SINTL(1) | General Commands Manual | SINTL(1) |
sintl
— simple
HTML5 translation
sintl |
[-cekq ] [-j
xliff] [-u
xliff] [html5...] |
The sintl
utility extracts translatable
strings from HTML5 files (-e
), joins XLIFF
translation files and untranslated HTML5 files (-j
),
and merges new or removed translations (-u
). Its
arguments are as follows:
-c
-e
, assigns the same
target as the source. With -u
, does the same for
new entries. By default, in either case the target is left blank. For
-j
, missing translations are filled in from the
input file's content.-e
-j
xliff-k
-u
, keep entries that are no longer
valid. Otherwise is ignored.-q
-u
is used.-u
xliff-k
is specified. Additions and deletions are noted
on standard error.By default, sintl
behaves as if
-e
were used.
Each text node in the HTML5 input files is its own translatable
string, unless the text node is in a phrasing content element. (Except
<iframe>
,
<noscript>
,
<select>
,
<script>
, and
<textarea>
.) For example,
<section> <div>foo <i>bar</i> baz</div> foobar </section>
results in two translatable strings: "foo <i>bar</i> baz" and "foobar".
Contiguous white-space is collapsed into a single space and empty
keys are ignored. This is why the text node preceding the
div
is omitted. You may override the whitespace
behaviour with the xml:space="preserve"
attribute, which affects the current and descendent nodes by not trimming
whitespace at all.
Translation may be controlled with the
its:translate
attribute, which is set to either
yes
or no
. When set to
no
, descendents of the labelled node are not
examined for translatable content. When set to yes
,
the opposite is true.
Attributes are carried over into the translatable keys to differentiate similar content.
In a break from standard usage, translations may change attribute values simply by changing the attribute content. For example,
<trans-unit id="unit1"> <source><g id="unit1-1" xhtml:href="foo.html">Hi</g>!</source> <target>Le <g id="unit1-1" xhtml:href="foo.fr.html">hi</g> !</target> </trans-unit>
In this example, the attribute of the translated element will replace that of the source.
sintl
performs a number of optimisations
to prevent superfluous content from being considered for translation. First,
translation strings consisting only of an empty tag are removed. For
example,
<p> <img src="path/to/image.png" /> </p>
These tags may be surrounded by white-space and arbitrarily nested.
Second, tags surrounding text are stripped away. For example,
<p> <a href="a/link.html"><i><strong>Hello.</strong></i></a> </p>
This will produce only the “Hello.” for translation.
The sintl
utility exits 0 on
success, and >0 if an error occurs.
Let the following simple file, index.xml, be used as a template for translating into different languages.
<!DOCTYPE html> <html xmlns:its="http://www.w3.org/2005/11/its"> <head><title>title</title></head> <body><p>hello <img src="foo.jpg" /> world</p></body> </html>
We can then create an initial XLIFF file as follows.
Now edit the XLIFF file.
<xliff version="1.2"> <file source-language="TODO" target-language="en"> <body> <trans-unit id="unit1"> <source>title</source> <target>Title</target> </trans-unit> <trans-unit id="2"> <source>hello <x id="0" xhtml:src="foo.jpg"/> world</source> <target>Hello, World!</target> </trans-unit> </body> </file> </xliff>
If the lang
attribute were specified on
the input <html>
root element, it would have
been propogated in the source-language
atttribute.
It defaults to TODO
. Finally, create a translated
output file as follows.
This can be repeated for as many translation files as necessary. Many systems will use a baseline translation (e.g., English) as the template, but I find it easier to translate based on sources that are identifiers, not content.
HTML5 files to translate must be valid XML-form HTML5 documents annotated with a subset of the W3C ITS v2.0 attributes. Files holding translation dictionaries must be valid XLIFF 1.2 files.
The sintl
utility was written by
Kristaps Dzonsons,
kristaps@bsd.lv.
sintl
ignores translation comments within
translated phrasing content. For example:
<i>Hello, <span its:translate="no">world</span>.</i>
In this example, the non-translatable content is simply passed into the output. Non-conformant HTML5, with non-phrasing content embedded in phrasing content, is explicitly disallowed. For example:
<i>Hello, <div its:translate="no">world</div>.</i>
June 28, 2019 | OpenBSD 6.7 |