utility extracts translatable strings
from HTML5 files (-e
), joins XLIFF translation
files and untranslated HTML5 files (-j
merges new or removed translations (-u
arguments are as follows:
- Copy mode: when used with -e,
assigns the same target as the source. With
-u, does the same for new entries. By
default, in either case the target is left blank. For
-j, missing translations are filled in from
the input file's content.
- Extracts translatable strings from
html5, emitting a skeleton XLIFF
translation file on standard output.
- Translate (“join”)
xliff, emitting translated HTML5 on
- When used with -u, keep
entries that are no longer valid. Otherwise is ignored.
- Quiet: don't note additions and deletions when
-u is used.
- Update xliff with new
translatable strings in html5.
Non-matching terms are discarded unless -k is
specified. Additions and deletions are noted on standard error.
- HTML5 input files to be translated or mined for
By default, sintl
behaves as if
Each text node in the HTML5 input files is its own translatable string, unless
the text node is in a phrasing content element. (Except
.) For example,
<div>foo <i>bar</i> baz</div>
results in two translatable strings: “foo <i>bar</i>
baz” and “foobar”.
Contiguous white-space is collapsed into a single space and empty keys are
ignored. This is why the text node preceding the
is omitted. You may override the whitespace behaviour with the
affects the current and descendent nodes by not trimming whitespace at all.
Translation may be controlled with the
attribute, which is set to either
. When set to
descendents of the labelled node are not examined for translatable content.
When set to
, the opposite is true.
Attributes are carried over into the translatable keys to differentiate similar
In a break from standard usage, translations may change attribute values simply
by changing the attribute content. For example,
<source><g id="unit1-1" xhtml:href="foo.html">Hi</g>!</source>
<target>Le <g id="unit1-1" xhtml:href="foo.fr.html">hi</g> !</target>
In this example, the attribute of the translated element will replace that of
performs a number of optimisations to prevent
superfluous content from being considered for translation. First, translation
strings consisting only of an empty tag are removed. For example,
<p> <img src="path/to/image.png" /> </p>
These tags may be surrounded by white-space and arbitrarily nested.
Second, tags surrounding text are stripped away. For example,
<p> <a href="a/link.html"><i><strong>Hello.</strong></i></a> </p>
This will produce only the “Hello.” for translation.
utility exits 0 on success,
and >0 if an error occurs.
Let the following simple file, index.xml
, be used
as a template for translating into different languages.
<body><p>hello <img src="foo.jpg" /> world</p></body>
We can then create an initial XLIFF file as follows.
sintl -e index.xml > index.en.xliff
Now edit the XLIFF file.
<file source-language="TODO" target-language="en">
<source>hello <x id="0" xhtml:src="foo.jpg"/> world</source>
attribute were specified on the input
root element, it would have been
propogated in the
. Finally, create a translated output
file as follows.
sintl -j index.en.xliff index.xml > index.en.html
This can be repeated for as many translation files as necessary. Many systems
will use a baseline translation (e.g., English) as the template, but I find it
easier to translate based on sources that are identifiers, not content.
HTML5 files to translate must be valid XML-form HTML5 documents annotated with a
subset of the W3C ITS v2.0 attributes. Files holding translation dictionaries
must be valid XLIFF 1.2 files.
utility was written by
ignores translation comments within
translated phrasing content. For example:
<i>Hello, <span its:translate="no">world</span>.</i>
In this example, the non-translatable content is simply passed into the output.
Non-conformant HTML5, with non-phrasing content embedded in phrasing content,
is explicitly disallowed. For example:
<i>Hello, <div its:translate="no">world</div>.</i>