 |
 |
|
 |
 |
 |
XUL Localizability issues
by Tao Cheng< tao@netscape.com
>
Document History
- 02/26/99: Correct entity reference syntax.
- 02/19/99: Insert two XUL L12y solutions ""@.*;" +
property file", proposed by Daniel Matejka, and
"Using XLinks and XPointers for XUL Localisation" by Daniel McGowan.
- 02/17/99: At last, a consensus of the solution for XUL localizability has been reached. The "XUL + language-specific DTD" is adopted.
- 02/10/99: In solution #2, relax the rule that the content data must
be a text attribute value; instead, treat content data as the element
content. See the samples in solution #2.
- 02/09/99: Add explanation of the "default string" to solution #2.
- 02/08/99: Add "How to locate the language-specific file" section.
- 02/08/99: Take out all out-dated sections.
- 02/08/99: Re-evaluate solution #1 and #2 in the comparison table.
- 02/08/99: Revised solution #2 to reduce the number of IDs needed for
resources identification. In the revised version, resources are uniquely
identified by the combination of widget ID and resource tag.
Java-property file format is also extended to support structuralized
resources.
- 02/05/99: Revamp the "Candidates of the final solution" section and
add a table of comparison.
- 02/05/99: Compile for final review; record more discussion.
- 01/29/99: Record more discussion.
- 01/28/99: Added two sections: "Ideas" and "Candidates of the final
solution".
- 01/28/99: Compiled feedback from Daniel Matejka.
-
01/25/99: Feedback from Daniel Matejka in red.
-
01/25/99: Post this document to newsgroup,
news://news.mozilla.org/netscape.public.mozilla.xpfe,
for broader audience and discussion.
-
01/25/99: Per Scott Collins, http://www.meer.net/ScottCollins/, not all
UI widgets can be described in XUL file. Q: does this mean we need another
mechanism to solve the non-XUL UI components?
-
01/20/99: Add a new section, "How does the XUL concept work?"
-
01/20/99: In the "XUL Localizability issues" held today, it's proposed
that
-
01/20/99: Incorporate Rob Thorne's comments and suggestions (in blue).
-
01/19/99: Add reference to Erik's
String
Resources . We might be able to consolidate the idea presented there
with the gettext() scheme.
-
01/16/99: First draft.
Goals
This document serves the following purposes
-
Identify the Internationalization and Localization requirements in the
Seamonkey project.
-
Discuss the XUL Localizability issues in Seamonkey, 5.0.
-
Record the proposed solutions, and their pros and cons.
-
Keep track of the status of the related issues.
-
Feature complete by first beta so it can be tested.
Principles
Here are a list of principles the author intends to follow in seeking the
solution for this issue.
-
Simple. Both core development and localization work will be easy
and less error prone. Win-win situation ;)
-
Leveragible. Localization results shall be leveragible from release
to release. Localization costs money.
-
Consistent. If possible, we shall seek a scheme that will work across
modules instead of within the XUL component only.
-
Portable. The final solution will be achievable on all platforms
including Unix, Windows, Mac, and others.
-
Extensible. The adopted solution will be flexible for customization
and future extension.
-
Dynamic binding. Some of the items requiring translation may be
dynamic, usually because they require string composition
("Installing item 5 of 10").
-
Validatible. Localizers/translators will be able to validate the
localization results.
-
Parseable: It should be possible to
unambiguously and automatically determine which embedded items contain localizable
text, and what items need to be locked.
-
Invisible (Internationalization). As much
as possible, the standard tools that create US UI should emit files that
already localizable, without requiring additional processing.
Candidates of the final solution
- XUL + language-specific DTD. (adopted)
Description:
- Put all localizable resources in a language DTD file. Example of such
resources are text strings, customizable icons, and URLs. Most of
them can be described by text/parsed entities.
- Non locale sensitive resources shall not be in this DTD file.
- Use SYSTEM identifier to reference this DTD file.
- Need to implement locale sensitive file lookup for the language
specific DTD file.
- Put format strings, such as "Item %d of %d", in text entities and
compute the value in the application code such as MailCore or
BrowserCore.
- To dynamically switch languages, we need to reload the XUL and its
DTD (probably from a remote host). This is because once the DOM tree
is created, the entities and DTDs have already been processed.
Sample XUL: toolbar.xul
<!DOCTYPE xui SYSTEM "toolbar.dtd">
<xul:toolbar>
&txtContentData;
<button cmd="nsCmd:BrowserBack"
style="background-color:rgb(192,192,192);">
<img src="resource:/res/toolbar/TB_Back.gif"/>
&txtBack;
</button>
<button
cmd="nsCmd:BrowserForward"
style="background-color:rgb(192,192,192);">
<img src="resource:/res/toolbar/TB_Forward.gif"/>
&txtForward;
</button>
<button
cmd="nsCmd:BrowserWizard"
style="background-color:rgb(192,192,192);">
<img src="&iconWizard;"/>
&txtWizard;
</button>
</xul:toolbar>
Sample DTD: toolbar.dtd
<!ENTITY txtContentData "Random content data">
<!ENTITY txtBack "Back to %s">
<!ENTITY txtForward "Forward">
<!ENTITY iconWizard "resource:/res/toolbar/TB_Wizard.gif">
<!ENTITY txtWizard "Wizard">
Pros:
- Already standard compliant; no new syntax names or tags need to be
introduced.
- Only one minor tweak needed: escape "%" used in formatting string,
such as "%d out of %d" for dynamic strings binding. For example,
use a numeric character reference (NCR), '%' to escape
'%'.
- Text replacement can be in either content or attribute values
(but not in the attribute names).
Cons:
- The language-specific DTD file is not flat file. Need a DTD parser
to extract localizable resources into a flat file for localizers.
- Two file formats to deal with: the property file and the DTD
file.
- Hard to group text entity by UI component.
- We lose the information of text entities after parsing.
- In switching languages, we need to reload the XUL and its
DTD (probably from a remote host) and reconstruct the DOM tree.
In the example of a dialog UI, if we used entities and DTDs, we
would have to tear down the whole DOM tree and the dialog that sits
on top of that, and then rebuild a new DOM tree and dialog. This
would be wasteful, since our layout manager is able to resize
elements dynamically, so we can "edit" the DOM tree and have the
dialogs redraw themselves automatically.
However, we can live with this performance drag since the users
might not switch language in runtime that often.
- Single XUL file with Java-like property file.
(ruled out due to technical difficulty)
Descriptions:
- Assign a widgetID to each widget in XUL file, and a
resTag to each localizable resource/attribute of a widget
in the widget code.
Then, call gettext(widgetID, resTag, default_string) to retrieve
the resources from a Java-like property file in runtime.
For example, a label widget can be described as
<label widgetID="345" text="label string"/> in a XUL file. Then, the
function call to retrieve localized text will be
gettext(345, RES_TEXT, "label string");
- If the property file does not exist or the combination of widgetID and
resTag does not resolve to a resource string, the default_string
will be returned in instead.
- All localizable resources must be stored in Java-like property file.
- The resources replacement may happen as early as in parsing or
as late as in widget initialization.
- Reference to the property file will be declared as an external unparsed
entity and stashed in the DOM tree for later use. See sample XUL
declaration below.
Sample XUL: toolbar.xul
<!DOCTYPE xui SYSTEM "toolbar.dtd">
<!-- L10N-PTY type of data: file format can be found at
http://www.netscape.com/PropertyFile -->
<!NOTATION L10N-PTY SYSTEM "http://www.netscape.com/PropertyFile">
<!ENTITY JFile
SYSTEM "http://www.home.org/l10n.property"
NDATA L10N-PTY>
<xul:toolbar>
<label widgetID="8000">Random content data
<label>
<button widgetID="8001"
cmd="nsCmd:BrowserBack"
style="background-color:rgb(192,192,192);"
img="resource:/res/toolbar/TB_Back.gif">Back to %s
</button>
<button widgetID="8002"
cmd="nsCmd:BrowserForward"
style="background-color:rgb(192,192,192);"
img="resource:/res/toolbar/TB_Forward.gif">Forward
</button>
<button widgetID="8003"
cmd="nsCmd:BrowserWizard"
style="background-color:rgb(192,192,192);"
img="resource:/res/toolbar/TB_Wizard.gif">Wizard
</button>
</xul:toolbar>
Sample property file: property.toolbar
8000: Random content data
8001.img: resource:/res/toolbar/TB_Back.gif
8001: Back to %s
8002.img: resource:/res/toolbar/TB_Forward.gif
8002: Forward
8003.img: resource:/res/toolbar/TB_Wizard.gif
8003: Wizard
Sample resource tags definition
#define RES_TEXT 0x1234
#define RES_IMG 0x1235
To get the text string for a "Back" button's label, we call
gettext(8001, RES_TEXT, "Back to %s")
Pros
- All localizable resources are uniquely identified by the combination of
widgetID and the resource tags. The application/front end developers
can easily update a UI element's attribute/resource.
- Core development work will not be block by gettext() implementation.
However, we shall request the UI developers to put English string,
localization notes, and comments in the property file.
- The fallback mechanism allows the developers to work without the
presence of property files.
- The English version of property file can be automatically generated
during XUL to DOM conversion.
- Provide fallback mechanism to default strings.
- The property file is flat and in clear text; easy to localize and
leverage.
- The implementation of nsStringBoundle interface is about to finish. The
basic facilities of parsing the property file and retrieving text are
ready to check in.
- Consistent with the scheme in "String Resources"; only one file format
to deal with.
- Resources are grouped by widgets. This also makes the property file more
readable.
- Easy to leverage the property file. All resources are IDed and ready for
comparision.
Cons
- Need to treat content data as the text resource of a label widget. (So
it can be identified and edited by application code.)
- Need to implement a mechanism to automatically bind localizable resources
to widgets. However, the amount of work can be reduced by performing
the localized resources binding in widget initialization time since
we need to bind the UI attributes in the DOM to the underlying widgets
anyway.
- Need to ensure the uniqueness of the widgetID. However, the appCore
developers need to have a way to uniquely identify a widget anyway.
- Localizable resources strings are duplicated twice: one in XUL and the
other in property file.
- Need to extend the Java-like property file to support structured
resources.
- Technical difficulty:once XUL has been converted to
DOM tree, the content can't be changed anymore.
- Use text entities for content data and IDs for widget
resources. (ruled out due to technical difficulty)
Description:
With the marriage of #1 and #2, we can take the advantage of both
worlds. The idea is to use text entities for content data to remedy the
awkwardness of the #2 approach in dealing with content data.
Pros:
- Reference to content data is XML standard compliant (general entity).
- All localizable resources are uniquely identified.
- UI developers will be able to specify widget resources directly in
XUL. Extraction of localizable resources can be performed in client's
build process. Localization is invisible to the UI developers.
- For UI that does not contain HTML data, we have only one file, the
property file, to deal with.
- For those contain HTML data, we deal with them outside of XUL. This
also helps us make the XUL file clean.
Cons:
- Why not simply use the DTD approach?
- Technical difficulty:once XUL has been converted to
DOM tree, the content can't be changed anymore.
- "@.*;" + property file
Description:
Assuming the "timely access" problem can be overcome, we could get around
the "syntax constraint" problem by using an entity-like syntax of our own.
That is, we invent something, say we use the "@" symbol like entities use
the "&". Then these things are used throughout the content just like
entities would have been used to do localization.
This still assumes we have some way to get at the
language-specific-substitution text after parsing (so it can't be a parser
directive; it may have to be some sort of special element that XUL will
recognize and not display). If all this worked, we'd be free to stick in
localizable text anywhere without constraining the element and attribute
structure. The above example
<element l10nID="100" text="english version"/>
becomes
<element text="@100;"/> ( or <element>@100;</element>,
if that's more appropriate for the widget).
There just needs to be a single routine somewhere central that knows
where to find the table of localized text strings. It finds "@.*;"
sequences and substitutes them. We have to walk the content model
after parsing and hand every string to this routine, and widgets have
to pass all their text strings through it before they do anything with
them.
Cons:
- The entity solution is more XML compliant and less work to implement.
- Using XLinks and XPointers for XUL Localisation
(by Daniel McGowan) (ruled out due to technical difficulty)
Abstract:
Use XLink & Xpointer to specifically referance text in a file that
is seperate from the base XUL file so that this text can be easily localised
and display this text to the end user in manner consistent with XPFE requirements.
Pros:
Since it is all written in vanilla XML there is no need to create custom
file types and this system can accept anything the parser can handle. It
maintains the name value paring essential for localisation. It allows us
to add localisation and developer notes to the object (e.g. button) and
the localised text separately but maintain a direct link between the two.
The text is pulled into the UI elements when the XUL file is parsed. This
also addressed the goal of separating markup, style and content.
Cons:
This does not leave us with a flat file solution. However the file
containing the text to be localised is of such a simple format that writing
a tool to parse it is a trivial exercise. We are going need some form of
tool to convert native encoding to unicode character references.
There are 4 files to track! Actually the language specific DTD is complete
and valid as is so it could easily be declared inline in the language specific
XML file. The link-attributes has been entitised and could conceivably
be inherited from a higher level DTD.
In reloading downloadble chrome, not all related files can be
blown away by the client.
Here is an example syntax needed for a button UI element.
|
UI.XUL
|
UI.DTD
|
<button href="&locale/uilang.xml|id(1234).child(text)">
<content-info>Put comments on button
functionality here
</content-info>
other xul markup
</button>
|
<!ENTITY % link-attributes
"xlink:form
CDATA #FIXED 'simple'
href
CDATA #REQUIRED
content-info CDATA #IMPLIED
show
CDATA #FIXED 'embed'
actuate
CDATA #FIXED 'auto'"
>
<!ELEMENT button (#PCDATA)>
<!ATTLIST button
%link-attributes;
other button specific attributes
>
|
| UILANG.XML |
UILANG.DTD |
<loctext id = "1244">
<text>Gallia est omnis divisa in partes tres,
quarum unam incolunt
Belgae, aliam Aquitani,
tertiam qui ipsorum
lingua Celtae,
nostra Galli appellantur.
</text>
<note>These are Ceasars first words on Gaul.
This button soulld
be centered on
column 1 of the
dialog
</note> |
<!ELEMENT loctext (text, note?)>
<!ATTLIST loctext id ID #REQUIRED>
<!ELEMENT text (#PCDATA)>
<!ELEMENT note (#PCDATA)> |
So, when <button> tag is parsed the "simple" xlink href
(which is #REQUIRED) is automatically (actuate = 'auto') embedded (show
= 'embed') with the text from the <text> child element of the element
with id = 1234 in the file at URI location which is the value of &locale(some
more globally set value)/UILANG.XML.
OK so that is probably a bad explaination but I hope the code is clear
enough. If you have any questions don't hesitste to flame away.
Daniel.
Comparision (*****: excellent, *: show stopper)
| Criteria to examine |
XUL + Language-specific DTD |
XUL + Language-specific property file |
Description |
| Simple |
***** |
**** Need to define resources tags in widget code and widgetID in XUL
file. |
Both core development and localization work shall be made easy and
less error prone. |
| Leveragible |
*** Need a parser to list, identify, and compare resources. |
***** All resources are in property files which are flat and easier for
leveraging. |
Localization results shall be leveragible from release to release. |
| Consistent |
*** Two file formats, DTD and property file, to deal with. |
***** Only one file format, property file. |
This scheme that will work across modules instead of within the XUL
component only. |
| Standard compliant |
***** |
**** (We can extend property file format to have similar syntax to
X/MOTIF's application default file.) |
Achievable on all platforms including Unix, Windows, Mac, and others. |
| Portable |
***** |
***** |
Achievable on all platforms including Unix, Windows, Mac, and others. |
| Extensible |
***** In the same direction as XML. |
**** Need to treat content data as the text resource of a label widget. |
The adopted solution will be flexible for customization and future
extension. |
| Dynamic binding |
*** Resources binding mostly appens in XML parser. |
***** Resources binding occurs at the last minute. |
Some of the items requiring translation may be dynamic, usually because
they require string composition ("Installing item 5 of 10"). |
| Validatible |
*** (need a DTD parser) |
***** (right on the scene) |
Localizers/translators will be able to validate the localization
results. |
| Parsable |
*** (DTD file contains XML tags, keywords, and others) |
***** (localizable resources are easily identified) |
It should be possible to unambiguously and automatically determine
which embedded items contain localizable text, and what items need to be
locked. |
| Invisible (Internationalization) |
*** (entity defined in external DTD) |
**** (developers need to assign an id to each resource; but the generation
of the US/EN property file could be done by the XUL parser.) |
As much as possible, the standard tools that create US UI should emit
files that already localizable, without requiring additional
processing. |
| Identifiable |
**** (entity names are unique; but we lose them after parsing.) |
**** (all resources are identified by the combination of widgetID and
resTag; but we must treat content data as the text resource of a
label widget) |
All resources shall be uniquely identifiable |
| Dynamic Language Switching |
*** Need to reload the XUL |
**** We can design it to modify localizable attributes only. |
Dynamically switch to different language and reflect it to UI.
(this does not happen quite often.) |
How to locate the language-specific file
In general, we need two information to locate the language-specific file:
- The reference to the location of the Java property files declared in the
unparsed entities as described in the solution #2.
- The locale information in the client.
For example, if we declare the entity as
<!ENTITY JFile SYSTEM "http://www.home.org/l10n-property.xxx"
NDATA L10N-PTY>
And, the current locale is "ja". Then, our real URI is
"http://www.home.org/l10n-property_ja.xxx"
or
"http://www.home.org/ja/l10n-property.xxx"
The real location of the language-specific DTD file can be determined in a
similar fashion.
References
|
 |
 |