XUL Localizability issues

by Tao Cheng< tao@netscape.com >

Document History

Goals

This document serves the following purposes

Localizability issues

Historically, we encountered some difficulties in localizing Web-based documents: It is the desire of the Mozilla Internationalization group to address these issues in the XUL world.

Criteria of the solution

Before embarking on the solution-seeking journey, let?s layout a set of criteria we intend to meet: While it might not be feasible to find a solution that satisfies all the criteria, they shall be used as the factors in decision making.

XUL Localizability dependency

The XUL localizability has dependency on the following items.

Candidates of the final solution

  1. XUL + language-specific DTD. (adopted)

    Description:

    In this approach, we declare general (text) entities for all locale sensitive resources in an external DTD (Document Type Definition) subset and use XML entity reference, "&entity;", to reference them:

    Sample XUL: toolbar.xul


      <!DOCTYPE xui SYSTEM "chrome://navigator/locale/toolbar.dtd">

      <xul:toolbar>

        &txtContentData;
        <button cmd="nsCmd:BrowserBack" style="background-color:rgb(192,192,192);">
          <img src="chrome://navigator/locale/TB_Back.gif"/>
          &txtBack;
        </button>

        <button cmd="nsCmd:BrowserForward" style="background-color:rgb(192,192,192);">
          <img src="chrome://navigator/locale/TB_Forward.gif"/>
          &txtForward;
        </button>

        <button cmd="nsCmd:BrowserWizard" style="background-color:rgb(192,192,192);">
          <img src="&iconWizard;"/>
          &txtWizard;
        </button>

      </xul:toolbar>

    Sample DTD: toolbar.dtd


      <!ENTITY txtContentData "Random content data">
      <!ENTITY txtBack "Back to &#37;s">
      <!ENTITY txtForward "Forward">
      <!ENTITY iconWizard "chrome://navigator/locale/TB_Wizard.gif">
      <!ENTITY txtWizard "Wizard">

    Pros:

    Cons:

  2. Resource ID + String Resource Manager. (ruled out due to technical difficulty)

    Descriptions:

    Sample XUL: toolbar.xul


      <!DOCTYPE xui SYSTEM "toolbar.dtd">

      <!-- L10N-PTY type of data: file format can be found at http://www.netscape.com/PropertyFile -->
      <!NOTATION L10N-PTY SYSTEM "http://www.netscape.com/PropertyFile">
      <!ENTITY JFile SYSTEM "http://www.home.org/l10n.property" NDATA L10N-PTY>

      <xul:toolbar>


        <label widgetID="8000">Random content data <label>
        <button widgetID="8001"
          cmd="nsCmd:BrowserBack"
          style="background-color:rgb(192,192,192);"
          img="resource:/res/toolbar/TB_Back.gif">Back to &#37;s
        </button>

        <button widgetID="8002"
          cmd="nsCmd:BrowserForward"
          style="background-color:rgb(192,192,192);"
          img="resource:/res/toolbar/TB_Forward.gif">Forward
        </button>

        <button widgetID="8003"
          cmd="nsCmd:BrowserWizard"
          style="background-color:rgb(192,192,192);"
          img="resource:/res/toolbar/TB_Wizard.gif">Wizard
        </button>

      </xul:toolbar>

    Sample property file: property.toolbar

      8000: Random content data
      8001.img: resource:/res/toolbar/TB_Back.gif
      8001: Back to &#37;s
      8002.img: resource:/res/toolbar/TB_Forward.gif
      8002: Forward
      8003.img: resource:/res/toolbar/TB_Wizard.gif
      8003: Wizard

    Sample resource tags definition

      #define RES_TEXT   0x1234
      #define RES_IMG     0x1235

    To get the text string for a "Back" button's label, we call

      gettext(8001, RES_TEXT, "Back to &#37;s")

    Pros

    Cons

  3. "@.*;" + property file

    Description: Assuming the "timely access" problem can be overcome, we could get around the "syntax constraint" problem by using an entity-like syntax of our own. That is, we invent something, say we use the "@" symbol like entities use the "&" symbol. Then these things are used throughout the content just like entities would have been used to do localization. This still assumes we have some way to get at the language-specific-substitution text after parsing (so it can't be a parser directive; it may have to be some sort of special element that XUL will recognize and not display). If all this worked, we'd be free to add localizable text anywhere without constraining the element and attribute structure.

    For example

    becomes

    There just needs to be a central single routine that knows where to find the table of localized text strings. It finds "@.*;" sequences and substitutes them. We have to walk the content model after parsing and hand every string to this routine, and widgets have to pass all their text strings through it before they do anything with them.

    Cons:

  4. Using XLinks and XPointers for XUL Localisation (by Daniel McGowan) (ruled out due to technical difficulty)

    Abstract:
    Use XLink & Xpointer to specifically reference a text in a file that is separate from the base XUL file so that this text can be easily localized and displayed to the end user in a manner consistent with XPFE (Cross Platform Front End) requirements.

    Pros:
    Since it is all written in vanilla XML there is no need to create custom file types. Thus this system can accept anything the parser can handle. It maintains the name value paring essential for localization. It allows us to add localization and developer notes to the object (e.g. button) and the localized text separately, but maintain a direct link between the two. The text is pulled into the UI elements when the XUL file is parsed. This also addresses the goal of separating markup, style and content.
    Cons:
    This does not leave us with a flat file solution. However the file containing the text to be localised is of such a simple format that writing a tool to parse it is a trivial exercise. We are going need some form of tool to convert native encoding to unicode character references.
    There are 4 files to track! Actually the language specific DTD is complete and valid as is so it could easily be declared inline in the language specific XML file. The link-attributes has been entitised and could conceivably be inherited from a higher level DTD.
    In reloading downloadble chrome, not all related files can be blown away by the client.

    Here is an example syntax needed for a button UI element.
    UI.XUL
    UI.DTD

    <button
       href="&locale/uilang.xml|
       id(1234).child(text)"
    >

       <content-info>
       Put comments on button

       functionality here
       </content-info>

      other xul markup
    </button>

     

    <!ENTITY  % link-attributes
      "xlink:form     CDATA    #FIXED 'simple'
       href           CDATA    #REQUIRED
       content-info   CDATA    #IMPLIED
       show           CDATA    #FIXED 'embed'
       actuate        CDATA    #FIXED 'auto'"
    >

    <!ELEMENT button (#PCDATA)>
    <!ATTLIST button 
        %link-attributes;
        other button specific attributes
    >
     

    UILANG.XML UILANG.DTD
    <loctext id = "1244">
      <text>
       Gallia est omnis divisa in partes tres, 

       quarum unam incolunt Belgae,
       aliam Aquitani, tertiam qui ipsorum lingua
       Celtae, nostra Galli appellantur. 
      </text>
      <note>These are Ceasars first words on Gaul.
             This button soulld be centered on 
             column 1 of the dialog
      </note>
    <!ELEMENT loctext (text, note?)>
    <!ATTLIST loctext id ID #REQUIRED>
    <!ELEMENT text (#PCDATA)>
    <!ELEMENT note (#PCDATA)>

    So, when <button> tag is parsed the "simple" xlink href (which is #REQUIRED) is automatically (actuate = 'auto') embedded (show = 'embed') with the text from the <text> child element of the element with id = 1234 in the file at URI location which is the value of &locale(some more globally set value)/UILANG.XML.

Comparision (*****: excellent, *: show stopper)

Criteria to examine XUL + Language-specific DTD XUL + Language-specific property file Description
Simple ***** **** Need to define resources tags in widget code and widgetID in XUL file. Both core development and localization work shall be made easy and less error prone.
Leveragible *** Need a parser to list, identify, and compare resources. ***** All resources are in property files which are flat and easier for leveraging. Localization results shall be leveragible from release to release.
Consistent *** Two file formats, DTD and property file, to deal with. ***** Only one file format, property file. This scheme that will work across modules instead of within the XUL component only.
Standard compliant ***** **** (We can extend property file format to have similar syntax to X/MOTIF's application default file.) Achievable on all platforms including Unix, Windows, Mac, and others.
Portable ***** ***** Achievable on all platforms including Unix, Windows, Mac, and others.
Extensible ***** In the same direction as XML. **** Need to treat content data as the text resource of a label widget. The adopted solution will be flexible for customization and future extension. 
Dynamic binding *** Resources binding mostly appens in XML parser. ***** Resources binding occurs at the last minute. Some of the items requiring translation may be dynamic, usually because they require string composition ("Installing item 5 of 10").
Validatible *** (need a DTD parser) ***** (right on the scene) Localizers/translators will be able to validate the localization results.
Parsable *** (DTD file contains XML tags, keywords, and others) ***** (localizable resources are easily identified) It should be possible to unambiguously and automatically determine which embedded items contain localizable text, and what items need to be locked.
Invisible (Internationalization) *** (entity defined in external DTD) **** (developers need to assign an id to each resource; but the generation of the US/EN property file could be done by the XUL parser.) As much as possible, the standard tools that create US UI should emit files that already localizable, without requiring additional processing.
Identifiable **** (entity names are unique; but we lose them after parsing.) **** (all resources are identified by the combination of widgetID and resTag; but we must treat content data as the text resource of a label widget) All resources shall be uniquely identifiable
Dynamic Language Switching *** Need to reload the XUL **** We can design it to modify localizable attributes only. Dynamically switch to different language and reflect it to UI. (this does not happen quite often.)


References

Document History (old)