    DITA: Troubleshooting non-English indexes

    This article describes how to configure the sorting of index entries for users of the DITA Open Toolkit (DITA OT). The DITA OT is installed automatically with XMetaL Author Enterprise.

    Rules for sorting index entries vary by language, and often also by region (locale). For details, see: http://en.wikipedia.org/wiki/Collation. For example, using English sorting rules, the word “Ñacunday” comes before the word “natural”, whereas in Spanish sorting rules all words starting with “n” come before words starting with “ñ”, so “natural” comes first.


    • Before you begin, [url=http://forums.XMetaL.com/index.php/topic,939.0.html]locate your active copy of the DITA OT folder[/url].
    • Make sure the root element of each DITA topic and map is tagged with an appropriate xml:lang attribute. See [url=http://forums.xmetal.com/index.php/topic,953.msg2983.html#msg2983]DITA: Indicating the language of content for DITA processing[/url] for details.
    • You do not need to know XSL or any programming language to work with the files mentioned in this article. However, make sure you know how to open configuration files using an [url=http://forums.xmetal.com/index.php/topic,955.0.html]appropriate multilingual text editor[/url].

    Index sorting rules for non-PDF Output
    To our knowledge, index sorting rules for non-PDF output are not documented. We are not aware of any major issues with the pre-configured index sorting rules for non-PDF output.

    Index sorting rules for PDF Output
    For “XMetaL Enhanced PDF via RenderX” output, index sorting rules are defined for each language in the following folder: DITA_OTdemoxmfocfgcommonindex. They are overridden by any files in the following folder:DITA_OTdemoxmfoCustomizationcommonindex.

    If you are generating PDF output using a standalone installation of the DITA Open Toolkit, index sorting rules  are in the following folder: DITA_OTdemofocfgcommonindex. They are overridden by any files in the following folder: demofoCustomizationcommonindex.

    The default index sorting rules for some languages are poor. For example, index terms which begin with accented letters might not appear in the index at all. The DITA OT treats a, A, à, and À as separate letters, so uppercase and lowercase versions of each possible accented letter must be explicitly included in the index sorting rules. You can download a [url=http://tech.groups.yahoo.com/group/dita-users/files/PDF%20Localization%20Files/index%20files/]set of index rules files here[/url]. The files, which are free but require Yahoo! registration, are much more complete than the default set. Copy the files to your DITA_OTdemoxmfocfgcommonindex folder. These files have been generously made available by other DITA users.

    For sorting indexes in Chinese and Japanese documents, we have heard from some customers that the [url=http://www.antennahouse.com/product/i18nindexlib_V2.1/index.html]Antenna House I18n Index Library[/url] gives best results.


