DITA-OT publishing German characters with HTML symbols

Derek Read

Derek Read

XMetaL actually defaults to using numbered entities in the XML and will use those unless a named entity is defined in your DTD (but various settings can also come in to play). However, it looks like you are authoring DITA, and the only named entity in DITA (which people should actually be avoiding now) is nbsp. That means that if XMetaL is inserting entities into your documents (and normally it should not be) they would be inserted as numbered character references in hex form.

In either case, the encoding for your XML files needs to be something less robust than UTF-8 (the default) or UTF-16 in order for entities to be inserted into the XML (most likely US-ASCII for the characters in question as LATIN1 / ISO-8859-1 supports German characters). If an encoding supports a character (and UTF-8 supports these) then it will save the character as the character and not as a numbered entity (and named entities need to be defined, as previously stated).

I think we'll need to see some test files to see if we can reproduced this. The more information you can provide about the setup the better. The most likely cause for issues here would be where the files are being written to / stored. If a CMS or anything else but a Windows file system is involved I'd start looking there for issues first.

Or perhaps the issue is with the DITA Open Toolkit. I'm not aware of anything that sounds like this but that is possible. In that case the XML files themselves (and XMetaL Author Enterprise) are also likely not the cause — the XML itself might be fine — but the DITA OT or modifications to it might being doing this.