Pages: 1
Print
Author Topic: Issues with CJK Editing in XMetaL  (Read 3535 times)
pmasal
Member

Posts: 86


« on: May 09, 2011, 08:32:20 AM »

We are using XMetaL 5.5 EE Author/Server for Documentum Webtop. We are having many issues with the way CJK DITA topic content is displayed in XMetaL.

1. When we import the DITA topic content into our CMS and check it out via XMetaL, most of the characters appear garbled and do not publish correctly after the content is checked in. Please see the attached xmetal_check_out_cms_text_garbled_cjk.JPG.

2. We can fix the above problem temporarily by taking a file system CJK object, open it in XMetaL, and save it over the CMS object. However when we do that, we get garbled text when we insert xrefs or conrefs from other file system objects (never imported into the CMS). Please see the attached xmetal_insert_xref_text_garbled_cjk.JPG.

3. Regardless of whether the DITA topic is a CMS or file system object, the dialog box text for inserting cross references to content in the same topic is always garbled. Thus we never know if we are inserting a correct cross reference.  Please see the attached xmetal_check_xref_dialog_text_garbled_cjk.JPG.

Can you please tell me if these are known issues and if there are any fixes we need to apply? Many thanks.
Paul


* xmetal_check_out_cms_text_garbled_cjk.JPG (93.71 KB, 676x545 - viewed 568 times.)

* xmetal_check_xref_dialog_text_garbled_cjk.JPG (70.27 KB, 676x597 - viewed 566 times.)

* xmetal_insert_xref_text_garbled_cjk.JPG (39.16 KB, 635x163 - viewed 545 times.)
Logged
Derek Read
Program Manager (XMetaL)
Administrator
Member

Posts: 2621



WWW
« Reply #1 on: May 09, 2011, 03:16:36 PM »

From the description and the screen captures my first suspect is the CMS. You are running DCTM though, and it supports Unicode (UTF-8) but perhaps it is not configured to save as UTF-8?

This might also just be a language rendering issue. If you are running XP do you have the "Install files for East Asian Languages" checkbox checked in Regional and Language Settings (Control Panel)?

That might explain why the characters are not displaying properly (they appear as the "missing character" glyph -- a square). Might be easiest to send me an actual copy of the files (before and after being in the CMS) so I can confirm their encoding and whether they contain the correct characters (so that the files are actually OK but the software is not rendering properly).

Do you ever see correctly rendered Korean in your documents (sounds like you are saying they are OK before they go into / out of the CMS). If you do, but at some later point you do not, that smells very much like something is messing with the encoding of document.

I do see that our Cross-Reference Properties dialog is displaying "missing glyph" characters for Korean syllables as well unfortunately, even when a system is configured to support "East Asian Languages" as far as I can tell. I'm not sure why that is the case but I will let our devlopment team know. I think that's your #3.
Logged
pmasal
Member

Posts: 86


« Reply #2 on: May 10, 2011, 07:40:10 AM »

Hi Derek, thank you. The "Install files for East Asian languages" box is checked in Regional and Language Options, so I'm ok there.

We are running a localization workflow that exports English DITA objects from our CMS, ftp's them to a translation vendor, then imports them back into the CMS. When we receive the files back from the vendor, they are ok in the file system. When we import them into the CMS, they are still ok (and publish just fine). But when I check them out of the CMS to edit them, they display as garbled in XMetaL.

I'm attaching a zip file (EMC_dita_before_after_CJK_issue.zip) with the following:
   CSA_blade_<language_code>_before.dita - Files that come back from vendor (OK)
   CSA_blade_<language_code>_after.dita - Files exported from CMS (not OK)

Thanks for any help you can provide.
Paul
   

* EMC_dita_before_after_CJK_issue.zip (4.59 KB - downloaded 246 times.)
Logged
Derek Read
Program Manager (XMetaL)
Administrator
Member

Posts: 2621



WWW
« Reply #3 on: May 10, 2011, 05:25:15 PM »

The files do appear to be broken after exported from the CMS. I don't need to open them in XMetaL to see the broken or missing characters (the results are obvious in several Unicode-capable plain text editors I have). Some characters are missing a portion of their bytes (character encoding has been garbled) or have been replaced entirely with a question mark and one or more square brackets.

I don't think it is our integration with Documentum Webtop that could be responsible for this. There is a slim possibility, but our code merely asks Documentum Webtop to check out a particular file and place it in the Documentum cache. XMetaL then opens the file directly from the cache on your local drive.

I would remove XMetaL from your tests to confirm that this is not an XMetaL issue:

1) Use the Internet Explorer interface for Documentum Webtop to check out a document.
2) Navigate to that location using a 3rd party tool, such as Notepad, to view the document and confirm that it has some characters messed up.
« Last Edit: May 10, 2011, 05:30:41 PM by Derek Read » Logged
Derek Read
Program Manager (XMetaL)
Administrator
Member

Posts: 2621



WWW
« Reply #4 on: May 10, 2011, 05:34:11 PM »

Note that when I say the character is replaced by a question mark, that means directly in the file itself (not just rendering). I have confirmed this by examining the file using the binary editor in Visual Studio 2005 to examine the hex codes for each of the bytes in the "after" files.
Logged
Pages: 1
Print
Jump to: