Home Forums General XMetaL Discussion Copy/paste table from MS Word works in Dita for CALS table but not in HTML table Reply To: Copy/paste table from MS Word works in Dita for CALS table but not in HTML table

Derek Read

Reply to: Copy/paste table from MS Word works in Dita for CALS table but not in HTML table

The DITA authoring solution implements a transformation from HTML to DITA and includes support for many of the basic DITA elements. This transformation works for any HTML content on the clipboard (MS Word puts HTML on the clipboard together with several other formats when you copy). Any similar transformation process needs to be tuned to the target schema's markup. Keep in mind that every version of MS Word puts something slightly different on the clipboard (as does every different browser and many browser versions — even though the underlying HTML source code is the same many applications put their own version of what they have built in memory onto the clipboard, or some version of that). In Word's case the HTML it puts on the clipboard contains a lot of proprietary Word (non-HTML) markup, so that is stripped out and the file is turned into an XHTML version before the transformation is done. The actual transform is essentially XML (XHTML) to XML (DITA) at that point.

If you want to implement something similar for your own schema you might have a look at the JScript code implemented for the DITA functionality in an XMetaL Author Enterprise installation. It tries to handle as many cases as possible however, so it is complex (and also undocumented as it is not really meant to serve as an example, though it has a few comments in it).

If you know that all of your MS Word documents were created using similar steps for constructing your tables (so Word's internal format for them is consistent), and that they were made using the same version of MS Word (this may not be important), and that all of your users will be using the same version of MS Word to copy from (this is important), and that users will only be copying tables or content from tables, then some of the cleanup steps might be simplified or you might get away without doing any cleanup before a transformation to your schema's format. In an XMetaL Author macro you can access the Windows clipboard using Application.Clipboard and you can use various “paste” related properties, methods and events to manipulate the content you obtain through Application.Clipboard so that what is actually pasted into the document matches what is allowed (essentially you need to transform it into a table that can be validly inserted into your document). I would recommend searching for “clipboard”, “paste” and “When text is dropped” in the Programmer's Guide if you think this might be feasible.