XMetaL Tips and Tricks

XMetaL Community Forum XMetaL Tips and Tricks Script Example: Sort Lowercase CALS Tables (DITA, DocBook, etc)

  • Derek Read

    Script Example: Sort Lowercase CALS Tables (DITA, DocBook, etc)

    Participants 5
    Replies 6
    Last Activity 13 years ago

    Products:
    Tested with XMetaL Author Enterprise 5.1.1.017 and 5.5.0.219 on Windows XP with CALS tables inside DITA topics.
    This should probably work with DocBook and similar Schemas that also use standard CALS tables, but I haven't done any testing for these.

    XMetaL Author Essential and Enterprise 8.0
    Far more robust table sorting functionality has been added as “native” functionality to XMetaL Author Enterprise and XMetaL Author Essential. The feature supports sorting of both CALS and HTML tables as well as lists (elements can be configured to be treated as lists in the CTM settings in your customization). If you need to sort tables or lists don't bother with this script example, upgrade to the current release. You might still use this sample if you need to perform sorting on other types of markup or you might use it as an example of how to pass data to and from MSXML from an XMetaL macro (including for purposes other than sorting).

    Purpose:
    Sort a CALS

    that has lowercase element names based on the content of cells in a given column (ie: DITA or DocBook
    ).
    Assumes the author running the script wants to sort the current table based on the cell content of the column their insertion point (cursor) is within.  The sort is hard-coded to perform an alphabetic sort from low to high values. Therefore, with this version the user is not prompted for any additional requirements, however, they could be (see MCR file for comments).

    In addition to sorting the table this script demonstrates how to build an XSLT on the fly and pass it, together with a chunk of XML (in this case a ), to MSXML, which then returns a string (or perhaps throws an error). The existing table's is deleted and replaced by the newly sorted .

    Note: If the sort did not accomplish anything (ie: the returned is the same because it didn't need sorting) the is still replaced.

    Demo Code:
    Before using this script please read the notes and comments in the MCR file (which also includes some legal stuff). Basically, this code is provided as a demo and should be treated as if it were completely untested. I have tested it as best I can, but it has not gone through our regular rigorous test process. You may also wish to adapt this code by altering the XSLT and / or the script logic itself as you may find that the functionality does not meet the exact needs of your end users, etc.

    Please also do not use this script without the permission of the people that maintain your XMetaL installation (if that isn't you). Although the possibility is low given the way I have coded this it could conflict with special customizations or scripts, 3rd party tools or plug-ins, a specific work-flow they have set up and wish you to follow, or any number of other things I cannot even guess at. I would recommend telling them about your wish to have something like this and let them integrate it and test it for you.

    Installation:
    1. Unzip attached ZIP file.
    2. Place MCR file in the Startup folder of your XMetaL Author Enterprise installation.
    3. Restart XMetaL Author Enterprise if already running.

    Note: You must be logged in to download attachments on this forum.

    Requirements:
    The script relies on the availability of Microsoft's MSXML COM control (ActiveX) version 4 (most systems will probably have versions 3 thru 5+). This should be the case on the vast majority of Windows XP and Vista machines. However, if you do run into a situation where the script is throwing an error regarding MSXML that is the first place to look and the first thing to check for.

    Usage:
    Warning: Do not use this script with real content until you are satisfied it passes your testing.
    1. Open or create a new DITA test document that contains a

    .
    2. Place cursor inside the
    in the column you wish to sort by.
    3. Run the script called “Demo: Lowercase CALS Table Sort”.
    4. If your document is messed up you should probably be able to undo the entire script with one Undo operation.

    Modifications, Extending Code, and Known Limitations:
    See comments in the MCR file.

    Reply

    dcramer

    Reply to: Script Example: Sort Lowercase CALS Tables (DITA, DocBook, etc)

    I just tested this on a CALS table in a DocBook document in XMetaL 4.6. It works great IF there are no entities in the table (e.g. &foo; where you've declared in the DOCTYPE or DTD and fails by having the table just disappear (i.e. it pastes over the old table with an empty string). So that's a pretty serious limitation and would affect any attempt to use an xslt to massage the markup like this, which is a potentially powerful tool. Obviously the parser can't resolve the entities without a doctype but if it resolves the entities then it's going to replace the table with the entities resolved. Another problem for the DITA folk is that this will never work with specialization without access to the DTD.

    Option 1: Obfuscate the entities before processing by replacing & in tableStr with @@@ before running the xslt and then replacing @@@ with & when done.
    Advantage: Entities remain unresolved in the result of the sort.
    Disadvantage: If any entities were in the sort column then the sort is inaccurate.

    Option 2: Get the DOCTYPE (is that possible?) and prepend it to tableStr.
    Advantage: MSXML can resolve the entities and process tableStr
    Disadvanages: Entities are resolved in the result, defeating the purpose of using them in the first place. Also, if you use catalog files to find the dtd and other resources, would MSXML know about them?

    Option 2.a: Munge tableStr replacing &foo; with &foo;. Get the DOCTYPE (assuming that's possible) and prepend it to tableStr. Process with the xslt letting msxml resolve the entities (hopefully it can find the dtd). Once done, replace bar with &foo;. Note that this assumes you're only using entities for fairly simple inline cases and you don't do something like blah“>, which would probably trip up your xslt.
    Advantage: Can sort based on entity values and return a result with the unresolved entities in place.
    Disadvantage: Sounds hard. Can it be made to work with catalog files? Is it possible to get the text of the DOCTYPE?

    Reply

    Derek Read

    Reply to: Script Example: Sort Lowercase CALS Tables (DITA, DocBook, etc)

    Interesting find. I didn't even think about entities and because MSXML is involved here that adds to the complexity. Your logic on the workarounds seem doable in script, but I'm not sure I will try yet.

    At this point I think the script is useful for the majority of people (the 80/20 rule probably applies, or maybe even 90/10 as the majority of our DITA clients are not specializing yet, though there are quite a few). A large portion of the population also doesn't use entities so I guess they'd be fine too.

    However, I suppose I should at least try to stop the table from being deleted. I assume Undo works to restore the table?

    There are other completely different approaches I've tried, one of which was a pure JScript string manipulation. It was about 1/2 done maybe when I had what I thought at the time was a flash of genius to use XSLT instead. I might go back to that other code and see if it might work better or be easier to implement (and ultimately it would be nice to have this type of functionality right inside the product in the form of some new APIs, though I don't see that happening anytime soon).

    One nice thing about the XSLT version is that it would likely be fairly easy to extend it to sort on more than one column at a time (because the MSXML support for XSLT has such a thing built-in).

    Reply

    dcramer

    Reply to: Script Example: Sort Lowercase CALS Tables (DITA, DocBook, etc)

    I like the xslt approach very much because it would be easy to create variants on this macro to transform other structures in ways that would be difficult or impossible using JScript and DOM, but the entity thing has to be addressed. I think Option 1 would be easy and would take care of most of the situations I would need it for. I'll take a stab at that and let you know how it goes. For Option 2.a. I would need help since I wouldn't know how to capture the DOCTYPE (not just the public and system identifiers but also the locally declared entities) as a string.

    Even better would be to use Saxon 9 to do the xslt because then you could use xslt 2.0 and much more interesting things. But I have no idea how you'd go about doing that.

    Thanks,
    David

    Reply

    Derek Read

    Reply to: Script Example: Sort Lowercase CALS Tables (DITA, DocBook, etc)

    Here's a quick fix to stop the from being removed when the table contains entities.

    It doesn't actually fix the issue (I'll let dcramer see if he can work that out), but at least this will tell you the table couldn't be sorted and stops it from disappearing in the case of entities and possibly other cases:

    Old
    [code] rngWork.TypeText(sortedTable);[/code]

    New
    [code] if (sortedTable > “”) {
    rngWork.TypeText(sortedTable);
    }
    else {
    Application.Alert(“This script is not smart enough to sort your table.”);
    }[/code]

    Reply

    dcramer

    Reply to: Script Example: Sort Lowercase CALS Tables (DITA, DocBook, etc)

    Here's a fix to hide entity references:

    [code]
    244a245,259
    >
    > // HACK ALERT!!
    > // Here we replace any ampersands in the source
    > // XML with a placeholder value, “@@[email protected]@”
    > // to keep MSXML from trying to resolve the entity.
    > // We'll change it back later. However we first
    > // make sure that the string @@[email protected]@ does
    > // not already exist in the table.
    > if(tableStr.match(/@@[email protected]@/)){
    > Application.Alert(“This table already contains the string @@[email protected]@.nThis macro reserves the [email protected]@[email protected]@ to hide entity references.”);
    > return;
    > }else{
    > tableStr = tableStr.replace(/&/g,'@@[email protected]@');
    > }
    >
    269c284
    < rngWork.TypeText(sortedTable);

    > rngWork.TypeText(sortedTable.replace(/@@[email protected]@/g,'&'));
    [/code]

    Reply

    dcramer

    Reply to: Script Example: Sort Lowercase CALS Tables (DITA, DocBook, etc)

    Another suggested change:

    [code]
    < xsltStr += 'n';

    > xsltStr += '
    n';
    [/code]

    Otherwise you lose comments and PIs in the table.

    Reply

  • You must be logged in to reply to this topic.

Lost Your Password?

Products
Downloads
Support