Pages: 1
Print
Author Topic: Unique ids  (Read 7125 times)
gcrews
Member

Posts: 265


« on: March 30, 2011, 12:59:48 PM »

Do the unique ids really have to be a long 32 digit random id? They are very unfriendly to use as authors and are horrible to any type of compression on html output. Be it gzip over http or chm compression. I did a quick search on all our DITA content and despite those guids we have over 25,000 duplicate ids due to file duplicating and content copying. From what I understand the ids only have to be unique within the current file.

Would it not be better to have something like this in the makeDitaUniqueID and makeDitaUniqueIDME funtions?
   //unique id within file
   var doc = nde.ownerDocument;
   var num = 0;
   while(doc.getElementById(strElementName + "_" + num))
      num++
   newId = strElementName + "_" + num;
   return newId;

It may not be the quickest way. But you can’t rely on element position or anything because that may change as a document is edited. Its probably faster in the long run to auto generate shorter ids like that than the time it takes to parse the longer files do to the longer guids.

It looks like I can override the auto id generation by using a database specializer. It doesn’t look like the dita map auto id generation has a similar mechanism for changing the behavior though.
« Last Edit: March 30, 2011, 01:01:32 PM by gcrews » Logged
Derek Read
Program Manager (XMetaL)
Administrator
Member

Posts: 2621



WWW
« Reply #1 on: March 31, 2011, 01:57:36 PM »

This feature was well-intentioned and we have so far chosen to leave it as is for consistency and because we have not heard of anyone having the types of issues you are raising here. Is the DITA OT really slower when parsing id values like this during output creation? I'll see if I can reproduce that. When this feature was created DITA was suggesting that id values should be universally unique (hence our use of GUID-like id values). Early versions of the DITA OT would fail with fatal errors if this was not the case (that has not been true for a long time now, and I believe id values are now rewritten during the merging phases of map processing by the DITA OT).

You can override this behavior in an "extender" script file. Various entry points are provided, including one that allows customizers to alter the format of the values used for the auto-id functionality. This is not documented as it is typically something our partners would be expected to do, however, you can do it yourself if you are familiar with JScript.

There is a way to extend the product to create id values of your own design. A sample file is included here with (very) basic documentation inside it:
<XMetaL install path>\Author\DITA\XACs\ditabase\ditabase_ditabase.off.js

To properly enable this functionality you need to:
1. Clone that file to the XAC folder containing the topic type(s) you use.

2. Rename the file so that it matches the name of the DTD in that folder but with a .js file extension.
Example:
 <XMetaL install path>\Author\DITA\XACs\task\task_ditabase.js

3. Inside the file change line line 47:
Old:  function ditabase_ditabase()
New:  function task_ditabase()

4. Change line 70 and the script between lines 71 and 74 to add your own override code (the stuff in red is what you will change):
 ditabase_ditabase.prototype.makeUniqueId = function(domNode)
  {
   // Override factory auto-id generator function
    return "ID" + strGetGUID() + "_" + domNode.nodeName;

  }


Example 1:
This one uses the standard JScript function Date(), so there is no need to know your way around any of the XMetaL Author APIs.

 task_ditabase.prototype.makeUniqueId = function(domNode)
  {
   // Override factory auto-id generator function
   var d = new Date();
   return "gcrews" + "_" + d.getTime();

  }


Example 2:
This one is similar to Example 1 above but uses an additional XMetaL Author API called UniqueAttributeValue. It also uses the domNode value passed in to this prototype function (the function we are providing to override the default auto-id functionality) passes in (domNode) which allows you to obtain information about the element the ID is being set for (in this case I use it twice, once to obtain information about the element itself - the nodeName - and also to access the document node containing the element so that I can use the API UniqueAttributeValue on the correct document). You may wish to have a look at the XMetaL Developer Programmers Guide for documentation on these things.

  task_ditabase.prototype.makeUniqueId = function(domNode)
  {
   // Override factory auto-id generator function
   var prefix = domNode.ownerDocument.doctype.name + "_" + domNode.nodeName + "_";
   var docUniqueID = domNode.ownerDocument.UniqueAttributeValue("id",prefix,1);
   return docUniqueID;
  }


5. Remove all of the other prototype functions that you are not using (in your case all of them) except the one you changed and the one at the very top that now looks like this:
function task_ditabase()
{
}


6. Save the file.

7. Repeat this process, if desired, for each document type you want to enable this behavior for.
  
These scripts are loaded dynamically when a document is opened or a new document is created from template so to test this you can simply save the .js file, and then select File > New (selecting the correct topic type, in this case a task). Make sure you test thoroughly on all document types you make this change for. If your script creates an invalid value XMetaL will catch it during validation (F9 or during a save) but not inside this script.
« Last Edit: March 31, 2011, 02:12:04 PM by Derek Read » Logged
gcrews
Member

Posts: 265


« Reply #2 on: April 01, 2011, 12:47:39 PM »

Thanks for the reply. I wasn’t saying that it was significantly slower by any means. I was relatively comparing it to the added time of  the while loop cycling though ids until it finds one not used witch probably takes less than 200 milliseconds or so. It may slightly decrease editing performance when editing a document when it creates a new unique ID. It would probably be made up in document loading time instead of having to parse the extra data.

Thanks for the detailed info on how to override the behavior. I kind of figured that’s about how to do it. I was wondering about the ditamap editor though. There is a makeDitaUniqueIdForME function at about line 230 of ditamap_utils.js. It doesn’t look like it does any call out to any extension. I guess the guids for the maps are too much of a brother and we can live with them.

The long guids really start to clutter up the topic files if you ever have to look at the code view, use text editor, look at the outputted html code.  Is it common to leave the defaults for what tags to auto generating ids for? Or is it more common to disable the auto id generation except for the main element.  Our content has over 35,000 guids on li tags, 8,000 guids on ul tags, 3,000 guids on step tags ect…. None of which are ever referenced by anything. Any tag that is referenced by a conref, href, or xref has had the id changed to something nicer.  
« Last Edit: April 01, 2011, 12:49:24 PM by gcrews » Logged
Derek Read
Program Manager (XMetaL)
Administrator
Member

Posts: 2621



WWW
« Reply #3 on: April 01, 2011, 02:28:31 PM »

We merely set up the defaults for which elements to auto-id in order to facilitate reuse or linking to elements that we think are likely to be used for those purposes. The elements that were chosen were based on some early surveys we performed. Knowing that different people will need more / fewer elements to have id values automatically set is why the interface for choosing those elements is provided in Tools > DITA Options.

So, if you don't reference elements (either for conref/reuse or cross referencing) then of course you don't need any id values at all, except the topic's root. You need an @id on the root for the document to be valid because in the DITA DTDs the @id for the root element in each topic type is defined as a true ID and #REQUIRED whereas all of the other @id are CDATA and #IMPLIED.

If you do need @id on some elements but prefer to manually set them to nice values then maybe in your case you should disable auto-id for everything except <concept>, <reference, <task> and <topic>.
Logged
Pages: 1
Print
Jump to:  

email us