General XMetaL Discussion

XMetaL Community Forum General XMetaL Discussion Does using associative arrays slow things down?

  • russurquhart1

    Does using associative arrays slow things down?

    Participants 3
    Replies 4
    Last Activity 12 years, 9 months ago

    With the help from here i was able to get my script working. I tried tightening up the code, and reduced the lines of code using associative arrays.

    The program first creates a node list of elements having a Pointer attribute. Then another node list is created of elements having a Target attribute. This Target node list is then traversed, and the value of the Target attribute is put into the array, indexed by its value.

    Then i go through each of the Pointer nodes, and check to see if the value of the Pointer node exists in the Target array.

    The code is pretty simple, but it seems to be running pretty slow. (Albeit there are a lot of elements containing Pointers and Targets.) It seems to take about 5 minutes to create the Target array, and then another 5 minutes to compare all the Pointer values to the Target array.

    My code is below, can anyone suggest how to speed this up?

    Thanks again!

    Russ

    ————————

              var app_str = “”;
              var  target_locations  =  new Array();
             
              var href_nodelist = ActiveDocument.getNodesByXpath(“//*[@Pointer]”); 
              var nodecount = href_nodelist.length;
              Application.Alert(“href count is ” + nodecount);
              var target_nodelist = ActiveDocument.getNodesByXpath(“//*[@Target]”);
              var target_node_count = target_nodelist.length;
              Application.Alert(“target count is ” + target_node_count);
              Application.Alert(“Building target table”);
              for (i=0; i          {
              var item = target_nodelist.item(i);
              var item_val = item.getAttribute(“Target”);
              target_locations[item_val]  =  item_val;
              }
              Application.Alert(“Target Table built”);
              var rng = ActiveDocument.Range;
                       
              for (i=0; i                  var item = href_nodelist.item(i);
                      var item_val = item.getAttribute(“Pointer”);
                     
                      if (target_locations[item_val] === undefined)
                      {
                      var app_str = app_str + “Target not found in this document: ” + i + ” – ” + item_val + “n”;
                      }
                }
              Application.Alert(app_str);
              ActiveDocument.FormattingUpdating = true;
    ]]>

    Reply

    Derek Read

    Reply to: Does using associative arrays slow things down?

    I don't think arrays are your issue (arrays in JScript are fast and would not be the bottleneck most of the time).

    I think the most likely bottleneck would be usage of ActiveDocument.getNodesByXpath as it can be a fairly expensive call to make on large and complex documents ( = many nodes and / or complex node structure).

    You may also wish to look at your usage of FormattingUpdating. Turning that off and then on will cause the document to be reformatted and if you are not modifying the document (as in your example) then calling it should not be necessary and would trigger an unnecessary complete reformat of the document. However, if you are making changes to the document this API will stop the document from flickering all over the place, and also in this case it will actually be faster as the document reformatting will only take place once at the very end of your script. I don't think your script is doing anything that would require this API (though perhaps this is not your real script).

    It is possible to obtain identical results using the Range method instead of getNodesByXPath by visiting each element in the document in turn (or by jumping to elements with a specific name). Range is generally quite fast when used this way. Also, putting some of this inside a function might make things slightly faster (that part is relying directly on the JScript engine).

    [code]//XMetaL Script Language JScript:

    //Define a new function for searching within arrays
    Array.prototype.exists = function(strVal){
    for (var i=0; i if (this == strVal) {
    return true;
    }
    return false;
    }

    /*
    Visit every element in the document and put all Pointer
    and Target values into two different arrays.

    Note: Might be slightly faster (depending on document
    size and complexity) if rewritten to visit specific
    elements rather than visiting all elements.
    */

    var rng = ActiveDocument.Range;
    var pointerAttrs = new Array();
    var targetAttrs = new Array();
    var i1 = 0;
    var i2 = 0;
    var msg = “”;

    rng.MoveToDocumentStart();
    while (rng.MoveToElement()) {
    pointerAttr = rng.ContainerAttribute(“Pointer”);
    if (pointerAttr > “”) {
    pointerAttrs[i1] = pointerAttr;
    i1++;
    }
    /*
    Note: Not sure if is it really necessary to
    do the following check. Putting empty values into
    the array might be slightly faster. However, we
    check these later on (when searching) so that tradeoff
    might go either way. If speed is of real importance
    test though two scenarios.
    */
    targetAttr = rng.ContainerAttribute(“Target”);
    if (targetAttr > “”) {
    targetAttrs[i2] = targetAttr;
    i2++;
    }
    }

    //Now that we have two arrays containing all values
    //call the prototype function to see if Target values exist for each Pointer
    for(var k=0;k if(!targetAttrs.exists(pointerAttrs[k])) {
    msg += “Target not found in this document: ” + k + ” – ” + pointerAttrs[k] + “n”;
    }
    }
    if (msg > “”) {
    Application.MessageBox(msg,64);
    }[/code]

    Reply

    russurquhart1

    Reply to: Does using associative arrays slow things down?

    Thanks!

    That made a hell of a difference.

    So, to me, is there a problem with using Xpath? (I'm getting the feeling that maybe i shouldn't use it at all.) Should Xpath just be used in special cases then?

    Thanks!

    Russ

    Reply

    Derek Read

    Reply to: Does using associative arrays slow things down?

    Running two tests on your script as it is written on a particular test file (created with basically random content and approximately 2500 elements) gives me vastly different results, as follows:

    As written (FormattingUpdating is turned off then on): about 35 seconds (the vast majority of this time is spent reformatting the entire document)

    FormattingUpdating lines commented out but otherwise identical: 3 seconds for the first run, less than 1 second for subsequent runs (probably because either the JScript engine has cached the script, or because the engine is already initialized and Windows can run it faster after caching the DLL)

    In this particular case what you are searching for is fairly simple (ie: it does not require a complicated XPath statement and can be easily transformed into some basic logic instead, as I've done). There is no problem with using XPath, however, there can be a trade-off in speed. Only testing will show that for sure. The difference will depend on the size and complexity of the document and the complexity of what needs to be found (ie: the 3rd paragraph of any element containing the attribute foo with the exact value bar would be pretty hard to write using scripting logic, though it could be done of course).

    Reply

    russurquhart1

    Reply to: Does using associative arrays slow things down?

    That's interesting. (I have yet to try taking the formatting off. You're right i don't really need that, it was some code from another bit of code i copied.)

    However, when i was running my script at it was with the xpath expressions, it would take, for this document, 5 minutes to copy the elements to the array (the length was reported almost immediately) and another five minutes to compare the pointer to the targets.

    My document is about the same number of found pointers and targets, and a buch of other elements, so maybe that is it, but there was a GREAT time diff.

    I'll look some more on my end. Thanks for the heads up alternate algorithm.

    Russ

    Reply

  • You must be logged in to reply to this topic.

Lost Your Password?

Products
Downloads
Support