General XMetaL Discussion
Bradley Shoebottom May 17, 2012 at 2:45 pm
XMetal Author Enterprise 7.0 (188.8.131.52) Word Count AlgorithmMay 17, 2012 at 2:45 pmParticipants 0Replies 1Last Activity 10 years, 8 months ago
Can you tell me what words are being counted with the Word Count feature? First off, what elements are included, then what does word Coutn consider to be a word and not a word. I have being using a number of word counting tools and I want to be able to explain what was or was not counted.Derek Read May 18, 2012 at 12:03 am
Reply to: XMetal Author Enterprise 7.0 (184.108.40.206) Word Count AlgorithmMay 18, 2012 at 12:03 am
In this case a “word” is defined by one or more characters followed or preceded by a white-space character and contained inside an element allowing text.
The basic logic is to load the entire document as a string, then:
1. Remove all tags, comments and processing instructions from the document (anything starting with < and ending with >.
2. Replace all contiguous sequences of non-white-space characters (no matter the length) with a single character (in the code the letter “a” is used but that is of no real importance).
3. Remove all white-space characters from the document.
4. Count up the number of remaining characters (the letter “a”) which now represent the original number of words in the document.
This is done using regular expressions in JScript because JScript string manipulation is far faster than loading the XML into an XML processor and parsing for text nodes, etc. You will see from this logic that it does not attempt to deal specifically with numbers, parenthesis or other punctuation, etc. I think once you get beyond this basic logic you need to start building in both language-specific smarts and perhaps business logic as well, plus personal preferences and there are probably as many different ways to do that as there are writers.
This is basically the same logic as used in the following demo (except that it has now been integrated into the new 7.0 “Cross-Files” feature): http://forums.xmetal.com/index.php/topic,28.0.html
- You must be logged in to reply to this topic.