Home › Forums › General XMetaL Discussion › seach result not complete for CCJK › Reply To: seach result not complete for CCJK
Reply to: seach result not complete for CCJK
January 10, 2013 at 12:09 amIn order for our current implementation of WebHelp to support Chinese we would likely need to implement an extra step to break Chinese sentences into words, something along these lines: http://nlp.stanford.edu/software/segmenter.shtml
Or one of various other solutions one might find searching http://www.google.com/search?q=中文分词
As the code that generates our JavaScript array uses Java to do that, the first option has some chance of being implemented (as it is written in Java) but I don't see that happening anytime soon given current priorities.
That leaves out Japanese which would require an alternative solution.
I think the most robust solution would be to implement a full blown search solution. Google certainly seems to be able to handle any Chinese and Japanese you throw at it, so that is one very strong option.