Home Forums General XMetaL Discussion Regular expressions in XSD Reply To: Regular expressions in XSD

Derek Read

Reply to: Regular expressions in XSD

We've concluded (partly offline with Greg) that XMetaL Author is following the W3C specs in this case.

The “lazy” quantifier in his expression (example: +?) is not defined by the W3C Schema Primers. This type of expression is supported in other languages that support regular expressions (notably Perl, though I've tested with the WSH JScript engine as well). So, an XML parser that implements the “lazy” quantifier in regular expressions for W3C Schema has extended the W3C specs with custom functionality.

The W3C Schema Primers can be somewhat vague (not to mention difficult to read) which is partly why variations in implementation between XML parsers exist. I suspect that if a company reused a regular expression engine (from another product it has built or perhaps something it has acquired) it is more likely to support things the W3C did not define. Perhaps that is what occurred with the Visual Studio and XML Spy implementations. Or perhaps they simply decided to extend their W3C Schema support beyond the W3C specs (in which case it would be nice if those extensions were documented somewhere). Or they may have been confused by the regular expression support from XQuery and/or XPath, both of which define things that the regular expression support in W3C Schema do not, including this “lazy” quantifier.

Ideally, it would be nice if XMetaL were to give a more meaningful error in this case; if it told you it is the Schema itself that has an issue rather than hinting that you can fix the string to match the regular expression. The software is not designed to validate DTDs or Schemas (it is an authoring tool), though it will warn of significant issues in some cases. If this was something authors would be seeing I would push stronger to implement something but I think this is a developer issue that will, hopefully, be caught before authors ever see a deployed customization.