General XMetaL Discussion

XMetaL Community Forum General XMetaL Discussion Regular expressions in XSD

  • Glovit

    Regular expressions in XSD

    Participants 0
    Replies 1
    Last Activity 10 years, 6 months ago

    I'm using XMetal Author Essentials and am having some issues with validation and rules generation against my XSDs that include patterns for restricting some values.

    First, I'm finding the XMetal doesn't seem to like multiple elements in a in an XSD. It doesn't throw an error when it is generating the RLD file, but the patterns all being picked up and this causes lots of validation issues. I'm able to work around this by modifying the schema to make sure there is only one pattern element and then creating one big long line of patterns separated by pipes “|”. This then generates the RLD and everything validates as expected.

    I'm also seeing that if a pattern is not fully enclosed in parenthesis, then the RLD file is generated, but any files that contain elements that use this restriction will not validate. I'm able to solve this by just making sure each pattern is within ().

    Now I'm running into an issue that I haven't been able to figure out and am looking to see if anyone knows if there are characters/limits to the XMetal capabilities for regular expression in generating RLD files.

    The following expression is valid in the schema and validates in Visual Studio, XMLSpy, and some other Regex testing tools I've tried online.

    Pattern = (^(((([-_a-zA-Z0-9 ]+)/)?(([-_a-zA-Z0-9 ]+)/))?([a-fA-F0-9]{8}(-[a-fA-F0-9]{4}){4}[a-fA-F0-9]{8}))(?.+?)?(#.+?)?$)

    String to match: = ba674c1a-b603-4e37-842a-1b38b8185a1e?foo#bar

    The pattern seems to be failing on the last 2 parts of the expression (?.+?)?(#.+?)?. I can remove these parts of the pattern and the ?foo#bar from the string, and it generates and validates as expected.

    Any insight into why the last part of this fails and possible workarounds?



    Derek Read

    Reply to: Regular expressions in XSD

    We've concluded (partly offline with Greg) that XMetaL Author is following the W3C specs in this case.

    The “lazy” quantifier in his expression (example: +?) is not defined by the W3C Schema Primers. This type of expression is supported in other languages that support regular expressions (notably Perl, though I've tested with the WSH JScript engine as well). So, an XML parser that implements the “lazy” quantifier in regular expressions for W3C Schema has extended the W3C specs with custom functionality.

    The W3C Schema Primers can be somewhat vague (not to mention difficult to read) which is partly why variations in implementation between XML parsers exist. I suspect that if a company reused a regular expression engine (from another product it has built or perhaps something it has acquired) it is more likely to support things the W3C did not define. Perhaps that is what occurred with the Visual Studio and XML Spy implementations. Or perhaps they simply decided to extend their W3C Schema support beyond the W3C specs (in which case it would be nice if those extensions were documented somewhere). Or they may have been confused by the regular expression support from XQuery and/or XPath, both of which define things that the regular expression support in W3C Schema do not, including this “lazy” quantifier.

    Ideally, it would be nice if XMetaL were to give a more meaningful error in this case; if it told you it is the Schema itself that has an issue rather than hinting that you can fix the string to match the regular expression. The software is not designed to validate DTDs or Schemas (it is an authoring tool), though it will warn of significant issues in some cases. If this was something authors would be seeing I would push stronger to implement something but I think this is a developer issue that will, hopefully, be caught before authors ever see a deployed customization.


  • You must be logged in to reply to this topic.

Lost Your Password?