Pages: 1
Print
Author Topic: Working with Invalid XML in XMetaL Author and Getting it into a Valid State  (Read 2146 times)
Derek Read
Program Manager (XMetaL)
Administrator
Member

Posts: 2580



WWW
« on: December 01, 2010, 02:46:45 PM »

A main guiding philosophy for how XMetaL Author functions as an XML editor is that an author should never have to work with invalid XML, and the software goes to great lengths to prevent you from getting into such a situation.

However, there are times when you might need to work with an invalid XML document. Here are some common reasons why a document might be invalid:
  • Legacy documents were converted from some other format (HTML, Word, etc) to XML using some automated process or by a contractor.
  • You previously used an XML authoring tool that was not very strict in enforcing XML validity.
  • You subcontract some parts of your authoring processes out and the people who do that work sometimes mess things up.
  • Your XML files are valid according to your current schema but you migrate your documents to a newer, but more restrictive, version of the same schema. In an ideal world this would not happen due to good planning and foresight, but of course "stuff" happens. In general, it would be best to try to batch process all such changes automatically external to XMetaL Author (using XSLT or similar) so your authors do not need to deal with this type of issue.
  • Your XML files are valid according to one schema and you are migrating to a similar schema (though different) and that process cannot be easily or inexpensively automated (using XSLT or some other solution).
  • You have an automated process that tends to muck things up in specific instances and nobody wants to spend the time fixing it because the number of instances is relatively small.
  • Someone using XMetaL Author has edited a document and put it into an invalid state (most likely using Plain Text view where "Rules Checking" is never enabled.
  • File system corruption.

I'm sure there are other reasons I have not thought of, but hopefully you get the idea.

Strict Enforcement (Rules Checking)
XMetaL Author strictly enforces XML validity according to the DTD or Schema associated with a particular document in order to stop the document author from creating an invalid XML document. We call this "Rules Checking", something you might also call "live validation". The idea behind this is that everything you do during editing is checked and if it would put the document into an invalid state XMetaL (usually) does not allow it.

This is almost always a good thing, as you don't want your authors creating invalid XML. It is also a good thing because it means (given all the other functionality in the product that helps guide you) authors don't need to remember every nuance of any particular DTD or Schema to create a valid XML document.

This feature is on by default, provided you remain in Tags On or Normal view. Plain Text view is considered an advanced view and never enforces "Rules Checking" (nor can it be made to do so).

But what if one or more of the following are true?
  • The XML you are starting with is really badly mangled by some other software or process and needs fixing.
  • You want to take advantage of most of the XML authoring functionality in XMetaL Author but you don't want it bugging you about invalid documents or stopping you from doing things that you know are wrong.
  • You really know the DTD or Schema like the back of your hand and have the confidence to work with documents that are (temporarily) mostly invalid.
  • You want to make a lot of intermediary changes to a document that will put it into an invalid state but that will allow you to more quickly get to your final goal (a valid XML document).

Solutions
There are two main solutions for this. In both cases you need to know your schema well enough to be able to get your document back into a valid state with less guidance than XMetaL normally provides in Tags On or Normal view.

In both cases you always have access to user-initiated validation using the option Tools > Validate Document, which has the shortcut key F9 (and you will likely be using that quite a bit).

In both cases the Element List will list all elements defined in the schema and will not limit itself to showing just those valid at the current position. The idea here is that you might wish to (or feel the need to) temporarily insert an element that you might later rename, move, wrap with another element or otherwise change.

The Attribute Inspector does still limit itself to displaying only attributes for the current element and values allowed for that element regardless of the view or other options. A side effect of this is that editing in Plain Text view with the Attribute Inspector open can slow down editing. For this reason you may wish to close the Attribute Inspector until you need it. Shift+F6 toggles it on and off and when on puts the selection in the last attribute that you edited if it exists in for the current element.

1. Plain Text View
  • The first option, one that requires no setting changes, is to simply switch to Plain Text view and work from there. In Plain Text view "Rules Checking" (a feature that checks if the change you are about to make will result in a valid document) is always turned off and cannot be enabled as this view truly is meant for advanced authors to use. In this view you are free to make any changes you like. However, you will also be missing features you have in Tags On view, such as tables rendered graphically as tables, CSS styling, macros, etc. For many people this is fine and a very good option.

    In some cases if you open a severely mangled XML file in XMetaL Author the product might switch to Plain Text view immediately (unless you use one of the following options).

Note: It is not uncommon for a customization or an integration with 3rd party software (such as a CMS) to disable Plain Text view. If Plain Text view is disabled for you somebody wants to stop you from using it and there could be various reasons for that. A customization may also be making the assumption that Rules Checking should always be on. So you should probably check with the people that wrote the customization you are using (CMS integration and document authoring customizations) before trying any of the following.

2. Disable Rules Checking
  • There are three ways to allow an author to disable Rules Checking on purpose.

    Direct INI variable:
    • rules_checking_always_off = yes (the default is no)

      Setting this INI variable turns Rules Checking off the next time you launch XMetaL Author. This affects all documents you open.

    Indirect INI variable:
    • rules_checking_always_off_option_shown = yes (default is no)

      This option enables an additional checkbox in the Tools > Options dialog. You can open that dialog any time you like and either check or uncheck the option.

      Checking the option turns Rules Checking off (immediately) and also sets the INI variable rules_checking_always_off = yes.

      Unchecking the option tries to turn Rules Checking on if possible and sets the INI variable rules_checking_always_off = no. If you try to turn Rules Checking on when your document is still mangled Rules Checking may not come back on (for that document). So, it is best to put your document into a valid state (use F9 to check that) before unchecking this option in the Tools > Options dialog.

    API:
    • ActiveDocument.RulesChecking = [boolean]

      A developer (someone trained in creating scripts to customize XMetaL Author) can call this API from script to toggle Rules Checking on and off. How it is used is up to the developer. The same caveats regarding making sure your document is valid before turning Rules Checking back on for any particular document apply (ie: if the document is still mangled it might not be possible to turn Rules Checking back on).

      This API affects the current document only, which makes it different from the INI variables previously mentioned. It is generally meant for temporary use in a script that needs to make changes to a document using APIs that would otherwise not allow those changes. The same script is generally expected to restore the document to a valid state before turning Rules Checking back on. However, when Rules Checking is turned off using this API, XMetaL Author will also attempt to turn it back on the next time it sees that the document is in a valid state. That will occur when the document is next validated. Various user actions may trigger validation or another API may do so. This API should respect the INI settings above and not restore Rules Checking automatically if the INI settings say the user wants Rules Checking off.

      This means a developer could add a feature for advanced authors to allow them to temporarily turn Rules Checking off. These types of authors could run the script to turn Rules Checking off, make any changes they like to the document, including invalid changes, then restore validity to the document by making changes and repeatedly (if necessary) pressing F9 to validate the document until it is valid. Once it is valid Rules Checking would be automatically restored.

« Last Edit: October 29, 2013, 04:47:30 PM by Derek Read » Logged
Pages: 1
Print
Jump to: