Reply to: 5.1 -> 6 changes?January 3, 2011 at 10:20 pm
We issue release notes listing features as well as the most important bug fixes with each version of our software. You can also view copies of these files on the forum here: http://forums.xmetal.com/index.php/topic,108.0.html
We have not made any specific changes for handling “large” files (in your case somewhere between 5MB and 40MB) with the release of XMetaL Author 6.0.
The product essentially relies on there being enough memory available to do two things:
1) Build a DOM (node tree) representation of the document.
2) Build a visual representation of the document from the DOM (using CSS among other things).
This is a general limitation of any XML authoring software with editing capabilities similar to XMetaL (an editable DOM needs to be created and the DOM needs to be rendered to the user). XMetaL does many other things besides just rendering the document and to some extent they might come into play (validation, etc) but these two are the main ones.
As it stands today, given enough memory XMetaL should be able to open a file of any size. The main issue you are likely to have is that your document might take a long time to render before you can begin editing and you may not want to wait (essentially making the product unusable for you for files of this size).
This is not something we plan to try to address directly, however, we will continue to add support for newer versions of Windows. As newer versions of Windows support faster CPUs, multiple CPUs, and larger amounts of directly addressable RAM that may help if you feel you must edit documents this large.
A single 40MB file must contain [u]a lot[/u] of content. Sounds like an encyclopedia. No matter what it is I should think it could be organized into smaller, more manageable pieces.
So, What Do People Do? (ie: What Can You Do?)
People typically don't run into these types of problems because they are not working with files this large. They may be producing large documents (after transformation into HTML, PDF, etc) but they are editing more manageably-sized documents. Often this is because it is prescribed by the CMS system they have integrated XMetaL with, or because it is a DITA 'best practice', or because they are using some other schema (such as DocBook) and that community has given some general guidance on how to manage these things.
Most widely-used document-centric XML schema have clearly defined methods (as part of their specs) for breaking documents up into smaller portions (chapters/sections for DocBook, topics for DITA, etc) and a means to reference and organize those smaller portions (books in DocBook, maps in DITA, etc). The W3C XML Recommendation itself also defines a way to reference files (external entity) for use with any XML document type.
These portions (chapters/sections, topics, entities, etc) can then be combined during the publishing phase by an XML, XSLT, or similar processor with less memory and CPU overhead. Such processors are less memory intensive because they do not need to allow the user to interact directly with and modify a “live” DOM (and they also typically do not need to render the document on screen).
What About Data-Centric XML?
For portions of your final document that are data-centric (content that is probably most easily organized in a table structure: table>row>cell) it may make sense to use an editor designed and optimized for that specific purpose (examples would include Microsoft Excel or a database with some kind of front end). This data is then transformed to XML during the publishing phase and injected into the appropriate location while the rest of the output is being built, or it might also be transformed directly into the final format and included directly. It may also be the case that people working with the data may not need to author the rest of your XML content and vice versa (and so using different tools just makes the best sense).
There are also benefits to organizing your content into smaller portions that you may not currently be seeing (translation cost savings, content reuse, content management, workflow management, among others).