General Guidance for Developing New Shells and Modules

While the creation of new document shells and modules is a largely mechanical process, it is one that involves a lot of moving parts and fiddly bits. There are many opportunities for error and many of these errors can be difficult to track down because of all the pointing and indirection going on in the files involved.

To ensure success you must work carefully and methodically. The tutorials presented here reflect the methodical approach that I depend on.

In general, the methodical approach involves the following principles and practices:
  • Test

    You must test from the very beginning. This is generally referred to as "test-driven development." The general practice is to create test cases first, verify that they fail (e.g., documents don't validate, transforms produce no output, etc.), then implement until the test cases pass. When they pass you know you're done.

  • Test at every step

    You must ensure that you are in a known working state before making any change. If you do that, then you know that the last thing you did caused the breakage when something stops working. It means you only have to back out one change in order to get back to a good starting state.

  • Check your assumptions

    If you're getting an inexplicable failure, test your basic assumptions. For example, when debugging references to DTD components through catalogs, you can test your assumption that the catalog is correct by tracing down through a chain of references. The OxygenXML editor's "open file at cursor" feature makes this easy, since you can start with the root catalog and chain down through the catalog-to-catalog and catalog-to-file references to make sure everything is hooked up correctly (other editors have similar features). Likewise, you can use search and replace to verify that strings match between DOCTYPE declarations or between schema location values and catalog entries.

    It's also good to verify you're changing the file you think you are. With the Toolkit there are often two or three copies of files: the copy you develop against in your source tree, the copy deployed to the Toolkit instance, and, for "template" files, the copy generated by the Toolkit's integration process. It's easy to accidentally open the wrong copy and then wonder what happened to your changes, either because you forgot to deploy them or because you modified the copy in the Toolkit by mistake and then redeployed over your changes from your source tree.

    If you have multiple Toolkits installed you should verify that you're running the code you think you are, since it's easy to run against the wrong Toolkit.

  • Start simple and work up

    Implement in small increments. For example, start with all the new files for a related set of shells and vocabulary modules in one directory so you don't have to worry about setting up catalogs initially. Once everything works in that context, reorganize the files to reflect the desired organization structure, creating and testing the necessary catalogs.

    Likewise, if you are creating several new document type shells, implement one completely before implementing the others, to ensure that you're not copying any mistakes.

  • Watch for cut-and-paste errors

    A lot of the work in creating new document shells and modules is cutting and pasting from existing files to create new ones. It's part of what makes it so fast to create new modules. But it also has the potential for insidious cut-and-paste errors because you copy something you shouldn't have or inadvertently copy the same mistake multiple times.

  • Use code control

    Use a code control system like Subversion or GIT and commit your code frequently. You do not want to let uncommitted changes sit too long, or you'll risk losing data and time. By testing early and often, you can commit code that isn't broken, even if it's not complete. Then, if things go wrong you can simply restore from your last commit and start over. As a general rule you never want to be at risk of losing more than an hour or two's worth of work, certainly not more than a day's work. Even if you are simply supporting yourself as a lone author you should use code control to manage your code and your authoring work. There are low-cost and free Subversion services, or you can just set up a repository on your work machine (just make sure you back up the repository itself regularly).