Specialization

Specialization is the process of creating structural or domain vocabulary modules that provide new markup for specific requirements.

The essential aspect of specialization is that every element type or attribute defined in a vocabulary module must be based on and consistent with an element type or attribute defined in a more-general vocabulary module or in the base topic or map type.

This requirement ensures that any element, no matter how specialized, can always be mapped back to some known type and therefore understood and processed in terms of that known type. This ensures that all DITA documents, no matter how specialized, can always be processed in some way. That is, new markup should never break existing specialization-aware DITA processing.

Every element type exists in a specialization hierarchy that goes from the base module (topic or map) through any intermediate modules to the element itself.

For example, if you defined a specialization of <concept> called <myConcept>, its specialization hierarchy would be <topic> -> <concept> -> <myConcept>. A processor given a <myConcept> document would be able to process it as a <myConcept> topic, a <concept> topic, or a generic <topic>, as appropriate.

The magic of specialization is the @class attribute.

Every DITA element must have a @class attribute. The value of the class attribute is the specification of the specialization hierarchy for the element. The syntax of the @class attribute is:
  • A leading "-" or "+" character: "-" for structural types, "+" for domain types.
  • One or more space-separated module/element-type pairs: "topic/p,""topic/body,""hi-d/i," etc.
  • A trailing space character, which ensures accurate string matching on the last term in the hierarchy
For the <myConcept> topic type the @class value would be
"- topic/topic concept/concept myConcept/myConcept "
Which you read right to left as:
The <myConcept> element in the "myConcept" module, which specializes <concept> from the "concept" module, which in turn specializes <topic> from the "topic" module.
If the <myConcept> topic type defined a specialized body element, say <myConceptBody>, then its @class value would be:
"- topic/body concept/conbody myConcept/myConceptBody "
Looking at an instance of the <myConcept> element, you would find these @class attributes:
<myConcept id="topicid"
  class="- topic/topic concept/concept myConcept/myConcept "
>
  <title>My Concept</title>
  <myConceptBody
    class="- topic/body concept/conbody myConcept/myConceptBody "
  >
</myConcept>

Note that these are attributes of element instances. While we tend to think of the @class attribute as something that is set in DTDs or XSDs, that is merely a convenience. What's really important is that the attributes are available to XML processors, which will be the case whether they are defaulted in DTDs or specified explicitly in instances—the two are identical to XML processors.

The magic of the @class attribute is that specialized DITA documents can "just work" when processed by general-purpose specialization-aware processors, such as the DITA Open Toolkit.

One implication of this magic is that you can define new markup without needing to also implement all the different forms of processing that might be applied to that markup—it will just work. To the degree that your specialized markup doesn't require any specialized processing, then you will never need to implement new processing for it.

If your specialized markup does require specific processing, DITA-aware tools will tend to make adding that processing easier because they tend themselves to be modular. For example, the DITA Open Toolkit provides a general plugin mechanism that makes it easy to implement and deploy specialization-specific processing that extends the out-of-the-box processing using the smallest amount of custom code possible.