SPFE vs DITA

DITA is the best-known structured writing architecture in use today, so any other architecture needs to explain what it offers that is different from DITA. Every architecture or system optimizes for some things at the expense of others. SPFE is not intended to be another DITA. Rather, it presents a very different architectural approach to structured writing that optimizes for different things.

Neither SPFE nor DITA is a closed system. Both are extensible, both have the capacity to create structured data, and both allow you to add functionality by creating new scripts. Thus any comparison between SPFE and DITA is not about ultimate capability — any form of output you can create with DITA, you can create with SPFE, and vice versa,  just as any output you can create with  Word, you can create with Frame, and vice versa. The differences between the two have to do with optimization, not output. Different business needs are best served by different kinds of optimization. Here is a comparison of the SPFE and DITA approaches to some key structured writing and publishing features.

Queries vs. Maps

Perhaps the most obvious difference between SPFE and DITA is that DITA uses maps to organize content and SPFE uses queries. In DITA, if you want to create a collection of topics for a specific purpose (a book, a help system, etc.) you select and order the topics for that collection by writing a DITA map. The DITA publishing process then reads that map, selects each named topic, and creates the topic collection. To create a topic collection in SPFE, you write a query expression that selects the topics to be included based on their metadata. One of the most obvious consequences of this is that if you create a new topic in a DITA system, that topic will not occur in any topic collections unless the maps for those collections are explicitly updated to include it. In SPFE, if you create a new topic, and apply the correct metadata to it, it will automatically be included in every topic collection whose query selects it based on its metadata.

Topic style

DITA is based on a specific model of topic-based authoring, which is a derivative of information mapping, and is based on the idea that every topic belongs to one of three basic types: task, concept, and reference. In choosing DITA, one is also choosing its particular brand of information mapping. SPFE is not based on this model, thought it isn’t antagonistic to it. You can implement a task/concept/reference topic scheme in SPFE if you want, or you can implement something different. (Given SPFE’s database orientation, it makes a lot of sense in SPFE to model a reference as a database rather than as a set of topics, for instance.)

SPFE is intended to support the Every Page is Page One (EPPO) approach to topic design. EPPO is not inconsistent with information mapping, as you could create EPPO topics using the information mapping design paradigm. The EPPO approach is really only about making sure that any topic works as page one for a reader, and that it does the job it is supposed to do, and no other. These constraints are neither required by, nor inconsistent with, information mapping.

Linking

One of the key features of SPFE is its use of soft linking. Soft linking is a process by which links between topics are not formed by authors creating links directly, but by authors marking up mentions of real-world objects. The system uses queries at build time to find topics on those objects and creates links to those topics automatically. (This is another case of using queries rather than maps.) The three principle advantages of soft linking are:

  • Authors do not have to discover resources to link to (thus saving time and allowing far more links to be created).
  • Resources created after the current resource will automatically be found and linked to in the next build (thus removing the need to update a topic to link it to new resources).
  • When a topic is reused in a new context, it is automatically linked to the resources available in that context (thus removing the need to remove links to reuse content, or to maintain multiple map files to specify link targets in each new context).

DITA uses conventional hard links, and indirect links managed by maps.

Reuse

Reuse is certainly the number one selling point of DITA. DITA is architected very much to allow the greatest scope for content reuse. By allowing a DITA topic to consist, in some cases, of nothing more than a heading or a sentence, and using maps to stitch these tiny topics together into larger topics, DITA allows you to optimize the percentage of reuse in your project.

However, DITA reuse does come with a cost, since to reuse content in this way, authors have to search the repository for reusable content and then create maps to stitch the pieces together. All of this creates authoring and content management overhead which adds to the cost of the system and the content.

SPFE also supports a degree of granular reuse through it support for fragments and strings, but it is more constrained than DITA in this regard. In general, the SPFE approach to reuse is again based on queries. If topic collections are created by queries, any given topic will naturally show up in any topic collection whose query selects it. Providing the topic metadata is set correctly, a topic will automatically be reused by every query that selects it. This greatly reduces the authoring and content management overhead for reuse. Thus while SPFE may not support reuse as granular and ad-hoc as DITA, reuse in SPFE can be significantly cheaper to set up and maintain.

One limitation of the DITA reuse model is that is only supports reusing DITA content. Topics have to conform to a DITA topic type to be reusable. SPFE, on the other hand, does not mandate topic types and is therefore capable of accessing, importing, and reusing a wide variety of content from heterogeneous systems.

Also, because it uses soft linking, SPFE provides a greater degree of what might be called reuse-by-reference, in which a topic is reused in different contexts not by actually appearing in those contexts but by being referenced from them.

Exchange

One of the considerations in creating a structured writing system is the exchange of content both within and outside the enterprise. SPFE and DITA take a different approach to exchange. The DITA approach is to foster exchange through the promotion of DITA itself as a universal standard for content creation. The DITA specialization mechanism is designed so that even content from different specializations can be exchanged and can be published by the base topic transform in any DITA system (with, potentially, some loss of quality).

SPFE rejects the idea that exchange requires all content to be created in the same format. In fact, the SPFE philosophy is that it is always a mistake to author content in an exchange format. Instead, the SPFE philosophy is that content should always be created in a format that is as specific as possible to the particular organization and the particular subject matter. The use of highly-specific formats helps ensure a high degree of accuracy and consistency in the content. For exchange, the content is transformed into an appropriate exchange format. In general the use of highly specific formats for authoring ensures a highly consistent translation to the exchange format, producing a higher quality content exchange than if authors had written in the exchange format. It also means that SPFE systems are not limited in which systems they can exchange content with.

Specialization vs. Modularity

One of DITA’s most noticeable features, and the one from which it takes its name, is its specialization mechanism, a design inspired by, though not fully implementing, the concept of inheritance from object-oriented programming. In DITA, all topics are specializations of a single base topic type. Publishing scripts use an analogous specialization mechanism. The specialization mechanism provides a useful degree of reuse for schemas (DTDs) and processing code in DITA, and it underpins DITA’s approach to exchange. Its disadvantages are that it somewhat restricts how schemas can be constructed, since they must all be specializations of the base DITA type, or of another specialization. Only the features supported by specialization are therefore available to schema designers, not the full set of possibilities offered by native XML. Also, the specialization mechanism for code means that effectively DITA is locked into using a single programming language, XSLT, for all DITA processing. The specialization mechanism itself is based on one of the properties of the XSLT language, and even if other languages could express the same logic, their code would not integrate with the rest of the processing tree that is written in XSLT.

SPFE rejects the limitations imposed by specialization and takes its inspiration from a different programming paradigm: modularity and loose coupling. SPFE combines a high degree of modularity in the design of both schemas and code, with a loosely coupled modular architecture that uses XML as a communication mechanism between the layers. This means that there are very few architectural restrictions on how schemas for authoring and content capture are written, and no restrictions on which languages are use to implement different stages of the publishing process.

Content management

While you can do DITA without a content management system, most experts would agree that for any but the smallest and simplest system, you need a CMS to do DITA successfully. In no small part this is due to the need to create and maintain maps, which requires frequent review of the content in the system. Because it uses queries rather than maps, SPFE is much less dependent on the capabilities of a content management system. This does not mean that SPFE systems can’t use a CMS, but they don’t require one, which means that they can take advantages of other kinds of repository design such as databases or version control systems. It also means that SPFE systems can be implemented using using open source version control systems such as Subversion, and can be integrated more tightly with the source control and build systems used by software development organizations.

Focus: the common vs. the specific

Overall, the difference in focus between DITA and SPFE might be summed up as the difference between the common and the specific. DITA aims to be a common platform not only for technical communication but for a wide range of corporate information as well. While its specialization mechanism does allow the development of more specific topics types, they are all built on a common base, and the emphasis of the DITA community is clearly more on encouraging the widespread adoption of certain common industry-wide specializations than on the development of specialized systems in individual organizations and groups.

SPFE, on the other hand, is designed to facilitate the development of highly specific systems while reducing the costs associated with developing a specific system through its modular architecture and its lessened dependence on complex content management systems.

One feature of a specialized systems is that, in being more specific to a particular organization or group, they can be simpler to use, just as, for instance, it is easier to prepare your taxes in a tax preparation program that is specific to the task than in a spreadsheet program that is not.

A more specific system also allows a greater degree of automation, and a finer degree of error preventions and detection. Because it is specific to the particular task of preparing a tax return, tax preparation software does more for you, prevents many mistakes, and catches many more. While SPFE can certainly be used in a general way, it is really optimized for building structured authoring systems that are highly specific to an individual content creation task (while taking advantage of a common publishing stream), thus providing greater ease of use for authors, increased automation, and improved error prevention and detection.

SPFE and DITA, then, represent very different approaches to structured writing and publishing. While SPFE does challenge DITA in the sense that it contests DITA’s push for commonality at the authoring level, it should not be seen as another tool for doing the same thing DITA does. Rather, SPFE represents a different way of thinking about and executing a structured authoring strategy, one that optimizes for different things and is therefore appropriate for different types of projects serving different types of business needs.

{ 2 comments to read ... please submit one more! }

  1. Peter Fournier

    One of the questions I’ve always had about all query based dynamic generation of information is the sequencing problem.

    If I want to install a major piece of equipment in a Central Office, there has to be a sequence in the installation. See crude outline below.

    So, how do I create a metadata lexicon that ensures sequencing at the gross and at the detailed level, especially when the details may change depending on the exact configuration of the product the customer has received.

    I can easily see how to dynamically deliver complete documentation based on the Bill of Materials (BoM), but that would depend on standalone, complex chunks, related to every item in the BoM.

    How can equivalent sequence be done based on a query?

    1) Mark-up the floor based on the provided template to ensure the bolt-hole locations are properly laid out.
    2) Bolt such and so “A” to the the floor.
    2-1) Optional instructions on bolting.
    3) Attach the vertical posts to the such-and-so “A”.

    ….

    4) Unpack the server.
    5) Set up the power.
    6) Install the brackets that will receive the server.

    … many steps in server install

    8) Install the power converter.
    9) Connect converter to server.
    10) Connect converter to CO power.

    • Hi Peter. Thanks for the comment.

      Indeed sequence is a key issue. I would start off by making a distinction between the sequence of a task and the sequence of topics. A sequence of a task has to be kept in the right order for correct performance, and so if a component based authoring system broke those steps up into separate pieces, they would indeed have to use a manifest to assemble them correctly. I would suggest, however, that it is unwise to move operational information from the content to the metadata. My view is that if there is sequence to a task, that should always be presented unbroken in a single content component.

      A sequence of topics, on the other hand, expresses an author’s view about the ideal order in which the reader should read the topics. The author is setting up a curriculum for the reader. Here, the question becomes, does the reader actually want such a curriculum established for them, or do they prefer to choose their own path through the content. Many readers, clearly, prefer to choose their own path. If the author wants to impose an arbitrary curriculum, they will indeed need a manifest. If they want to accommodate the readers desire to choose their own path, they don’t.

      This supposes, of course, that we are dealing with components that are designed to be readable in independent order — in other words, that they are Every Page is Page One topics. If they are written so that they only make sense in a particular sequence, then again you will need a manifest to keep them in that order.

      Returning to the issue of the order of tasks, it is worth noting that tasks are fractal. That is, each step in a high level task can be broken down into several lower-level tasks. You can go on breaking down the fractal structure of tasks practically to infinity, but most technical documents only include a few levels of this practically infinite breakdown. In many cases, you will find two level tasks in which each major step is described, and then a minor procedure is given for performing this step. Occasionally you will even see three levels. And occasionally you will see these nested procedures flattened, often resulting in procedures with 20 or 30 steps in them. These are the kinds of issues that your example deals with.

      The approach I would generally recommend for this is to put the fractal levels of procedure into separate topics, and reference the minor procedure from the major procedure in the source. That way, each procedure level is fully contained in a single topic, and you don’t need a manifest to keep the steps in the right order. For presentation, you can then retrieve the minor procedure (by query) and either insert it into the overall procedure flow, or link to it.

{ 0 Pingbacks/Trackbacks }

Leave a Reply