[wp-docs] New WordPress Handbook

Fri Jan 23 21:57:43 GMT 2009

I've waited a while to see how this conversation develops before joining in.

Lorelle (and other non technical folks) the fact that Subversion is to
be used is not really very important. It is merely revision control of
the changes. The most important thing we will get from Subversion, as
Mike and Matt pointed out, is the ability to branch the book's
'source' and to continue to develop those branches of the book
independently.

Austin you asked what problems using subversion will solve that the
wiki doesn't  - the main one is the ability to have one or more
'released' versions of the book at the same time as one or more
versions in development. A wiki is always the 'latest' version,
regardless of whether people are in the middle of updating things.
Sure you can look at past revisions of individual pages, but for
someone who wants to look at the 'current' documentation as opposed to
the half finished next version, they're stuck. You end up trying to
work around this problem with duplicated extra pages or marking up
sections of a page with a version indicator. Far from ideal.

For those assuming the process of submitting updated and new content
for a book will be in any way related to the process for submitting
code, there really is no correspondence. If you submit a patch to the
handbook that is 'wrong', at worst a paragraph or two might not make
sense, or will be grammaticality incorrect.  Whereas submitting code
that is worng means the program doesn't run or it corrupts the
database or leaves a security hole.
The two are not comparable. I also don't' expect the exact same set of
people will be involved.

Now to the important stuff: The real value of using the 'subversion
book' method is that it is using docbook to generate the output. That
is, the raw content of the book (the chapters, paragraphs, tables,
screen shots, etc.) is stored in DocBook's XML format. This has a
number of advatnages, many of which have been raised in this thread,
including supporting multiple formats, building the book on demand and
so on.

However, I need to raise some very serious problems with the
'subversion book' method.

Firstly, the subversion book is stored in just 16 files! That is each
chapter and appendix is a single file. The largest of which is 300K.
It will be impossible for multiple people to contribute to such large
pieces of work with actual patches. As soon as the first couple of
patches are applied the rest will be hopelessly out of date.

Which brings me to a  second problem. Subversion does not to my
knowledge diff xml files as xml files. That is, it does not understand
the semantics of an xml file structure. It treats them as ascii
records (lines) and that doesn't work for structured files. I am not
aware of a third party add-on that will do the job either (subversion
supports pluggable diff engines). At least not an open source one.

The upshot of which is that I do not believe subversion can be used to
directly manage 'patches' to the document in the way it is used for
code.

The files size issue could be addressed by splitting things up into
much smaller pieces (I presume DocBook supports that), but it still
doesn't solve the diff and patch issue.

So how to address that problem?

As was mentioned by Tom Johnson, I believe DITA is the correct
solution. Some background: At my last job the technical documentation
team and I came up with a solution to overwhelming documentation
demands and problems by creating a system based on DITA, Subversion,
some home grown php code, oh, and WordPress.

In a large nutshell, the DITA system breaks the content (the words in
the book) into manageable, and *reusable* chunks. Those chunks can be
as large as a whole chapter (not practical as given above) or as small
as a single phrase (which could be just one word -- probably going to
far). Most chunks are at the sub-topic level -- a couple of paragraphs
or so.

These chunks are then assembled in two ways, the first, by creating
topics which reference (and pull in) the smaller chunks to create a
topic about a particular subject.
Then those topics  are pulled to together in (a topicmap or bookmap --
like docBooks 'book' xml) to create chapters and whole books.

Now for the clever bits.

Firstly, single sourcing or re-use: wherever you have a common phrase,
or couple of sentences (a chunk), perhaps a common overview of a
subject that needs to be in several different topics (because in
reference books you do have to repeat yourself a lot), you write that
once, then reference it in all the topics it has to appear.
Got a typo - fix it once, and all the topics which reference it are
now fixed! Need to update it for the new version of WP - update it
once and all those topics are now updated.

Next: audiences and other meta data. Think of adding tags to the
content. Tags like 'WP2.6, WP2.7, Advanced-level, beginner-level,
offline, etc. You can add those at the level of an individual chunk, a
topic, or a whole chapter (in a topicmap).

So you might have a topic that begins with an intro (pulled in from a
common chunk) an overview, a simple beginners run through (tagged
beginner-level), followed by an advanced run through -- tagged with
advanced-level. Finally you tag all the screen shots with 'offiline'.

Now imagine being able to say "I want to build a beginner's manual for
version 2.6 in PDF"  - so you include chunks with tag WP2.6, exclude
chunks with tag advanced-level and include chunks with tag offline,
and build as PDF.

Or you want version 2.7 advanced manual in RTF? Include chunks with
WP2.6, beginner-level, advanced-level, and offline and build.

How about a beginners guide for on-line help use. Exclude
advanced-level and exclude offline, build as XHTML. Why exclude
offline? Screen shots don't add much to the 'help when you have the
screen in front of you.

Oh and I mentioned WordPress -- I built an importer which will take
the XHTML generated by DITA and import it into WordPress as a
hierarchical set of pages. I even have a version for WPMU that
supports populating each mu blog with separate areas of content.

Here is a link about DITA for the curious:
http://dita.xml.org/book/getting-started

Rather than have you Google search for this stuff and the lovely Miss
Von Tees show up at awkward moments, try my delicious links
http://delicious.com/mikelittle/dita

By the way, this doesn't solve the problem of diffs and patches, but
the much smaller chunks are manageable with a locking pattern rather
than copy and merge (I know its sacrilege to use Subversion in that
way!)

Hope this gives some food for thought. I have a whole load more on
process, standards, user contribution mechanisms and the need for
control.

Mike
-- 
Mike Little
http://zed1.com/