Sunday, April 24, 2005

DocBook investigation: progress update

Quick update on the DocBook investigation. I've tried a number of schema-aware XML editors for generating DocBook sources. Didn't like any of them. The problem, I think, is one of familiarity: I don't know the DocBook schema very well, so vanilla schema-assisted editing doesn't give me enough support. I tried saving OpenOffice documents as docbook. This works, but doesn't seem to offer much in the way of fine-grain control of the generated XML. Also, I couldn't get it to round-trip nicely. The best solution I've found so far is XMLMind. There's a free standard edition and a payware professional edition. I've only tried the standard edition so far. It's a Java/Swing application, with a slightly odd feel to the UI, but I quickly found myself adjusting. So far, it has been easily the most effective solution I've found for editing DocBook in XML, but with a WSIWYG (-ish) presentation. I've had to step outside the editor and directly hack the XML once or twice, for example to insert XInclude instructions to modularise my thesis into one-chapter-per-file chunks. But XMLMind was easily able to cope with the XIncludes once I had entered them. There may be a way of doing XInclude from the interface, but I couldn't see it. The standard edition of XMLMind doesn't generate PDF files: you need the professional edition for that. However, I yum install'ed fop from, and that works fine. It was nice to see that XMLMind is very up-to-date with the XSL stylesheets from SourceForge for transforming DocBook.

Next goal is bibliography processing. DocBook can handle references already, I just need my refs in the appropriate format. I have a large collection of existing reference data in ProCite for MS Windows. For managing the references on Linux, RefDB looks like a good choice. Unfortunately, I've not been able to install it so far due to incompatibilities with libdbi on Fedora Core 3. I've asked on the refdb list to see if anyone has a solution. It may also be that Fedora Core 4 has the more up-to-date libraries when it ships (the problem isn't just libdbi, but conflicts with the FC3 installed MySQL and PostgresQL). Fingers crossed. In the meantime, I haven't yet found a ProCite to refDB translator. ProCite can export data as a comma-delimited field, but the meanings of the fields are context-dependent on the reference type. I have a sinking feeling I may end up writing my own ProCite to XML converter. Sigh.

Final note: I've been using Bob Stayton's DocBook XSL: The Complete Guide, second edition, as one source of assistance in learning my way around DocBook's world. It's an excellent resource, thoroughly recommended. docbook


Serge Stinckwich said...

Why not mix Docbook with TeX, more precisely ConText (

Look here :

Ian said...

Serge -
Thanks for the suggestion. I used to use TeX (well LaTeX) early on in my career. I confess that LaTeX and I didn't get along that well. But it would seem to me that TeX and DocBook occupy similar territory, as text markup languages, but DocBook has the advantage of more clearly separating content (i.e. structure) and presentation. I think I'll continue the current investigation for now, but I'll surely come back and look more deeply at ConText if I get into difficulty.