On Process: Generating Documentation with Markdown, PHP, and HTMLdoc
Working on a fantastic project lately; a PHP platform developed with Smarty template engine that will power three (and more) websites. Each website has a shared library used to bootstrap the app and the site-specific logic is contained in the extensible smarty templates.
Each site has a large body of "documentation" files. The client's previous situation was such that each person responsible for the documentation was using their own (and thus, different) solution. So I recommended settling on a standardized documentation format.
I thought Markdown was a great fit and found a fantastic PHP port of John Gruber's tool. This particular version—Markdown Extra—extends the Markdown port to footnotes, tables, and other nice things. I created a very simple smarty plugin (the initial version below) for documentation writers to be able to load in a markdown file into a Smarty Template:
// smarty/plugins/block.markdown.php
function smarty_block_markdown($params, $content, &$template, &$repeat)
{
if (!$repeat)
{
if (isset($params['source']))
{
$source = $params['source'];
$contents = file_get_contents($source);
$rendered = Markdown($contents);
}
else if (!empty($content))
{
$rendered = Markdown($content);
}
else
{
return "";
}
return $rendered;
}
}
This is quite a simplified version of the final plugin. The final includes support for replaceable tokens among other things, but basically, it's used like so:
{* render the contents of a file called markdown/file/name.md as HTML *}
{markdown source="markdown/file/name"}{/markdown}
{* render the markdown source in the block directly *}
{markdown}
# header {#header-id}
This is some markdown.
{/markdown}
So why separate the markdown instead of just injecting it directly into the smarty views or just writing it in HTML? Well, part of the requirements of the project was that any documentation written for the website should be able to be easily converted to a PDF. So how do we do that? After some research on what tools would be easy for others to use as well, it ended up being a simple process; in short:
-
Load markdown files, parse with Markdown Extra, converting to HTML.
-
Take the concatenated HTML and pipe the output into HTMLDOC.
HTMLDOC is an open-source command-line tool that converts HTML documents into PDFs. I decided to use this tool to automate the build of the PDF documentation. So for web viewing, Markdown is converted to HTML and cached on subsequent requests as HTML via Smarty, and for download, the PDF docs are created using the same Markdown; a nice separation of concerns, I thought.
Here's a section of the build-docs PHP script that is run during the ANT build for each site. Prior to this section, the file looks in the documentation and loads specified markdown files and parses them with Markdown extra and loads the concatenated results into a temporary HTML file. This file is then piped into HTMLDOC:
$command = "htmldoc ";
$command .= "--book "; // generate in book format with TOC
$command .= "--links "; // link up hyperlinks
$command .= "--title "; // include a title page
$command .= "--toctitle " . escapeshellarg($toc_title) . " "; // TOC title
$command .= "--linkstyle "underline" "; // what to do with hyperlinks
if ($title_image)
{
$command .= "--titleimage " . escapeshellarg($title_image) . " ";
}
$command .= "--footer h./ ";
$command .= "--header .t. ";
//$command .= "--bodyfont helvetica ";
$command .= "-t pdf14 ";
$command .= "-f {$output_file} {$temp_file}";
exec($command, $result, $return);
We execute the htmldoc command via PHP and the documentation PDF is generated. The variables allow us to create a bootstrap file for each site's documentation to configure the output a little bit.
The final requirement for the documentation was the addition of tables of contents to any markdown file and documentation. The documentation writers wanted to be able to have a list of the <h1-6> tags linked up to the corresponding section on the page. I accomplished this by adding another parsing block to Markdown Extra. In short, this block uses Markdown Extra's existing list of parsed headers and wraps them in a <ul> list with anchor tag links. The regular expression for matching the "toc" is as follows:
"/{toc(?:|?([1-6])(?::([1-6]))?)?}/
Ain't regex lovely? This matches some of the following, replacing it with the rendered unordered list where it appears in the markdown document.
{toc}renders the entire table of contents.{toc|3}would display<h1>through<h3>tags in the table of contents.{toc|2:5}would display<h2>through<h5>tags in the table of contents.
Anyway, that's some of the process stuff I've been plugging away on lately. I've sort of become obsessed with writing everything in Markdown now. If I need to send to another developer or client, I'll do something like the following:
alias markdown='/path/to/Markdown_1.0.1/Markdown.pl' markdown my-doc-file.md | htmldoc --format pdf14 - > my-doc-file.pdf
(Obviously there'll be a bunch of params for htmldoc as seen in the PHP example above). Quite simply converting Markdown and piping to htmldoc. Of course, I don't get to use my fancy extended Markdown with tables of contents, but for day to day writing, it's ideal. Bam, lovely looking PDF.
Depending on the completion of these projects, I may release the updated Markdown Extra with Table of Contents and the related Markdown Smarty plugins. Check back later with me.