This section explains the criteria behind the toolchain selection, which resulted in
EPUB as the chosen eBook format
MathML for math support
LateXML as the XHTML compiler
Calibre for EPUB3
Bitmap formulas look awful, so the only two approaches to reflowable, good quality math available today for browsers and eReaders are MathML and MathJax. MathML is a standard supported natively by most browsers (although, surprisingly, not by Chrome) in which equations are encoded in XML and rendered on the fly. The second approach, MathJax, allows for math to be written in a variety of formats (including LaTeX and MathML itself) and the rendering is performed by Javascript; obviously, this requires a browser (or a reader) equipped with Javascript and sufficient computational power. MathML is “understood” by MathJax, so I chose MathML.
Today, the most common devices to read eBooks are:
Kindle Paperwhite (or similar e-paper devices)
iPad or Android tablets
PCs
On the other hand, there are about thirty eBook formats in use worldwide. Many of these are either specialized formats (e.g. comic books) or proprietary formats (e.g. Sony reader format). The most important items for our purposes are:
EPUB: an open standard now in its 3rd version based on HTML+CSS. Supports MathML and Javascript.
iBook: a non-standard EBOOK subset by Apple; not compatible with EPUB readers. Supports MathML and Javascript.
AZW: Kindle’s proprietary format (now in version 8). Does NOT support MathML.
PDF: although not an ebook standard per se, still very relevant. Supports Javascript.
Each standard should be evaluated according to two independent criteria: how beautiful the books look and what devices these book can be read on; additionally, one should consider DRM and marketing channels.
iBook produces very high quality books since the books only work on Apple devices and therefore the environment is fully controlled; however, books only work on Apple devices.
AZW works natively on Kindle but Amazon has Kindle reader software for all platforms (including Apple devices)
EPUB is an open format and readers exist for tablets and PCs (but not for the Kindle). However, your mileage may vary with each reader since the standard is open and potentially very broad.
PDFs are supported by all devices but reflowing is not possible.
AZW (the Kindle format used by the ”classic” Kindles) does not support MathML so equations must be rendered as graphics. However, and this is critical, the AZW format does not support inline graphics, so inline equations look terrible. For classic Kindles, the best approach is simply to produce a fixed-layout PDF with a size compatible with the Kindle screen. (This is what Amazon does already for existing textbooks).
iBook is a closed standard and Apple handles the sales channel with an iron fist. Since iBook is basically a proprietary dialect of EPUB and since excellent EPUB readers exist for iPad, iBook would be difficult to recommend over EPUB.
EPUB therefore seems to be the obligatory choice although, being an open standard, one should not expect the flawless level of device support offered by the proprietary formats.
Since the chosen target format, EPUB, is based on XHTML, the first order of business is finding a good tool to convert LaTeX to HTML. Since TeX is a Turing-complete programming language, conversion is anything but trivial and, indeed, independently of the chosen tool, there will be limitations on the external packages that can be used in a convertible a manuscript.
Amongst the available conversion tool, I chose LaTeXML [3] since it formats equations in MathML and it supports a reasonable number of external packages via so called “bindings” (see Figure 1 and [4] for the up-to-date list). A binding is a piece of custom code that makes the package understandable by LateXML; writing a custom binding seems to be a daunting task so we will not attempt that here.
LaTeXML proceeds in two passes: first it converst LaTeX to XML, then it formats the XML via style sheets into XHTML.
The EPUB format is simply a compressed archive containing a set of XHTML pages (the chapters of the book) plus CSS files and a variety of indexing files. Unfortunately, direct conversion to EPUB from LaTeXML’s XML file is still experimental and undocumented. Until this changes, I found that the most robust way to package the XHTML file into an EPUB compliant document is to use Calibre’s ebook-convert utility [7]