Converting Academic Press PDFs to EPUB — Springer, Elsevier, Wiley
Academic press PDFs from Springer, Elsevier, and Wiley present specific challenges for conversion: two-column layouts, figure captions, multi-level footnotes, cross-references, and complex mathematical notation. This guide covers what to expect and how to get the best results.
Challenges specific to academic PDFs
- Two-column layout: Journal articles use side-by-side columns. A naive converter reads left-to-right across both columns, producing garbled text. Correct conversion requires column detection and re-flow.
- Footnotes and endnotes: Academic papers use extensive footnotes that must be separated from body text and linked correctly.
- Mathematical notation: Equations may be embedded as images (from older LaTeX pipelines) or as actual math objects. Image equations are selectable only as images; real MathML is searchable and accessible.
- Figures and captions: Figure placement relative to body text is fluid in PDF; converters must decide where to insert figures in the reflowable output.
- References section: Dense bibliography formatting requires correct paragraph detection to avoid merging adjacent references into a single block.
Springer / Nature PDFs
Springer journal articles and book chapters are typically formatted as two-column PDFs with a clear header hierarchy. Key characteristics:
- Standard two-column layout in journals; single-column in book chapters
- Numbered references in brackets (e.g., [1], [23])
- Figures placed inline with body text, numbered sequentially
- Abstract in a shaded box at the top
Conversion tip: Springer book chapter PDFs (single-column) convert significantly better than journal article PDFs. If you have both formats, prefer the book chapter version for e-reader use.
Elsevier / ScienceDirect PDFs
Elsevier articles are notoriously complex — heavy use of columns, sidebars, author contribution statements, and keyword blocks interrupt the body text flow.
- Two-column layout with wide margins
- Highlights box and keyword list at top (often before abstract)
- Author affiliations as footnotes on the first page
- Numbered references with full bibliographic detail
Conversion tip: Elsevier offers HTML full-text on ScienceDirect for most articles — HTML downloads often convert to EPUB better than the PDF, since column layout issues don't apply. Check if an HTML version is available before converting the PDF.
Wiley Online Library PDFs
Wiley's PDF layout varies by journal but generally features:
- Two-column or single-column depending on journal
- Supporting information appendices at the end
- Wiley branding headers/footers on every page
Conversion tip: Wiley headers and footers sometimes interfere with page boundary detection. If the converted EPUB has stray header text mixed into body paragraphs, this is the likely cause. toolkit.bot's layout detection handles most Wiley patterns correctly.
Best conversion approach for academic PDFs
- Try toolkit.bot first — toolkit.bot handles two-column detection and re-flow for academic layouts. Upload the PDF and check if the reading order is correct.
- Check the figures: Figures may appear out of order in the EPUB relative to the PDF. This is a known limitation of reflowable EPUB — figures can only be placed where the HTML flow allows.
- Verify math rendering: If equations appear as blank spaces, the PDF encoded them as custom fonts rather than Unicode math symbols. This is a PDF-level issue that no converter can fully fix without the original LaTeX source.
- For Calibre users: Calibre's PDF-to-EPUB conversion does not handle two-column layouts well. toolkit.bot or a dedicated academic converter will produce better results.
| Publisher | Layout | Typical conversion quality |
|---|---|---|
| Springer (book chapters) | Single-column | Excellent |
| Springer (journal articles) | Two-column | Good |
| Elsevier | Two-column | Good (HTML preferred) |
| Wiley | Varies | Good |
| Nature (journals) | Single-column | Excellent |
| PLOS ONE | Two-column | Good |
| arXiv preprints | Single-column (LaTeX) | Excellent |
FAQ
Can I legally convert a journal article PDF to EPUB for personal use?
Personal format-shifting for accessibility and personal reading is generally considered fair use in most jurisdictions. Redistributing the converted EPUB would violate copyright. Consult your institution's library for specific guidance.
Why does the math look broken after conversion?
Many older academic PDFs embed equations as images or using custom symbol fonts, not as real Unicode or MathML. When the custom font is not available, equations render as empty boxes or wrong characters. This is a limitation of the source PDF — the original LaTeX source would be needed to produce correct MathML.
Is there a better format than PDF for academic content on e-readers?
Yes: arXiv preprints are available in LaTeX source, which can be compiled to proper EPUB3 with correct MathML. For published articles, some publishers offer HTML full-text (Elsevier ScienceDirect, PubMed Central) which converts to EPUB more reliably than PDF.
How do I convert a whole journal issue?
Download each article as a separate PDF and convert individually, or use the toolkit.bot API with a batch script. See PDF to EPUB with curl for a shell script approach.
toolkit.bot handles two-column layouts, footnotes, and academic paper structure — free, no account required.