How to Convert Large PDF Files to EPUB — Files Over 50 MB
Most online PDF-to-EPUB converters impose a file size limit — typically 10 to 50 MB. This is a real constraint for scanned books, illustrated textbooks, and multi-chapter academic works that regularly exceed 100 MB. This guide covers strategies to work around these limits.
Why are some PDFs so large?
- Scanned books: Each page is a high-resolution image (300 DPI TIFF or JPEG). A 300-page scanned book can easily reach 150–300 MB.
- Embedded high-res images: Art books, cookbooks, and scientific papers with full-color figures embed large image files.
- Embedded fonts: PDFs with embedded font subsets add 1–5 MB per unique font family.
- Multiple layers: PDFs with OCR text layers (text + image) can be double the size of the image-only version.
Step 1: Compress the PDF first
Before converting, reduce the PDF's file size. Image resolution is the biggest lever:
Using Ghostscript (free, command line)
# Compress to screen quality (72 DPI) — smallest file, good enough for reading
gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/screen -dNOPAUSE -dQUIET -dBATCH -sOutputFile=compressed.pdf input.pdf
# Or ebook quality (150 DPI) — better balance
gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/ebook -dNOPAUSE -dQUIET -dBATCH -sOutputFile=compressed.pdf input.pdf
/screen targets 72 DPI (screen viewing). /ebook targets 150 DPI (e-reader quality — recommended). /printer keeps 300 DPI. For EPUB reading, /ebook is the right balance.
Using online tools
iLovePDF (ilovepdf.com) and Smallpdf (smallpdf.com) both offer free PDF compression. Good for one-off files when Ghostscript is not installed.
Expected size reduction
| Original type | Ghostscript /ebook | Typical result |
|---|---|---|
| 300 DPI scanned book (200 pages) | 200 MB → ~30 MB | 85% reduction |
| Art book with color images | 150 MB → ~20 MB | 87% reduction |
| Academic paper (text-heavy) | 5 MB → ~3 MB | 40% reduction |
Step 2: Split into smaller parts
If compression alone doesn't get you under the limit, split the PDF into chapters or sections. Each part can be converted independently and later merged in Calibre or Sigil.
Using pdftk (free)
# Split pages 1-100 into part 1
pdftk input.pdf cat 1-100 output part1.pdf
# Split pages 101-200 into part 2
pdftk input.pdf cat 101-200 output part2.pdf
Using Python (pypdf)
from pypdf import PdfReader, PdfWriter
def split_pdf(input_path, pages_per_chunk=100):
reader = PdfReader(input_path)
total = len(reader.pages)
for start in range(0, total, pages_per_chunk):
writer = PdfWriter()
end = min(start + pages_per_chunk, total)
for i in range(start, end):
writer.add_page(reader.pages[i])
out = f"chunk_{start:04d}_{end:04d}.pdf"
with open(out, "wb") as f:
writer.write(f)
print(f"Written: {out}")
split_pdf("large_book.pdf", pages_per_chunk=80)
Step 3: Use the API for files over the browser limit
The toolkit.bot API supports larger files than the browser tool. Use the REST API directly:
# Check your plan's file size limit
curl https://toolkit.bot/api/v1/limits -H "Authorization: Bearer $TOOLKIT_API_KEY"
# Upload a large file via API (same as normal upload)
curl -X POST https://toolkit.bot/api/v1/jobs -H "Authorization: Bearer $TOOLKIT_API_KEY" -F "file=@large_document.pdf"
API users on paid plans have higher file size limits than the browser tool's free tier. See toolkit.bot/api for current plan limits.
Batch processing a large collection
For converting a whole library of large PDFs, use a shell script combining compression, splitting, and API upload:
#!/usr/bin/env bash
set -euo pipefail
for pdf in *.pdf; do
size=$(stat -c%s "$pdf" 2>/dev/null || stat -f%z "$pdf")
epub="${pdf%.pdf}.epub"
if [ "$size" -gt 52428800 ]; then
echo "Compressing $pdf ($(( size / 1048576 )) MB)..."
gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/ebook -dNOPAUSE -dQUIET -dBATCH -sOutputFile="/tmp/compressed_$pdf" "$pdf"
pdf="/tmp/compressed_$pdf"
fi
echo "Converting $pdf..."
UPLOAD=$(curl -sf -X POST "https://toolkit.bot/api/v1/jobs" -H "Authorization: Bearer $TOOLKIT_API_KEY" -F "file=@$pdf")
JOB_ID=$(echo "$UPLOAD" | grep -o '"job_id":"[^"]*"' | cut -d'"' -f4)
for i in $(seq 1 60); do
sleep 5
STATUS=$(curl -sf "https://toolkit.bot/api/v1/jobs/$JOB_ID" -H "Authorization: Bearer $TOOLKIT_API_KEY")
if echo "$STATUS" | grep -q '"done"'; then
DL=$(echo "$STATUS" | grep -o '"download_url":"[^"]*"' | cut -d'"' -f4)
curl -sfL "$DL" -H "Authorization: Bearer $TOOLKIT_API_KEY" -o "$epub"
echo "Done: $epub"
break
fi
done
done
FAQ
What is toolkit.bot's file size limit?
The free browser tool supports files up to 50 MB. API users on paid plans have higher limits — see the pricing page at toolkit.bot/api.
Will compressing the PDF affect conversion quality?
Ghostscript's /ebook setting (150 DPI) is appropriate for reading on e-readers. Text fidelity is unaffected — only image resolution is reduced. For purely text-based PDFs, compression has negligible quality impact.
Can I merge the EPUB parts back together?
Yes. Use Calibre: add all EPUB parts to the library → select them all → right-click → Edit and merge books. The merged EPUB will have all chapters in sequence. You may need to manually fix the table of contents in Sigil after merging.
Upload your PDF (up to 50 MB free) at toolkit.bot — or use the API for larger files.