[Rx] How to Convert PDF to ePUB

epub02Converting a PDF to ePUB format isn’t difficult, but getting quality results seems to be nigh on impossible.

DevelopRx assembled a team of crack engineers (actually just me) to tackle this issue and solve it using readily available open source software (a.k.a. FREE!).

The process described in this article is primarily for text-based PDFs with common headers and footers otherwise known as “books.”

 

Problem

A typical book-type PDF normal has chapter headings and page numbers. When these pages are converted to another format like ePUB, all these artifacts litter the text. Now, it’s possible to go in and edit all these strings but it’s very tedious.

Typical PDF with Header and Footer

2021-05-08_112117

Converted PDF in ePUB Format

2021-05-08_111004

Solution

The solution is to crop the PDF and chop off the headers and footers. Save that header-less, page-number-less file and then convert to ePUB. Seems so easy, no? But wait, there’s more!

When a PDF is cropped, the only thing that changes is the visible portion. This means you’ll still get all the headers and footers in the converted ePUB. I won’t tell you how long it took me to figure this out…

If you have Adobe Acrobat Pro (which you can rent for $15/month), you can do the cropping magic and actually easily delete the cropped area. Yay! Or you can…

Convert PDF to ePUB using Open Source Software

Download Briss, Calibre, LibreWriter (part of LibreOffice).

BLISS

Run Briss to crop pages.

Briss is a simple Java program that runs by clicking the executable.

2021-05-08_135453

Select your PDF under File > Load File.

2021-05-08_140853

Cancel this.

2021-05-08_141005

When the file is loaded, it will show all the pages overlapped on one another for even and odd pages. Usually the program detects the boundaries for you. You can also drag the corners to get the max page width to set the boundaries.

Under Rectangle > Select all and then under Action > Crop PDF

This will save the file with a _cropped extension:  Lorum Ipsum_cropped.pdf

LIBRE WRITER

Run Libre Writer and open the cropped PDF in Writer. It may take awhile if it’s a large file.

2021-05-08_143328

Depending on your default font, you may get a file with words outside the boundaries.

2021-05-08_143535

DON’T PANIC!

This appears to be a font problem. The easiest way to solve it is to change the size of the pages:

Page > Properties

2021-05-08_152915

Change the width (not the height!) and set the margins to zero.

This put the text back into the boundaries and still leave the headers and the footers in the Gray Zone.

2021-05-08_153214

Export the file to PDF.

Image 4_cr

CALIBRE E-BOOK MANAGER

Open Calibre and drag over the latest exported PDF file.

2021-05-08_154459

Press Convert books > Output format: EPUB (upper right) > OK (bottom right)

When the little chipmunks stop running on the wheel in the bottom right of the main screen, the converted PDF in ePUB format should be in your library.

There may be a few line return errors but the rest should be…

 All good!