Crop PDFs
Monday, 28 January 2008 14:22

I came across a lot of ebooks in PDF format which are in principle suitable for display on the CyBook (e.g., Planet PDF), but feature margins that are way too large. Also, headers and footers, and advertisement are not really required. To crop a PDF file and maintain the structure like bookmarks and links can be done by replacing the MediaBox or CropBox and repairing the XRef table with pdftk (pdfcrop2):

perl -pe "s/(Crop|Media)Box\s*\[(.+?)\]/\$1Box\[$2\]/g;" $1 | pdftk - output $3

The script takes three arguments:

  1. Input file
  2. New crop box, i.e. "left bottom right top" in postscript pixels starting from the top left, e.g. "0 0 612 792". To obtain this one could use xv or any other image manipulation program that can open PDF pages without looking at their CropBox after.
  3. Output file