0x2a.at

... and their answer to everything

Archive for Christoph Sieghart

Manipulating pdfs with pdfjam

posted on 2011-02-14, author: Christoph Sieghart

Every once in a while you need to manipulate pdfs on a high level. Thats where pdfjam comes to the rescue. It is the swiss army knive of high level pdf manipulation. The operations I perform the most are merging, splitting, trimming and converting documents to multiple pages per page.

Merging

To merge two or more pdf documents, just pass them to pdfjam with the appropriate page selectors.

pdfjam file1.pdf '-' file2.pdf '1,2' file3.pdf '2-' --outfile output.pdf

This will take all pages of file1.pdf, page 1 and 2 of file2.pdf and all pages up from page 2 of file3.pdf and merge them in a file called output.pdf. For more info an pdfjam page selector see the pdfjam help.

Splitting

Splitting works just the way merging does.

pdfjam file1.pdf '1,2' --outfile first.pdf
pdfjam file1.pdf '3-' --outfile second.pdf

If you know of a way to do it in one go, drop me an email.

Trimming

pdfjam --trim '1cm 2cm 1cm 2cm' --clip true file1.pdf --outfile output.pdf

This trims 1cm from the left and right, and 2cm from the top and bottom of the file. This is especially useful for removing blank margins from pdfs (really nice for reading on an ebook reader like the Amazon Kindle).

Upping

pdfjam --nup 2x2 file1.pdf --outfile output.pdf

This recombines the pdf file to contain 4 pages per page. Useful for printing slides.

This is just a small peek at what pdfjam can do. If you need help checkout

pdfjam --help

and the LaTeX pdfpages package.

tag: cli, tools

sed - non greedy matching

posted on 2008-07-08, author: Christoph Sieghart

I just needed non greedy matching in sed and the man page had nothing to say about it (I was hoping for a simple flag, but nada).

As do all other regexp engines I know of, sed uses greedy matching per default. The trick to get non greedy matching in sed is to match all characters excluding the one that terminates the match. I know, a no-brainer, but I wasted precious minutes on it and shell scripts should be, after all, quick and easy.

So in case somebody else might need it:

Greedy matching:

% echo "<b>foo</b>bar" | sed 's/<.*>//g'
bar

Non greedy matching:

% echo "<b>foo</b>bar" | sed 's/<[^>]*>//g'
foobar
tag: shell