Thomas Fischer's Weblog

Life, Linux, LaTeX

Archive for April 2008

Text on PDF Files

leave a comment »

As promised last time, I’m going to show a small bash script which puts text on existing pdf files. This bash script uses pdflatex, some styles and auxiliary programs to do its magic. The script is currently work in progress, thus feedback is welcome.

Once the script is placed somewhere in the path, you can apply it to any pdf document. The scripts parameters are group into three categories:

  • Global parameters:

    -h
    Prints help on parameters
    -l
    Puts all boxes (see below) above any content in the pdf file. May be neccessary under certain circumstances.
    -p prefixfile
    Insert additional commands (e.g. \usepackage) into document prefix
  • I/O parameters:

    -i infile
    Input file’s name
    -o outfile
    Optional argument for the output file’s name. If no name is given, the input file’s name will be used and appended by -textonpdf
  • Box parameters. Boxes are text frames with coordinates (x/y), width and a textual content. Multiple boxes may be defined, each textual content will be placed at the coordinates specified in previous parameters

    -x number
    absolute x-coordinate of the next box, given in cm
    -y number
    absolute y-coordinate of the next box, given in cm
    -w number
    width of the next box, given in cm
    -t text
    textual content of the box. Occurrences of %d will be replaced by the current date, %p will be replaced with the current page’s number

    Boxes will be put on every page. The boxes’ content is centered.

Let me show you some examples:

textonpdf.sh -x 0 -y 27 -w 21 -t '%p' -i myfile.pdf -o output.pdf

will put page numbers (as given by -t '%p') centered (given a page is 21cm wide) at the bottom of each page (27cm below the top). As promised, multiple boxes may be specified. The following example puts the current date on the top of each page:

textonpdf.sh -x 5 -y 27 -w 11 -t '%p' -x 2 -w 17 -y 2 \
  -t 'File created on \textbf{%d}' -i myfile.pdf -o output.pdf

As you can see, LaTeX commands can be used in the inserted text, as the boxes’ content is directly used in the intermediate tex document. Updated: Finally, a more complex example for a watermark. First, we create a prefix file, which will be loaded using the -p switch.

\usepackage[scaled]{helvet}
\usepackage[utf8]{inputenc}
\usepackage{rotating}
\usepackage{xcolor}

These packages load Helvetica as font, set the input encoding to UTF-8 (may be different for you), and include the rotating and xcolor package. Now, we call the script:

textonpdf.sh -x 1 -y 1 -p prefix.tex -w 10 -t \
  '\centering\begin{turn}{25}\begin{minipage}{10cm}\centering\bfseries\sffamily\Huge\color{red}Nur für den\\internen Gebrauch\end{minipage}\end{turn}' \
  -i test.pdf -o test2.pdf

Here, you can two new things: First, the prefix file prefix.tex (see above) and include some tex commands into the text at the -t switch. In the text, there are commands to turn the text by 25 degrees and a minipage of 10cm widths (required for the line break later). Inside the minipage, all text is centered, set in Helvetica, huge and red. The text itself is a two lines long and warns you the document is for internal use only. Using or removing the -l switch makes a difference.

Does it work for you, too? Any bugs or comments? Let me know.

Download script

Written by Thomas Fischer

April 27, 2008 at 0:00

Posted in LaTeX, Linux

Modifying PDF Files with Command Line Tools

leave a comment »

Sometimes it is neccessary to edit a pdf file e.g. by extracting selected pages, putting multiple pages on one page or changing the page margins. For all these problems there is a toolchain called PDFjam.

PDFjam is a collection of shell scripts that use pdfLaTeX to perform modifications on a (set of) pdf documents. Installing PDFjam is quite simple, as most Linux distributions offer packages.

PDFjam consists of three shell scripts:

pdfnup
Puts multiple pages on one pages, may perform additional operations such as scaling. The PostScript equivalent for this tool is psnup.
pdfjoin
Combines multiple pdf documents into one file, may perform additional operations.
pdf90
Rotates pages in 90 degree steps.

For PostScript documents, similar operations can be performed using the PSUtils or mpage.

I’m going to present some typical use cases where you want to use the PDFjam tools:

  1. You got slides from a lecture or presentation. Each slide is one page in the pdf document and it is a waste of paper (and trees ;-)) to print it this way. To put 2×4 pages on one page, use the following command:
    pdfnup --nup 2x4 slides.pdf

    The result file will be called slides-2x4.pdf. Now, you may want to have some margin around these eight slides on each page, e.g. to add personal notes or for punching. Simply scale the page’s content:

    pdfnup --nup 2x4 --scale 0.9 slides.pdf

    Adapt the scale parameter to your personal needs. To have space between each of the eight slides, add the --delta parameter:

    pdfnup --nup 2x4 --scale 0.9 --delta "1cm 1cm" slides.pdf

    Here, between two slides one centimeter is added in horizontal and vertical direction. Of course, you can use this program not only for slides, but you can put two pages on one side of a sheet of paper, too. In this case, use --nup 2x1.

  2. You got a set of pdf documents from several sources and you want to combine them to one single document. E.g. from SpringerLink you can get whole books split into single files per chapter or section. To recombine a sequence of pdf files to one pdf file, use the following command:
    pdfjoin --outfile book.pdf  chapter1.pdf chapter2.pdf ...

    Parameter --outfile determines the resulting pdf file’s name.

Both pdfjoin and pdfnup provide many more options to scale, rotate, or crop the files to be processed. Simply use the parameter --help to get an overview of all available options.

Next time, we will have a look on how to add text or any other content into existing pdf documents …

Written by Thomas Fischer

April 25, 2008 at 0:00

Posted in LaTeX, Linux