PDFTK
PDFTK
PDFTK
NAME
pdftk A handy tool for manipulating PDF
SYNOPSIS
pdftk <input PDF files | - | PROMPT>
[ input_pw <input PDF owner passwords | PROMPT> ]
[ <operation> <operation arguments> ]
[ output <output filename | - | PROMPT> ]
[ encrypt_40bit | encrypt_128bit ]
[ allow <permissions> ]
[ owner_pw <owner password | PROMPT> ]
[ user_pw <user password | PROMPT> ]
[ flatten ] [ need_appearances ]
[ compress | uncompress ]
[ keep_first_id | keep_final_id ] [ drop_xfa ] [ drop_xmp ]
[ verbose ] [ dont_ask | do_ask ]
Where:
<operation> may be empty, or:
[ cat | shuffle | burst | rotate |
generate_fdf | fill_form |
background | multibackground |
stamp | multistamp |
dump_data | dump_data_utf8 |
dump_data_fields | dump_data_fields_utf8 |
dump_data_annots |
update_info | update_info_utf8 |
attach_files | unpack_files ]
Handles are often omitted. They are useful when specifying PDF passwords or page ranges, later.
If handles are not given, then passwords are associated with input files by order.
Most pdftk features require that encrypted input PDF are accompanied by the owner password. If
the input PDF has no owner password, then the user password must be given, instead. If the input
PDF has no passwords, then no password should be given.
When running in do_ask mode, pdftk will prompt you for a password if the supplied password is
incorrect or none was given.
[<operation> <operation arguments>]
Available operations are: cat, shuffle, burst, rotate, generate_fdf, fill_form, background, multi-
background, stamp, multistamp, dump_data, dump_data_utf8, dump_data_fields,
dump_data_fields_utf8, dump_data_annots, update_info, update_info_utf8, attach_files,
unpack_files. Some operations takes additional arguments, described below.
If this optional argument is omitted, then pdftk runs in filter mode. Filter mode takes only one
PDF input and creates a new PDF after applying all of the output options, like encryption and
compression.
cat [<page ranges>]
Assembles (catenates) pages from input PDFs to create a new PDF. Use cat to merge PDF
pages or to split PDF pages from documents. You can also use it to rotate PDF pages. Page
order in the new PDF is specified by the order of the given page ranges. Page ranges are
described like this:
Where the handle identifies one of the input PDF files, and the beginning and ending page
numbers are one-based references to pages in the PDF file. The qualifier can be even or odd,
and the page rotation can be north, south, east, west, left, right, or down.
If a PDF handle is given but no pages are specified, then the entire PDF is used. If no pages are
specified for any of the input PDFs, then the input PDFs bookmarks are also merged and
included in the output.
If the handle is omitted from the page range, then the pages are taken from the first input PDF.
The even qualifier causes pdftk to use only the even-numbered PDF pages, so 1-6even yields
pages 2, 4 and 6 in that order. 6-1even yields pages 6, 4 and 2 in that order.
The page rotation setting can cause pdftk to rotate pages and documents. Each option sets the
page rotation as follows (in degrees): north: 0, east: 90, south: 180, west: 270, left: -90, right:
+90, down: +180. left, right, and down make relative adjustments to a pages rotation.
If no arguments are passed to cat, then pdftk combines all input PDFs in the order they were
given to create the output.
NOTES:
* <end page number> may be less than <begin page number>.
* The keyword end may be used to reference the final page of a document instead of a page
number.
* Reference a single page by omitting the ending page number.
* The handle may be used alone to represent the entire PDF document, e.g., B1-end is the
same as B.
* You can reference page numbers in reverse order by prefixing them with the letter r. For
example, page r1 is the last page of the document, r2 is the next-to-last page of the document,
and rend is the first page of the document. You can use this prefix in ranges, too, for example
r3-r1 is the last three pages of a PDF.
The qualifier can be even or odd, and the page rotation can be north, south, east, west, left,
right, or down.
Each option sets the page rotation as follows (in degrees): north: 0, east: 90, south: 180, west:
270, left: -90, right: +90, down: +180. left, right, and down make relative adjustments to a
pages rotation.
The given order of the pages doesnt change the page order in the output.
generate_fdf
Reads a single input PDF file and generates an FDF file suitable for fill_form out of it to the
given output filename or (if no output is given) to stdout. Does not create a new PDF.
fill_form <FDF data filename | XFDF data filename | - | PROMPT>
Fills the single input PDFs form fields with the data from an FDF file, XFDF file or stdin.
Enter the data filename after fill_form, or use - to pass the data via stdin, like so:
If the input FDF file includes Rich Text formatted data in addition to plain text, then the Rich
Text data is packed into the form fields as well as the plain text. Pdftk also sets a flag that cues
Reader/Acrobat to generate new field appearances based on the Rich Text data. So when the
user opens the PDF, the viewer will create the Rich Text appearance on the spot. If the users
PDF viewer does not support Rich Text, then the user will see the plain text data instead. If
you flatten this form before Acrobat has a chance to create (and save) new field appearances,
then the plain text field data is what youll see.
Pdftk uses only the first page from the background PDF and applies it to every page of the
input PDF. This page is scaled and rotated as needed to fit the input page. You can use - to
pass a background PDF into pdftk via stdin.
If the input PDF does not have a transparent background (such as a PDF created from page
scans) then the resulting background wont be visible -- use the stamp operation instead.
multibackground <background PDF filename | - | PROMPT>
Same as the background operation, but applies each page of the background PDF to the corre-
sponding page of the input PDF. If the input PDF has more pages than the stamp PDF, then
the final stamp page is repeated across these remaining pages in the input PDF.
This operation does not change the metadata stored in the PDFs XMP stream, if it has one.
(For this reason you should include a ModDate entry in your updated info with a current
date/timestamp, format: D:YYYYMMDDHHmmSS, e.g. D:201307241346 -- omitted data
after YYYY revert to default values.)
For example:
or, interactively:
The permissions section may include one or more of the following features:
Printing
Top Quality Printing
DegradedPrinting
Lower Quality Printing
ModifyContents
Also allows Assembly
Assembly
CopyContents
Also allows ScreenReaders
ScreenReaders
ModifyAnnotations
Also allows FillIn
FillIn
AllFeatures
Allows the user to perform all of the above, and top quality printing.
[owner_pw <owner password | PROMPT>]
[user_pw <user password | PROMPT>]
If an encryption strength is given but no passwords are supplied, then the owner and user pass-
words remain empty, which means that the resulting PDF may be opened and its security parame-
ters altered by anybody.
[compress | uncompress]
These are only useful when you want to edit PDF code in a text editor like vim or emacs. Remove
PDF page stream compression by applying the uncompress filter. Use the compress filter to
restore compression.
[flatten]
Use this option to merge an input PDFs interactive form fields (and their data) with the PDFs
pages. Only one input PDF may be given. Sometimes used with the fill_form operation.
[need_appearances]
Sets a flag that cues Reader/Acrobat to generate new field appearances based on the form field val-
ues. Use this when filling a form with non-ASCII text to ensure the best presentation in Adobe
Reader or Acrobat. It wont work when combined with the flatten option.
[keep_first_id | keep_final_id]
When combining pages from multiple PDFs, use one of these options to copy the document ID
from either the first or final input document into the new output PDF. Otherwise pdftk creates a
new document ID for the output PDF. When no operation is given, pdftk always uses the ID from
the (single) input PDF.
[drop_xfa]
If your input PDF is a form created using Acrobat 7 or Adobe Designer, then it probably has XFA
data. Filling such a form using pdftk yields a PDF with data that fails to display in Acrobat 7 (and
6?). The workaround solution is to remove the forms XFA data, either before you fill the form
using pdftk or at the time you fill the form. Using this option causes pdftk to omit the XFA data
from the output PDF form.
This option is only useful when running pdftk on a single input PDF. When assembling a PDF
from multiple inputs using pdftk, any XFA data in the input is automatically omitted.
[drop_xmp]
Many PDFs store document metadata using both an Info dictionary (old school) and an XMP
stream (new school). Pdftks update_info operation can update the Info dictionary, but not the
XMP stream. The proper remedy for this is to include a ModDate entry in your updated info with
a current date/timestamp. The date/timestamp format is: D:YYYYMMDDHHmmSS, e.g.
D:201307241346 -- omitted data after YYYY revert to default values. This newer ModDate
should cue PDF viewers that the Info metadata is more current than the XMP data.
Alternatively, you might prefer to remove the XMP stream from the PDF altogether -- thats what
this option does. Note that objects inside the PDF might have their own, separate XMP metadata
streams, and that drop_xmp does not remove those. It only removes the PDFs document-level
XMP stream.
[verbose]
By default, pdftk runs quietly. Append verbose to the end and it will speak up.
[dont_ask | do_ask]
Depending on the compile-time settings (see ASK_ABOUT_WARNINGS), pdftk might prompt
you for further input when it encounters a problem, such as a bad password. Override this default
behavior by adding dont_ask (so pdftk wont ask you what to do) or do_ask (so pdftk will ask
you what to do).
When running in dont_ask mode, pdftk will over-write files with its output without notice.
EXAMPLES
Collate scanned pages
pdftk A=even.pdf B=odd.pdf shuffle A B output collated.pdf
or if odd.pdf is in reverse order:
pdftk A=even.pdf B=odd.pdf shuffle A Bend-1 output collated.pdf
Decrypt a PDF
pdftk secured.pdf input_pw foopass output unsecured.pdf
Encrypt a PDF using 128-bit strength (the default), withhold all permissions (the default)
pdftk 1.pdf output 1.128.pdf owner_pw foopass
Same as above, except password baz must also be used to open output PDF
pdftk 1.pdf output 1.128.pdf owner_pw foo user_pw baz