Releases: pdfcpu/pdfcpu
v0.9.1
Folks!
In order to get rid of the CLI message about validating links please go get the latest commit.
This somehow sneaked into release and I am not ready to push another release yet.
Thank you!
Hello dear pdfcpu user!
π©π½βπ¬ This release extends the image command so you can update individual images in a PDF.
It also extends the pdfcpu configuration with parameters for controlling outbound HTTP access and introduces the config version.
Moreover we introduce a config command that lets you reset the config.yml to the current major version whenever pdfcpu issues a corresponding warning or you just feel like it for other reasons.
And we have a nice extension for the booklet command and lots of fixes and parser improvements.
Let's dive right in.. π€Ώ
Update Images
pdfcpu images list [-p(ages) selectedPages] -- inFile...
pdfcpu images extract [-p(ages) selectedPages] -- inFile outDir
pdfcpu images update inFile imageFile [outFile] [ objNr | (pageNr Id) ]
Using the new images command you can now update images in your PDF file.
Consider the following use case:
pdfcpu images list gallery.pdf
gallery.pdf:
1 images available (1.8 MB)
Page Obj# β Id β Type SoftMask ImgMask β Width β Height β ColorSpace Comp bpc Interp β Size β Filters
βββββββββββΏββββββΏβββββββββββββββββββββββββΏββββββββΏβββββββββΏβββββββββββββββββββββββββββββΏβββββββββΏββββββββββββ
1 3 β Im0 β image β 1268 β 720 β DeviceRGB 3 8 * β 1.8 MB β FlateDecode
Extract all images into the current dir:
pdfcpu images extract gallery.pdf .
extracting images from gallery.pdf into ./ ...
optimizing...
writing gallery_1_Im0.png
Let's update the image with Id=Im0 on page=1 with gallery_1_Im0.png:
pdfcpu images update gallery.pdf gallery_1_Im0.png
or update the image object (#3) with logo.png:
pdfcpu images update gallery.pdf logo.png 3
or why not updating the image with Id=Im0 on page=1 with logo.jpg:
pdfcpu images update gallery.pdf logo.jpg 1 Im0
You can also dry run the command ofcourse and write to some out.pdf:
pdfcpu images update gallery.pdf gallery_1_Im0.png out.pdf
pdfcpu images update gallery.pdf logo.png out.pdf 3
pdfcpu images update gallery.pdf logo.jpg out.pdf 1 Im0
The behavior of pdfcpu images extract
is the same like pdfcpu extract -mode image
.
See more here and don't forget there is always pdfcpu help images
.
Reset Configuration and new config command
Sometimes it is necessary to extend the pdfcpu configuration.
In such a case if you upgraded to a new release in the past you had to manually remove your config.yml
and it would get upgraded on the execution of the next command on the CLI.
This is now history.
Starting with this release pdfcpu will issue a warning if your configuration needs to be upgraded:
**************************** WARNING ****************************
* Your configuration is not based on the current major version. *
* Please backup and then reset your configuration: *
* $ pdfcpu config reset *
*****************************************************************
The warning will only appear if the major version of the installed pdfcpu executable
does not match the major version of the new configuration version we are also introducing with this release:
# version (Do not edit!)
version: v0.9.1 dev
If you do not reset your configuration in this situation you are risking nasty side effects and in worst case a hard landing - Ouch.. π
From now on all you have to do is execute the new config reset
command:
$ pdfcpu config reset
Did you make a backup of /Users/horstrutter/Library/Application Support/pdfcpu/config.yml ?
(yes/no): yes
Are you ready to reset your config.yml to v0.9.1 dev ?
(yes/no): yes
resetting..
Ready - Don't forget to update config.yml with your modifications.
Using the occasion we extended what you know as pdfcpu config
into:
$ pdfcpu help config
usage: pdfcpu config list
pdfcpu config reset
Make sure you also read the docs.
Controlling Http Traffic
Right now there are two use cases involving outbound Http traffic:
- validation check for broken links
- loading images into image boxes
We are introducing 2 new configuration parameters with this release:
# internet availability.
offline: false
# http timeout in seconds.
timeout: 5
There is also a new offline
common flag for operations which is probably most useful for testing scenarios and consistent benchmarking.
Extended Booklet command
Thanks to @adamgreenhall for once again making the booklet command even more powerful.
Please check out all the details here.
In addition feel free to consult pdfcpu help booklet
.
ππ» Thanks ππ»
to all of you for reporting bugs and testing fixes.
Special shoutout also to @carlwilson and everybody else for putting time into submitting a PR.
β¨ All of this ensures pdfcpu gets more robust and better and better by the minute β¨
Changelog
- 1d4a5a6 Bump version
- c703429 Fix config file handling
- 853c877 Bump version
- a7c32de Add warning if config.yml is outdated, add config reset cmd.
- 22ebeff Booklet 8up orientations (#969)
- b1b9f99 Fix #868, add config parms offline, timeout
- 23311b7 Fix #859, #965
- dc38554 Fix #897
- 9a32118 Fix #455
- bb789c7 Fix form validation
- ac650d9 Fix #940
- 9749d6d Fix #953
- 8711370 Fix #955
- e2a8e58 Fix #951
- 84cdec0 Fix #948
- 007356f Merge pull request #947 from carlwilson/docker-entrypoint
- 748b3cb Merge remote-tracking branch 'upstream/master' into docker-entrypoint
- 3f7e650 FIX: Docker run command in README
- 75f26b2 Fix #935, Clean up
- 0174d86 Fix Docker execution instructions in README.
- e2441b9 Fix #941
- 5677395 Use Docker ENTRYPOINT rather than CMD
v0.8.1
Yet Another Maintenance Release
This release has been overdue.
Lots of fixed bugs to report as well as major improvements of CJK support.
The API ships now with enhanced support for adding PDF annotations.
The corresponding tests are located in annotation_test.go and the generated artifacts here.
I recommend using Adobe Reader to view these because many other PDF Viewer lack the necessary PDF spec compliance.
Thanks
for all of you test driving pdfcpu and reporting π s along the way.
Special PR thanks ππ» go to @toshi1127 and @xelan.
Changelog
- 66fee12 Bump version
- 1bdc717 Add support for Caret annotation.
- df5e3c7 Add support for Line annotation.
- 8e281b1 Add support for PolyLine and Polygon annotations.
- ee932c4 Fix #931
- c6decf5 Fix #932
- b9c28ae Fix #930
- 3aff1b0 Add support for FreeText annotation.
- d7593cb Fix #918
- 1abb9f6 Fix #926
- fdaf5a4 Fix #911
- 68d2f39 Fix #628, #924
- 649f511 Fix #921
- e987369 Fix #912
- 2059677 Fix #910
- c0c7f90 Fix #890, #915
- 6a9df2e Fix #907
- a1d0f95 Fix #908
- d87622b Fix #687, clean up
- 88bee8f Fix #898, clean up
- b54a425 Fix #903
- 699a216 Merge PR #884
- cd40e60 Merge PR #895
- 402000d Fix #850
- c3d8e18 Fix #885
- 12ffda1 Fix #891
- a938dd5 Fix #886
- a467f3c Improve reading corrupted files
- c342327 Merge PR #881
- 3406273 Fix #871
- 551f87e Fix #767
- 1f3886c Fix #853
- d38d51b Fix #867
- 7cae81e Fix #819, clean up
- d1433b9 Fix #862
v0.8.0
Maintenance Release
PDF 2.0 Support
PDF 2.0 encryption is now supported and you are free to use the following commands with your PDF 2.0 input files:
- encrypt
- decrypt
- changeopw
- changeupw
- permissions
Performance
We can report another π @fancycode parser improvement resulting in a significant performance boost and lower memory overhead especially for large files:
Before:
$ time go run test.go
2024/03/21 09:03:55.874443 Parsing ...
2024/03/21 09:04:07.947987 Done, uses 4244 MiBytes heap memory, 6755 MiBytes system memory
2024/03/21 09:04:07.948013 Parsed 1133 pages
real 0m12,743s
user 0m21,830s
sys 0m2,589s
After:
$ time go run test.go
2024/03/21 09:04:30.639673 Parsing ...
2024/03/21 09:04:30.899588 Done, uses 12 MiBytes heap memory, 11 MiBytes system memory
2024/03/21 09:04:30.899609 Parsed 1133 pages
real 0m0,568s
user 0m0,881s
sys 0m0,228s
Configuration Changes
We have added options to skip some optimization steps or disable internal optimization alltogether:
If you disable the following option there will be no internal optimization of the cross reference table once it is loaded into memory.
This will only affect commands that do not rely on optimization like e.g. optimize
# toggle optimization
optimize: true
The following will disable the parsing of page content streams in order to detect unused resources like images or fonts.
# optimize page resources via content stream analysis.
optimizeResourceDicts: true
The following option decides if pdfcpu will scan for and remove duplicate content streams.
# optimize duplicate content streams across pages.
optimizeDuplicateContentStreams: false
β‘ Caution is advised and you have to know what you are doing when using these options.
Tuning or turning optimization off can make sense in environments where you deal with large PDF files that usually look the same structure wise so there are no surprises.
Since the pdfcpu configuration has changed you are encouraged to recreate your config.yml:
- Locate your config.yml using
pdfcpu conf
- Remove/backup your config.yml
- Create a new config.yml from scratch by executing any pdfcpu cmd on the CLI eg. execute one more time
pdfcpu conf
- Edit your configuration
Thanks
for all of you test driving pdfcpu and reporting π s along the way.
Special PR thanks ππ» also to @adamgreenhall for improving the booklet
command and to @xelan as well.
Changelog
- 576f15e Bump version
- 38b2992 Fix #851
- 41333df Cancel parsing in "buffer" if context is cancelled.
- b462c01 Handle case where referenced stream length does not exist.
- ca6d15e Avoid pointer receiver and don't call PDFString of lazy objects internally.
- 91619f0 Write out LazyObjectStreamObject without temporary decoding.
- df5d53d Lazily decode data of StreamObject objects.
- 82f1929 filter: Add API to partially decode data.
- fc09c1f Lazily parse ObjectStream objects.
- 7188e6a Fix #852
- 5bafded Merge PR #855
- 2d06bc7 Add missing author info
- 0988c5e Add PDF 2.0 encryption
- f783bf2 Fix #834
- 87abdcc Fix #849
- d568466 Fix #844
- deb697d Fix #847
- a647579 Fix #843
- 05d2d1f Fix #841
- 6d95797 Fix #839
- 9d76f84 Merge in PR #817
- b9c7a89 Fix #838
- c5db1d9 Fix #826
- 26c5fb2 Fix #826
- aff022f Fix #835, Add config flags for optimization
- 3282d8a Fix #823
- 6158a91 Another fix for #828
- 57030ec Fix #828
- 5ccea97 Fix #135
- fd34b05 Fix #821
v0.7.0
Hello!
π§βπ¬ We packed lots of goodies into this release for you..
Performance
You will like this β¨
Thanks to @fancycode we have improved PDF parsing significantly.
While this is not easily comparable running the pdfcpu testsuite is now 8 seconds faster under MacOS 14.2.1:
Before:
./coverage.sh 67.60s user 13.35s system 119% cpu 1:07.93 total
After:
./coverage.sh 59.64s user 12.55s system 107% cpu 1:07.01 total
PDF 2.0 Support
We now have basic support for writing back PDF 2.0 files.
This means you may start using all pdfcpu operations that update validated PDF 2.0 files.
Basic support means, your mileage may vary, especially when you try to process a file using one of the new 2.0 features.
Since it is hard to get a hand on PDF 2.0 files using a specific new 2.0 feature there is a disclaimer printed on the command line asking for your input and contribution. Please open an issue and share your file in case pdfcpu has a problem digesting your file.
The same applies if you just want to see some specific 2.0 feature supported.
In general, please ππ» report back any issues - there is no way to fix something that does not get reported!
New Zoom Command
pdfcpu zoom [-p(ages) selectedPages] -- description inFile [outFile]
Zoom in/out of selected pages either by magnification factor or corresponding margin.
When zooming out the unused page content space results into horizontal and vertical margins.
These are different from each other but correspond to a certain factor.
Examples:
Zoom into magnification of 200%
pdfcpu zoom -- "factor: 2" in.pdf out.pdf
Zoom out to magnification of 50%
pdfcpu zoom -- "factor: .5" in.pdf out.pdf
Zoom out to a magnification equivalent to a horizontal margin of 1 cm
pdfcpu zoom -unit cm -- "hmargin: 1" in.pdf out.pdf
Zoom out to a magnification equivalent to a vertical margin of 30 points.
Draw a border around zoomed out page content and fill unused page space light gray
pdfcpu zoom -- "vmargin: 30, border:true, bgcolor:lightgray" in.pdf out.pdf ...
Please consult pdfcpu help zoom
for more and also the official documentation
Enhanced Booklet command
Thanks to @adamgreenhall we have an even more powerful booklet command for producing zines:
We now have booklet styles 2, 4, 6 and 8 and you may choose one of the following booklet types, each representing a certain method for arranging pages into a booklet:
booklet, bookletadvanced, perfectbound
Examples:
Arrange pages of in.pdf 2 per sheet side (4 per sheet, back and front) onto out.pdf
pdfcpu booklet -- "formsize:Letter" out.pdf 2 in.pdf
Arrange pages of in.pdf 4 per sheet side (8 per sheet, back and front) onto out.pdf:
pdfcpu booklet -- "formsize:Ledger" out.pdf 4 in.pdf
Arrange pages of in.pdf 6 per sheet side (12 per sheet, back and front) onto out.pdf
pdfcpu booklet -- "formsize:Ledger" out.pdf 6 in.pdf
Arrange pages of in.pdf 8 per sheet side (16 per sheet, back and front) onto out.pdf
pdfcpu booklet -- "formsize:A3" out.pdf 8 in.pdf
Arrange pages of in.pdf 4 per sheet side, with short-edge binding onto out.pdf
pdfcpu booklet -- "formsize:A3, binding:short" out.pdf 4 in.pdf
Arrange pages of in.pdf 2 per sheetside as sequence of folios covering 4*foliosize pages each.
pdfcpu booklet -- "formsize:A4, multifolio:on" hardbackbook.pdf 2 in.pdf
Arrange pages of in.pdf 2 per sheet side, arranged for perfect binding, onto out.pdf
pdfcpu booklet -- "formsize:A4, btype:perfectbound" out.pdf 2 in.pdf
Arrange pages of in.pdf 4 per sheet side, arranged for advanced binding, onto out.pdf
pdfcpu booklet -- "formsize:A3, btype:bookletadvanced" out.pdf 4 in.pdf
Please consult pdfcpu help booklet
for more and also the official documentation
Configuration Changes
There are two changes to the configuration:
validationNone
was eliminatedpostProcessValidate
is new and enables safeguard validation
Validation mode ValidationNone
has been eliminated for a couple of reasons.
First of all during validation there are a lot of things happening like internalizing and caching needed for command processing,
secondly PDF validation has become quite performant.
We are introducing the new config flag postProcessValidate
.
This flag which is turned on by default enables the validation of your processed cross reference table right before writing.
This is considered a useful safeguard, since in cases when writing back a problematic cross reference table without problems,
only the next read/parse/validation attempt will take notice of a problem.
If you disable this you will get an additional performance boost overall but with the caveat described above.
As usual please renew your configuration!
Form filling now expects the user font Roboto-Regular
when using eastern european scripts.
You can do this manually or just remove your pdfcpu configuration all together and recreate it like so:
- Locate the
pdfcpu
folder usingpdfcpu conf
- Remove/backup the
pdfcpu
folder - Recreate a brand new
pdfcpu
folder by executing any pdfcpu cmd on the CLI eg. execute one more timepdfcpu conf
- Edit your configuration
Samples And Tests
This all is complementing the official documentation
To get a better understanding of pdfcpu's operations please make sure you check out all tests and the corresponding PDF output and all json input where appropriate:
pdfcpu/pkg/samples/*
comes loaded with 230 MB worth of PDFs produced by corresponding tests and json input located at:
- pdfcpu/pkg/api/test
- pdfcpu/pkg/testdata/json
Thanks
π to all bug reporters and feature requestors.
Special thanks for contributed PRs go to @adamgreenhall, @fancycode, @kalimit, @sivukhin and @afh
Little Commercial Break
pdfcpu is in need of more frequent financial supporters!
Please consider becoming a sponsor especially if you are a (small) business π
If you are a developer within a business please go to your superior or team lead and have them compare the benefits/costs vs. commercial solutions. If you prefer to operate in stealth mode that's fine - you can always become a private sponsor.
What's important is to keep the project funded and on a clear, steady path π
Meet The Maintainer
I will be in the San Francisco Bay Area this fall.
If you are a recurring sponsor or not but a business using pdfcpu I would like to get to know you and your pdfcpu use case. I'll be happy to meet also one-on-one possibly over π» for a technical chat/discussion and to get feedback right from the trenches.
Just get in touch with me: hhrutter@gmail.com
Next Steps
Support for PDF 2.0 encryption will be tackled next, after that digital signatures.
A Beta version is within reach ππ»
Have fun π with pdfcpu!
Changelog
- dfaa588 Bump version, fix #818
- c0a39e9 Add zoom cmd, fix #756
- d581dc1 Fix #809
- 5b7d844 Add config flag postProcessValidate
- 8735421 Fix #815
- 88f1b3d Fix #814
- 268e6bb Merge PR #811
- da12eed Fix #813
- dedaddc Merge #795, cleanup
- 95c2d64 Avoid copying from "bytes.Buffer" to get underlying bytes.
- 044a6c0 Use type switch instead of long list of type tests.
- 3d4cbdb Further improve parsing of dictionaries / names.
- fc87a22 Fix #794
- b4af9ea Eliminate model.ValidationNone
- d5fd063 Fix #807
- 8f3e992 Fix #628
- cfd7627 Finalize extended booklet cmd as contributed by Adam Greenhall
- a893411 Fix booklet cmd parsing, clean up
- d3e607d Fix #807
- 4527ff4 Fix #806
- 18b8e77 Fix #805
- a8b4a4a Fix #798
- 694f81f Fix #794 , add PDF 2.0 disclaimer
- 1d5da77 cli documentation
- 032b32d Fix #773
- 055e03f Fix #765
- 261c563 Fix #758, #770
- 9295163 Fix #779, #780
- 865e6b7 Fix #724
- 222cf6c Fix #796
- 6ae90db Fix #772
- 96659b7 Fix #789
- dae09eb Fix #786
- ac6f14a Fix #766
- 60e13f3 Fix #760
- 6e235c0 Fix #759
- 793c509 Fix #771
- 6935271 Fix bug with types when splitting pdf
- ef9bfc9 add type cast check
- ef759de fix validation in ParseXRefStreamDict for even sized arrays
- 6e5acd7 fix bug in clone for FilterPipeline DecodeParams in StreamDict object
- 12e046d Fix building from distribution archive
- 043541b Fix #775, #490
- 04634d3 Add testcase that parses a large dictionary.
- bec27a4 Avoid calling "DecodeName" when parsing dictionaries.
- d5443fe Add tests for new reading functions that take a Context.
- f2e4421 Add new reading / parsing functions that take a Context object.
- b89d7b1 Fix #766
- e33b502 Fix #755
v0.6.0
Hello! π«
This release comes ready for you to play around with during the π holidays.
It is packed with new features and the first one π₯ dealing with PDF 2.0 (ISO 32000:2) support.
Let's get right into it..
PDF 2.0 Support
We start with basic support for validation and you can play around with the validate
and info
commands.
The work around this is ongoing and will stretch over the next couple of releases.
Please ππ» report back any issues.
CLI
There are three new commands:
Manage the page layout which shall be used when the document is opened:
pdfcpu pagelayout list inFile
pdfcpu pagelayout set inFile value
pdfcpu pagelayout reset inFile
β‘οΈ pdfcpu help pagelayout
and pagelayout
Manage how the document shall be displayed when opened:
pdfcpu pagemode list inFile
pdfcpu pagemode set inFile value
pdfcpu pagemode reset inFile
β‘οΈ pdfcpu help pagemode
and pagemode
Manage the way the document shall be displayed on the screen and shall be printed:
pdfcpu viewerpref list [-a(ll)] [-j(son)] inFile
pdfcpu viewerpref set inFile (inFileJSON | JSONstring)
pdfcpu viewerpref reset inFile
β‘οΈ pdfcpu help viewerpref
and viewerpref
The split
command now also allows for splitting along page boundaries:
pdfcpu split [-m(ode) span|bookmark|page] inFile outDir [span|pageNr...]
β‘οΈ pdfcpu help split
and split
The merge
command allows for divider pages at file boundaries and zipping two files together:
pdfcpu merge [-m(ode) create|append|zip] [-s(ort) -b(ookmarks) -d(ivider)] outFile inFile...
β‘οΈ pdfcpu help merge
and merge
The permission
command is now more useful:
pdfcpu permissions list [-upw userpw] [-opw ownerpw] inFile...
pdfcpu permissions set [-perm none|print|all|max4Hex|max12Bits] [-upw userpw] -opw ownerpw inFile
It now also allows for conveniently setting individual PDF access bits either via a binary or hexadecimal number.
β‘οΈ pdfcpu help permissions
and permissions
API
Thanks to @vsenko the stamp
command in combination with PDF stamps has become more powerful.
- You can set the
PDF
attribute in yourWatermark
struct with a cached reader which should save you some memory. - Multi stamping is the process where the pages of some input PDF file will be stamped one by one with the next page from a stamp PDF file, eg:
pdfcpu stamp add -mode pdf -- "stamp.pdf" "" in.pdf out.pdf
There is now a way to fine tune multi stamping eg:pdfcpu stamp add -mode pdf -- "stamp.pdf:2:3" "" in.pdf out.pdf
will initiate multi stamping at page 2 of stamp.pdf and page 3 of in.pdf
Configuration
There are three changes to the configuration:
headerBufSize
was eliminated since PDF 2.0 comes with a flexible header location specification.permissions
are now a 4 digit hex number instead of a negative integer.needAppearances
is a flag you can set for form filling.
Since the pdfcpu configuration has changed you are encouraged to recreate your config.yml:
- Locate your config.yml using
pdfcpu conf
- Remove/backup your config.yml
- Create a new config.yml from scratch by executing any pdfcpu cmd on the CLI eg. execute one more time
pdfcpu conf
- Edit your configuration
If and only if you are having fun using pdfcpu...
pdfcpu is in need of financial supporters. There are membership fees, meetings and countless hours I am putting into this project.
Please π consider supporting me in any way you can by becoming a sponsor.
Go to your superior or team lead and have them compare the benefits/costs vs. commercial solutions.
Big Thanks
π to all bug reporters and PRs.
Have fun π with pdfcpu!
Changelog
- e3358c4 PDF 2.0 safe guard, fix #740, bump version
- dce5085 Fix #665, #473
- cad9003 Fix #723
- 297212b Fix #732, #733, #734, #472, add vsenko to contr.
- d7a0231 Fix #747
- f7d021c Fix #739
- 9964328 Fix #727
- cf3b64a Fix #738
- d6f60e1 Fix #737
- b34248d Fix #736
- 125cee2 Fix #716, add option for merging with divider
- 4f99328 Fix #722, 713
- 1807371 Add Split by page number command
- 821def5 Add viewerpref command
- 9cdf2fd add pagemode, pagelayout commands
- 4a2298c Fix #717
- b4d3bd5 Fix #711
- 685340d validate PageMode, PageLayout
- 4fbc44e Fix #710, #635
- 04840dd Fix #708
- c5e4405 Fix #701
- 2a899ed Fix #705
- 3987bb6 Fix #706
- 5349a52 Fix layout ParsePageFormat
- b4556cc Cleanup table column paddings
- 27f0464 ColPaddings
- d298613 Fix #689
- 7a908c8 Fix #677
- d1a66fd Add State/Statemodel validation for TextAnnotation
- c135369 Fix #679, #680
- b42e6eb Fix #676
- 91b6234 Fix #675
v0.5.0
π§βπ¬ The Bookmark Release
Hello!
This release features the new bookmark command and there are substantial changes to the API including better support for scenarios with parallel execution.
CLI
Finally you are able to get rid of unwanted bookmarks, replace existing bookmarks or create even new ones.
There are four commands to list your bookmarks, import/export bookmarks via JSON or to remove all bookmarks:
pdfcpu bookmarks list inFile
pdfcpu bookmarks import [-r(eplace)] inFile inFileJSON [outFile]
pdfcpu bookmarks export inFile [outFileJSON]
pdfcpu bookmarks remove inFile [outFile]
Please check out the documentation.
API
pdfcpu is ready for go 1.21 !
Many API calls now return structs for corresponding objects and thanks to @semvis123 and @yyoshiki41 we were able to remove two significant points of contention. These changes should result in a much better experience running pdfcpu within goroutines. Your feedback is highly appreciated π
Parser
As always there is steady improvement to the PDF parser and thanks goes to every single user reporting issues.
Remember, only because you are stuck parsing a specific file does not mean we can't do anything about it - but you are encouraged to take the time and file an issue.
pdfcpu is now a proud member of the PDF association
This is a long term commitment for using the optimal resources, going in the right direction in order to make pdfcpu a sound tool for both developers and CLI users for the time to come.
If and only if you are having fun using pdfcpu...
There are membership fees, meetings and countless hours I am putting into this project.
Please π consider supporting me in any way you can by becoming a sponsor.
Go to your superior or team lead and have them compare the benefits/costs vs. commercial solutions.
Thanks!
As always π to all bug reporters and PRs.
Have fun π processing your PDFs with pdfcpu!
Changelog
- 1d309dc Add bookmarks cmd, Fix #506, #621, #671, bump version
- be32323 Fix #664, #666, #667, #669
- a9afcfe Fix #663
- 3161cdf Fix #659
- f0e5a70 Fix #664
- 70de7fc Fix concurrent write to map userfontmetrics
- bda08dd Fix #660
- c676831 Cleanup #604
- 7c02e0d fix(api): Protects DisableConfigDir from concurrent access using mutex
- fff3853 Fix #657
- 337d257 Export form errors
v0.4.2
Maintenance Release
This release is a cut-off after fixing a couple of issues.
A notable feature of this release is bookmark support for merging PDFs.
Existing bookmarks will be preserved during merging and from now on the output file also has a bookmark hierarchy representing the input files.
Bookmarks during merging will be created per default.
You can skip bookmark creation on the CLI by supplying -bookmarks=false
If you want to skip bookmark creation when using the API you need to reset a new configuration parameter:
# merge creates bookmarks
createBookmarks: true
Since the pdfcpu configuration has changed you are encouraged to recreate your config.yml:
- Locate your config.yml using
pdfcpu conf
- Remove/backup your config.yml
- Create a new config.yml from scratch by executing any pdfcpu cmd on the CLI eg. execute one more time
pdfcpu conf
- Edit your configuration
Please π
pdfcpu is in need of more supporters.
If you use it please consider the hard work put into this and consider sponsorship.
Go to your superior or team lead and have them compare the benefits/costs vs. commercial solutions.
Thanks!
As always π to all bug reporters and PRs.
Have fun π processing your PDFs with pdfcpu!
Changelog
- 4aef7f7 Bump version
- bc1661f Cleanup
- 7da9606 Fix #649
- 9b69fb1 Fix #654
- d77fba2 Fix #608, #632, #650
- fd12f8b Fix #637
- 353d3c0 Fix #626
- 53c7728 Fix #617, #635
- b36ed33 Merge in PR #631
- b1c460e Fix #636
- 05d0e93 Fix #624
- b17ba03 Fix #644
- df2a612 Fix #623
- bec6f91 Fix #606
- 6bf4168 Fix #622
- 5248cad Fix #630
- 8315a3e Fix #627
- 9dfe68b Fix #618
- a7fd28e Skip PieceInfo validation in relaxed mode
v0.4.1
A Release About Cutting PDF Pages
There are three new commands that will cut your PDF page in one way or another:
- Cut
pdfcpu cut [-p(ages) selectedPages] -- description inFile outDir [outFileName]
A low level command for fine grained custom page cutting.
Apply any number of horizontal or vertical page cuts:
β‘οΈ Documentation and examples
- N-down
pdfcpu ndown [-p(ages) selectedPages] -- [description] n inFile outDir [outFileName]
Cut selected page into n pages symmetrically.
Think the inverse operation of n-up:
β‘οΈ Documentation and examples
- Poster
pdfcpu poster [-p(ages) selectedPages] -- description inFile outDir [outFileName]
Create a poster with full control over scaling and tile size and more:
β‘οΈ Documentation and examples
API-Change
A notable API change is an additional parameter for AddBookmarks.
The replace flag enforces deleting any old bookmarks that may be present in the input file:
AddBookmarksFile(inFile, outFile string, bms []pdf.Bookmark, replace bool, conf *model.Configuration)
As always π to all bug reporters and PRs.
pdfcpu is getting better & better every day π
Have fun fiddling around with your PDFs!
.. Ohh, and before jumping right in please do me a favor and click here
Changelog
- 0342a83 Bump version
- b80cf0f Fix #383
- c411201 Upgrade dependencies
- 30f3667 Fix #474
- bdf5fd1 Add replace flag to AddBookmarks
- adcdddd Fix #557
- 73c50d3 Fix #448
- e560c3e Fix #586
- add1372 Fix #483
- bb20b1e Fix #598
- 9ffd937 Cleanup
- 7059d42 Cleanup
- 256b5ef Fix #591
- 7fbccb8 Fix #467, #557, #573
- f965b0a wip
- 91d97f3 Fix #583
- 6c0092e Fix #589
- 71a87ec Fix #571
- 8295bb4 compute watermark position in floats, not ints (#610)
- 5348f88 Fix #593
- a410fdc Fix #584
- 92f7b2b Fix #575, #579
- 768cf96 Fix #571, #572
- 25f7cef Bump go.mod
- 1fab2dc Fix #566
- 6e87941 add missing refactoring due to rename (#565)
v0.4.0
Hello, the main focus of this release is PDF form management!
This release has been a long time coming π Thank you π for your patience!
API-Users, please proceed with caution!
The codebase has been refactored heavily and there may be some side effects.
E.g. usages of: api.Merge(rsc []io.ReadSeeker,...) need to be migrated to api.MergeRaw(rsc []io.ReadSeeker, ...)
pdfcpu v0.4.0 is now based on go1.20 and comes with two new commands:
- The new powerful
form
command solves all major form handling usecases:
pdfcpu form list inFile...
pdfcpu form remove inFile [outFile] fieldID...
pdfcpu form lock inFile [outFile] [fieldID...]
pdfcpu form unlock inFile [outFile] [fieldID...]
pdfcpu form reset inFile [outFile] [fieldID...]
pdfcpu form export inFile [outFileJSON]
pdfcpu form fill inFile inFileJSON [outFile]
pdfcpu form multifill [-m(ode) single|merge] inFile inFileData outDir [outName]
DISCLAIMER 1
You are free to export and fill already existing PDF forms - this may or may not work!
Feel free to open an issue and we may be able to make it work.
DISCLAIMER 2
All forms generated with pdfcpu create
are optimized for Adobe Reader.
Mac Preview is not suited well for form handling!
The following workflows are supported:
The Regular Workflow
- Create your form with the already introduced
create
command. - Print a
list
of your form fields on the command line. export
your form to JSON,- Edit your JSON and enter your form data.
fill
your form with this JSON payload.- Optionally write protect individual fields using
lock
.
The Fill & Merge Workflow
"Give me access to your contacts db and I will generate a single PDF containing a page sequence of contact sheets"
This usecase is implemented by extending the regular workflow by an additional integrated merge step.
As for automatically filling a form with your data pdfcpu gives you two options:
- JSON - better suited for API providers, verbose but powerful JSON elements, allows full fledged imageBoxes for image fields
- CSV - for on-premises bulk fill scenarios, one CSV line representing a form instance, has its limitations eg. need to fake image fields
- The new
resize
command comes to the rescue when you're stuck with some large pages in front of a regular small form printer:
pdfcpu resize [-p(ages) selectedPages] -- [description] inFile [outFile]
Scale your pages up or down or resize to one of the many supported standard form sizes (pdfcpu paper
prints a list ) optionally enforcing portrait or landscape mode:
Examples:
pdfcpu resize "scale:2" in.pdf out.pdf
Enlarge pages by doubling the page dimensions, keep orientation.
pdfcpu resize -pages 1-3 -- "sc:.5" in.pdf out.pdf
Shrink first 3 pages by cutting in half the page dimensions, keep orientation.
pdfcpu resize -u cm -- "dim:40 0" in.pdf out.pdf
Resize pages to width of 40 cm, keep orientation.
pdfcpu resize "form:A4" in.pdf out.pdf
Resize pages to A4, keep orientation.
pdfcpu resize "f:A4P, bgcol:#d0d0d0" in.pdf out.pdf
Resize pages to A4 and enforce orientation(here: portrait mode), apply background color.
pdfcpu resize "dim:400 200" in.pdf out.pdf
Resize pages to 400 x 200 points, keep orientation.
pdfcpu resize "dim:400 200, enforce:true" in.pdf out.pdf
Resize pages to 400 x 200 points, enforce orientation.
Countless bugs have been rolling in and many of those have been fixed. Thank you π all also for your PRs π π
Unfortunately due to heavy refactoring of the code base some of them had to be merged in manually or still will be.
Changelog
- 7ff654b Bump version
- 580f9d7 Update form samples
- 4ac95e3 Fix #461
- 98c062b Fix #537
- 4e12e03 Fix #363,#524
- 73a7d98 Fix #535
- b45bfb7 Fix #539
- 74efd88 Fix #515
- d26377a Fix #481 Add Rene Kaufmann to contributors
- de1d264 Fix #430 Add Dmitry Ivanov to contributors
- 7af9343 Eliminate io.ReadAll
- a86469b Fix #447
- 7a47cb5 Bump go.mod
- 363a1f2 Add form handling
- 07d9762 Fix #523
- 97106ac Fix #511
- 8f6a813 Fix #466
- a45919b Fix #522
- aa87096 Update Dockerfile (#518)
- 9aa382a fix ListImages comment (#516)
- a021566 Fix #479
- ba2518c Fix #459
- bfaf786 Fix #489
- 4382301 Fix #458
- d8d83be Fix #469
- b9818b8 Fix #497
- d16027e Fix #493
- 5ef3bae Fix #494
- 3decd49 Fix #488
- 5d37c49 Fix #487
- fe532c2 Fix #485
- 0ce7ef0 Fix #471, #475, #478
- 8b9e92e Fix #480
- 11d755a Fix #457
- 432b649 Fix #453
- c54a411 Cleanup
- c202488 Fix free list validation
- e456479 Fix #357, #451
- 74d211b Fix #389
- 8f9a93c Add Fedora instructions (#439)
- a056f85 Fix #446
- 15f4842 Fix #380
- 437ac57 Fix #440
- 437b0a0 Fix #438
- e3d3f26 Fix #429
- adf6c1b Fix #434
- 515c2ef Fix #437
- 17e5f68 Fix #442, #443
- 8c35cdc Cleanup
- 92c29f7 Cleanup
- 1b1f5e4 go lint
- a002745 Fix #385
- 96a0a30 Fix #391
- b8b77ee Fix #411
- 31d2490 Fix #414
- f23a3da Fix #418
- 68200b9 Fix #413
- ad65664 Add issue template
- c78f959 Cleanup
- 013a11b Fix #402
- 8767334 Fix #400
- 6b5d856 Fix of a bug in Fix #407
- df8cec3 Fix #407
- f5de06f Fix #404
- 5a3802e Fix #379
- 9d3225c Add Eng Zer Jun as contributor
- 8e53ad6 refactor: move from io/ioutil to io and os packages (#403)
- c37ef1a Integrate #397, Add Juan Iscar as contributor
- e9f927d Fix #396, Add config cmd
- a8a031e Fix #398
v0.3.13
Hello, this release is all about PDF generation!
A new command interprets a JSON structure representing page declarations and renders PDF pages accordingly:
pdfcpu create in.json [in.pdf] out.pdf
in.json
contains a page sequence with content composed of text, images, colored boxes, tables and more.
Each content element follows a box model consisting of margin, border and padding and may define fonts and colors where appropriate. You may also set general page attributes like paper size, background color or your crop box and you may also define your page headers and footers.
in.pdf
if present, existing page content of in.pdf will be modified by appending to it.
out.pdf
is where rendered pages are written to.
How it works
The way this command is setup allows for repeatedly adding content to a PDF.
This fosters an incremental approach to PDF generation which may be during the design phase or in production.
Learning
The best documentation for this command is the combination of the content of:
- pkg/testdata/json/ and
- pkg/samples/create/ - the home of the corresponding result PDFs.
If you are a Go developer you can play around with createFromJSON_test.go by modifying the JSON file and then executing the test which will give you immediate visual feedback by regenerating the corresponding result PDF.
The JSON is self explanatory and I highly recommend working through all the examples!
Many of the examples are multi page PDFs so make sure you don't miss anything!
You will learn about
- setting up your layout coordinate system
- absolute vs. anchored positioning
- inheritance of fonts, margin, border, padding
- setting up your custom colors
- setting up your resource dirs for imageBoxes
- setting up pools for reused boxes, images and text
- using guides throughout the layout process
- how to highlight your crop box and content boxes
Note
Although this command allows for the modification of any PDF it works best for PDFs generated by pdfcpu itself.
PDF Forms
Eventually and this release is really the preparation you will be able to create your PDF forms with this command.
To catch a glimpse of this effort have a look at:
- pkg/testdata/json/textfield.json
- pkg/testdata/json/textarea.json
- pkg/testdata/json/checkbox.json
- pkg/testdata/json/radiobuttonsHor.json
and again the corresponding PDF files in pkg/samples/create/
You are welcome to experiment with form generation based on these form element samples but PDF form creation has NOT been released!
π Thanks everybody for submitting issues and PRs π
Thank you for using pdfcpu π
Changelog
e89570a Fix OS agnostic fileName resolving
dc0561f Bump version
72e4e71 Relax validation for FontDescriptor Lang
9616d4b Add create cmd
3c08a45 Add Stefan Huber as contributor
92bfd3e Extend relaxed validation for CIDToGIDMap
c3eb4c0 Fix #335, #358
ba9f089 Fix #349
281745f Fix #353
b80c13b Fix #354
64e3df6 Fix #356
5e1ae87 Fix #362
befba81 Fix #366
5a7da1d Fix #371
437ef37 Fix #380, #387
4adf70c Fix #381
0c4c829 Fix #386
af9e334 Fix #388
6509cea Fix #394
8837dd1 Fix io.Reader based encryption #372
ce70c15 Merge branch 'pr/signalwerk/360' into master
16f43aa stamping: Fix OCG reuse
7e1546b validate cmd: wildcard support