Nothing Special   »   [go: up one dir, main page]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize content streams #385

Closed
maxiride opened this issue Oct 4, 2021 · 2 comments
Closed

optimize content streams #385

maxiride opened this issue Oct 4, 2021 · 2 comments
Assignees

Comments

@maxiride
Copy link
maxiride commented Oct 4, 2021

I've been using pdfcpu optimize for a while now with very successful results, recently it has been brought to my attention from a fellow peer another software called cpdfsqueeze (source and binares linked below), I'm not making an issue to bring attention to cpdfsqueeze but rather only because its source is available thus allowing for potential learnign and a pdfcpu improvement.

cpdfsqueeze managed to compress a 280MB PDF down to 19MB while pdfcpu (with or without the xrefstream option set to true\false) compressed it to 153MB and 156MB respectively.

From a purely side by side comparison the PDF produced are identical (given that they are all meant for printing so forms and other content is not important), however, due to a very low knowledge of the PDF internals I can't really tell what has been done.

In the event that the contributors will find the source code resourceful and "clean" on the PDF manipulation I am willingly to put a bounty to improve the optimization process.


cpdfqueeze
Source: https://github.com/johnwhitington/cpdfsqueeze
Binaries: https://github.com/coherentgraphics/cpdfsqueeze-binaries

Due to the troubles of producing large enough PDFs I've sent demo production files via email.

@hhrutter
Copy link
Collaborator
hhrutter commented Oct 4, 2021

Checking for duplicate content streams is something I noticed cpdfsqeeze is doing and pdfcpu not doing right now.
I am further investigating this.

@hhrutter hhrutter self-assigned this Oct 5, 2021
@hhrutter hhrutter changed the title [Feedback] optimize performances optimize content streams Nov 30, 2021
@hhrutter
Copy link
Collaborator

Optimization takes care now of redundant content streams and forms.

adamgreenhall added a commit to adamgreenhall/pdfcpu that referenced this issue Feb 3, 2022
This reverts commit a002745.
adamgreenhall added a commit to adamgreenhall/pdfcpu that referenced this issue Apr 28, 2022
adamgreenhall added a commit to adamgreenhall/pdfcpu that referenced this issue Apr 28, 2022
* Fix pdfcpu#442, pdfcpu#443

* Fix pdfcpu#437

* Fix pdfcpu#434

* Fix pdfcpu#429

* Fix pdfcpu#438

* Fix pdfcpu#440

* Fix pdfcpu#380

* Fix pdfcpu#446

* Add Fedora instructions (pdfcpu#439)

* Fix pdfcpu#389

* Fix pdfcpu#357, pdfcpu#451

* Fix free list validation

* Cleanup

* Fix pdfcpu#453

* Fix pdfcpu#457

* Revert "Revert "Fix pdfcpu#385""

This reverts commit bbe8e25.

Co-authored-by: Horst Rutter <hhrutter@gmail.com>
Co-authored-by: Fabio Alessandro Locati <77888+Fale@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants