Nothing Special   »   [go: up one dir, main page]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Impossible to retrieve the attachments from a PDF in version 1.4 #953

Closed
AmineZouitine opened this issue Sep 24, 2024 · 7 comments
Closed
Assignees

Comments

@AmineZouitine
Copy link
AmineZouitine commented Sep 24, 2024

pdfcpu does not allow extracting attachments from a PDF in version 1.4.

CLI Error:

> pdfcpu attachments list doc.pdf
unknown_error: facturx: validating xref table: pdfcpu: validateNameEntry: dict=rootDict entry=PageMode invalid dict entry: UseAttachment

Lib issue:

unknown_error: facturx: validating xref table: pdfcpu: validateNameEntry: dict=rootDict entry=PageMode invalid dict entry: UseAttachments

I upgraded the PDF version to 1.7 with Ghostscript, and then it worked. So the issue was indeed due to the PDF version.

Library version: v0.8.1
CLI version: v0.8.1 dev

OS: MacOS

It would be great not to have this constraint anymore 🙏

@GitHubRulesOK
Copy link
GitHubRulesOK commented Sep 24, 2024

VERSION is nothing to do with attachments (only related to PDF/A numbering)
What likely happened is GS always writes a new file with little or no errors thus should have altered the file into a workable format

tested with a file that says 1.4 compatible (no error) as the file was written in the past by GhostScript thus you need to look closer as to why your source PDF file fails but version is rare as a cause. you need to supply the PDF file for analysis or TRACE WHY it fails at a given point.

pdfcpu_0.8.1_Windows_i386>pdfcpu attachments list facture-x-1.4.pdf
factur-x.xml (ZUGFeRD electronic invoice)
%PDF-1.4
%Çì�¢
%%Invocation: path/gswin64c -dDisplayFormat=198788 -dDisplayResolution=96 -q -dBATCH -dNOPAUSE -dSAFER -dDELAYSAFER -dESTACKPRINT -sDEVICE=pdfwrite -dPDFA=3 -sColorConversionStrategy=RGB -dPDFACompatibilityPolicy=2 -sZUGFeRDXMLFile=? -sZUGFeRDProfile=?
%%+ -sZUGFeRDVersion=? -sZUGFeRDConformanceLevel=? ? ? -f ?

@hhrutter
Copy link
Collaborator
hhrutter commented Sep 25, 2024

Hello!

pdfcpu maintainer here!

The PDF spec clearly defines the pagemode "UseAttachments" since PDF v1.6.

Since your file's version is 1.4 this is is a spec violation and has to be reported.

This is not related to attachment support per se.

@GitHubRulesOK
Copy link
GitHubRulesOK commented Sep 25, 2024

I am confused /UseAttachments is a viewer control it does not have to have an attachment here is 1.1 from PDFA discussion with /UseAttachments view https://github.com/pdf-association/pdf-issues/files/11260962/out.pdf

Whilst not trawling back to 1.1
I see no reason to reject its presence in older version numbers
The viewing state is not related to "is attachment allowed or not" clearly PDFA 1a and 1b dont allow attachments but that should not affect the setting of the viewer or should it ?

@petervwyatt
Copy link

That is correct - ISO 19005 (PDF/A) does not prohibit private data (including key values of documented keys) so long as they do not impact page rendering visual appearance. In this case, it is just a viewer UI recommendation. You can consider it "private" since it wasn't formally documented until much more recently - as per resolved PDF Errata 275 which is where the mentioned attachment is from - resolution wording.

See also our recent publication "Understanding Private Data in PDF/A" for more info.

@hhrutter
Copy link
Collaborator

This issue is not about the catalog entry ViewerPreferences.

It's about catalog entry PageMode:

The validation for this very file fails because it sets PageMode to UseAttachments and this is prohibited by the spec until v1.6:
Screenshot 2024-09-26 at 11 15 40

If this sounds like a contradiction then maybe it needs to be addressed in PDF Errata 275

@petervwyatt
Copy link

That may be what the spec said, but Adobe implemented the otherwise undocumented setting in its products since it was introduced - and so had other vendors to match behaviour. Thus there are extant files in the wild with this value from that era. All PDF 2.0 has done is acknowledge this fact and make the defacto spec official. View it as an editorial oversight dating back many years :-)

@hhrutter
Copy link
Collaborator

The latest commit relaxes validation of pageMode UseAttachments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants