Skip to content

kvnlnk/pdf2pdfa

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pdf2pdfa

A simple Ghostscript-based PDF to PDF/A-{1, 2, 3} B converter written in Python with validation.

Currently only works on Windows.

Requirements

  • A recent version of Python (at least 3.12.6)
  • A recent version of Ghostscript (at least Ghostscript 10.05.1)
  • A recent version of Java (at least Java 21)

Installation

Just download the repository as a ZIP file and extract it, or clone the repository:

  1. Clone the repository:
    git clone https://github.com/kvnlnk/pdf2pdfa.git
    cd pdf2pdfa

If its not working due to paths exceeding 260 characters, you can try to move the folder closer to the root of your drive (e.g. C:\pdf2pdfa) or enable long paths with the following command if you have Git version 2.10 or higher:

git config --system core.longpaths true

Usage

Take the PDF file you want to convert and run the following command:

python pdf2pdfa.py [-h] [-i INPUT] [-o OUTPUT] [-t {profile,independent}] [-pv {1,2,3}] [-v {True,False}]

You may or may not specify an output file name and location. If you don't, the output file will be saved in the same directory as the input file, with the same name but with a -pdfa extension. You also have a number of optional arguments:

  -h, --help            show this help message and exit

  -i INPUT, --input INPUT
                        Input PDF file with path

  -o OUTPUT, --output OUTPUT
                        Output path for the converted PDF/A-1B file. Default will be the current directory + '/[input_filename]_pdfa.pdf'

  -t {profile,independent}, --type {profile,independent}
                        Whether to use a color profile or device independent color.

  -pv {1,2,3}, --pdfaVersion {1,2,3}
                        PDF/A version (1, 2, or 3). Default is 1.

  -v {True,False}, --validate {True,False}
                        Validate the PDF/A file after conversion.  Default is False.                 

Examples

Convert a PDF to PDF/A-1B using a color profile, PDF/A version 1, and validate the output:

python pdf2pdfa.py -i input.pdf -o output.pdf -t profile -pv 1 -v True

Convert a PDF to PDF/A-2B using a color profile, PDF/A version 2, and validate the output:

python pdf2pdfa.py -i input.pdf -o output.pdf -t profile -pv 2 -v True

Convert a PDF to PDF/A-3B using a device independent color, PDF/A version 3, and not validate the output:

python pdf2pdfa.py -i input.pdf -o output.pdf -t independent -pv 3 -v False

Credits

This project was inspired by and references:

  • pdf2archive by matteosecli - A simple Ghostscript-based PDF/A-1B converter as Shell-Script.

Licensing

  • This project is licensed under the GNU General Public License - see the LICENSE file for details.
  • Ghostscript is licensed under the GNU Affero General Public License - see the Ghostscript License for details. Both PDFA_def.ps and srgb.icc are used from Ghostscript.
  • VeraPDF is open source software dual licensed under MPL v2+ and GPL v3+ - see the VeraPDF License for details.

About

A simple Ghostscript-based PDF to PDF/A-{1, 2, 3}B converter written in Python.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published