Skip to content

πŸ“„ An efficient Python script for automated, recursive PDF to TXT conversion. Intelligently processes nested folders, mirroring the original structure for clean output. Demonstrates practical scripting, file system management, and automation.

License

Notifications You must be signed in to change notification settings

JackRipper01/PDF-to-TXT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ“„ Automated PDF to TXT Converter: Efficient Document Text Extraction

This project is a practical Python script designed for automated, recursive conversion of PDF documents into plain text files. It intelligently navigates deeply nested folder structures, mirroring the original directory layout to ensure organized output of the .txt files.

Showcases practical scripting, recursive file system management, and automation skills.

More:

Key Features & Technical Details:

  • Recursive File System Traversal: Automatically scans through complex and deeply nested directory structures to locate all PDF files, regardless of their location.
  • Batch PDF to TXT Conversion: Efficiently converts multiple PDF documents into their corresponding plain text files.
  • Preserves Directory Structure: Recreates the exact original folder hierarchy for the converted .txt files, ensuring intuitive organization and easy retrieval.
  • Simple Python Utility: A straightforward, single-script solution for automating common document processing tasks and making PDF content accessible.

This utility demonstrates strong command of Python scripting for automation, file system manipulation, and practical data transformation.

Instructions:

Place pdf files not matter if they are inside a folder inside a folder and a lot of folder with them, when running the .py file, will be mirrored the same folder structure but every pdf file would be converted to txt file.

Note: Reference section and Aknowledgement section will be deleted cause this work was implemented to convert pdf papers to txt.

Instructions:

pip install PyPDF2 or pip install -r requirements.txt

Run .py script inside src folder

About

πŸ“„ An efficient Python script for automated, recursive PDF to TXT conversion. Intelligently processes nested folders, mirroring the original structure for clean output. Demonstrates practical scripting, file system management, and automation.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages