pdfminer Archives

Extracting text from a PDF file using PDFMiner in python?

August 20, 2022 by Magenaut

I am looking for documentation or examples on how to extract text from a PDF file using PDFMiner with Python.

How do I use pdfminer as a library

August 17, 2022 by Magenaut

I am trying to get text data from a pdf using pdfminer. I am able to extract this data to a .txt file successfully with the pdfminer command line tool pdf2txt.py. I currently do this and then use a python script to clean up the .txt file. I would like to incorporate the pdf extract process into the script and save myself a step.