Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow calling high level functions with file-like objects #392

Closed
jstockwin opened this issue Mar 16, 2020 · 1 comment · Fixed by #393
Closed

Allow calling high level functions with file-like objects #392

jstockwin opened this issue Mar 16, 2020 · 1 comment · Fixed by #393

Comments

@jstockwin
Copy link
Member

Feature request

Some of the functions (extract_text and extract_pages) in high_level.py take pdf_file as a parameter, the path to the pdf file. This means the PDF file must be on the hard drive.

It would be nicer if there were functions available which allowed any file-like object (like extract_text_to_fp does). This means I can still call the functions even if the PDF isn't on my hard drive.

I would guess we don't want breaking changes to the existing functions (?), so will create new *_from_io functions, and then the API for the existing ones will be unchanged.

@pietermarsman
Copy link
Member

I would guess we don't want breaking changes to the existing functions (?), so will create new *_from_io functions, and then the API for the existing ones will be unchanged.

No breaking changes is indeed what we want. But you can do it like the [read_csv()](I would guess we don't want breaking changes to the existing functions (?), so will create new *_from_io functions, and then the API for the existing ones will be unchanged.) method in pandas. It supports both paths and file-like objects with a single method.

jstockwin added a commit to jstockwin/pdfminer.six that referenced this issue Mar 26, 2020
pietermarsman pushed a commit that referenced this issue Mar 26, 2020
* Allow file-like inputs to high level functions (#392)

* PR Review - move open_filename to utils
davidfraser pushed a commit to j5int/pdfminer.six that referenced this issue Sep 15, 2020
…er#393)

* Allow file-like inputs to high level functions (pdfminer#392)

* PR Review - move open_filename to utils

(cherry picked from commit 1a4a06d)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants