Reading tables from pdf java
I need to read this document. Tried to do this using pdfbox:
PDDocument document = PDDocument.load(new File(path));
PDFTextStripper stripper = new PDFTextStripper();
String text = stripper.getText(document);
But the text is not returned. What other way can I do this?
0
Author: Александр Доценко, 2019-05-30
1 answers
In terms of using pdfbox, you're doing everything right. The problem is that there is no text in your document. These are scanned images placed in a pdf. pdfbox doesn't recognize text from an image, so nothing gets into the text variable except spaces and line breaks. you can get a similar effect if you insert a picture with text in a doc document and try to edit it. you can also easily verify that your code works by substituting another pdf for the input.- document
0
Author: Дмитрий, 2019-05-30 17:08:43