Reading tables from pdf java

I need to read this document. Tried to do this using pdfbox:

PDDocument document = PDDocument.load(new File(path));
    PDFTextStripper stripper = new PDFTextStripper();
    String text = stripper.getText(document);

But the text is not returned. What other way can I do this?

 0

1 answers

In terms of using pdfbox, you're doing everything right. The problem is that there is no text in your document. These are scanned images placed in a pdf. pdfbox doesn't recognize text from an image, so nothing gets into the text variable except spaces and line breaks. you can get a similar effect if you insert a picture with text in a doc document and try to edit it. you can also easily verify that your code works by substituting another pdf for the input.- document

 0
Author: Дмитрий, 2019-05-30 17:08:43