'PDF reader for Java as PDF.js

We have a project where we use pdf.js to render a PDF into webpage and it creates HTML container elements for the PDF pages. The content of the PDF is split as HTML span in the view.

Attached is the image which shows how pdf text is rendered in the view. It also shows, each span has a data-key does not corresponds to a line in PDF.

enter image description here

Now, I need a pdf reader for java which reads and breaks the content as span with data-key or just the span in the order.

There are lot of java libraries available to read PDF content which gets the content line by line but that does not solve my issue. I need a java library which could break the content equivalent to span in the view.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source