Search for images containing paper with text

There are several photos available. It is necessary to find among them those photos on which a sheet of paper with text is captured. That is, the program should separate the photos of book/notebook pages from any other photos (faces, nature, etc.). In this case, you need to select only those photos where the paper with text takes up most of the photo (for example, a photo of a wall with a small ad glued to it is not interested in us, so look for small areas with text on the image). no text needed).

What technologies can be used to implement this? While I see options:

  • OpenCV (many text recognition/detection algorithms, but what I found is designed for a more general case, such as "searching for text in a photo" or " highlighting areas with text")
  • Neural networks

I couldn't Google something specific for my task, because, as I understood, usually the task is to search for areas with text or character recognition, and I don't like it required.

If there are any libraries for this under C++/Java/Python, then it will be very useful.

Here are 2 examples of the desired images and 1 example of an unsuitable image.

Example of the image you are looking for Example of the image you are looking for Example of an inappropriate image

Author: AnatoliySultanov, 2017-04-11

2 answers

Here is an example of a text search algorithm on matlab. I would advise you to start with it. If you don't have a matlab, you can do the same on OpenCV.

This is really a more general case. But if the text is on a light background, and takes up most of the image, then perhaps this is enough for your purposes.

 0
Author: Dima, 2017-06-14 20:20:24

1 Neural networks

2 Python

3 Normal convolutional ns, give the generator a daddy with 2 daddies: fit/unsuitable, all)

 0
Author: Polina, 2021-01-10 16:28:40