What are the terms "Cascade " and" Classifier " in relation to computer vision?

Whenever I read something about OpenCV and Computer Vision I get two terms that are Cascade and Classifier, these Terms leave me very confused about what they can be or mean.

Therefore, I would like the following doubts to be clarified:

  1. What is Cascade?
  2. what is Classifier?
  3. There are differences between Cascade and Classifier or both represent the same things?
Author: Luiz Vieira, 2017-12-12

1 answers

Classifiers

A classifier is a computational system that, having some input data that characterizes an example of something, classifies that something among some options.

The use of classifiers in Computer Vision is the most diverse. Famous illustrative examples are those machines that separate ripe tomatoes from Greens or ripe coffee beans from Greens, commonly based on color analysis. They are also used in the detection of intruders, from fireplaces, face recognition (identity), Nudity Detection, recognize digits , and so on.

However, classifiers are also useful in many other tasks that do not involve image processing. They can, for example, be used to classify a stock as buy or sell based on its transaction history, a player as novice or expert based on its game history, flowers in different categories based on their physical characteristics, different spelling structures in a sentence written based only on the text, different voice commands based on variations in voice frequency, and so on.

This term is traditionally used in Artificial Intelligence , especially in machine learning . What happens is that machine learning is quite popular nowadays, mainly in the processing of images, and may generate confusion that the classifier term is unique to that area. To learn how a classifier works, take a look at in this and also in this another question.

Cascades

Cascade (cascade, in English) or Haar Cascade is the "nickname" of a famous image object search algorithm, whose official name is Viola-Jones algorithm (due to the names of the authors).

This algorithm it uses masks (the so-called Haar Features) to characterize an object through variations in brightness (mainly at edges). Masks capture these variations in different amplitudes and directions, and values characterizing a certain type of object (an entire face or just an eye, for example) are learned with a machine learning algorithm called AdaBoost to generate multiple classifiers, one for each Haar Feature. Since these classifiers are produced - that is, trained from example images (in the case of Cascade, with positive images - which have the object - and negative ones - which do not have the object), it finds the object in a new image by running the various "cascading" mini classifiers (and hence the nickname of the algorithm):

  1. the algorithm sets a window size, scales the Haar Features to that size, and scans the image being searched according to that window. (It is much more efficient the image, because the features can already be kept pre-scaled).
  2. in each window, the algorithm selects and executes one of the mini classifiers (for a given feature) based on the pixel values in the image under that window.
  3. if the feature returns false (that is, it is not the object from its point of view), the algorithm proceeds to the next window. If the windows are finished, he concludes that he does not have the desired object in the image.
  4. if a feature returns true( that is, it is the object from her point of view), the algorithm passes to the next classifier and repeats from Step 2. If there are no more mini classifiers to run, the algorithm concludes because it found the object in that window.

The following image illustrates this process in the OpenCV face detector (features are reproduced from the OpenCV page).

insert the description of the image here

The use of AdaBoost is important for performance, since that this algorithm iteratively searches the image with windows of various sizes. So he is able to find objects quite robustly at different scales (that is, with different sizes). An important point is that your accuracy in finding objects strongly depends on the images used in the training. If images with the object are used in only one orientation in the training, the algorithm will not be able to identify the object in other orientations (for example example, a banana detection algorithm trained with bananas horizontally will not recognize them in any orientation other than that).

Conclusion

Definitely the terms don't mean the same thing. Classifiers are algorithms used for predicting classes in input data. The Viola-Jones algorithm, or Haar Cascade, or just Cascade, is an object detection algorithm that uses several classifiers.

 16
Author: Luiz Vieira, 2017-12-14 23:10:34