How to separate similar images (Pyhton / Machine Learning)

Purpose: to separate images with equal characteristics from a folder with multiple images

(exp: foto1, foto2, foto3, foto4, foto5 > > > foto1.Copo1 foto2.Copo2 foto3.Copo3; foto4.Dog1, photo5.Dog2...)

I would like a light on the subject, but in the part I studied, I believe it would be something in the style : machine learning - > unsupervised - > grouping.

Author: Fred Guth, 2018-06-24

2 answers

Your question is very generic, there is no way to answer it specifically.

One thing you can do is use k-means to cluster by some similarity criterion. You decide the criterion: 1) can cluster by color, for example; 2) if the images are normalized, you can use SIFT and set as a criterion how many keypoints are inliers.

I'm assuming you don't have any category information from the images once you mentioned learning unsupervised. If you have any category information, the results are better.

 1
Author: Fred Guth, 2018-06-24 23:42:14

First I would advise to reduce the dimensions of these images. Because if you apply a K-means can happen the problem of course of dimensionality that causes algorithms that use distances between one point and another lose accuracy. But I do not mean literally reduce the size of the image but use a PCA or SVD for this because it will retain relevant information of the image.

There are other forms of clusters such as hierarchical clustering and Autoencouders that can be useful too.

Another important point is the memory required to handle this amount of images. depending on the algorithm and the amount of memory of your computer you can lock it.

There are more direct methods like comparing pieces from IMAGE A to image B. (But I don't think it's very good.)

Infim there are several ways to do this type of clustering.

 0
Author: Júlio Cesar Pereira Rocha, 2018-08-23 13:01:17