Comparing objects in python

Question

Comparing objects in python

There are 2 objects. The element-by-element comparison provides equality (i.e., comparing each element with each, x.min_x with y.min_x, x.min_y with x.min_y , etc.). If you compare for equality using the == operator, then there will arise they will not be equal. All actions are performed within the PyCharm

x = Rectangle(min_x=1, max_x=4, min_y=0, max_y=3)
y = Rectangle(min_x=1, max_x=4, min_y=0, max_y=3)
a = x.max_y == y.max_y  # True
a = x.max_x == y.max_x  # True
a = x.min_x == y.min_x  # True
a = x.min_y == y.min_y  # True
a = x == y  # False

How to solve the problem?

1

python структуры-данных

Author: hedgehogues, 2017-08-22

Source

3 answers

You have two different objects with the same content. If you want to compare not the equality of references to objects, but the equality of their contents, you should at least redefine the method __eq__.

4

Author: Sergey Gornostaev, 2017-08-22 15:19:56

total_ordering and магические методы.

Any operation that applies to an object necessarily "requires" the object to have a method corresponding to the operation. This method is called специальным or магическим. Some магические методы are defined for objects by default, the rest must be defined by yourself.

For example, for an object to behave like a function, i.e., to be called (): myObj(), then in myObj there must be a method __call__.

To compare objects, use the operator >: myObj1 > myObj2, must be the method __gt__.

And so with any operation applicable to the object.

For comparison operators, in order not to define all possible comparison operations, it is convenient to use total_ordering, as in the example.

1

Author: vadim vaduxa, 2017-08-23 17:04:13

score 11 · Accepted Answer

Python does not know how to compare user objects by default - indeed, classes can be arranged much more complex from the inside than straight sets of values. Their comparison in this case may include a comparison of not all variables, but only some, and sometimes you still need to perform some actions with them. And this is not to mention that sometimes we may need to compare objects of different classes.

So in the general case, two different instances of custom classes will always be considered unequal to each other. To test this, first define a class for our objects:

class Rectangle(object):
    def __init__(self, min_x, min_y, max_x, max_y):
        self.min_x, self.min_y = min_x, min_y
        self.max_x, self.max_y = max_x, max_y

Let's try to compare two arbitrary objects that have nothing in common, just in case:

foo = Rectangle(0, 0, 42, 42)
bar = Rectangle(4, 8, 15, 16)

print(foo == bar) # False

So far, in general, nothing unexpected. Now let's see what happens when comparing instances that have identical sets of attributes:

baz = Rectangle(0, 0, 42, 42)
# казалось бы, этот объект ничем не отличается от первого
print(foo == baz) # False

Contrary to expectations, when trying to comparisons we still get False - the == operator doesn't even try to look at the attributes of objects.

It seems that no matter what we try to compare an instance of a custom class with, the result will always be negative. This is close to the truth, but with one small exception:

print(foo == foo) # True

Each object is by default equal to itself - that is, generally speaking, the comparison function is completely equivalent to checking object1 is object2. This can be used, although it is still better to do it directly via is.

Well, if you remember the specifics of assignment in python, it becomes clear how the following case works:

quux = foo
print(quux == foo, quux is foo) # True True

We did not create a new object, but simply linked the name quux to the old foo object, which was the only one left. And it is, as before, equivalent to itself.

Now about how this situation can be changed.

A possible solution "head-on" is to take dictionaries of all the attributes of objects and compare them directly - and dictionaries are considered equal if and only if all their keys and values are pairwise equal:

print(bar.__dict__ == baz.__dict__) # True

However, such direct access to service variables from the outside without a good reason is considered very ugly and "non-Python", especially since this does not completely solve the problem - if there are not only simple entities among the attribute values, but also, for example, instances of nested classes, the comparison will not work as it should again. And the recording of this method is very inconvenient and cumbersome in itself. to myself.

"The correct " solution in such cases is to define a method in the class__eq__, which sets the interaction of objects with the == operator. The method must take two arguments self and other as input and return True if the objects are equal and False if they are not. If objects of these types are not comparable to each other at all, you need to return NotImplemented - after seeing it, the interpreter will try to use the __eq__ method from the second argument, and if it returns NotImplemented, the result is reduced to False.

Redefine our class so that when calling __eq__, the coordinates of the rectangles are compared in pairs:

class Rectangle(object):
        ...
    def __eq__(self, other):
        # сравнение двух прямоугольников
        if isinstance(other, Rectangle):
            return (self.min_x == other.min_x and
                         self.min_y == other.min_y and
                         self.max_x == other.max_x and
                         self.max_y == other.max_y)
        # иначе возвращаем NotImplemented
        return NotImplemented

foo = Rectangle(0, 0, 42, 42)
bar = Rectangle(4, 8, 15, 16)

print(foo == bar) # всё ещё False

baz = Rectangle(0, 0, 42, 42)
print(foo == baz) # True

print(foo == 10) # False

The __eq__ method adds a lot of flexibility to our class - objects can have attributes that are not involved in the comparison process (for example, unique names), and the algorithm itself can be complicated if necessary: for example, you can compare only the shape of rectangles, regardless of their location on the plane.

class Rectangle(object):
    def __init__(self, min_x, min_y, max_x, max_y, name=None):
        ...
        self.name = name

    def __eq__(self, other):
        if isinstance(other, Rectangle):
            return (self.max_x - self.min_x == other.max_x - other.min_x and
                         self.max_y - self.min_y == other.max_y - other.min_y)
        return NotImplemented

print(Rectangle(4, 8, 15, 16, 'spam') == Rectangle(14, 18, 25, 26, 'eggs')) # True

Or and at all, you can compare rectangles only by area, and not only with each other, but also, say, with triangles or even with numeric values:

from math import sqrt

class Rectangle(object):
        ...
    def area(self):
        return (self.max_x - self.min_x)*(self.max_y - self.min_y)

    def __eq__(self, other):
        if isinstance(other, (Rectangle, Triangle)):
            return self.area() == other.area()
        if isinstance(other, (int, float)):
            return self.area() == other
        return NotImplemented

class Triangle(object):
    def __init__(self, a_x, a_y, b_x, b_y, c_x, c_y):
    # задаётся координатами трёх вершин
        self.a_x, self.a_y = a_x, a_y
        self.b_x, self.b_y = b_x, b_y
        self.c_x, self.c_y = c_x, c_y
        # сразу посчитаем длины сторон
        self.ab = sqrt((a_x - b_x)**2 + (a_y - b_y)**2)
        self.bc = sqrt((b_x - c_x)**2 + (b_y - c_y)**2)
        self.ca = sqrt((c_x - a_x)**2 + (c_y - a_y)**2)

    def area(self):
        p = (self.ab + self.bc + self.ca)/2
        return sqrt(p * (p - self.ab) * (p - self.bc) * (p - self.ca))

    def __eq__(self, other):
       ...

print(Rectangle(57, 179, 59, 185) == Rectangle(-100, 4, -96, 7)) # True
print(Rectangle(1, 4, 9, 7) == Triangle(0, 0, 0, 6, 8, 0)) # True

Strictly speaking, for triangles, it was possible not to define the __eq__ method at all - for inter-class comparison, it is enough that it is present in at least one of the classes. But in general, it is still worth prescribing it both there and there. First, it will allow, if necessary, to compare the triangles with each other. Secondly, when processing a comparison, the interpreter may first try to use __eq__ from both the first object and the second (depending on the specific interpretation parameters), and having both methods will speed up the processing slightly.

That's something like that. For more information about __eq__, its inverse __ne__ , and other special class methods, see thedocumentation .