Comparing objects in python
There are 2 objects. The element-by-element comparison provides equality (i.e., comparing each element with each, x.min_x
with y.min_x
, x.min_y
with x.min_y
, etc.). If you compare for equality using the ==
operator, then there will arise they will not be equal. All actions are performed within the PyCharm
x = Rectangle(min_x=1, max_x=4, min_y=0, max_y=3)
y = Rectangle(min_x=1, max_x=4, min_y=0, max_y=3)
a = x.max_y == y.max_y # True
a = x.max_x == y.max_x # True
a = x.min_x == y.min_x # True
a = x.min_y == y.min_y # True
a = x == y # False
How to solve the problem?
3 answers
Python does not know how to compare user objects by default - indeed, classes can be arranged much more complex from the inside than straight sets of values. Their comparison in this case may include a comparison of not all variables, but only some, and sometimes you still need to perform some actions with them. And this is not to mention that sometimes we may need to compare objects of different classes.
So in the general case, two different instances of custom classes will always be considered unequal to each other. To test this, first define a class for our objects:
class Rectangle(object):
def __init__(self, min_x, min_y, max_x, max_y):
self.min_x, self.min_y = min_x, min_y
self.max_x, self.max_y = max_x, max_y
Let's try to compare two arbitrary objects that have nothing in common, just in case:
foo = Rectangle(0, 0, 42, 42)
bar = Rectangle(4, 8, 15, 16)
print(foo == bar) # False
So far, in general, nothing unexpected. Now let's see what happens when comparing instances that have identical sets of attributes:
baz = Rectangle(0, 0, 42, 42)
# казалось бы, этот объект ничем не отличается от первого
print(foo == baz) # False
Contrary to expectations, when trying to comparisons we still get False
- the ==
operator doesn't even try to look at the attributes of objects.
It seems that no matter what we try to compare an instance of a custom class with, the result will always be negative. This is close to the truth, but with one small exception:
print(foo == foo) # True
Each object is by default equal to itself - that is, generally speaking, the comparison function is completely equivalent to checking object1 is object2
. This can be used, although it is still better to do it directly via is
.
Well, if you remember the specifics of assignment in python, it becomes clear how the following case works:
quux = foo
print(quux == foo, quux is foo) # True True
We did not create a new object, but simply linked the name quux to the old foo object, which was the only one left. And it is, as before, equivalent to itself.
Now about how this situation can be changed.
A possible solution "head-on" is to take dictionaries of all the attributes of objects and compare them directly - and dictionaries are considered equal if and only if all their keys and values are pairwise equal:
print(bar.__dict__ == baz.__dict__) # True
However, such direct access to service variables from the outside without a good reason is considered very ugly and "non-Python", especially since this does not completely solve the problem - if there are not only simple entities among the attribute values, but also, for example, instances of nested classes, the comparison will not work as it should again. And the recording of this method is very inconvenient and cumbersome in itself. to myself.
"The correct " solution in such cases is to define a method in the class__eq__
, which sets the interaction of objects with the ==
operator. The method must take two arguments self
and other
as input and return True
if the objects are equal and False
if they are not. If objects of these types are not comparable to each other at all, you need to return NotImplemented
- after seeing it, the interpreter will try to use the __eq__
method from the second argument, and if it returns NotImplemented
, the result is reduced to False
.
Redefine our class so that when calling __eq__
, the coordinates of the rectangles are compared in pairs:
class Rectangle(object):
...
def __eq__(self, other):
# сравнение двух прямоугольников
if isinstance(other, Rectangle):
return (self.min_x == other.min_x and
self.min_y == other.min_y and
self.max_x == other.max_x and
self.max_y == other.max_y)
# иначе возвращаем NotImplemented
return NotImplemented
foo = Rectangle(0, 0, 42, 42)
bar = Rectangle(4, 8, 15, 16)
print(foo == bar) # всё ещё False
baz = Rectangle(0, 0, 42, 42)
print(foo == baz) # True
print(foo == 10) # False
The __eq__
method adds a lot of flexibility to our class - objects can have attributes that are not involved in the comparison process (for example, unique names), and the algorithm itself can be complicated if necessary: for example, you can compare only the shape of rectangles, regardless of their location on the plane.
class Rectangle(object):
def __init__(self, min_x, min_y, max_x, max_y, name=None):
...
self.name = name
def __eq__(self, other):
if isinstance(other, Rectangle):
return (self.max_x - self.min_x == other.max_x - other.min_x and
self.max_y - self.min_y == other.max_y - other.min_y)
return NotImplemented
print(Rectangle(4, 8, 15, 16, 'spam') == Rectangle(14, 18, 25, 26, 'eggs')) # True
Or and at all, you can compare rectangles only by area, and not only with each other, but also, say, with triangles or even with numeric values:
from math import sqrt
class Rectangle(object):
...
def area(self):
return (self.max_x - self.min_x)*(self.max_y - self.min_y)
def __eq__(self, other):
if isinstance(other, (Rectangle, Triangle)):
return self.area() == other.area()
if isinstance(other, (int, float)):
return self.area() == other
return NotImplemented
class Triangle(object):
def __init__(self, a_x, a_y, b_x, b_y, c_x, c_y):
# задаётся координатами трёх вершин
self.a_x, self.a_y = a_x, a_y
self.b_x, self.b_y = b_x, b_y
self.c_x, self.c_y = c_x, c_y
# сразу посчитаем длины сторон
self.ab = sqrt((a_x - b_x)**2 + (a_y - b_y)**2)
self.bc = sqrt((b_x - c_x)**2 + (b_y - c_y)**2)
self.ca = sqrt((c_x - a_x)**2 + (c_y - a_y)**2)
def area(self):
p = (self.ab + self.bc + self.ca)/2
return sqrt(p * (p - self.ab) * (p - self.bc) * (p - self.ca))
def __eq__(self, other):
...
print(Rectangle(57, 179, 59, 185) == Rectangle(-100, 4, -96, 7)) # True
print(Rectangle(1, 4, 9, 7) == Triangle(0, 0, 0, 6, 8, 0)) # True
Strictly speaking, for triangles, it was possible not to define the __eq__
method at all - for inter-class comparison, it is enough that it is present in at least one of the classes. But in general, it is still worth prescribing it both there and there. First, it will allow, if necessary, to compare the triangles with each other. Secondly, when processing a comparison, the interpreter may first try to use __eq__
from both the first object and the second (depending on the specific interpretation parameters), and having both methods will speed up the processing slightly.
That's something like that. For more information about __eq__
, its inverse __ne__
, and other special class methods, see thedocumentation .
You have two different objects with the same content. If you want to compare not the equality of references to objects, but the equality of their contents, you should at least redefine the method __eq__
.
total_ordering
and магические методы
.
Any operation that applies to an object necessarily "requires" the object to have a method corresponding to the operation. This method is called специальным
or магическим
. Some магические методы
are defined for objects by default, the rest must be defined by yourself.
For example, for an object to behave like a function, i.e., to be called ()
: myObj()
, then in myObj
there must be a method __call__
.
To compare objects, use the operator >
: myObj1 > myObj2
, must be the method __gt__
.
And so with any operation applicable to the object.
For comparison operators, in order not to define all possible comparison operations, it is convenient to use total_ordering
, as in the example.