What must I do to use my objects of a custom type as keys in a Python dictionary (where I don’t want the “object id” to act as the key) , e.g.
class MyThing:
def __init__(self,name,location,length):
self.name = name
self.location = location
self.length = length
I’d want to use MyThing’s as keys that are considered the same if name and location are the same.
From C#/Java I’m used to having to override and provide an equals and hashcode method, and promise not to mutate anything the hashcode depends on.
What must I do in Python to accomplish this ? Should I even ?
(In a simple case, like here, perhaps it’d be better to just place a (name,location) tuple as key – but consider I’d want the key to be an object)
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
You need to add 2 methods, note __hash__ and __eq__:
class MyThing:
def __init__(self,name,location,length):
self.name = name
self.location = location
self.length = length
def __hash__(self):
return hash((self.name, self.location))
def __eq__(self, other):
return (self.name, self.location) == (other.name, other.location)
def __ne__(self, other):
# Not strictly necessary, but to avoid having both x==y and x!=y
# True at the same time
return not(self == other)
The Python dict documentation defines these requirements on key objects, i.e. they must be hashable.
Method 2
An alternative in Python 2.6 or above is to use collections.namedtuple() — it saves you writing any special methods:
from collections import namedtuple
MyThingBase = namedtuple("MyThingBase", ["name", "location"])
class MyThing(MyThingBase):
def __new__(cls, name, location, length):
obj = MyThingBase.__new__(cls, name, location)
obj.length = length
return obj
a = MyThing("a", "here", 10)
b = MyThing("a", "here", 20)
c = MyThing("c", "there", 10)
a == b
# True
hash(a) == hash(b)
# True
a == c
# False
Method 3
You override __hash__ if you want special hash-semantics, and __cmp__ or __eq__ in order to make your class usable as a key. Objects who compare equal need to have the same hash value.
Python expects __hash__ to return an integer, returning Banana() is not recommended 🙂
User defined classes have __hash__ by default that calls id(self), as you noted.
There is some extra tips from the documentation.:
Classes which inherit a
__hash__()
method from a parent class but change
the meaning of__cmp__()or__eq__()
such that the hash value returned is
no longer appropriate (e.g. by
switching to a value-based concept of
equality instead of the default
identity based equality) can
explicitly flag themselves as being
unhashable by setting__hash__ = None
in the class definition. Doing so
means that not only will instances of
the class raise an appropriate
TypeError when a program attempts to
retrieve their hash value, but they
will also be correctly identified as
unhashable when checking
isinstance(obj, collections.Hashable)
(unlike classes which define their own
__hash__()to explicitly raise TypeError).
Method 4
I noticed in python 3.8.8 (maybe ever earlier) you don’t need anymore explicitly declare __eq__() and __hash__() to have to opportunity to use your own class as a key in dict.
class Apple:
def __init__(self, weight):
self.weight = weight
def __repr__(self):
return f'Apple({self.weight})'
apple_a = Apple(1)
apple_b = Apple(1)
apple_c = Apple(2)
apple_dictionary = {apple_a : 3, apple_b : 4, apple_c : 5}
print(apple_dictionary[apple_a]) # 3
print(apple_dictionary) # {Apple(1): 3, Apple(1): 4, Apple(2): 5}
I assume from some time Python manages it on its own, however I can be wrong.
Method 5
The answer for today as I know other people may end up here like me, is to use dataclasses in python >3.7. It has both hash and eq functions.
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0