I understand the difference between copy vs. deepcopy in the copy module. I’ve used copy.copy and copy.deepcopy before successfully, but this is the first time I’ve actually gone about overloading the __copy__ and __deepcopy__ methods. I’ve already Googled around and looked through the built-in Python modules to look for instances of the __copy__ and __deepcopy__ functions (e.g. sets.py, decimal.py, and fractions.py), but I’m still not 100% sure I’ve got it right.
Here’s my scenario:
I have a configuration object. Initially, I’m going to instantiate one configuration object with a default set of values. This configuration will be handed off to multiple other objects (to ensure all objects start with the same configuration). However, once user interaction starts, each object needs to tweak its configurations independently without affecting each other’s configurations (which says to me I’ll need to make deepcopys of my initial configuration to hand around).
Here’s a sample object:
class ChartConfig(object):
def __init__(self):
#Drawing properties (Booleans/strings)
self.antialiased = None
self.plot_style = None
self.plot_title = None
self.autoscale = None
#X axis properties (strings/ints)
self.xaxis_title = None
self.xaxis_tick_rotation = None
self.xaxis_tick_align = None
#Y axis properties (strings/ints)
self.yaxis_title = None
self.yaxis_tick_rotation = None
self.yaxis_tick_align = None
#A list of non-primitive objects
self.trace_configs = []
def __copy__(self):
pass
def __deepcopy__(self, memo):
pass
What is the right way to implement the copy and deepcopy methods on this object to ensure copy.copy and copy.deepcopy give me the proper behavior?
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
Putting together Alex Martelli’s answer and Rob Young’s comment you get the following code:
from copy import copy, deepcopy
class A(object):
def __init__(self):
print 'init'
self.v = 10
self.z = [2,3,4]
def __copy__(self):
cls = self.__class__
result = cls.__new__(cls)
result.__dict__.update(self.__dict__)
return result
def __deepcopy__(self, memo):
cls = self.__class__
result = cls.__new__(cls)
memo[id(self)] = result
for k, v in self.__dict__.items():
setattr(result, k, deepcopy(v, memo))
return result
a = A()
a.v = 11
b1, b2 = copy(a), deepcopy(a)
a.v = 12
a.z.append(5)
print b1.v, b1.z
print b2.v, b2.z
prints
init 11 [2, 3, 4, 5] 11 [2, 3, 4]
here __deepcopy__ fills in the memo dict to avoid excess copying in case the object itself is referenced from its member.
Method 2
The recommendations for customizing are at the very end of the docs page:
Classes can use the same interfaces to
control copying that they use to
control pickling. See the description
of module pickle for information on
these methods. The copy module does
not use the copy_reg registration
module.In order for a class to define its own
copy implementation, it can define
special methods__copy__()and
__deepcopy__(). The former is called to implement the shallow copy
operation; no additional arguments are
passed. The latter is called to
implement the deep copy operation; it
is passed one argument, the memo
dictionary. If the__deepcopy__()
implementation needs to make a deep
copy of a component, it should call
thedeepcopy()function with the
component as first argument and the
memo dictionary as second argument.
Since you appear not to care about pickling customization, defining __copy__ and __deepcopy__ definitely seems like the right way to go for you.
Specifically, __copy__ (the shallow copy) is pretty easy in your case…:
def __copy__(self): newone = type(self)() newone.__dict__.update(self.__dict__) return newone
__deepcopy__ would be similar (accepting a memo arg too) but before the return it would have to call self.foo = deepcopy(self.foo, memo) for any attribute self.foo that needs deep copying (essentially attributes that are containers — lists, dicts, non-primitive objects which hold other stuff through their __dict__s).
Method 3
Following Peter’s excellent answer, to implement a custom deepcopy, with minimal alteration to the default implementation (e.g. just modifying a field like I needed) :
class Foo(object):
def __deepcopy__(self, memo):
deepcopy_method = self.__deepcopy__
self.__deepcopy__ = None
cp = deepcopy(self, memo)
self.__deepcopy__ = deepcopy_method
cp.__deepcopy__ = deepcopy_method
# custom treatments
# for instance: cp.id = None
return cp
Method 4
Its not clear from your problem why you need to override these methods, since you don’t want to do any customization to the copying methods.
Anyhow, if you do want to customize the deep copy (e.g. by sharing some attributes and copying others), here is a solution:
from copy import deepcopy
def deepcopy_with_sharing(obj, shared_attribute_names, memo=None):
'''
Deepcopy an object, except for a given list of attributes, which should
be shared between the original object and its copy.
obj is some object
shared_attribute_names: A list of strings identifying the attributes that
should be shared between the original and its copy.
memo is the dictionary passed into __deepcopy__. Ignore this argument if
not calling from within __deepcopy__.
'''
assert isinstance(shared_attribute_names, (list, tuple))
shared_attributes = {k: getattr(obj, k) for k in shared_attribute_names}
if hasattr(obj, '__deepcopy__'):
# Do hack to prevent infinite recursion in call to deepcopy
deepcopy_method = obj.__deepcopy__
obj.__deepcopy__ = None
for attr in shared_attribute_names:
del obj.__dict__[attr]
clone = deepcopy(obj)
for attr, val in shared_attributes.iteritems():
setattr(obj, attr, val)
setattr(clone, attr, val)
if hasattr(obj, '__deepcopy__'):
# Undo hack
obj.__deepcopy__ = deepcopy_method
del clone.__deepcopy__
return clone
class A(object):
def __init__(self):
self.copy_me = []
self.share_me = []
def __deepcopy__(self, memo):
return deepcopy_with_sharing(self, shared_attribute_names = ['share_me'], memo=memo)
a = A()
b = deepcopy(a)
assert a.copy_me is not b.copy_me
assert a.share_me is b.share_me
c = deepcopy(b)
assert c.copy_me is not b.copy_me
assert c.share_me is b.share_me
Method 5
I might be a bit off on the specifics, but here goes;
From the copy docs;
- A shallow copy constructs a new compound object and then (to the extent possible) inserts references into it to the objects found in the original.
- A deep copy constructs a new compound object and then, recursively, inserts copies into it of the objects found in the original.
In other words: copy() will copy only the top element and leave the rest as pointers into the original structure. deepcopy() will recursively copy over everything.
That is, deepcopy() is what you need.
If you need to do something really specific, you can override __copy__() or __deepcopy__(), as described in the manual. Personally, I’d probably implement a plain function (e.g. config.copy_config() or such) to make it plain that it isn’t Python standard behaviour.
Method 6
The copy module uses eventually the __getstate__()/__setstate__() pickling protocol, so these are also valid targets to override.
The default implementation just returns and sets the __dict__ of the class, so you don’t have to call super() and worry about Eino Gourdin’s clever trick, above.
Method 7
Building on Antony Hatchkins’ clean answer, here’s my version where the class in question derives from another custom class (s.t. we need to call super):
class Foo(FooBase):
def __init__(self, param1, param2):
self._base_params = [param1, param2]
super(Foo, result).__init__(*self._base_params)
def __copy__(self):
cls = self.__class__
result = cls.__new__(cls)
result.__dict__.update(self.__dict__)
super(Foo, result).__init__(*self._base_params)
return result
def __deepcopy__(self, memo):
cls = self.__class__
result = cls.__new__(cls)
memo[id(self)] = result
for k, v in self.__dict__.items():
setattr(result, k, copy.deepcopy(v, memo))
super(Foo, result).__init__(*self._base_params)
return result
Method 8
Peter‘s and Eino Gourdin‘s answers are clever and useful, but they have a very subtle bug!
Python methods are bound to their object. When you do cp.__deepcopy__ = deepcopy_method, you are actually giving the object cp a reference to __deepcopy__ on the original object. Any calls to cp.__deepcopy__ will return a copy of the original!
If you deepcopy your object and then deepcopy that copy, the output is a NOT a copy of the copy!
Here’s a minimal example of the behavior, along with my fixed implementation where you copy the __deepcopy__ implementation and then bind it to the new object:
from copy import deepcopy
import types
class Good:
def __init__(self):
self.i = 0
def __deepcopy__(self, memo):
deepcopy_method = self.__deepcopy__
self.__deepcopy__ = None
cp = deepcopy(self, memo)
self.__deepcopy__ = deepcopy_method
# Copy the function object
func = types.FunctionType(
deepcopy_method.__code__,
deepcopy_method.__globals__,
deepcopy_method.__name__,
deepcopy_method.__defaults__,
deepcopy_method.__closure__,
)
# Bind to cp and set
bound_method = func.__get__(cp, cp.__class__)
cp.__deepcopy__ = bound_method
return cp
class Bad:
def __init__(self):
self.i = 0
def __deepcopy__(self, memo):
deepcopy_method = self.__deepcopy__
self.__deepcopy__ = None
cp = deepcopy(self, memo)
self.__deepcopy__ = deepcopy_method
cp.__deepcopy__ = deepcopy_method
return cp
x = Bad()
copy = deepcopy(x)
copy.i = 1
copy_of_copy = deepcopy(copy)
print(copy_of_copy.i) # 0
x = Good()
copy = deepcopy(x)
copy.i = 1
copy_of_copy = deepcopy(copy)
print(copy_of_copy.i) # 1
Method 9
I came here for performance reasons. Using the default copy.deepcopy() function was slowing down my code by up to 30 times.
Using the answer by @Anthony Hatchkins as a starting point, I realized that copy.deepcopy() is really slow for e.g. lists. I replaced the setattr loop with simple [:] slicing to copy whole lists. For anyone concerned with performance it is worthwhile doing timeit.timeit() comparisons and replacing the calls to copy.deepcopy() by faster alternatives.
setup = 'import copy; l = [1, 2, 3, 4, 5, 6, 7, 8, 9, 0]' timeit.timeit(setup = setup, stmt='m=l[:]') timeit.timeit(setup = setup, stmt='m=l.copy()') timeit.timeit(setup = setup, stmt='m=copy.deepcopy(l)')
will give these results:
0.11505379999289289 0.09126630000537261 6.423627900003339
Method 10
Similar with Zach Price‘s thoughts, there is a simpler way to achieve that goal, i.e. unbind the original __deepcopy__ method then bind it to cp
from copy import deepcopy
import types
class Good:
def __init__(self):
self.i = 0
def __deepcopy__(self, memo):
deepcopy_method = self.__deepcopy__
self.__deepcopy__ = None
cp = deepcopy(self, memo)
self.__deepcopy__ = deepcopy_method
# Bind to cp by types.MethodType
cp.__deepcopy__ = types.MethodType(deepcopy_method.__func__, cp)
return cp
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0