How do I serialize a Python dictionary into a string, and then back to a dictionary? The dictionary will have lists and other dictionaries inside it.
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
It depends on what you’re wanting to use it for. If you’re just trying to save it, you should use pickle (or, if you’re using CPython 2.x, cPickle, which is faster).
>>> import pickle
>>> pickle.dumps({'foo': 'bar'})
b'x80x03}qx00Xx03x00x00x00fooqx01Xx03x00x00x00barqx02s.'
>>> pickle.loads(_)
{'foo': 'bar'}
If you want it to be readable, you could use json:
>>> import json
>>> json.dumps({'foo': 'bar'})
'{"foo": "bar"}'
>>> json.loads(_)
{'foo': 'bar'}
json is, however, very limited in what it will support, while pickle can be used for arbitrary objects (if it doesn’t work automatically, the class can define __getstate__ to specify precisely how it should be pickled).
>>> pickle.dumps(object()) b'x80x03cbuiltinsnobjectnqx00)x81qx01.' >>> json.dumps(object()) Traceback (most recent call last): ... TypeError: <object object at 0x7fa0348230c0> is not JSON serializable
Method 2
Pickle is great but I think it’s worth mentioning literal_eval from the ast module for an even lighter weight solution if you’re only serializing basic python types. It’s basically a “safe” version of the notorious eval function that only allows evaluation of basic python types as opposed to any valid python code.
Example:
>>> d = {}
>>> d[0] = range(10)
>>> d['1'] = {}
>>> d['1'][0] = range(10)
>>> d['1'][1] = 'hello'
>>> data_string = str(d)
>>> print data_string
{0: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9], '1': {0: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9], 1: 'hello'}}
>>> from ast import literal_eval
>>> d == literal_eval(data_string)
True
One benefit is that the serialized data is just python code, so it’s very human friendly. Compare it to what you would get with pickle.dumps:
>>> import pickle >>> print pickle.dumps(d) (dp0 I0 (lp1 I0 aI1 aI2 aI3 aI4 aI5 aI6 aI7 aI8 aI9 asS'1' p2 (dp3 I0 (lp4 I0 aI1 aI2 aI3 aI4 aI5 aI6 aI7 aI8 aI9 asI1 S'hello' p5 ss.
The downside is that as soon as the the data includes a type that is not supported by literal_ast you’ll have to transition to something else like pickling.
Method 3
Use Python’s json module, or simplejson if you don’t have python 2.6 or higher.
Method 4
If you fully trust the string and don’t care about python injection attacks then this is very simple solution:
d = { 'method' : "eval", 'safe' : False, 'guarantees' : None }
s = str(d)
d2 = eval(s)
for k in d2:
print k+"="+d2[k]
If you’re more safety conscious then ast.literal_eval is a better bet.
Method 5
One thing json cannot do is dict indexed with numerals. The following snippet
import json
dictionary = dict({0:0, 1:5, 2:10})
serialized = json.dumps(dictionary)
unpacked = json.loads(serialized)
print(unpacked[0])
will throw
KeyError: 0
Because keys are converted to strings. cPickle preserves the numeric type and the unpacked dict can be used right away.
Method 6
pyyaml should also be mentioned here. It is both human readable and can serialize any python object.
pyyaml is hosted here:
https://pypi.org/project/PyYAML
Method 7
While not strictly serialization, json may be reasonable approach here. That will handled nested dicts and lists, and data as long as your data is “simple”: strings, and basic numeric types.
Method 8
A new alternative to JSON or YaML is NestedText. It supports strings that are nested in lists and dictionaries to any depth. It conveys nesting through the use of indenting, and so has no need for either quoting or escaping. As such, the result tends to be very readable. The result looks like YaML, but without all the special cases. It is especially appropriate for serializing code snippets. For example, here is an a single test case extracted from a much larger set that was serialized with NestedText:
base tests:
-
args: --quiet --config test7 files -N configs/subdir
expected:
> Archive: test7-dddd-dd-ddTdd:dd:dd
> «TESTS»/configs/subdir/
> «TESTS»/configs/subdir/file
Be aware, that integers, floats, and bools are converted to strings.
Method 9
If you are trying to only serialize then pprint may also be a good option. It requires the object to be serialized and a file stream.
Here’s some code:
from pprint import pprint
my_dict = {1:'a',2:'b'}
with open('test_results.txt','wb') as f:
pprint(my_dict,f)
I am not sure if we can deserialize easily. I was using json to serialize and deserialze earlier which works correctly in most cases.
f.write(json.dumps(my_dict, sort_keys = True, indent = 2, ensure_ascii=True))
However, in one particular case, there were some errors writing non-unicode data to json.
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0