I have 2 CSV files: ‘Data’ and ‘Mapping’:
- ‘Mapping’ file has 4 columns:
Device_Name,GDN,Device_Type, andDevice_OS. All four columns are populated. - ‘Data’ file has these same columns, with
Device_Namecolumn populated and the other three columns blank. - I want my Python code to open both files and for each
Device_Namein the Data file, map itsGDN,Device_Type, andDevice_OSvalue from the Mapping file.
I know how to use dict when only 2 columns are present (1 is needed to be mapped) but I don’t know how to accomplish this when 3 columns need to be mapped.
Following is the code using which I tried to accomplish mapping of Device_Type:
x = dict([])
with open("Pricing Mapping_2013-04-22.csv", "rb") as in_file1:
file_map = csv.reader(in_file1, delimiter=',')
for row in file_map:
typemap = <div class="su-row"></div>,row[2]]
x.append(typemap)
with open("Pricing_Updated_Cleaned.csv", "rb") as in_file2, open("Data Scraper_GDN.csv", "wb") as out_file:
writer = csv.writer(out_file, delimiter=',')
for row in csv.reader(in_file2, delimiter=','):
try:
row[27] = x<div class="su-row"></div>]
except KeyError:
row[27] = ""
writer.writerow(row)
It returns Attribute Error.
After some researching, I think I need to create a nested dict, but I don’t have any idea how to do this.
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
A nested dict is a dictionary within a dictionary. A very simple thing.
>>> d = {}
>>> d['dict1'] = {}
>>> d['dict1']['innerkey'] = 'value'
>>> d['dict1']['innerkey2'] = 'value2'
>>> d
{'dict1': {'innerkey': 'value', 'innerkey2': 'value2'}}
You can also use a defaultdict from the collections package to facilitate creating nested dictionaries.
>>> import collections
>>> d = collections.defaultdict(dict)
>>> d['dict1']['innerkey'] = 'value'
>>> d # currently a defaultdict type
defaultdict(<type 'dict'>, {'dict1': {'innerkey': 'value'}})
>>> dict(d) # but is exactly like a normal dictionary.
{'dict1': {'innerkey': 'value'}}
You can populate that however you want.
I would recommend in your code something like the following:
d = {} # can use defaultdict(dict) instead
for row in file_map:
# derive row key from something
# when using defaultdict, we can skip the next step creating a dictionary on row_key
d[row_key] = {}
for idx, col in enumerate(row):
d[row_key][idx] = col
According to your comment:
may be above code is confusing the question. My problem in nutshell: I
have 2 files a.csv b.csv, a.csv has 4 columns i j k l, b.csv also has
these columns. i is kind of key columns for these csvs’. j k l column
is empty in a.csv but populated in b.csv. I want to map values of j k
l columns using ‘i` as key column from b.csv to a.csv file
My suggestion would be something like this (without using defaultdict):
a_file = "path/to/a.csv"
b_file = "path/to/b.csv"
# read from file a.csv
with open(a_file) as f:
# skip headers
f.next()
# get first colum as keys
keys = (line.split(',')[0] for line in f)
# create empty dictionary:
d = {}
# read from file b.csv
with open(b_file) as f:
# gather headers except first key header
headers = f.next().split(',')[1:]
# iterate lines
for line in f:
# gather the colums
cols = line.strip().split(',')
# check to make sure this key should be mapped.
if cols[0] not in keys:
continue
# add key to dict
d[cols[0]] = dict(
# inner keys are the header names, values are columns
(headers[idx], v) for idx, v in enumerate(cols[1:]))
Please note though, that for parsing csv files there is a csv module.
Method 2
UPDATE: For an arbitrary length of a nested dictionary, go to this answer.
Use the defaultdict function from the collections.
High performance: “if key not in dict” is very expensive when the data set is large.
Low maintenance: make the code more readable and can be easily extended.
from collections import defaultdict target_dict = defaultdict(dict) target_dict[key1][key2] = val
Method 3
For arbitrary levels of nestedness:
In [2]: def nested_dict():
...: return collections.defaultdict(nested_dict)
...:
In [3]: a = nested_dict()
In [4]: a
Out[4]: defaultdict(<function __main__.nested_dict>, {})
In [5]: a['a']['b']['c'] = 1
In [6]: a
Out[6]:
defaultdict(<function __main__.nested_dict>,
{'a': defaultdict(<function __main__.nested_dict>,
{'b': defaultdict(<function __main__.nested_dict>,
{'c': 1})})})
Method 4
It is important to remember when using defaultdict and similar nested dict modules such as nested_dict, that looking up a nonexistent key may inadvertently create a new key entry in the dict and cause a lot of havoc.
Here is a Python3 example with nested_dict module:
import nested_dict as nd
nest = nd.nested_dict()
nest['outer1']['inner1'] = 'v11'
nest['outer1']['inner2'] = 'v12'
print('original nested dict: n', nest)
try:
nest['outer1']['wrong_key1']
except KeyError as e:
print('exception missing key', e)
print('nested dict after lookup with missing key. no exception raised:n', nest)
# Instead, convert back to normal dict...
nest_d = nest.to_dict(nest)
try:
print('converted to normal dict. Trying to lookup Wrong_key2')
nest_d['outer1']['wrong_key2']
except KeyError as e:
print('exception missing key', e)
else:
print(' no exception raised:n')
# ...or use dict.keys to check if key in nested dict
print('checking with dict.keys')
print(list(nest['outer1'].keys()))
if 'wrong_key3' in list(nest.keys()):
print('found wrong_key3')
else:
print(' did not find wrong_key3')
Output is:
original nested dict: {"outer1": {"inner2": "v12", "inner1": "v11"}}
nested dict after lookup with missing key. no exception raised:
{"outer1": {"wrong_key1": {}, "inner2": "v12", "inner1": "v11"}}
converted to normal dict.
Trying to lookup Wrong_key2
exception missing key 'wrong_key2'
checking with dict.keys
['wrong_key1', 'inner2', 'inner1']
did not find wrong_key3
Method 5
pip install addict
from addict import Dict
mapping = Dict()
mapping.a.b.c.d.e = 2
print(mapping) # {'a': {'b': {'c': {'d': {'e': 2}}}}}
References:
Method 6
If you want to create a nested dictionary given a list (arbitrary length) for a path and perform a function on an item that may exist at the end of the path, this handy little recursive function is quite helpful:
def ensure_path(data, path, default=None, default_func=lambda x: x):
"""
Function:
- Ensures a path exists within a nested dictionary
Requires:
- `data`:
- Type: dict
- What: A dictionary to check if the path exists
- `path`:
- Type: list of strs
- What: The path to check
Optional:
- `default`:
- Type: any
- What: The default item to add to a path that does not yet exist
- Default: None
- `default_func`:
- Type: function
- What: A single input function that takes in the current path item (or default) and adjusts it
- Default: `lambda x: x` # Returns the value in the dict or the default value if none was present
"""
if len(path)>1:
if path[0] not in data:
data[path[0]]={}
data[path[0]]=ensure_path(data=data[path[0]], path=path[1:], default=default, default_func=default_func)
else:
if path[0] not in data:
data[path[0]]=default
data[path[0]]=default_func(data[path[0]])
return data
Example:
data={'a':{'b':1}}
ensure_path(data=data, path=['a','c'], default=[1])
print(data) #=> {'a':{'b':1, 'c':[1]}}
ensure_path(data=data, path=['a','c'], default=[1], default_func=lambda x:x+[2])
print(data) #=> {'a': {'b': 1, 'c': [1, 2]}}
Method 7
This thing is empty nested list from which ne will append data to empty dict
ls = [['a','a1','a2','a3'],['b','b1','b2','b3'],['c','c1','c2','c3'], ['d','d1','d2','d3']]
this means to create four empty dict inside data_dict
data_dict = {f'dict{i}':{} for i in range(4)}
for i in range(4):
upd_dict = {'val' : ls[i][0], 'val1' : ls[i][1],'val2' : ls[i][2],'val3' : ls[i][3]}
data_dict[f'dict{i}'].update(upd_dict)
print(data_dict)
The output
{‘dict0’: {‘val’: ‘a’, ‘val1’: ‘a1’, ‘val2’: ‘a2’, ‘val3’: ‘a3’},
‘dict1’: {‘val’: ‘b’, ‘val1’: ‘b1’, ‘val2’: ‘b2’, ‘val3’: ‘b3′},’dict2’:
{‘val’: ‘c’, ‘val1’: ‘c1’, ‘val2’: ‘c2’, ‘val3’: ‘c3’}, ‘dict3’: {‘val’: ‘d’, ‘val1’: ‘d1’, ‘val2’: ‘d2’, ‘val3’: ‘d3’}}
Method 8
#in jupyter
import sys
!conda install -c conda-forge --yes --prefix {sys.prefix} nested_dict
import nested_dict as nd
d = nd.nested_dict()
‘d’ can be used now to store the nested key value pairs.
Method 9
travel_log = {
"France" : {"cities_visited" : ["paris", "lille", "dijon"], "total_visits" : 10},
"india" : {"cities_visited" : ["Mumbai", "delhi", "surat",], "total_visits" : 12}
}
Method 10
You can initialize an empty NestedDict and then assign values to new keys.
from ndicts.ndicts import NestedDict nd = NestedDict() nd["level1", "level2", "level3"] = 0
>>> nd
NestedDict({'level1': {'level2': {'level3': 0}}})
ndicts is on Pypi
pip install ndicts
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0