i have a dataframe
id store address 1 100 xyz 2 200 qwe 3 300 asd 4 400 zxc 5 500 bnm
i have another dataframe df2
serialNo store_code warehouse
1 300 Land
2 500 Sea
3 100 Land
4 200 Sea
5 400 Land
I want my final dataframe to look like:
id store address warehouse 1 100 xyz Land 2 200 qwe Sea 3 300 asd Land 4 400 zxc Land 5 500 bnm Sea
i.e map from one dataframe onto another creating new column
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
df.merge
out = (df1.merge(df2, left_on='store', right_on='store_code')
.reindex(columns=['id', 'store', 'address', 'warehouse']))
print(out)
id store address warehouse
0 1 100 xyz Land
1 2 200 qwe Sea
2 3 300 asd Land
3 4 400 zxc Land
4 5 500 bnm Sea
pd.concat + df.sort_values
u = df1.sort_values('store')
v = df2.sort_values('store_code')[['warehouse']].reset_index(drop=1)
out = pd.concat([u, v], 1)
print(out)
id store address warehouse
0 1 100 xyz Land
1 2 200 qwe Sea
2 3 300 asd Land
3 4 400 zxc Land
4 5 500 bnm Sea
The first sort call is redundant assuming your dataframe is already sorted on store, in which case you may remove it.
df.replace/df.map
s = df1.store.replace(df2.set_index('store_code')['warehouse'])
print(s)
0 Land
1 Sea
2 Land
3 Land
4 Sea
df1['warehouse'] = s
print(df1)
id store address warehouse
0 1 100 xyz Land
1 2 200 qwe Sea
2 3 300 asd Land
3 4 400 zxc Land
4 5 500 bnm Sea
Alternatively, create a mapping explicitly. This works if you want to use it later.
mapping = dict(df2[['store_code', 'warehouse']].values) df1['warehouse'] = df1.store.map(mapping) print(df1) id store address warehouse 0 1 100 xyz Land 1 2 200 qwe Sea 2 3 300 asd Land 3 4 400 zxc Land 4 5 500 bnm Sea
Method 2
df1['warehouse'] = df1['store'].map(df2.set_index('store_code')['warehouse'])
print (df1)
id store address warehouse
0 1 100 xyz Land
1 2 200 qwe Sea
2 3 300 asd Land
3 4 400 zxc Land
4 5 500 bnm Sea
df1 = df1.join(df2.set_index('store_code'), on=['store']).drop('serialNo', 1)
print (df1)
id store address warehouse
0 1 100 xyz Land
1 2 200 qwe Sea
2 3 300 asd Land
3 4 400 zxc Land
4 5 500 bnm Sea
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0