I have the following 2 dataframes
Example1 sku loc flag 122 61 True 123 61 True 113 62 True 122 62 True 123 62 False 122 63 False 301 63 True Example2 sku dept 113 a 122 b 123 b 301 c
I want to perform a merge, or join opertation using Pandas (or whichever Python operator is best) to produce the below data frame.
Example3 sku loc flag dept 122 61 True b 123 61 True b 113 62 True a 122 62 True b 123 62 False b 122 63 False b 301 63 True c Both df_Example1.join(df_Example2,lsuffix='_ProdHier') df_Example1.join(df_Example2,how='outer',lsuffix='_ProdHier')
Aren’t working.
What am I doing wrong?
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
Perform a left merge, this will use sku column as the column to join on:
In [26]: df.merge(df1, on='sku', how='left') Out[26]: sku loc flag dept 0 122 61 True b 1 122 62 True b 2 122 63 False b 3 123 61 True b 4 123 62 False b 5 113 62 True a 6 301 63 True c
If sku is in fact your index then do this:
In [28]:
df.merge(df1, left_index=True, right_index=True, how='left')
Out[28]:
loc flag dept
sku
113 62 True a
122 61 True b
122 62 True b
122 63 False b
123 61 True b
123 62 False b
301 63 True c
Another method is to use map, if you set sku as the index on your second df, so in effect it becomes a Series then the code simplifies to this:
In [19]: df['dept']=df.sku.map(df1.dept) df Out[19]: sku loc flag dept 0 122 61 True b 1 123 61 True b 2 113 62 True a 3 122 62 True b 4 123 62 False b 5 122 63 False b 6 301 63 True c
Method 2
A more generic application would be to use apply and lambda as follows:
dict1 = {113:'a',
122:'b',
123:'b',
301:'c'}
df = pd.DataFrame([['1', 113],
['2', 113],
['3', 301],
['4', 122],
['5', 113]], columns=['num', 'num_letter'])
Add as a new dataframe column
**df['letter'] = df['num_letter'].apply(lambda x: dict1[x])** num num_letter letter 0 1 113 a 1 2 113 a 2 3 301 c 3 4 122 b 4 5 113 a
OR replace the existing (‘num_letter’) column
**df['num_letter'] = df['num_letter'].apply(lambda x: dict1[x])** num num_letter 0 1 a 1 2 a 2 3 c 3 4 b 4 5 a
Method 3
VLookup in VBA is just like pandas.dataframe.merge
I always look for so many procedures for VBA in the past and now python dataframe saves me a ton of work, good thing is I don’t need write a vlookup method.
>>> A >>> B
lkey value rkey value
0 foo 1 0 foo 5
1 bar 2 1 bar 6
2 baz 3 2 qux 7
3 foo 4 3 bar 8
>>> A.merge(B, left_on='lkey', right_on='rkey', how='outer')
lkey value_x rkey value_y
0 foo 1 foo 5
1 foo 4 foo 5
2 bar 2 bar 6
3 bar 2 bar 8
4 baz 3 NaN NaN
5 NaN NaN qux 7
You can also try the following to do a left merge.
import pandas as pd pd.merge(left, right, left_on = 'key', right_on = 'key', how='left')
outer or left act like SQL, python’s built-in class DataFrame has the method merge taking many args, which is very detailed and handy.
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0