vlookup in Pandas using join

I have the following 2 dataframes

Example1
sku loc flag  
122  61 True 
123  61 True
113  62 True 
122  62 True 
123  62 False
122  63 False
301  63 True 

Example2 
sku dept 
113 a
122 b
123 b
301 c

I want to perform a merge, or join opertation using Pandas (or whichever Python operator is best) to produce the below data frame.

Example3
sku loc flag   dept  
122  61 True   b
123  61 True   b
113  62 True   a
122  62 True   b
123  62 False  b
122  63 False  b
301  63 True   c

Both 
df_Example1.join(df_Example2,lsuffix='_ProdHier')
df_Example1.join(df_Example2,how='outer',lsuffix='_ProdHier')

Aren’t working.
What am I doing wrong?

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

Perform a left merge, this will use sku column as the column to join on:

In [26]:

df.merge(df1, on='sku', how='left')
Out[26]:
   sku  loc   flag dept
0  122   61   True    b
1  122   62   True    b
2  122   63  False    b
3  123   61   True    b
4  123   62  False    b
5  113   62   True    a
6  301   63   True    c

If sku is in fact your index then do this:

In [28]:

df.merge(df1, left_index=True, right_index=True, how='left')
Out[28]:
     loc   flag dept
sku                 
113   62   True    a
122   61   True    b
122   62   True    b
122   63  False    b
123   61   True    b
123   62  False    b
301   63   True    c

Another method is to use map, if you set sku as the index on your second df, so in effect it becomes a Series then the code simplifies to this:

In [19]:

df['dept']=df.sku.map(df1.dept)
df
Out[19]:
   sku  loc   flag dept
0  122   61   True    b
1  123   61   True    b
2  113   62   True    a
3  122   62   True    b
4  123   62  False    b
5  122   63  False    b
6  301   63   True    c

Method 2

A more generic application would be to use apply and lambda as follows:

dict1 = {113:'a',
         122:'b',
         123:'b',
         301:'c'}

df = pd.DataFrame([['1', 113],
                   ['2', 113],
                   ['3', 301],
                   ['4', 122],
                   ['5', 113]], columns=['num', 'num_letter'])

Add as a new dataframe column

 **df['letter'] = df['num_letter'].apply(lambda x: dict1[x])**

  num  num_letter letter
0   1         113      a
1   2         113      a
2   3         301      c
3   4         122      b
4   5         113      a

OR replace the existing (‘num_letter’) column

 **df['num_letter'] = df['num_letter'].apply(lambda x: dict1[x])**

  num num_letter
0   1          a
1   2          a
2   3          c
3   4          b
4   5          a

Method 3

VLookup in VBA is just like pandas.dataframe.merge

I always look for so many procedures for VBA in the past and now python dataframe saves me a ton of work, good thing is I don’t need write a vlookup method.

pandas.DataFrame.merge

>>> A              >>> B
    lkey value         rkey value
0   foo  1         0   foo  5
1   bar  2         1   bar  6
2   baz  3         2   qux  7
3   foo  4         3   bar  8
>>> A.merge(B, left_on='lkey', right_on='rkey', how='outer')
   lkey  value_x  rkey  value_y
0  foo   1        foo   5
1  foo   4        foo   5
2  bar   2        bar   6
3  bar   2        bar   8
4  baz   3        NaN   NaN
5  NaN   NaN      qux   7

You can also try the following to do a left merge.

import pandas as pd
pd.merge(left, right, left_on = 'key', right_on = 'key', how='left')

outer or left act like SQL, python’s built-in class DataFrame has the method merge taking many args, which is very detailed and handy.


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x