Joining pandas DataFrames by Column names

I have two DataFrames with the following column names:

frame_1:
event_id, date, time, county_ID

frame_2:
countyid, state

I would like to get a DataFrame with the following columns by joining (left) on county_ID = countyid:

joined_dataframe
event_id, date, time, county, state

I cannot figure out how to do it if the columns on which I want to join are not the index.

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

You can use the left_on and right_on options of pd.merge as follows:

pd.merge(frame_1, frame_2, left_on='county_ID', right_on='countyid')

Or equivalently with DataFrame.merge:

frame_1.merge(frame_2, left_on='county_ID', right_on='countyid')

I was not sure from the question if you only wanted to merge if the key was in the left hand DataFrame. If that is the case then the following will do that (the above will in effect do a many to many merge)

pd.merge(frame_1, frame_2, how='left', left_on='county_ID', right_on='countyid')

Or

frame_1.merge(frame_2, how='left', left_on='county_ID', right_on='countyid')

Method 2

you need to make county_ID as index for the right frame:

frame_2.join ( frame_1.set_index( [ 'county_ID' ], verify_integrity=True ),
               on=[ 'countyid' ], how='left' )

for your information, in pandas left join breaks when the right frame has non unique values on the joining column. see this bug.

so you need to verify integrity before joining by , verify_integrity=True


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x