This is probably easy, but I have the following data:
In data frame 1:
index dat1 0 9 1 5
In data frame 2:
index dat2 0 7 1 6
I want a data frame with the following form:
index dat1 dat2 0 9 7 1 5 6
I’ve tried using the append method, but I get a cross join (i.e. cartesian product).
What’s the right way to do this?
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
It seems in general you’re just looking for a join:
> dat1 = pd.DataFrame({'dat1': [9,5]})
> dat2 = pd.DataFrame({'dat2': [7,6]})
> dat1.join(dat2)
dat1 dat2
0 9 7
1 5 6
Method 2
You can also use:
dat1 = pd.concat([dat1, dat2], axis=1)
Method 3
Both join() and concat() way could solve the problem. However, there is one warning I have to mention: Reset the index before you join() or concat() if you trying to deal with some data frame by selecting some rows from another DataFrame.
One example below shows some interesting behavior of join and concat:
dat1 = pd.DataFrame({'dat1': range(4)})
dat2 = pd.DataFrame({'dat2': range(4,8)})
dat1.index = [1,3,5,7]
dat2.index = [2,4,6,8]
# way1 join 2 DataFrames
print(dat1.join(dat2))
# output
dat1 dat2
1 0 NaN
3 1 NaN
5 2 NaN
7 3 NaN
# way2 concat 2 DataFrames
print(pd.concat([dat1,dat2],axis=1))
#output
dat1 dat2
1 0.0 NaN
2 NaN 4.0
3 1.0 NaN
4 NaN 5.0
5 2.0 NaN
6 NaN 6.0
7 3.0 NaN
8 NaN 7.0
#reset index
dat1 = dat1.reset_index(drop=True)
dat2 = dat2.reset_index(drop=True)
#both 2 ways to get the same result
print(dat1.join(dat2))
dat1 dat2
0 0 4
1 1 5
2 2 6
3 3 7
print(pd.concat([dat1,dat2],axis=1))
dat1 dat2
0 0 4
1 1 5
2 2 6
3 3 7
Method 4
You can assign a new column. Use indices to align correspoding rows:
df1 = pd.DataFrame({'A': [1, 2, 3], 'B': [10, 20, 30]}, index=[0, 1, 2])
df2 = pd.DataFrame({'C': [100, 200, 300]}, index=[1, 2, 3])
df1['C'] = df2['C']
Result:
A B C 0 1 10 NaN 1 2 20 100.0 2 3 30 200.0
Ignore indices:
df1['C'] = df2['C'].reset_index(drop=True)
Result:
A B C 0 1 10 100 1 2 20 200 2 3 30 300
Method 5
Perhaps too simple by anyways…
dat1 = pd.DataFrame({'dat1': [9,5]})
dat2 = pd.DataFrame({'dat2': [7,6]})
dat1['dat2'] = dat2 # Uses indices from dat1
Result:
dat1 dat2 0 9 7 1 5 6
Method 6
Just a matter of the right google search:
data = dat_1.append(dat_2) data = data.groupby(data.index).sum()
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0