Assuming the following DataFrame:
key.0 key.1 key.2 topic 1 abc def ghi 8 2 xab xcd xef 9
How can I combine the values of all the key.* columns into a single column ‘key’, that’s associated with the topic value corresponding to the key.* columns? This is the result I want:
topic key 1 8 abc 2 8 def 3 8 ghi 4 9 xab 5 9 xcd 6 9 xef
Note that the number of key.N columns is variable on some external N.
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
You can melt your dataframe:
>>> keys = [c for c in df if c.startswith('key.')]
>>> pd.melt(df, id_vars='topic', value_vars=keys, value_name='key')
topic variable key
0 8 key.0 abc
1 9 key.0 xab
2 8 key.1 def
3 9 key.1 xcd
4 8 key.2 ghi
5 9 key.2 xef
It also gives you the source of the key.
From v0.20, melt is a first class function of the pd.DataFrame class:
>>> df.melt('topic', value_name='key').drop('variable', 1)
topic key
0 8 abc
1 9 xab
2 8 def
3 9 xcd
4 8 ghi
5 9 xef
Method 2
After trying various ways, I find the following is more or less intuitive, provided stack‘s magic is understood:
# keep topic as index, stack other columns 'against' it
stacked = df.set_index('topic').stack()
# set the name of the new series created
df = stacked.reset_index(name='key')
# drop the 'source' level (key.*)
df.drop('level_1', axis=1, inplace=True)
The resulting dataframe is as required:
topic key 0 8 abc 1 8 def 2 8 ghi 3 9 xab 4 9 xcd 5 9 xef
You may want to print intermediary results to understand the process in full. If you don’t mind having more columns than needed, the key steps are set_index('topic'), stack() and reset_index(name='key').
Method 3
OK , cause one of the current answer is mark as duplicated of this question, I will answer here.
By Using wide_to_long
pd.wide_to_long(df, ['key'], 'topic', 'age').reset_index().drop('age',1)
Out[123]:
topic key
0 8 abc
1 9 xab
2 8 def
3 9 xcd
4 8 ghi
5 9 xef
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0