I have an excel sheet that looks like so:
Column1 Column2 Column3 0 23 1 1 5 2 1 2 3 1 19 5 2 56 1 2 22 2 3 2 4 3 14 5 4 59 1 5 44 1 5 1 2 5 87 3
And I’m looking to extract that data, group it by column 1, and add it to a dictionary so it appears like this:
{0: [1],
1: [2,3,5],
2: [1,2],
3: [4,5],
4: [1],
5: [1,2,3]}
This is my code so far
excel = pandas.read_excel(r"e:test_data.xlsx", sheetname='mySheet', parse_cols'A,C')
myTable = excel.groupby("Column1").groups
print myTable
However, my output looks like this:
{0: [0L], 1: [1L, 2L, 3L], 2: [4L, 5L], 3: [6L, 7L], 4: [8L], 5: [9L, 10L, 11L]}
Thanks!
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
You could groupby on Column1 and then take Column3 to apply(list) and call to_dict?
In [81]: df.groupby('Column1')['Column3'].apply(list).to_dict()
Out[81]: {0: [1], 1: [2, 3, 5], 2: [1, 2], 3: [4, 5], 4: [1], 5: [1, 2, 3]}
Or, do
In [433]: {k: list(v) for k, v in df.groupby('Column1')['Column3']}
Out[433]: {0: [1], 1: [2, 3, 5], 2: [1, 2], 3: [4, 5], 4: [1], 5: [1, 2, 3]}
Method 2
According to the docs, the GroupBy.groups:
is a dict whose keys are the computed unique groups and corresponding
values being the axis labels belonging to each group.
If you want the values themselves, you can groupby ‘Column1’ and then call apply and pass the list method to apply to each group.
You can then convert it to a dict as desired:
In [5]:
dict(df.groupby('Column1')['Column3'].apply(list))
Out[5]:
{0: [1], 1: [2, 3, 5], 2: [1, 2], 3: [4, 5], 4: [1], 5: [1, 2, 3]}
(Note: have a look at this SO question for why the numbers are followed by L)
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0