Accessing every 1st element of Pandas DataFrame column containing lists

I have a Pandas DataFrame with a column containing lists objects

      A
0   [1,2]
1   [3,4]
2   [8,9] 
3   [2,6]

How can I access the first element of each list and save it into a new column of the DataFrame? To get a result like this:

      A     new_col
0   [1,2]      1
1   [3,4]      3
2   [8,9]      8
3   [2,6]      2

I know this could be done via iterating over each row, but is there any “pythonic” way?

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

As always, remember that storing non-scalar objects in frames is generally disfavoured, and should really only be used as a temporary intermediate step.

That said, you can use the .str accessor even though it’s not a column of strings:

>>> df = pd.DataFrame({"A": [[1,2],[3,4],[8,9],[2,6]]})
>>> df["new_col"] = df["A"].str[0]
>>> df
        A  new_col
0  [1, 2]        1
1  [3, 4]        3
2  [8, 9]        8
3  [2, 6]        2
>>> df["new_col"]
0    1
1    3
2    8
3    2
Name: new_col, dtype: int64

Method 2

You can use map and a lambda function

df.loc[:, 'new_col'] = df.A.map(lambda x: x[0])

Method 3

Use apply with x[0]:

df['new_col'] = df.A.apply(lambda x: x[0])
print df
        A  new_col
0  [1, 2]        1
1  [3, 4]        3
2  [8, 9]        8
3  [2, 6]        2

Method 4

You can just use a conditional list comprehension which takes the first value of any iterable or else uses None for that item. List comprehensions are very Pythonic.

df['new_col'] = [val[0] if hasattr(val, '__iter__') else None for val in df["A"]]

>>> df
        A  new_col
0  [1, 2]        1
1  [3, 4]        3
2  [8, 9]        8
3  [2, 6]        2

Timings

df = pd.concat([df] * 10000)

%timeit df['new_col'] = [val[0] if hasattr(val, '__iter__') else None for val in df["A"]]
100 loops, best of 3: 13.2 ms per loop

%timeit df["new_col"] = df["A"].str[0]
100 loops, best of 3: 15.3 ms per loop

%timeit df['new_col'] = df.A.apply(lambda x: x[0])
100 loops, best of 3: 12.1 ms per loop

%timeit df.A.map(lambda x: x[0])
100 loops, best of 3: 11.1 ms per loop

Removing the safety check ensuring an interable.

%timeit df['new_col'] = [val[0] for val in df["A"]]
100 loops, best of 3: 7.38 ms per loop

Method 5

You can use the method str.get:

df['A'].str.get(0)


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x