Get last “column” after .str.split() operation on column in pandas DataFrame

I have a column in a pandas DataFrame that I would like to split on a single space. The splitting is simple enough with DataFrame.str.split(' '), but I can’t make a new column from the last entry. When I .str.split() the column I get a list of arrays and I don’t know how to manipulate this to get a new column for my DataFrame.

Here is an example. Each entry in the column contains ‘symbol data price’ and I would like to split off the price (and eventually remove the “p”… or “c” in half the cases).

import pandas as pd
temp = pd.DataFrame({'ticker' : ['spx 5/25/2001 p500', 'spx 5/25/2001 p600', 'spx 5/25/2001 p700']})
temp2 = temp.ticker.str.split(' ')

which yields

0    ['spx', '5/25/2001', 'p500']
1    ['spx', '5/25/2001', 'p600']
2    ['spx', '5/25/2001', 'p700']

But temp2[0] just gives one list entry’s array and temp2[:][-1] fails. How can I convert the last entry in each array to a new column? Thanks!

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

Do this:

In [43]: temp2.str[-1]
Out[43]: 
0    p500
1    p600
2    p700
Name: ticker

So all together it would be:

>>> temp = pd.DataFrame({'ticker' : ['spx 5/25/2001 p500', 'spx 5/25/2001 p600', 'spx 5/25/2001 p700']})
>>> temp['ticker'].str.split(' ').str[-1]
0    p500
1    p600
2    p700
Name: ticker, dtype: object

Method 2

You could use the tolist method as an intermediary:

In [99]: import pandas as pd

In [100]: d1 = pd.DataFrame({'ticker' : ['spx 5/25/2001 p500', 'spx 5/25/2001 p600', 'spx 5/25/2001 p700']})

In [101]: d1.ticker.str.split().tolist()
Out[101]: 
[['spx', '5/25/2001', 'p500'],
 ['spx', '5/25/2001', 'p600'],
 ['spx', '5/25/2001', 'p700']]

From which you could make a new DataFrame:

In [102]: d2 = pd.DataFrame(d1.ticker.str.split().tolist(), 
   .....:                   columns="symbol date price".split())

In [103]: d2
Out[103]: 
  symbol       date price
0    spx  5/25/2001  p500
1    spx  5/25/2001  p600
2    spx  5/25/2001  p700

For good measure, you could fix the price:

In [104]: d2["price"] = d2["price"].str.replace("p","").astype(float)

In [105]: d2
Out[105]: 
  symbol       date  price
0    spx  5/25/2001    500
1    spx  5/25/2001    600
2    spx  5/25/2001    700

PS: but if you really just want the last column, apply would suffice:

In [113]: temp2.apply(lambda x: x[2])
Out[113]: 
0    p500
1    p600
2    p700
Name: ticker

Method 3

https://pandas.pydata.org/pandas-docs/stable/text.html

s2 = pd.Series(['a_b_c', 'c_d_e', np.nan, 'f_g_h'])
s2.str.split('_').str.get(1)

or

s2.str.split('_').str[1]

Method 4

Using Pandas 0.20.3:

In [10]: import pandas as pd
    ...: temp = pd.DataFrame({'ticker' : ['spx 5/25/2001 p500', 'spx 5/25/2001 p600', 'spx 5/25/2001 p700']})
    ...:

In [11]: temp2 = temp.ticker.str.split(' ', expand=True)  # the expand=True return a DataFrame

In [12]: temp2
Out[12]:
     0          1     2
0  spx  5/25/2001  p500
1  spx  5/25/2001  p600
2  spx  5/25/2001  p700

In [13]: temp3 = temp.join(temp2[2])

In [14]: temp3
Out[14]:
               ticker     2
0  spx 5/25/2001 p500  p500
1  spx 5/25/2001 p600  p600
2  spx 5/25/2001 p700  p700

Method 5

If you are looking for a one-liner (like I came here for), this should do nicely:

temp2 = temp.ticker.str.split(' ', expand = True)[-1]

You can also trivially modify this answer to assign this column back to the original DataFrame as follows:

temp['last_split'] = temp.ticker.str.split(' ', expand = True)[-1]

Which I imagine is a popular use case here.


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x