How can I split and parse a string in Python?

I am trying to split this string in python: 2.7.0_bf4fda703454

I want to split that string on the underscore _ so that I can use the value on the left side.

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

"2.7.0_bf4fda703454".split("_") gives a list of strings:

In [1]: "2.7.0_bf4fda703454".split("_")
Out[1]: ['2.7.0', 'bf4fda703454']

This splits the string at every underscore. If you want it to stop after the first split, use "2.7.0_bf4fda703454".split("_", 1).

If you know for a fact that the string contains an underscore, you can even unpack the LHS and RHS into separate variables:

In [8]: lhs, rhs = "2.7.0_bf4fda703454".split("_", 1)

In [9]: lhs
Out[9]: '2.7.0'

In [10]: rhs
Out[10]: 'bf4fda703454'

An alternative is to use partition(). The usage is similar to the last example, except that it returns three components instead of two. The principal advantage is that this method doesn’t fail if the string doesn’t contain the separator.

Method 2

Python string parsing walkthrough

Split a string on space, get a list, show its type, print it out:

<a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="c9aca589a8b9a6a5a5a6">[email protected]</a>:~/foo$ python
>>> mystring = "What does the fox say?"

>>> mylist = mystring.split(" ")

>>> print type(mylist)
<type 'list'>

>>> print mylist
['What', 'does', 'the', 'fox', 'say?']

If you have two delimiters next to each other, empty string is assumed:

<a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="46232a062736292a2a29">[email protected]</a>:~/foo$ python
>>> mystring = "its  so   fluffy   im gonna    DIE!!!"

>>> print mystring.split(" ")
['its', '', 'so', '', '', 'fluffy', '', '', 'im', 'gonna', '', '', '', 'DIE!!!']

Split a string on underscore and grab the 5th item in the list:

<a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="9bfef7dbfaebf4f7f7f4">[email protected]</a>:~/foo$ python
>>> mystring = "Time_to_fire_up_Kowalski's_Nuclear_reactor."

>>> mystring.split("_")[4]
"Kowalski's"

Collapse multiple spaces into one

<a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="a8cdc4e8c9d8c7c4c4c7">[email protected]</a>:~/foo$ python
>>> mystring = 'collapse    these       spaces'

>>> mycollapsedstring = ' '.join(mystring.split())

>>> print mycollapsedstring.split(' ')
['collapse', 'these', 'spaces']

When you pass no parameter to Python’s split method, the documentation states: “runs of consecutive whitespace are regarded as a single separator, and the result will contain no empty strings at the start or end if the string has leading or trailing whitespace”.

Hold onto your hats boys, parse on a regular expression:

<a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="cfaaa38faebfa0a3a3a0">[email protected]</a>:~/foo$ python
>>> mystring = 'zzzzzzabczzzzzzdefzzzzzzzzzghizzzzzzzzzzzz'
>>> import re
>>> mylist = re.split("[a-m]+", mystring)
>>> print mylist
['zzzzzz', 'zzzzzz', 'zzzzzzzzz', 'zzzzzzzzzzzz']

The regular expression “[a-m]+” means the lowercase letters a through m that occur one or more times are matched as a delimiter. re is a library to be imported.

Or if you want to chomp the items one at a time:

<a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="4520290524352a29292a">[email protected]</a>:~/foo$ python
>>> mystring = "theres coffee in that nebula"

>>> mytuple = mystring.partition(" ")

>>> print type(mytuple)
<type 'tuple'>

>>> print mytuple
('theres', ' ', 'coffee in that nebula')

>>> print mytuple[0]
theres

>>> print mytuple[2]
coffee in that nebula

Method 3

If it’s always going to be an even LHS/RHS split, you can also use the partition method that’s built into strings. It returns a 3-tuple as (LHS, separator, RHS) if the separator is found, and (original_string, '', '') if the separator wasn’t present:

>>> "2.7.0_bf4fda703454".partition('_')
('2.7.0', '_', 'bf4fda703454')

>>> "shazam".partition("_")
('shazam', '', '')


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x