How to split strings inside a list by whitespace characters

So stdin returns a string of text into a list, and multiple lines of text are all list elements.
How do you split them all into single words?

mylist = ['this is a string of text n', 'this is a different string of text n', 'and for good measure here is another one n']

wanted output:

newlist = ['this', 'is', 'a', 'string', 'of', 'text', 'this', 'is', 'a', 'different', 'string', 'of', 'text', 'and', 'for', 'good', 'measure', 'here', 'is', 'another', 'one']

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

You can use simple list comprehension, like:

newlist = [<b>word</b> for line in mylist <b>for word in line.split()</b>]

This generates:

>>> [word for line in mylist for word in line.split()]
['this', 'is', 'a', 'string', 'of', 'text', 'this', 'is', 'a', 'different', 'string', 'of', 'text', 'and', 'for', 'good', 'measure', 'here', 'is', 'another', 'one']

Method 2

You could just do:

words = str(list).split()

So you turn the list into a string then split it by a space bar.
Then you can remove the /n’s by doing:

words.replace("/n", "")

Or if you want to do it in one line:

words = str(str(str(list).split()).replace("/n", "")).split()

Just saying this may not work in python 2

Method 3

Besides the list comprehension answer above that i vouch for, you could also do it in a for loop:

#Define the newlist as an empty list
newlist = list()
#Iterate over mylist items
for item in mylist:
 #split the element string into a list of words
 itemWords = item.split()
 #extend newlist to include all itemWords
 newlist.extend(itemWords)
print(newlist)

eventually your newlist will contain all split words that were in all elements in mylist

But the python list comprehension looks much nicer and you can do awesome things with it. Check here for more:

https://docs.python.org/3/tutorial/datastructures.html#list-comprehensions

Method 4

Alternatively, you can map str.split method to every string inside the list and then chain the elements from the resulting lists together by itertools.chain.from_iterable:

from itertools import chain

mylist = ['this is a string of text n', 'this is a different string of text n', 'and for good measure here is another one n']
result = list(chain.from_iterable(map(str.split, mylist)))
print(result)
# ['this', 'is', 'a', 'string', 'of', 'text', 'this', 'is', 'a', 'different', 'string', 'of', 'text', 'and', 'for', 'good', 'measure', 'here', 'is', 'another', 'one']


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x