Splitting a list based on a delimiter word

I have a list containing various string values. I want to split the list whenever I see WORD. The result will be a list of lists (which will be the sublists of original list) containing exactly one instance of the WORD I can do this using a loop but is there a more pythonic way to do achieve this ?

Example = ['A', 'WORD', 'B' , 'C' , 'WORD' , 'D']

result = [['A'], ['WORD','B','C'],['WORD','D']]

This is what I have tried but it actually does not achieve what I want since it will put WORD in a different list that it should be in:

def split_excel_cells(delimiter, cell_data):

    result = []

    temp = []

    for cell in cell_data:
        if cell == delimiter:
            temp.append(cell)
            result.append(temp)
            temp = []
        else:
            temp.append(cell)

    return result

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

import itertools

lst = ['A', 'WORD', 'B' , 'C' , 'WORD' , 'D']
w = 'WORD'

spl = [list(y) for x, y in itertools.groupby(lst, lambda z: z == w) if not x]

this creates a splitted list without delimiters, which looks more logical to me:

[['A'], ['B', 'C'], ['D']]

If you insist on delimiters to be included, this should do the trick:

spl = [[]]
for x, y in itertools.groupby(lst, lambda z: z == w):
    if x: spl.append([])
    spl[-1].extend(y)

Method 2

I would use a generator:

def group(seq, sep):
    g = []
    for el in seq:
        if el == sep:
            yield g
            g = []
        g.append(el)
    yield g

ex = ['A', 'WORD', 'B' , 'C' , 'WORD' , 'D']
result = list(group(ex, 'WORD'))
print(result)

This prints

[['A'], ['WORD', 'B', 'C'], ['WORD', 'D']]

The code accepts any iterable, and produces an iterable (which you don’t have to flatten into a list if you don’t want to).

Method 3

Given

import more_itertools as mit


iterable = ["A", "WORD", "B" , "C" , "WORD" , "D"]
pred = lambda x: x == "WORD"

Code

list(mit.split_before(iterable, pred))
# [['A'], ['WORD', 'B', 'C'], ['WORD', 'D']]

more_itertools is a third-party library installable via > pip install more_itertools.

See also split_at and split_after.

Method 4

  • @NPE’s solution looks very pythonic to me. This is another one using itertools:
  • izip is specific to python 2.7. Replace izip with zip to work in python 3
from itertools import izip, chain
example = ['A', 'WORD', 'B' , 'C' , 'WORD' , 'D']
indices = [i for i,x in enumerate(example) if x=="WORD"]
pairs = izip(chain([0], indices), chain(indices, [None]))
result = [example[i:j] for i, j in pairs]


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x