Is the a short syntax for joining a list of lists into a single list( or iterator) in python?
For example I have a list as follows and I want to iterate over a,b and c.
x = [["a","b"], ["c"]]
The best I can come up with is as follows.
result = [] [ result.extend(el) for el in x] for el in result: print el
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
import itertools a = [['a','b'], ['c']] print(list(itertools.chain.from_iterable(a)))
Method 2
x = [["a","b"], ["c"]] result = sum(x, [])
Method 3
If you’re only going one level deep, a nested comprehension will also work:
>>> x = [["a","b"], ["c"]] >>> [inner ... for outer in x ... for inner in outer] ['a', 'b', 'c']
On one line, that becomes:
>>> [j for i in x for j in i] ['a', 'b', 'c']
Method 4
l = [] map(l.extend, list_of_lists)
shortest!
Method 5
This is known as flattening, and there are a LOT of implementations out there.
How about this, although it will only work for 1 level deep nesting:
>>> x = [["a","b"], ["c"]] >>> for el in sum(x, []): ... print el ... a b c
From those links, apparently the most complete-fast-elegant-etc implementation is the following:
def flatten(l, ltypes=(list, tuple)):
ltype = type(l)
l = list(l)
i = 0
while i < len(l):
while isinstance(l[i], ltypes):
if not l[i]:
l.pop(i)
i -= 1
break
else:
l[i:i + 1] = l[i]
i += 1
return ltype(l)
Method 6
If you need a list, not a generator, use list():
from itertools import chain x = [["a","b"], ["c"]] y = list(chain(*x))
Method 7
A performance comparison:
import itertools import timeit big_list = [[0]*1000 for i in range(1000)] timeit.repeat(lambda: list(itertools.chain.from_iterable(big_list)), number=100) timeit.repeat(lambda: list(itertools.chain(*big_list)), number=100) timeit.repeat(lambda: (lambda b: map(b.extend, big_list))([]), number=100) timeit.repeat(lambda: [el for list_ in big_list for el in list_], number=100) [100*x for x in timeit.repeat(lambda: sum(big_list, []), number=1)]
Producing:
>>> import itertools >>> import timeit >>> big_list = [[0]*1000 for i in range(1000)] >>> timeit.repeat(lambda: list(itertools.chain.from_iterable(big_list)), number=100) [3.016212113769325, 3.0148865239060227, 3.0126415732791028] >>> timeit.repeat(lambda: list(itertools.chain(*big_list)), number=100) [3.019953987082083, 3.528754223385439, 3.02181439266457] >>> timeit.repeat(lambda: (lambda b: map(b.extend, big_list))([]), number=100) [1.812084445152557, 1.7702404451095965, 1.7722977998725362] >>> timeit.repeat(lambda: [el for list_ in big_list for el in list_], number=100) [5.409658160700605, 5.477502077679354, 5.444318360412744] >>> [100*x for x in timeit.repeat(lambda: sum(big_list, []), number=1)] [399.27587954973444, 400.9240571138051, 403.7521153804846]
This is with Python 2.7.1 on Windows XP 32-bit, but @temoto in the comments above got from_iterable to be faster than map+extend, so it’s quite platform and input dependent.
Stay away from sum(big_list, [])
Method 8
This works recursively for infinitely nested elements:
def iterFlatten(root):
if isinstance(root, (list, tuple)):
for element in root:
for e in iterFlatten(element):
yield e
else:
yield root
Result:
>>> b = [["a", ("b", "c")], "d"]
>>> list(iterFlatten(b))
['a', 'b', 'c', 'd']
Method 9
Late to the party but …
I’m new to python and come from a lisp background. This is what I came up with (check out the var names for lulz):
def flatten(lst):
if lst:
car,*cdr=lst
if isinstance(car,(list,tuple)):
if cdr: return flatten(car) + flatten(cdr)
return flatten(car)
if cdr: return [car] + flatten(cdr)
return [car]
Seems to work. Test:
flatten((1,2,3,(4,5,6,(7,8,(((1,2)))))))
returns:
[1, 2, 3, 4, 5, 6, 7, 8, 1, 2]
Method 10
What you’re describing is known as flattening a list, and with this new knowledge you’ll be able to find many solutions to this on Google (there is no built-in flatten method). Here is one of them, from http://www.daniel-lemire.com/blog/archives/2006/05/10/flattening-lists-in-python/:
def flatten(x):
flat = True
ans = []
for i in x:
if ( i.__class__ is list):
ans = flatten(i)
else:
ans.append(i)
return ans
Method 11
There’s always reduce (being deprecated to functools):
>>> x = [ [ 'a', 'b'], ['c'] ] >>> for el in reduce(lambda a,b: a+b, x, []): ... print el ... __main__:1: DeprecationWarning: reduce() not supported in 3.x; use functools.reduce() a b c >>> import functools >>> for el in functools.reduce(lambda a,b: a+b, x, []): ... print el ... a b c >>>
Unfortunately the plus operator for list concatenation can’t be used as a function — or fortunate, if you prefer lambdas to be ugly for improved visibility.
Method 12
Or a recursive operation:
def flatten(input):
ret = []
if not isinstance(input, (list, tuple)):
return [input]
for i in input:
if isinstance(i, (list, tuple)):
ret.extend(flatten(i))
else:
ret.append(i)
return ret
Method 13
For one-level flatten, if you care about speed, this is faster than any of the previous answers under all conditions I tried. (That is, if you need the result as a list. If you only need to iterate through it on the fly then the chain example is probably better.) It works by pre-allocating a list of the final size and copying the parts in by slice (which is a lower-level block copy than any of the iterator methods):
def join(a):
"""Joins a sequence of sequences into a single sequence. (One-level flattening.)
E.g., join([(1,2,3), [4, 5], [6, (7, 8, 9), 10]]) = [1,2,3,4,5,6,(7,8,9),10]
This is very efficient, especially when the subsequences are long.
"""
n = sum([len(b) for b in a])
l = [None]*n
i = 0
for b in a:
j = i+len(b)
l[i:j] = b
i = j
return l
Sorted times list with comments:
[(0.5391559600830078, 'flatten4b'), # join() above. (0.5400412082672119, 'flatten4c'), # Same, with sum(len(b) for b in a) (0.5419249534606934, 'flatten4a'), # Similar, using zip() (0.7351131439208984, 'flatten1b'), # list(itertools.chain.from_iterable(a)) (0.7472689151763916, 'flatten1'), # list(itertools.chain(*a)) (1.5468521118164062, 'flatten3'), # [i for j in a for i in j] (26.696547985076904, 'flatten2')] # sum(a, [])
Method 14
Sadly, Python doesn’t have a simple way to flatten lists. Try this:
def flatten(some_list):
for element in some_list:
if type(element) in (tuple, list):
for item in flatten(element):
yield item
else:
yield element
Which will recursively flatten a list; you can then do
result = []
[ result.extend(el) for el in x]
for el in flatten(result):
print el
Method 15
I had a similar problem when I had to create a dictionary that contained the elements of an array and their count. The answer is relevant because, I flatten a list of lists, get the elements I need and then do a group and count. I used Python’s map function to produce a tuple of element and it’s count and groupby over the array. Note that the groupby takes the array element itself as the keyfunc. As a relatively new Python coder, I find it to me more easier to comprehend, while being Pythonic as well.
Before I discuss the code, here is a sample of data I had to flatten first:
{ "_id" : ObjectId("4fe3a90783157d765d000011"), "status" : [ "opencalais" ],
"content_length" : 688, "open_calais_extract" : { "entities" : [
{"type" :"Person","name" : "Iman Samdura","rel_score" : 0.223 },
{"type" : "Company", "name" : "Associated Press", "rel_score" : 0.321 },
{"type" : "Country", "name" : "Indonesia", "rel_score" : 0.321 }, ... ]},
"title" : "Indonesia Police Arrest Bali Bomb Planner", "time" : "06:42 ET",
"filename" : "021121bn.01", "month" : "November", "utctime" : 1037836800,
"date" : "November 21, 2002", "news_type" : "bn", "day" : "21" }
It is a query result from Mongo. The code below flattens a collection of such lists.
def flatten_list(items): return sorted([entity['name'] for entity in [entities for sublist in [item['open_calais_extract']['entities'] for item in items] for entities in sublist])
First, I would extract all the “entities” collection, and then for each entities collection, iterate over the dictionary and extract the name attribute.
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0