Iterable Operators¶
The synchronous helpers in pdum.plumbum.iterops make it easy to work with pipelines that consume and transform iterables.
The async module pdum.plumbum.aiterops mirrors the same API for async iterators.
In [1]:
Copied!
from pdum.plumbum import pb
from pdum.plumbum.iterops import (
take,
tail,
skip,
dedup,
uniq,
select,
where,
groupby,
sort,
reverse,
transpose,
batched,
chain,
chain_with,
islice,
izip,
)
from pdum.plumbum import pb
from pdum.plumbum.iterops import (
take,
tail,
skip,
dedup,
uniq,
select,
where,
groupby,
sort,
reverse,
transpose,
batched,
chain,
chain_with,
islice,
izip,
)
Basic selection and filtering¶
In [2]:
Copied!
data = [1, 2, 3, 4]
pipeline = select(lambda value: value * 2) | where(lambda value: value % 2 == 0) | pb(list)
data > pipeline
data = [1, 2, 3, 4]
pipeline = select(lambda value: value * 2) | where(lambda value: value % 2 == 0) | pb(list)
data > pipeline
Out[2]:
[2, 4, 6, 8]
Trimming data¶
In [3]:
Copied!
list(range(5)) > (take(3) | pb(list))
list(range(5)) > (take(3) | pb(list))
Out[3]:
[0, 1, 2]
In [4]:
Copied!
list(range(5)) > (tail(2) | pb(list))
list(range(5)) > (tail(2) | pb(list))
Out[4]:
[3, 4]
In [5]:
Copied!
list(range(5)) > (skip(3) | pb(list))
list(range(5)) > (skip(3) | pb(list))
Out[5]:
[3, 4]
Deduplication¶
In [6]:
Copied!
[1, 1, 2, 2, 3] > (dedup() | pb(list))
[1, 1, 2, 2, 3] > (dedup() | pb(list))
Out[6]:
[1, 2, 3]
In [7]:
Copied!
[1, 1, 2, 1, 1] > (uniq() | pb(list))
[1, 1, 2, 1, 1] > (uniq() | pb(list))
Out[7]:
[1, 2, 1]
Sorting and grouping¶
In [8]:
Copied!
[3, 1, 2] > (sort() | pb(list))
[3, 1, 2] > (sort() | pb(list))
Out[8]:
[1, 2, 3]
In [9]:
Copied!
list(range(5)) > (reverse | pb(list))
list(range(5)) > (reverse | pb(list))
Out[9]:
[4, 3, 2, 1, 0]
In [10]:
Copied!
words = ["apple", "apricot", "banana"]
words > (groupby(lambda w: w[0]) | pb(list))
words = ["apple", "apricot", "banana"]
words > (groupby(lambda w: w[0]) | pb(list))
Out[10]:
[('a', <itertools._grouper at 0x7f9c747f4ac0>),
('b', <itertools._grouper at 0x7f9c747f5a20>)]
Shaping data¶
In [11]:
Copied!
[[1, 2], [3, 4]] > (transpose | pb(list))
[[1, 2], [3, 4]] > (transpose | pb(list))
Out[11]:
[(1, 3), (2, 4)]
In [12]:
Copied!
list(range(5)) > (batched(2) | pb(list))
list(range(5)) > (batched(2) | pb(list))
Out[12]:
[(0, 1), (2, 3), (4,)]
Combining iterables¶
In [13]:
Copied!
[[1, 2], [3]] > (chain | pb(list))
[[1, 2], [3]] > (chain | pb(list))
Out[13]:
[1, 2, 3]
In [14]:
Copied!
[1, 2] > (chain_with([3, 4]) | pb(list))
[1, 2] > (chain_with([3, 4]) | pb(list))
Out[14]:
[1, 2, 3, 4]
In [15]:
Copied!
list(range(5)) > (islice(1, 4) | pb(list))
list(range(5)) > (islice(1, 4) | pb(list))
Out[15]:
[1, 2, 3]
In [16]:
Copied!
[1, 2] > (izip(["a", "b"]) | pb(list))
[1, 2] > (izip(["a", "b"]) | pb(list))
Out[16]:
[(1, 'a'), (2, 'b')]
The async equivalents live in pdum.plumbum.aiterops with an a prefix (e.g. aselect, anetcat).