Iterable && Iterator
- 可迭代对象(iterable),只定义了
__iter__
方法; 字符串、列表、元组、字典、文件;可以通过iter(iterable)
方法获取iterator对象,也可以通过list(iterable)
for xxx in iterable
间接调用__iter__
方法 迭代器(iterator), Iteration Protocol: 定义了
__iter__
和__next__
两个方法,__iter__
返回迭代器本身(用于for loop),__next__
方法返回下一个元素,如果没有元素了,抛出StopIteration异常; for python2, usenext
; for python3, use__next__
iterator = iter(l) # iterator2 = l.__iter__() list(l) for xxx in l:
yrange
例子1:iterable和iterator是同一个对象。
y = iterable()
list(y)
list(y)
for i in y
只有第一次输出所有值;后续输出未空。
class yrange:
def __init__(self, n):
self.i = 0
self.n = n
def __iter__(self):
print("__iter__1")
return self
def next(self):
if self.i < self.n:
i = self.i
self.i += 1
return i
else:
raise StopIteration()
当使用list(iterable)
的时候,会调用iterable的__iter__
方法返回iterator
output
y = yrange(3)
0
y = yrange(3)
1
y = yrange(3)
2
y = yrange(3)
StopIteration
list(yrange(3))
__iter__1
[0, 1, 2]
sum(yrange(3))
__iter__1
3
y = yrange(3)
list(y)
__iter__1
[0, 1, 2]
list(y)
__iter__1
[]
y = yrange(3)
list(y.__iter__())
__iter__1
__iter__1
[0, 1, 2]
list(y.__iter__())
__iter__1
__iter__1
[]
zrange
例子2:iterable和iterator是不同对象。
z = iterable()
list(z)
list(z)
for i in z
调用N次,都会输出所有值。
class zrange:
def __init__(self, n):
self.n = n
def __iter__(self):
print("__iter__1")
return zrange_iter(self.n)
class zrange_iter:
def __init__(self, n):
self.i = 0
self.n = n
def __iter__(self):
print("__iter__2")
# Iterators are iterables too.
# Adding this functions to make them so.
return self
def next(self):
if self.i < self.n:
i = self.i
self.i += 1
return i
else:
raise StopIteration()
output
z = zrange(3)
list(z)
__iter__1
[0, 1, 2]
list(z)
__iter__1
[0, 1, 2]
z = zrange(3)
list(z.__iter__())
__iter__1
__iter__2
[0, 1, 2]
list(z.__iter__())
__iter__1
__iter__2
[0, 1, 2]
Generator
Generator functions
are ordinary functions defined using yield
instead of return
. When called, a generator function
returns a generator object
, which is a kind of iterator
- it has a next()
method. When you call next()
, the next value yielded by the generator function is returned.
use the word “generator” to mean the genearted object and “generator function” to mean the function that generates it.
generator也是一个iterator。 Generator functions
简化了iterator
的创建。只需要yield就可以代替实现iterator的__iter__
和next
方法。
def yrange(n):
i = 0
while i < n:
yield i
i += 1
output
>>> y = yrange(3)
>>> y
<generator object yrange at 0x401f30>
>>> y.next()
0
>>> y.next()
1
>>> y.next()
2
>>> y.next()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
How to work
When a generator function
is called, it returns a generator object
without even beginning execution of the function.
When next
method is called for the first time, the function starts executing until it reaches yield
statement. The yielded value is returned by the next
call.
>>> def foo():
... print "begin"
... for i in range(3):
... print "before yield", i
... yield i
... print "after yield", i
... print "end"
...
>>> f = foo() # 不执行任何语句,返回generator object
>>> f.next() # 执行语句直到yield,返回结果
begin
before yield 0
0
>>> f.next() # 从上一次yield语句的下一句开始执行语句直到再次到达yield,返回结果
after yield 0
before yield 1
1
>>> f.next() # 从上一次yield语句的下一句开始执行语句直到再次到达yield,返回结果
after yield 1
before yield 2
2
>>> f.next() # 从上一次yield语句的下一句开始执行语句,由于没有再次到达yield所以抛出StopIteration异常
after yield 2
end
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
>>>
再看一下例子
def integers():
"""Infinite sequence of integers."""
i = 1
while True:
yield i
i = i + 1
def squares():
for i in integers():
yield i * i
def take(n, seq):
"""Returns first n values from the given sequence."""
seq = iter(seq)
result = []
try:
for i in range(n):
result.append(seq.next())
except StopIteration:
pass
return result
print take(5, squares()) # prints [1, 4, 9, 16, 25]
Generator Expressions
Generator Expressions are generator version of list comprehensions
.
They look like list comprehensions, but returns a generator back instead of a list.
a = [x for x in range(5)]
a
[0, 1, 2, 3, 4]
a = (x*x for x in range(5))
a
<generator object <genexpr> at 0x0000000005232630>
sum(a)
10
sum((x*x for x in range(10)))
#如果只有一个参数,generator expression的()可以省略
sum(x*x for x in range(10))
pyt = ((x, y, z) for z in integers()
for y in xrange(1, z)
for x in range(1, y)
if x*x + y*y == z*z)
pyt
<generator object <genexpr> at 0x0000000005232828>
take(5,pyt)
[(3, 4, 5), (6, 8, 10), (5, 12, 13), (9, 12, 15), (8, 15, 17)]
Example: Reading multiple files
python提供的file对象就是一个iterator对象
f = open('./1.txt')
f.next()
f.next()
使用generator简化代码
old
def cat(filenames):
for f in filenames:
for line in open(f):
print line,
def grep(pattern, filenames):
for f in filenames:
for line in open(f):
if pattern in line:
print line,
new with generator
def readfiles(filenames):
for f in filenames:
for line in open(f):
yield line
def grep(pattern, lines):
return (line for line in lines if pattern in line)
def printlines(lines):
for line in lines:
print line,
def main(pattern, filenames):
lines = readfiles(filenames)
lines = grep(pattern, lines)
printlines(lines)
Itertools
import itertools
it1 = iter([1, 2, 3])
it2 = iter([4, 5, 6])
for v in itertools.chain(it1, it2):
print v
for x, y in itertools.izip(["a", "b", "c"], [1, 2, 3]:
print x, y
#a 1
#b 2
#c 3
Reference
History
- 20181029: created.