Exercises¶
This chapter is an intermezzo that allows us to check and have a deeper understanding of the concepts seen so far by means of exercises. We will see how the code shown can be rewritten to take advantage of battle-tested solutions and idioms that emerges from daily practice.
First of all, we import some modules (be free to skim the corresponding documentation for each one of them),
import functools, operator, math, itertools, random, collections, statistics, bisect, operator, heapq
that contains useful definitions for the code that we are going to write. Moreover, an utility for generators,
def take(iterable, n):
return map(lambda p: p[1], zip(range(n), iterable))
that consumes an iterable and return a generator that will yield \(n\) objects at most. For the sake of clarity,
taken = take(itertools.count(), 50)
taken
<generator object take at 0x7f9dbe3b65f0>
is a actually generator and its content equals
assert list(taken) == list(range(50))
Before starting, we initialize the random generator with a nice prime
random.seed(11)
Intersection¶
A = list(range(10000))
B = list(range(10000))
random.shuffle(A)
random.shuffle(B)
def intersection(A, B):
B = set(B)
return (a for a in A if a in B)
%timeit list(intersection(A, B))
1.73 ms ± 48.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit list(zip(A, set(B)))
1.44 ms ± 206 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
def intersection(A, B):
A, B = iter(sorted(A)), iter(sorted(B))
a, b = next(A), next(B)
while True:
try:
if a == b:
yield a
a, b = next(A), next(B)
elif a < b:
a = next(A)
else:
b = next(B)
except StopIteration:
break
%timeit list(intersection(A, B))
6.47 ms ± 1.09 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)
(Pythagorean) tuples¶
Let
def tuples(*slices):
return itertools.product(*map(lambda s: range(s.start, s.stop), slices))
INTERMEZZO
def A(a, b, c, d):
pass
def A(*args):
return list(map(lambda i: i + 4, args))
def AA(args):
return list(map(lambda i: i + 4, args))
def B(a, b, *args):
return [a, b] + list(map(lambda i: i + 4, args))
A(1, 2, 3)
A([1, 2, 3])
AA([1, 2, 3])
B(1,)
B(1, 2)
B(1, 2, 3)
A()
A(1, 2, 3)
A(1, 2, 3, 4, 5, 6, 7)
container = range(5)
A( *container )
where
help(itertools.product)
Consider the application to an empty sequence of slide
s,
units = tuples()
units
then saturate it
list(units)
Now, build tuples using just a slide
object,
singletons = tuples(slice(5, 11))
singletons
then saturate it
list(singletons)
Now, build tuples using a twin slide
object,
s = slice(5, 11)
pairs = tuples(s, s)
pairs
then saturate it
list(pairs)
Now, build tuples using a three different slide
objects (taking into
account of splitting the returned generator),
triples_a, triples_b = itertools.tee(tuples(slice(5, 11), slice(6, 13), slice(7, 14)))
where
help(itertools.tee)
then saturate it
list(triples_a)
Now a corner case, but still interesting for ensuring a sound behavior,
triples = tuples(slice(5, 11), slice(6, 6), slice(7, 14)) # ouch!
L = [1, 2, 3, 4]
L[2:2]
L[slice(2, 2)]
then saturate it
list(triples) # who we have to blame?
Finally, let
type(True)
def is_pythagorean(tup: tuple, n=2) -> bool: # is_pythagorean is a *predicate*
'''A Pythagorean triple consists of three positive integers a, b, and c, such that a^2 + b^2 = c^2.
Such a triple is commonly written (a, b, c), and a well-known example is (3, 4, 5).
If (a, b, c) is a Pythagorean triple, then so is (ka, kb, kc) for any positive integer k.
A primitive Pythagorean triple is one in which a, b and c are coprime (that is,
they have no common divisor larger than 1).
See also https://en.wikipedia.org/wiki/Pythagorean_triple.
'''
a, b, c = tup # tuple unpacking
return (a**n + b**n == c**n) if a <= b <= c else False
in
filter(is_pythagorean, triples_b)
list(filter(is_pythagorean, triples_b)) # do a selection
and
help(is_pythagorean) # just to show that writing docstrings is cool and useful.
sum_upto
¶
Let
def sum_upto(n):
return functools.reduce(operator.add, range(n+1))
and test according to Euler’s quicker formula
n = 100
v = sum_upto(n)
gauss = (n*(n+1)/2)
assert v == gauss == 5050
where
help(functools.reduce)
and
help(operator.add)
sqrt
¶
Let
def sqrt(n):
refined = n
while True:
yield refined
refined = (n/refined + refined)/2
to enumerate 15 approximation of the square root of 37
n = 37
list(take(sqrt(37), 15))
and check with respect to
math.sqrt(n)
where
help(math.sqrt)
Help on built-in function sqrt in module math:
sqrt(x, /)
Return the square root of x.
\(\pi\)¶
According to https://en.wikipedia.org/wiki/Leibniz_formula_for_%CF%80, let
def pi_Leibniz():
d = 0
for i, coeff in enumerate(itertools.count(1, step=2)):
yield 4*d
d += (-1)**i/coeff
in
list(take(pi_Leibniz(), 1000))[-10:]
and check against the
math.pi
where
help(itertools.count)
The Collatz’s conjecture¶
Consider the following operation on an arbitrary positive integer:
If the number is even, divide it by two.
If the number is odd, triple it and add one.
See also https://en.wikipedia.org/wiki/Collatz_conjecture. Let
def collatz(n):
while True:
yield n
n = 3*n + 1 if n % 2 else n // 2 # be aware that we lose track of the original `n`!
in
[list(take(collatz(n), 15)) for n in range(1, 20)]
Fibonacci numbers¶
Directly from https://docs.python.org/3/library/functools.html#functools.cache:
@functools.lru_cache()
def factorial(n):
print('•', end='')
return n * factorial(n-1) if n else 1
no previously cached result, makes 11 recursive calls (count the • symbols)
factorial(10)
•••••••••••
just looks up cached value result
factorial(5)
makes two new recursive calls, the other 10 are cached
factorial(12)
••
Uniform random
on segmented interval¶
The problem here reads as follow: sample uniformly from \([a, b)\)
and \([c, d)\) where \(b <= c\). Eventually, try to generate to
an arbitrary sequence of slice
s, assuming they are fed in sorted
order with respect to <
.
help(random.random)
Help on built-in function random:
random() method of random.Random instance
random() -> x in the interval [0, 1).
def samples(*slices):
step = 1/len(slices)
steps = itertools.count(step, step)
bins = [(s, sl) for sl, s in zip(slices, steps)]
while True:
r = random.random()
i = bisect.bisect_left(bins, (r, None))
sl = slices[i]
yield abs(sl.stop - sl.start) * (r - (i*step))/step + sl.start
samples(slice(10, 20), slice(35, 40))
<generator object samples at 0x7f94b236eba0>
Then define the generator with respect to \([10, 20)\) and \([35, 40)\)
observations = take(samples(slice(10, 20), slice(35, 40)), 1_000_000)
observations
<generator object take at 0x7f94b2336970>
have a look at some observations
sorted([i for _, i in zip(range(100), observations)])
then observe the quantiles:
statistics.quantiles(observations)
it looks uniform. By the way, use different intervals, \([14, 20)\) and \([35,40)\),
observations = take(samples(slice(14, 20), slice(35, 40)), 1_000_000)
look again at some observations,
sorted([i for _, i in zip(range(100), observations)])
and check the corresponding quantiles
statistics.quantiles(observations)
it should be uniform too. Finally, we test the corner case where \(b=c\), so let \([10, 20)\) and \([20,40)\),
observations = take(samples(slice(10, 20), slice(20, 40)), 1_000_000)
look again at some observations,
sorted([i for _, i in zip(range(100), observations)])
and check the corresponding quantiles
statistics.quantiles(observations)
it should be uniform either. Finally, attempt a sampling from 4
slices,
observations = take(samples(slice(0, 5), slice(10, 15), slice(20, 25), slice(30, 35)), 1_000_000)
look again at some observations,
sorted([i for _, i in zip(range(100), observations)])
and check the corresponding quantiles
statistics.quantiles(observations)
it should be uniform either.
Bernoulli random variable¶
int(True) # this is a very quick check to see if a Boolean can be used as integer
def Bernoulli(p):
'This is a generator for a Bernoulli random variable of parameter `p` for success.'
while True: # forever we loop
r = random.random() # get a sample
yield int(r < p) # if that sample denotes a success or a failure we *yield* that outcome
B = Bernoulli(p=0.6) # B is our random variable
B
next(B)
next(B)
next(B)
next(B)
list(take(B, 20))
C = collections.Counter(take(B, 1_000_000))
C
C[1]/(C[0]+C[1])
where
print(collections.Counter.__doc__)
Russian Peasant Multiplication¶
Let
def halves_doubles(n, m):
halving = n
doubling = m
acc = 0
while halving:
digit = halving % 2
acc = acc + digit * doubling
yield (digit, halving, doubling, acc)
halving = halving >> 1 # int(halving / 2)
doubling = doubling << 1
in
list(halves_doubles(89, 18))
[(1, 89, 18, 18),
(0, 44, 36, 18),
(0, 22, 72, 18),
(1, 11, 144, 162),
(1, 5, 288, 450),
(0, 2, 576, 450),
(1, 1, 1152, 1602)]
see https://en.wikipedia.org/wiki/Ancient_Egyptian_multiplication and also https://www.cut-the-knot.org/Curriculum/Algebra/PeasantMultiplication.shtml. Then,
def rpm(n, m):
*prefix, (b, h, d, s) = halves_doubles(n, m)
return s
so the check passes,
assert rpm(89, 18) == 89 * 18 == 1602
because
bin(89)
'0b1011001'
Of course, it works too when the first number is even,
rpm(6, 100)
600
Of course our implementation
%timeit rpm(293819385789379687596845, 921038209831568476843584365)
33.2 µs ± 111 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
is slower than the primitive one
%timeit 293819385789379687596845 * 921038209831568476843584365
98.8 ns ± 0.164 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
because arithmetic is performed in the virtual machine.
Let us give a strict version also,
def rpm_strict(n, m):
halving = n
doubling = m
acc = 0
while halving:
digit = halving % 2
acc = acc + digit * doubling
halving = halving >> 1
doubling = doubling << 1
return acc
check that it is correct,
rpm_strict(89, 18)
and observe that it is a little bit faster than our former implementation
%timeit rpm_strict(293819385789379687596845, 921038209831568476843584365)
Fixed sum¶
def subarrays(L):
return (L[i:j] for i in range(len(L)) for j in range(i, len(L)+1))
L = [-1, 5, 8, -9, 4, 1]
list(subarrays(L))
def fixed_sum(L, n):
return filter(lambda s: sum(s)==n, subarrays(L))
list(fixed_sum(L, 10))
def partial_sums(L):
g = itertools.accumulate(subarrays(L), lambda s, each: s + each[-1] if each else 0, initial=0)
next(g) # to ignore the initial 0 given above
return g
list(partial_sums(L))
Toward an optimization…
def subarrays_rev(L):
return (tuple(L[i:j]) for i in range(len(L)-1, -1, -1) for j in range(i+1, len(L)+1))
list(subarrays_rev(L))
def fixed_sum_rev(L, n, cache={}):
for tup in subarrays_rev(L):
rest = tup[1:]
s = tup[0] + cache.get(rest, 0)
cache[tup] = s
if s == n: yield tup
cache = {}
list(fixed_sum_rev(L, 10, cache))
cache # have a look at the collected values
def sample(n):
O, b, *rest = bin(random.getrandbits(n)) # because `string`s are iterable objects indeed.
return list(map(int, rest))
where
help(random.getrandbits)
LL = sample(1000)
assert set(map(tuple, fixed_sum(LL, 10))) == set(fixed_sum_rev(LL, 10))
%timeit list(fixed_sum(LL, 10))
%timeit list(fixed_sum_rev(LL, 10))
INTERMEZZO
if 4 < 8:
print('a')
else:
pass
b = if 4 < 8:
'''
lots of code
'''
else:
6
b = 5 if 4 < 8 else 6
b
Some strange uses of recursion¶
For more on this recursion schemata see https://www.cs.ox.ac.uk/people/ralf.hinze/publications/ICFP09.pdf and also https://www.sciencedirect.com/science/article/pii/S1571066104809721.
Constants¶
def const(n):
yield n
yield from const(n)
const(1)
<generator object const at 0x7f425c332970>
ones = const(1)
list(take(ones, 10))
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
Nats¶
def nats():
yield 0
g = nats() # !!
yield from map(lambda n: n + 1, g)
list(take(nats(), 10))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Primes¶
Consider the following functional specification for the naturals that are also primes
primes = filterPrime [2..]
where filterPrime (p:xs) =
p : filterPrime [x | x <- xs, x `mod` p /= 0]
def primes():
def P(numbers):
prime = next(numbers) # get the next prime from the iterator `it`.
yield prime # yield the next prime number
def not_divisible_by_prime(n): # a mnemonic predicate.
q, r = divmod(n, prime)
return r != 0
yield from P(filter(not_divisible_by_prime, numbers)) # `numbers` has been advanced before.
yield from P(itertools.count(2))
list(take(primes(), 20))
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71]
Fibonacci, again¶
Remember,
def fibs(first=0, second=1):
yield first # the first number in the Fibonacci series,
yield second # ... and the second one.
f, ff = itertools.tee(fibs(first, second)) # duplicate the stream of fibonacci numbers.
next(ff) # advance just one of them
yield from map(operator.add, f, ff) # according to the Fibonacci rule, yield all the rest.
list(take(fibs(), 20))
[0,
1,
1,
2,
3,
5,
8,
13,
21,
34,
55,
89,
144,
233,
377,
610,
987,
1597,
2584,
4181]
…and again¶
from sympy import IndexedBase, init_printing # SymPy for symbolic computation
init_printing() # pretty printing math symbols and expressions
x = IndexedBase('x')
x[1] # indexing as done in math.
fibos = list(take(fibs(x[0], x[1]), 20)) # generate an abstract schema
fibos
[expr.subs({x[0]:0, x[1]:1}) for expr in fibos] # Fibonacci numbers, as usual.
[expr.subs({x[0]:2, x[1]:1}) for expr in fibos] # Lucas numbers, less usual.