September 14, 2009

List comprehensions and generators: Python and haskell for beginers

List comprehensions and generators are features I always liked from Python. I'll try to write a post I would love to read some time ago when I first met these wonders...

A comprehension list is a way to obtaining a lis in a "descriptive fashion", for example: the list of the first ten powers of 2:


[1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024]


Could be obtained from the list comprehension:


[2**n for n in range(0,11)]


Which could be read as: "2 to the n-th, for n in the range from 0 to 11". EIn this case range() is a number generator.

What does being a generator mean? It simply returns an iterator and it will produce numbers as long as we ask for them (there's a more performant function to do this, called xrange(), with better memory usage).

We would like to build, for example a generator that returns the numbers in the fibonacci sequence. Let's see which Python provides to achieve this.

The next code counts. Forever. It returns one number every time we ask for it:



def count():
x = 0
while True:
yield x
x = x + 1



The yield operator is powerful enough to build a generator. It returns the control to the caller of the generator function. Everytime the generator is invoked again (with the next() method it continues the execution in the line next to the last executed yield, avoiding any infinite loop as the while True may suggest.

In a Python console we can give it a try:


>>> a = count()
>>> a.next()
0
>>> a.next()
1
>>> a.next()
2
>>> a.next()
3
>>> a.next()
4


Now we do the same with the ever-useful sequence (the first two numbers are hard-coded since they are the base cases of the recursive definition of the sequence)




def fib():
i = 0
first = 0
yield first
second = 1
yield second
while True:
next = first + second
first = second
second = next
yield next



Now great, everything is very ince but where do these list comprehension and generator stuff come from? Well, list comprehensions are a feature that python "borrowed" from another language...Haskell, from the functional pradigm.

How do we do this same thing we've done in Python with Haskell? Let's start with the list comprehensions:


Main> [2^n| n<-[0..10]]
[1,2,4,8,16,32,64,128,256,512,1024]


Which can be read as: "The 2-to-the-nth that come from taking n from the list that goes from 0 to 10", since Haskell talks the mathematician language, it doesn't have problems like Python's range(n,m), which goes from n to (m-1)...

Let's continue. Do you notice anything strange in the Haskell line? A generator!, to be more precise... two! Why?


Main> [0..10]
[0,1,2,3,4,5,6,7,8,9,10]


Generates the sequence that goes from 0 to 10 while n <-[0..10] in the context of the comprehension list "takes n's" from that list.

Something important to say about generators. Python and Haskell allow generator chaining, try these codes in both Python and Haskell (respectively):


Main> [ (x,y)| x<-[0..5], y<-[11..15] ]



[(x,y) for x in range(0,6) for y in range(11,16)]


They generate the same!

Speed test


Anybody reading this is ready to go on reading by his own about generator and comprehension list wonders... you can find some links at the end of the post.
Bu I kept playing and tried some things... For example, have a look at these definitions in Haskell to generate the fibonacci sequence:



fibo 0 = 0
fibo 1 = 1
fibo n = fib (n-1) + fib (n-2)


fib :: Int -> Integer
fib n = fibs !! n
where
fibs = 0 : 1 : zipWith (+) fibs (tail fibs)



fibo is the recursive version, SLOW. Every time I ask for the n-th numer it recalculates the (n-1)-th and (n-2)-th and for each of these goes on like that... clearly taking exponential complexity.

En cambio fib is a version that makes use of the super-great functionalities present in functional languages: the folds, high order functions. This last version has lineal complexity, which means, it can compete with the Python version with no (prior) disadvantage (in algorithmic means). Let's do this!

The contestants :


fib :: Int -> Integer
fib n = fibs !! n
where
fibs = 0 : 1 : zipWith (+) fibs (tail fibs)



VS.



def fib():
i = 0
first = 0
yield first
second = 1
yield second
while True:
next = first + second
first = second
second = next
yield next



Try this.... my results in the next post.
Hoe you enjoyed this post!

Links:
Este link Pyton comparison with haskell from python docs.
http://www.haskell.org/haskellwiki/List_comprehension
http://www.haskell.org/haskellwiki/Fold
http://www.zvon.org/other/haskell/Outputprelude/zipWith_f.html
http://docs.python.org/tutorial/datastructures.html#list-comprehensions
http://docs.python.org/tutorial/classes.html#generators