Lesson A7 – Iterations

It is a common purpose of programs to perform a defined task many, many times. We will learn in this part, how to write code that repeats itself a number of times without being literately written multiple times.

Let’s demonstrate this with an example that we will explain in more detail later using a dictionary of solvents and boiling points as in the container lesson.

[1]:
solvents_dict = {
    "water": 100,
    "ethanol": 78,
    "N,N-dimethylformamide": 153,
    "dichloromethane": 40,
    }

We could print each entry in the dictionary with a separate line:

[2]:
# Imaging doing this for hundreds of entries ...
# Or imagine you do not know all the dictionary keys before hand ...
# Or imagine the keys are changing over time ...
print(f"The boling point of water is {solvents_dict['water']} °C")
print(f"The boling point of ethanol is {solvents_dict['ethanol']} °C")
print(f"The boling point of N,N-dimethylformamide is {solvents_dict['N,N-dimethylformamide']} °C")
print(f"The boling point of dichloromethane is {solvents_dict['dichloromethane']} °C")
The boling point of water is 100 °C
The boling point of ethanol is 78 °C
The boling point of N,N-dimethylformamide is 153 °C
The boling point of dichloromethane is 40 °C

But it would be far less tedious and error-prone, if we used a self-repeating code block for this:

[3]:
# With a loop you can write:
for k, v in solvents_dict.items():
    print(f"The boiling point of {k} is {v} °C")

# You need to write much less code
# It works for arbitrarily many entries
# You do not need to know the entries beforehand
The boiling point of water is 100 °C
The boiling point of ethanol is 78 °C
The boiling point of N,N-dimethylformamide is 153 °C
The boiling point of dichloromethane is 40 °C

For-loops

One way to implement a repeating code block, is to use for-loops. They are meant to be used whenever you want to do something for a pre-defined number of times. In Python, for loops are connected to iterables. An iterable is an object you can use to iterate over. Collections, like lists, can be for example iterables. We use a container for an iteration with the pattern: for element in iterable.

[4]:
for i in [1, 2, 3]:  # <- colon after the for statement
    # loop over elements
    print(i)
1
2
3

Note: Notice, that the code block after the for statement, which should be repeated, is indented by 4 spaces. As we saw for if-else statements, indentation is also important in this new context to mark code as subordinate to the loop.

When the interpreter encounters the for directive, it takes the first element of the following iterable and assigns it to the control variable i (which does not need to be defined before) and executes the indented code block. After the code block ended, the interpreter jumps back to the beginning of the loop and assigns the next element of the iterable to i and so on until the end of the iterable is reached.

Advanced: If you have some familiarity with other programming languages this might seem strange to you as for loop are often implemented differently. Just as a side-note, the following example does not work in Python:

>>> i = 0  # initialise control variable
... for i < 2:  # check if control variable meets condition
...     print(i)  # do something
...     i += 1  # modify control variable

File "<ipython-input-15-a9edde6964ac>", line 2
    for i < 2:  # check if control variable meets condition
          ^
SyntaxError: invalid syntax

Use for-loops in Python as iterations over an iterable with the in operator. Do not set and modify the control variable explicitly. If you are iterating and you do not use the control variable, try to make this clear by using a single underscore _.

[5]:
for _ in [1, 1, 1]:
    # loop over elements but do something unrelated
    print("Hello!")
Hello!
Hello!
Hello!

As another rule of thumb, do not mess with the iterable you are looping over during the loops.

[6]:
l = [1, 2, 3]
for i in l:
    if i < 3:
        l.append(4)  # Don't!
    print(i)
1
2
3
4
4

You can use other types of collections and objects you know for iterations.

[7]:
string = "Hello, World!"
for i in string:
    print(i.upper(), end="-")
H-E-L-L-O-,- -W-O-R-L-D-!-

Note that you can also use collections that are not sequences for this.

[8]:
set_ = {1, 2, 3}
for i in set_:
    print(i ** 2)
1
4
9

Especially interesting is the use of dictionaries as iterables. Consider once more our example from the introduction.

[9]:
solvents_dict = {
    "water": 100,
    "ethanol": 78,
    "N,N-dimethylformamide": 153,
    "dichloromethane": 40,
    }

for i in solvents_dict:
    print(i)
water
ethanol
N,N-dimethylformamide
dichloromethane

Looping over a dictionary is the same as looping over its keys. You can of course loop over the values, or keys and values at the same time instead.

[10]:
print("Loop over keys:")
for k in solvents_dict.keys():
    print(f"    Key: {k}")

print("\nLoop over values:")
for v in solvents_dict.values():
    print(f"    Value: {v}")

print("\nLoop over items:")
for k, v in solvents_dict.items():
    print(f"    Key: {k}, Value: {v}")
Loop over keys:
    Key: water
    Key: ethanol
    Key: N,N-dimethylformamide
    Key: dichloromethane

Loop over values:
    Value: 100
    Value: 78
    Value: 153
    Value: 40

Loop over items:
    Key: water, Value: 100
    Key: ethanol, Value: 78
    Key: N,N-dimethylformamide, Value: 153
    Key: dichloromethane, Value: 40

Break and continue

Two Python directives associated with loops come in handy when you want to tune the looping behaviour. The continue directive leads to a skip of the current iteration, when it is encountered. The break directive aborts the current loop completely. Both are commonly used conditionally.

[11]:
for k, v in dict_.items():
    if k == "water":
        continue
    if v == 153:
        break
    print(f"    Key: {k}, Value: {v}")
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-11-26c2ba83b0d5> in <module>
----> 1 for k, v in dict_.items():
      2     if k == "water":
      3         continue
      4     if v == 153:
      5         break

NameError: name 'dict_' is not defined

Also important in this regard is the else statement. Used at the end of a loop, it introduces a code block that is only executed, if the loop ends normally without being interrupted by a break.

[12]:
for i in range(10):
    if i == 10:
        print("Aborting the loop")
        break
else:
    print("`i` was never 10")
`i` was never 10
[13]:
for i in range(10):
    if i == 3:
        print("Aborting the loop")
        break
else:
    print("`i` was never 3")
Aborting the loop

Do you see how important identation is here? Indenting the else block one hierarchy level down, completely changes the code!

[14]:
for i in range(10):
    if i == 3:
        print("Aborting the loop")
        break
    else:
        print("`i` was never 3")
`i` was never 3
`i` was never 3
`i` was never 3
Aborting the loop

range() and enumerate()

Iterations over collections are nice, but what if you do not have a suitable collection at you disposal? Imagine you wanted to repeat a task a hundred times. It would be tedious to create let’s say a list with a hundred elements manually first for this. This is what range() is for. This function gives you an iterable of integers following the pattern range(start, stop, step).

[15]:
for i in range(10):
    # loop over ten elements
    # start = 0  # inclusive
    # end = 10  # exclusive
    # step = 1
    print(i)
0
1
2
3
4
5
6
7
8
9
[16]:
for i in range(10, 101, 10):
    print(i, end=" - ")
10 - 20 - 30 - 40 - 50 - 60 - 70 - 80 - 90 - 100 -

Another super useful thing, if you have a collection at hand for an iteration over its elements but at the same time you want to use the index of the iteration, is the enumerate() function. It wraps your iterable with a counter and gives you an iterable of counter, element tuples. Sound difficult? It is not:

[17]:
solven_list = ["water", "ethanol", "dichloromethane"]

print("Use enumerate starting at 0:")
for c, i in enumerate(solven_list):
    print(f"    Index: {c} -- Element: {i}")
Use enumerate starting at 0:
    Index: 0 -- Element: water
    Index: 1 -- Element: ethanol
    Index: 2 -- Element: dichloromethane

While-loop

Another way we could realise a repeating code block is to use a while directive. Imagine you want to do a specific task more than once, but you do not exactly know how many times. You know, however, that you want to do it as long as a certain condition is met (or not met).

Let’s look at this with an example. Suppose, a potential energy function \(E(x)\) is not given as an analytic expression, but only as energy values at discrete values of \(x\) in the interval \([-10, 10]\).

[18]:
x = list(range(-10, 11))
E = [x**2 - 2 for x in x]
print("x: ", x)
print("E: ", E)
# E(-10) = 98, E(-9)=79 etc.
x:  [-10, -9, -8, -7, -6, -5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
E:  [98, 79, 62, 47, 34, 23, 14, 7, 2, -1, -2, -1, 2, 7, 14, 23, 34, 47, 62, 79, 98]

Presuming that the energy is positive at the lowest value of \(x\), we want to go through the list of energies to find out if the energy crosses the zero-line. If this is the case we want to print the two \(x\) values in between which the cross happens.

[19]:
i = 0  # start at the beginning of the list

# Check whether the potential energy of the current x value
# is greater than zero
while E[i] > 0:
    # If this is the case,
    # print the potential energy of the current x-value
    print(f"E({x[i]}) = {E[i]}")

    # and move the index to the next x-value.
    i += 1

    # If you reach the end of the list, stop iterating
    if i == len(x):
        break

# Print the result
if i != len(x):
    print(f"E(x) crosses the zero line between x = {x[i-1]} and x = {x[i]}.")
else:
    print("E(x) does not cross the zero-line.")
E(-10) = 98
E(-9) = 79
E(-8) = 62
E(-7) = 47
E(-6) = 34
E(-5) = 23
E(-4) = 14
E(-3) = 7
E(-2) = 2
E(x) crosses the zero line between x = -2 and x = -1.

As we mentioned, while-loops are intended for situations where you want to loop for an undefined number of times. A frequent pattern that is seen in this context is:

[20]:
l = [2, 1, 3]

element = 0
while True:
    print("Loop as long as necessarry...")

    if l[element]**2 > 8:
        break
    element += 1
Loop as long as necessarry...
Loop as long as necessarry...
Loop as long as necessarry...

The condition after the while-directive (True) is of course always True. This initialises a potentially infinite loop. It emphasises that we do not intend to loop a certain number of times. It is easy to get hung up in a loop that goes on forever with this. The only way to get out, is the break directive. Once this is encountered we escape the loop immediately. In our example we go through the elements of a list and check if they exceed 8 when they are raised to the power of 2. Once we found an element that fulfills the criterion, we stop looping.

Note: Use a while-loop if you want to repeat a code block an indefinite number of times. Take care to prevent infinite loops.

Advanced

Here is another while-loop example for illustration.

[21]:
l = [2, 1, 3]  # List of integers

element = 0
while element < (len(l) - 1) and l[element] > l[element + 1]:
    # Condition for loop execution:
    # List element is at most second last and smaller than the next
    l[element], l[element + 1] = l[element + 1], l[element]
    element += 1
    # Switch elements

print(l)
[1, 2, 3]

Let this code fragment sink in for a minute. What does it do? First, we create a list of numbers. Than we set a counter or index variable element to 0, so to the first list element. When the interpreter meets the while directive, it checks the fulfillment of the following conditions: Is the element index (0) smaller than the length of the list (len(l)) minus 1 (\(3 - 1 = 2 \rightarrow 2 > 0 \rightarrow\) True), and is the list element greater by value than the element with the next higher list index (\(2 > 1 \rightarrow\) True). If both conditions are met, which is the case in the first run-through, the indented code block is executed. The pattern a, b = b, a is the fastest way to switch two variable values. So in our case the two list elements change places, so that the larger element is moved to the right. After the position change is completed we increase the element index. After the execution of the indented statements, the interpreter jumps back to the position where the while condition is checked. This time the check evaluates to False, because the list element 2 is not larger than 3. As a consequence the indented code block is not executed and the interpreter moves on to what comes after it. What we did here, is a quite naive approach to bring the list elements in order.

Advanced

Generators, Iterators and other curious creatures

In the context of containers and iterations we faced the important term iterable. An iterable is anything you can use for element wise iterations. We know that indexable sequences (like strings or lists) can be iterables as well as unordered collections (like sets and dictionaries). We also saw that range() for examples gives us something we can use for iterations, so in other words range() gives us an iterable. Let’s have a closer look at this function.

[22]:
r = range(10)
print(r)
range(0, 10)

So what is this exactly? The call of the range() function gives us something special: A range object.

[23]:
type(r)
[23]:
range
[24]:
isinstance(r, list)  # ?
[24]:
False

A range object is not a list or any other type of collection we now. It is an object of this special type returned by range(). The elements we are iterating over are, however, somewhat hidden in this object. We can convert this range into a list, to expose them.

[25]:
list(r)
[25]:
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

A range object also acts as a sequence in the sense that we can request its elements by index.

[26]:
r[1]
[26]:
1

The really special thing about this is, that the elements the range object contains are not present (i.e. stored in memory) somewhere before they are used.

Note: The range() function returns a range object which can be used as a special type of iterable.

Remember to call help() on anything you want to learn more about.

[27]:
help(range)
Help on class range in module builtins:

class range(object)
 |  range(stop) -> range object
 |  range(start, stop[, step]) -> range object
 |
 |  Return an object that produces a sequence of integers from start (inclusive)
 |  to stop (exclusive) by step.  range(i, j) produces i, i+1, i+2, ..., j-1.
 |  start defaults to 0, and stop is omitted!  range(4) produces 0, 1, 2, 3.
 |  These are exactly the valid indices for a list of 4 elements.
 |  When step is given, it specifies the increment (or decrement).
 |
 |  Methods defined here:
 |
 |  __bool__(self, /)
 |      self != 0
 |
 |  __contains__(self, key, /)
 |      Return key in self.
 |
 |  __eq__(self, value, /)
 |      Return self==value.
 |
 |  __ge__(self, value, /)
 |      Return self>=value.
 |
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |
 |  __getitem__(self, key, /)
 |      Return self[key].
 |
 |  __gt__(self, value, /)
 |      Return self>value.
 |
 |  __hash__(self, /)
 |      Return hash(self).
 |
 |  __iter__(self, /)
 |      Implement iter(self).
 |
 |  __le__(self, value, /)
 |      Return self<=value.
 |
 |  __len__(self, /)
 |      Return len(self).
 |
 |  __lt__(self, value, /)
 |      Return self<value.
 |
 |  __ne__(self, value, /)
 |      Return self!=value.
 |
 |  __new__(*args, **kwargs) from builtins.type
 |      Create and return a new object.  See help(type) for accurate signature.
 |
 |  __reduce__(...)
 |      helper for pickle
 |
 |  __repr__(self, /)
 |      Return repr(self).
 |
 |  __reversed__(...)
 |      Return a reverse iterator.
 |
 |  count(...)
 |      rangeobject.count(value) -> integer -- return number of occurrences of value
 |
 |  index(...)
 |      rangeobject.index(value, [start, [stop]]) -> integer -- return index of value.
 |      Raise ValueError if the value is not present.
 |
 |  ----------------------------------------------------------------------
 |  Data descriptors defined here:
 |
 |  start
 |
 |  step
 |
 |  stop

Besides range() we used above the enumerate() function. This is giving us yet another very special thing.

[28]:
e = enumerate(range(10), 1)
print(e)
<enumerate object at 0x7fed6a34dca8>

So what is this again? As for range we can see from the print statement that we received an enmurate object from our function call. This enumerate object has some similarity with the range object in that it is an iterable whose elements are somewhat hidden and not really physically stored anywhere until they are used.

[29]:
list(e)
[29]:
[(1, 0),
 (2, 1),
 (3, 2),
 (4, 3),
 (5, 4),
 (6, 5),
 (7, 6),
 (8, 7),
 (9, 8),
 (10, 9)]

The enumerate object is different to the range object on the other hand as we can for example not treat it as a sequence by itself and we can not use an index.

[30]:
e[1]
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-30-4151b458e4a6> in <module>
----> 1 e[1]

TypeError: 'enumerate' object is not subscriptable

But what is this?! Here is anther strange thing about this enumerate object. When we try to get its elements as a list againg we get an empty list.

[31]:
list(e)
[31]:
[]

Objects like the enumerate object can only give to you each element once. It is designed to do so. After we have received every elements, the object is exhausted and remains empty. We can call this object an iterator. For now let’s say iterators are yet another type of iterable and leave it as that. We can request the elements of an iterator one by one using the next() function.

[32]:
e = enumerate(range(10), 1)
while True:
    print(next(e))
(1, 0)
(2, 1)
(3, 2)
(4, 3)
(5, 4)
(6, 5)
(7, 6)
(8, 7)
(9, 8)
(10, 9)
---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
<ipython-input-32-65e26bd8ffc3> in <module>
      1 e = enumerate(range(10), 1)
      2 while True:
----> 3     print(next(e))

StopIteration:

Once an iterator has been exhausted, trying to squeeze another element out of it will raise an execption.

Note: The enumerate() function returns an enumerate object which can be used as a special type of iterable. An enumerate object is an iterator that can return its elements one by one only once before it is exhausted.

In the container lesson you have already met another function that creates an iterator: zip().

[33]:
z = zip([1, 2, 3], ["a", "b", "c"])
print(z)
<zip object at 0x7fed682e6b08>

Functions that create iterators can be called generators. We do not want to dive to deep into this advanced topic but let us finish with one last very useful thing that you can use in very different situations: Generator expressions. With a generator expression we can create our own iterator in basically one line. Such an iterator is a good thing to have if you want to create an iterable for on-time use only, that is very memory efficient (only returns one element at the time). A generator expression resembles pretty much what we have seen as a for loop:

[34]:
g = (x for x in "abc")
print(g)
<generator object <genexpr> at 0x7fed68217200>
[35]:
while True:
    print(next(g))
a
b
c
---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
<ipython-input-35-23b2aec53ec8> in <module>
      1 while True:
----> 2     print(next(g))

StopIteration:

We can also interprete a generator expression as a list expression, when we substitute the parentheses with square brackets.

[36]:
gl = [x/10 for x in range(1, 11)]  # Create a range of floats
print(gl)
[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]

Again, we don’t want to discuss this useful but maybe confusing thing here in detail. But should you encounter such an expression any time soon, you know at least what lies before you.