3. Advanced strings and functions, files and debugging.#

It is common to have the first cell in a Notebook with all imports needed.

We start to impor pi function from the math module and os module that we will use later.

from math import pi 
import os 

3.1 Advanced Strings#

Welcome to the third Notebook. In this Notebook we are going to learn some advanced Python. Let’s first start with strings. Run the code below and see what it prints out.

MyFString = f"This is an F-String"
MyString = "This is a string"
print(MyFString)
print(MyString)
This is an F-String
This is a string

Now let’s try inserting some data into our print() function. We’ll use the list of integers [1,2,3,4].

Data = [1,2,3,4]

MyFString = f"Data1: {Data[0]}, Data2: {Data[1]}, Data3: {Data[2]}, Data4: {Data[3]}"

print(MyFString)
print("Data1:",Data[0],",Data2:",Data[1],",Data3:",Data[2],",Data4:",Data[3])
Data1: 1, Data2: 2, Data3: 3, Data4: 4
Data1: 1 ,Data2: 2 ,Data3: 3 ,Data4: 4

As you can see from the above code, it is much easier to insert variables in a string by using an f-string (formatted string).

Formatting numbers#

Using f-strings makes formatting numbers really easy. Just add a colon character after a number value and specify how you want to format the number. The following table demonstrates a couple of examples with the number \(1\):

Code

Result

1:.2f

1.00

1:.0f

1

1:.10f

1.0000000000

1:%

100.000000%

1:.1%

100.0%

1:e

1.000000e+00

As you can see the default number of decimal places is six. Furthermore, the % formatting operator assumes that \(1\) is equal to \(100\)%, which is usual when working with fractions, and the formatting operator e formats using scientific notation.

Now let’s use our newfound knowledge of strings to make a simple progress bar. During other courses, you’ll sometimes have to write algorithms that take a long time to run. In this case, it is useful to have a progress bar. Our example of a progress bar makes use of the sleep() function, from the time module, to simulate elapsed time.

import time

for i in range(11):
    print(f"Loading: {i*10}%", )
    time.sleep(0.5) 
Loading: 0%
Loading: 10%
Loading: 20%
Loading: 30%
Loading: 40%
Loading: 50%
Loading: 60%
Loading: 70%
Loading: 80%
Loading: 90%
Loading: 100%

This works! Though it is not that pretty to look at. It would look nicer to not have it print a new line each time. This is where escape characters come in. These characters can do some special things in strings. Below an example of some escape characters:

Escape characters

Code

Result

'

\

\

\n

new line

\r

carriage return

\t

tab

\b

backspace

We can use some of these characters in our code. Let’s use the carriage return character to make our progress bar not print out a new line every time. We can do this by adding end=”\r” into our print function. The end keyword specifies a string that gets printed at the end. The string we print at the end here is the carriage return character. This carriage resets the print function to the start of the line; thus making the next print function overwrite the current printed line. Try it and see what happens:

print("Will I get overwritten?", end="\r")
print("This is a very important message")
This is a very important message

Now let’s add this to our progress bar…

import time
for i in range(11):
    print(f"Loading: {i*10}%", end="\r")
    time.sleep(0.5) 
print("Loading complete!")
Loading complete!

As you can see, it works beautifully!

3.2 Advanced Functions#

Sometimes you want to use the same code multiple times, so you could embed this code into a function. However, sometimes the code you want to use is so short that putting it into a function feels a bit over the top. This is where lambda functions are useful.

Lambda functions are functions that can take any number of arguments but can only have one expression in their function body. To demonstrate, see the code below. Here we have two functions that do exactly the same, but one is a lambda function and the other one is a normal function.

sqrt_lambda = lambda x : x**0.5

def sqrt(x):
    sqrt = x**0.5
    return sqrt

print(f"The square root of 16 is equal to {sqrt_lambda(16):.0f}")
print(f"The square root of 16 is equal to {sqrt(16):.0f}")
The square root of 16 is equal to 4
The square root of 16 is equal to 4
As you can see, the lambda version is much more concise. It automatically returns the computed value for you as well.

3.3 Working with files#

A lot of the work you’ll do in Python will have the following structure:

  1. Read data from a file

  2. Perform computations on the data

  3. Visualize the results and/or save the results to a file

So far, we have only learned about computations. So let’s learn a bit about how to manage files. Actually, opening or saving files is usually done with the help of modules which you will learn in more detail in Notebook 4 and 6. What we’ll discuss here is how to manage file paths.

File paths#

To learn how to use files we need to learn how file paths in computers work. If you are tech-savvy and know how file paths work you can skip this part.

File paths in computers work like a tree. They start at the root directory, which is often the C: drive (in Windows). This is the name of the hard drive that stores your Operating System. From the C: drive you can navigate into other directories. This is done using the \ character, however in other Operating Systems often the / delimiter is used.

If a file is in the folder Users, which is stored in the C: directory, the file path would be C:\Users. These types of file paths are called absolute paths. This file path is valid for most computers that run Windows, but some other Operating Systems may have different folder setups. This is why it is useful to use relative paths. Relative paths do not start from the root directory. Instead, they start from the directory you are currently in. By default, Jupyter Notebooks are stored in C:\Users\CurrentUser (where CurrentUser is your Windows username). To move into a directory using a relative path, for example, to the desktop folder, you would just write .\Desktop. To move back a directory, using a relative path, you would type ..

os.listdir() or os.listdir('./') list all the entries in your current directory os.listdir('../') list all entries if we go back one level.

Note

We use the / as delimiter, since a \ won’t work on macOS

import os

print(os.listdir())
print(os.listdir('./'))

print(os.listdir('../'))
['01.ipynb']
['01.ipynb']
['Exercises', 'In_a_Nutshell', 'Theory']

Warning

Keep in mind that, in Python, all file paths must be strings!

pathlib and os modules#

These modules are very useful in managing and navigating your file paths. The function path.expanduser(‘~’), from the os module, allows you to find your root directory, independent of your Operating System. Try the below cell to see it.

from pathlib import Path
import os

root_path = os.path.expanduser('~')
print(root_path)
C:\Users\mmendozalugo

The path shown above is thus the absolute path to your current directory.

This can come in handy when you write a code that needs to create directories in the user’s computer to save data files and/or plots. As an example, the code below checks if a directory exists and, if it doesn’t, it creates one.

The os.path.join is used to concatenate two strings to form a path string with the appropriate delimiter.

The code will check if a directory named plots exists in your current directory if not, it will create one.

print('Contents of current directory (before):')
print(os.listdir(root_path))

imdir = os.path.join(root_path,'plots') 
print(f'\nimdir = {imdir}')

Path(imdir).mkdir(parents=True, exist_ok=True)

print('\nContents of current directory (after creating the new directory):')
print(os.listdir(root_path))
Contents of current directory (before):
['Exercises', 'In_a_Nutshell', 'Theory']
imdir =  C:\Users\mmendozalugo\plots

Contents of current directory (after creating the new directory):
['Exercises', 'In_a_Nutshell', 'plots', 'Theory']

To delete the folder that was just created we run the code bellow.

try:
    os.rmdir(imdir)
    print(f'Directory {imdir} has been deleted.')
except:
    print('You already deleted the folder. :)')
Directory C:\Users\mmendozalugo\plots has been deleted.

Now you are, hopefully, a bit more used to working with file paths. For the next test, we are going to try to open a file. We can use some built-in Python functions to open a *.txt file and print its contents.

3.4 Debugging#

It is very easy (and common) to make mistakes when programming. We call these errors bugs. Finding these bugs in your program and resolving them is what we call debugging.

Errors According to Think PythonAppendix A, there are three different types of errors:

1. Syntax errors#

”In computer science, the syntax of a computer language is the set of rules that defines the combinations of symbols that are considered to be correctly structured statements or expressions in that language.”

Therefore, a syntax error is an error that does not obey the rules of the programming language. For example, parenthesis always comes in pairs… so (1+2) is OK, but 1+2) is not. Below another example of a syntax error. As you will see — this error is caught by the interpreter before running the code (hence, the print statements do not result in anything being printed).

For example if I want to raise 2 to the 3rd power applying the wrong syntax, it will cause a syntax error.

print('Message before')
2***3
print('Message after')
  Cell In[12], line 4
    2***3
       ^
SyntaxError: invalid syntax

2. Runtime errors#

”The second type of error is a runtime error. This type of error does not appear until after the program has started running. These errors are also called exceptions, as they usually indicate that something exceptional (and bad) has happened.”

Below an example of a small script to express fractions as decimals that will cause a runtime error. The error will appear, since you cannot divide by 0.

numerators = [1, 7, 5, 12, -1]
denominators = [6, 8, -1, 0, 5]
fractions = []

for i in range(len(numerators)):
    fractions.append(numerators[i] / denominators[i])
    print(f'New fraction was added from {numerators[i]}' 
          f'and {denominators[i]}!\n It is equal to {fractions[i]:.3f}')
   
New fraction was added from 1and 6!
 It is equal to 0.167
New fraction was added from 7and 8!
 It is equal to 0.875
New fraction was added from 5and -1!
 It is equal to -5.000
---------------------------------------------------------------------------
ZeroDivisionError                         Traceback (most recent call last)
Cell In[1], line 6
      3 fractions = []
      5 for i in range(len(numerators)):
----> 6     fractions.append(numerators[i] / denominators[i])
      7     print(f'New fraction was added from {numerators[i]}' 
      8           f'and {denominators[i]}!\n It is equal to {fractions[i]:.3f}')

ZeroDivisionError: division by zero

3. Semantic errors#

According to the Oxford Dictionary, ‘semantic’ is an adjective relating to meaning. Therefore, a ‘semantic error’ is an error in the meaning of your code. Your code will still run without giving any error back, but it will not result in what you expected (or desired). For that reason, semantic errors are the hardest to identify. Below an example:

I want to raise 2 to the 3rd power. However, I apply the wrong syntax that does not represent “pow()”.

No error message is created, because this syntax is used for another function in Python. However, this results in an output I did not expect nor desire.

power_of_2 = 2^3
print(f'2 to the 3rd power is {power_of_2}')
2 to the 3rd power is 1

Debugging strategies#

There are a few ways to debug a program. A simple one is to debug by tracking your values using print statements. By printing the values of the variables in between, we can find where the program does something unwanted. For example, the code block below:

A = [0, 1, 2, 3]

def sumA(my_list):
    "returns the sum of all the values in a given list"
    my_sum = 0
    i = 0
    while i < len(A):
        my_sum = A[i]
        i += 1
    return my_sum

print('The sum of the elements of the list A is {}.'.format(sumA(A)))
The sum of the elements of the list A is 3.

We see that our sumA() function outputs \(3\), which isn’t the sum of the contents of the list \(A\). By adding a print(my_sum) inside the loop we can get a clearer understanding of what goes wrong.

def sumA(my_list):
    "returns the sum of all the values in a given list"
    my_sum = 0
    i = 0
    while i < len(A):
        my_sum = A[i]
        print('var my_sum[{}] = {}'.format(i,my_sum))
        i += 1
    return my_sum

print('The sum of the elements of the list A is {}.'.format(sumA(A)))
var my_sum[0] = 0
var my_sum[1] = 1
var my_sum[2] = 2
var my_sum[3] = 3
The sum of the elements of the list A is 3.

It looks like the function is just stating the values of the list \(A\), but not adding them… so we must have forgotten to add something. Below the fixed version of that function.

def sumA_fixed(my_list):
    "returns the sum of all the values in a given list"
    my_sum = 0
    i = 0
    while i < len(A):
        my_sum += A[i]
        print('var my_sum[{}] = {}'.format(i,my_sum))
        i += 1
    return my_sum

print('The sum of the elements of the list A is {}.'.format(sumA_fixed(A)))

Additional study material#

After this Notebook you should be able to:

  • print a variable, formatting it in an appropriate manner

  • know the existence of escape characters

  • know how to use lambda functions

  • understand how file paths work

  • create and delete new directories

  • know the three different types of errors

  • have a plan when debugging your code