Exercises

Exercises#

You can easily start a live Jupyter session in the cloud directly from this book. To do this, just click on the Launch Button () located above on this page.

You have a few options to choose from:

Launch on Binder: By selecting this option, you can instantly launch a live Jupyter session using Binder.
Launch on Google Colab: This option allows you to launch a live Jupyter session using Google Colab.

Alternatively, you can also click on the Jupyter Lite session link, which will open a new tab where you can freely write and run your code.

Jupyter Lite session

You can start a Jupiter lite session here

Wait until the message “You may begin!” is printed.

Exercise 6.2.1#

Finish a function, which with a for loop will get all types saved in series my_series

Type your code where the three (...) dots are placed. Do not change the name of the variables.

import pandas as pd

my_list = ['begin', 2, 3/4, "end"]
my_series = pd.Series(data=my_list)

def list_types(series):
    series_types = "Types inside series:\n"
    for ...
        item_type = ...
        series_types += str(item_type) + '\n'
    
    return series_types
        
print(list_types(my_series))

check your answer!

To check your answer in a Jupyter Lite session, simply run the following line of code immediately after your code implementation.

If your are in Google Colab just run the cell bellow.

check.notebook_6(question_number=0, arguments=[list_types])

(Searching) Exercise 6.3.1#

Complete the code below to print the total number of ’NaN’ values in mineral_properties

Change only the (...) below, do not add any other lines to this cell

import numpy as np
import pandas as pd

file_location = ("https://raw.githubusercontent.com/TUDelft-CITG/"
                "learn-python/mike/book/06/Theory/")
mineral_properties = pd.read_csv(file_location + 'mineral_properties.txt', 
                  skiprows=1,skipinitialspace=True)
mineral_properties['new_column'] = np.nan

def count_nans(df):
    nan_total = ...
    return nan_total

print(mineral_properties.head(3))
print(f'\ntotal amount of nans = {count_nans(mineral_properties)}')

check your answer!

To check your answer in a Jupyter Lite session, simply run the following line of code immediately after your code implementation.

If your are in Google Colab just run the cell bellow.

check.notebook_6(question_number=1, arguments=[count_nans, mineral_properties])

(Searching/Fixing) Exercise 6.3.3#

A geologist wrote a line of code to know how many of the listed minerals have a hardness greater than or equal to \(3\). However, something is not correct in his code… could you maybe fix it? There is a syntax error as well as a semantic error.

def count_minerals(df, minimal_hardness):
    amount_of_minerals = len(df(df.hardness.gt(minimal_hardness)))
    return amount_of_minerals
             
print(mineral_properties.head(3))
print(f'Amount of minerals with hardness >= 3: {count_minerals(mineral_properties, 3)}')

check your answer!

To check your answer in a Jupyter Lite session, simply run the following line of code immediately after your code implementation.

If your are in Google Colab just run the cell bellow.

check.notebook_6(question_number=2, arguments=[count_minerals, mineral_properties])

The above exercise might look like a silly one, but take some time to analyze it carefully… what is happening there? Break down each part of that line of code to understand it properly.

(Searching) Exercise 6.5.1#

A geologist is interested in the tallest mountain chains around Earth. For that, he created the tallest_mountains.csv table, containing information on all mountains above \(8000\) meters. Your task is to do the following \(5\) assignments.

Read the tallest_mountains.csv file.
What are the names of the columns?
What is the height of the tallest mountain?
What is the row number (or index) of this mountain?
What is the name of the tallest mountain in the dataset?

Use pandas functions to answer the questions.

Source for this and next exercises: Wikipedia

Write your code here, do not change any variable name

file_location = ("https://raw.githubusercontent.com/TUDelft-CITG/"
                "learn-python/mike/book/06/Exercises/")

mountains_8000 = ... #1 
cols = ... #2 
max_height = ... #3
index_max = ... #4 
tallest_mountain = ... #5 

print(mountains_8000.head())

check your answer!

To check your answer in a Jupyter Lite session, simply run the following line of code immediately after your code implementation.

If your are in Google Colab just run the cell bellow.

check.notebook_6(question_number=3, arguments=,[mountains_8000, cols, max_height, index_max, tallest_mountain])

(Searching) Exercise 6.5.2 #

Now, our geologist friend got another table with all mountains above \(7000\) meters, it’s in the mountains_above_7000m.csv file. Your task is to do the following \(6\) assignments.

Read the mountains_above_7000m.csv file.
Append mountains_8000 to this.
Remove the column describing the mountain range they belong.
Fix the row-indexing issue.
Create a Series indicating which entries of ‘Feet’ are missing.

Hint: Use the .isnull() function.
Add the values that are missing in the ‘Feet’ column, use \(1\) mtr = \(3.28084\) feet.

Hint2: Use the .mask() function, more info here, or the .fillna() function.

Write your code for tasks 1,2 and 3 in teh cell below, do not change any variable name

file_location = ("https://raw.githubusercontent.com/TUDelft-CITG/"
                "learn-python/mike/book/06/Exercises/")

mountains_7000 = ... # 1
df_concat = ... # 2
df_concat_norange = ... # 3

print(df_concat_norange)

Write your code for the remaining tasks in teh cell below, do not change any variable name

df_reset = ... # 4
missing_feet_series = ... # 5
df_reset["Feet"] = ... # 6
print(df_reset) 

check your answer!

To check your answer in a Jupyter Lite session, simply run the following line of code immediately after your code implementation.

If your are in Google Colab just run the cell bellow.

check.notebook_6(question_number=4, arguments=,[df_reset])

(Searching) Exercise 6.5.3#

Now that our geologist friend has all this information, he wants to know how many of these mountains are claimed by China. Can you help him out?

Hint: you need to find out which elements in the appropriate column contains the string ‘China’.

Type your code where the three (...) dots are placed. Do not change the name of the variables.

china_mountains = ...

print(china_mountains)

check your answer!

To check your answer in a Jupyter Lite session, simply run the following line of code immediately after your code implementation.

If your are in Google Colab just run the cell bellow.

check.notebook_6(question_number=5, arguments=,[china_mountains])

Exercises

Contents

Exercises#

Exercise 6.2.1#

(Searching) Exercise 6.3.1#

(Searching/Fixing) Exercise 6.3.3#

(Searching) Exercise 6.5.1#

(Searching) Exercise 6.5.2#

(Searching) Exercise 6.5.3#

(Searching) Exercise 6.5.2 #