Exercises
Last updated on 2026-03-31 | Edit this page
Estimated time: 50 minutes
Overview
Questions
- How much did I learn over the past two days?
Objectives
- Test your knowledge on these tasks
Sorting Out References
OUTPUT
Hopper Grace
Slicing Strings
A section of an array is called a slice. We can take slices of character strings as well:
PYTHON
element = 'oxygen'
print('first three characters:', element[0:3])
print('last three characters:', element[3:6])
OUTPUT
first three characters: oxy
last three characters: gen
What is the value of element[:4]? What about
element[4:]? Or element[:]?
OUTPUT
oxyg
en
oxygen
Slicing Strings (continued)
What is element[-1]? What is
element[-2]?
OUTPUT
n
e
Slicing Strings (continued)
Given those answers, explain what element[1:-1]
does.
Creates a substring from index 1 up to (not including) the final index, effectively removing the first and last letters from ‘oxygen’
Slicing Strings (continued)
How can we rewrite the slice for getting the last three characters of
element, so that it works even if we assign a different
string to element? Test your solution with the following
strings: carpentry, clone,
hi.
PYTHON
element = 'oxygen'
print('last three characters:', element[-3:])
element = 'carpentry'
print('last three characters:', element[-3:])
element = 'clone'
print('last three characters:', element[-3:])
element = 'hi'
print('last three characters:', element[-3:])
OUTPUT
last three characters: gen
last three characters: try
last three characters: one
last three characters: hi
Overloading
+ usually means addition, but when used on strings or
lists, it means “concatenate”. Given that, what do you think the
multiplication operator * does on lists? In particular,
what will be the output of the following code?
[2, 4, 6, 8, 10, 2, 4, 6, 8, 10][4, 8, 12, 16, 20][[2, 4, 6, 8, 10], [2, 4, 6, 8, 10]][2, 4, 6, 8, 10, 4, 8, 12, 16, 20]
The technical term for this is operator overloading: a
single operator, like + or *, can do different
things depending on what it’s applied to.
Thin Slices
The expression element[3:3] produces an empty string, i.e., a string that
contains no characters. If data holds our array of patient
data, what does data[3:3, 4:4] produce? What about
data[3:3, :]?
OUTPUT
array([], shape=(0, 0), dtype=float64)
array([], shape=(0, 40), dtype=float64)
Stacking Arrays
Arrays can be concatenated and stacked on top of one another, using
NumPy’s vstack and hstack functions for
vertical and horizontal stacking, respectively.
PYTHON
import numpy
A = numpy.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print('A = ')
print(A)
B = numpy.hstack([A, A])
print('B = ')
print(B)
C = numpy.vstack([A, A])
print('C = ')
print(C)
OUTPUT
A =
[[1 2 3]
[4 5 6]
[7 8 9]]
B =
[[1 2 3 1 2 3]
[4 5 6 4 5 6]
[7 8 9 7 8 9]]
C =
[[1 2 3]
[4 5 6]
[7 8 9]
[1 2 3]
[4 5 6]
[7 8 9]]
Write some additional code that slices the first and last columns of
A, and stacks them into a 3x2 array. Make sure to
print the results to verify your solution.
A ‘gotcha’ with array indexing is that singleton dimensions are
dropped by default. That means A[:, 0] is a one dimensional
array, which won’t stack as desired. To preserve singleton dimensions,
the index itself can be a slice or array. For example,
A[:, :1] returns a two dimensional array with one singleton
dimension (i.e. a column vector).
OUTPUT
D =
[[1 3]
[4 6]
[7 9]]
Change In Inflammation
The patient data is longitudinal in the sense that each row represents a series of observations relating to one individual. This means that the change in inflammation over time is a meaningful concept. Let’s find out how to calculate changes in the data contained in an array with NumPy.
The numpy.diff() function takes an array and returns the
differences between two successive values. Let’s use it to examine the
changes each day across the first week of patient 3 from our
inflammation dataset.
OUTPUT
[0. 0. 2. 0. 4. 2. 2.]
Calling numpy.diff(patient3_week1) would do the
following calculations
and return the 6 difference values in a new array.
OUTPUT
array([ 0., 2., -2., 4., -2., 0.])
Note that the array of differences is shorter by one element (length 6).
When calling numpy.diff with a multi-dimensional array,
an axis argument may be passed to the function to specify
which axis to process. When applying numpy.diff to our 2D
inflammation array data, which axis would we specify?
Change In Inflammation (continued)
If the shape of an individual data file is (60, 40) (60
rows and 40 columns), what would the shape of the array be after you run
the diff() function and why?
The shape will be (60, 39) because there is one fewer
difference between columns than there are columns in the data.
Change In Inflammation (continued)
How would you find the largest change in inflammation for each patient? Does it matter if the change in inflammation is an increase or a decrease?
By using the numpy.amax() function after you apply the
numpy.diff() function, you will get the largest difference
between days.
PYTHON
array([ 7., 12., 11., 10., 11., 13., 10., 8., 10., 10., 7.,
7., 13., 7., 10., 10., 8., 10., 9., 10., 13., 7.,
12., 9., 12., 11., 10., 10., 7., 10., 11., 10., 8.,
11., 12., 10., 9., 10., 13., 10., 7., 7., 10., 13.,
12., 8., 8., 10., 10., 9., 8., 13., 10., 7., 10.,
8., 12., 10., 7., 12.])
If inflammation values decrease along an axis, then the
difference from one element to the next will be negative. If you are
interested in the magnitude of the change and not the
direction, the numpy.absolute() function will provide
that.
Notice the difference if you get the largest absolute difference between readings.
PYTHON
array([ 12., 14., 11., 13., 11., 13., 10., 12., 10., 10., 10.,
12., 13., 10., 11., 10., 12., 13., 9., 10., 13., 9.,
12., 9., 12., 11., 10., 13., 9., 13., 11., 11., 8.,
11., 12., 13., 9., 10., 13., 11., 11., 13., 11., 13.,
13., 10., 9., 10., 10., 9., 9., 13., 10., 9., 10.,
11., 13., 10., 10., 12.])
From 1 to N
Python has a built-in function called range that
generates a sequence of numbers. range can accept 1, 2, or
3 parameters.
- If one parameter is given,
rangegenerates a sequence of that length, starting at zero and incrementing by 1. For example,range(3)produces the numbers0, 1, 2. - If two parameters are given,
rangestarts at the first and ends just before the second, incrementing by one. For example,range(2, 5)produces2, 3, 4. - If
rangeis given 3 parameters, it starts at the first one, ends just before the second one, and increments by the third one. For example,range(3, 10, 2)produces3, 5, 7, 9.
Using range, write a loop that prints the first 3
natural numbers:
The body of the loop is executed 6 times.
Summing a list
Write a loop that calculates the sum of elements in a list by adding
each element and printing the final value, so
[124, 402, 36] prints 562
Computing the Value of a Polynomial
The built-in function enumerate takes a sequence (e.g. a
list) and generates a new sequence of the
same length. Each element of the new sequence is a pair composed of the
index (0, 1, 2,…) and the value from the original sequence:
The code above loops through a_list, assigning the index
to idx and the value to val.
Suppose you have encoded a polynomial as a list of coefficients in the following way: the first element is the constant term, the second element is the coefficient of the linear term, the third is the coefficient of the quadratic term, where the polynomial is of the form \(ax^0 + bx^1 + cx^2\).
OUTPUT
97
Write a loop using enumerate(coefs) which computes the
value y of any polynomial, given x and
coefs.
Plot Scaling
Why do all of our plots stop just short of the upper end of our graph?
Because matplotlib normally sets x and y axes limits to the min and max of our data (depending on data range)
Drawing Straight Lines
In the center and right subplots above, we expect all lines to look like step functions because non-integer values are not realistic for the minimum and maximum values. However, you can see that the lines are not always vertical or horizontal, and in particular the step function in the subplot on the right looks slanted. Why is this?
Because matplotlib interpolates (draws a straight line) between the
points. One way to do avoid this is to use the Matplotlib
drawstyle option:
PYTHON
import numpy
import matplotlib.pyplot
data = numpy.loadtxt(fname='inflammation-01.csv', delimiter=',')
fig = matplotlib.pyplot.figure(figsize=(10.0, 3.0))
axes1 = fig.add_subplot(1, 3, 1)
axes2 = fig.add_subplot(1, 3, 2)
axes3 = fig.add_subplot(1, 3, 3)
axes1.set_ylabel('average')
axes1.plot(numpy.mean(data, axis=0), drawstyle='steps-mid')
axes2.set_ylabel('max')
axes2.plot(numpy.amax(data, axis=0), drawstyle='steps-mid')
axes3.set_ylabel('min')
axes3.plot(numpy.amin(data, axis=0), drawstyle='steps-mid')
fig.tight_layout()
matplotlib.pyplot.show()
Make Your Own Plot
Create a plot showing the standard deviation (numpy.std)
of the inflammation data for each day across all patients.
Moving Plots Around
Modify the program to display the three plots on top of one another instead of side by side.
PYTHON
import numpy
import matplotlib.pyplot
data = numpy.loadtxt(fname='inflammation-01.csv', delimiter=',')
# change figsize (swap width and height)
fig = matplotlib.pyplot.figure(figsize=(3.0, 10.0))
# change add_subplot (swap first two parameters)
axes1 = fig.add_subplot(3, 1, 1)
axes2 = fig.add_subplot(3, 1, 2)
axes3 = fig.add_subplot(3, 1, 3)
axes1.set_ylabel('average')
axes1.plot(numpy.mean(data, axis=0))
axes2.set_ylabel('max')
axes2.plot(numpy.amax(data, axis=0))
axes3.set_ylabel('min')
axes3.plot(numpy.amin(data, axis=0))
fig.tight_layout()
matplotlib.pyplot.show()
Mixing Default and Non-Default Parameters
Given the following code:
PYTHON
def numbers(one, two=2, three, four=4):
n = str(one) + str(two) + str(three) + str(four)
return n
print(numbers(1, three=3))
What do you expect will be printed? What is actually printed? What rule do you think Python is following?
1234one2three41239SyntaxError
Given that, what does the following piece of code display when run?
a: b: 3 c: 6a: -1 b: 3 c: 6a: -1 b: 2 c: 6a: b: -1 c: 2
Attempting to define the numbers function results in
4. SyntaxError. The defined parameters two and
four are given default values. Because one and
three are not given default values, they are required to be
included as arguments when the function is called and must be placed
before any parameters that have default values in the function
definition.
The given call to func displays
a: -1 b: 2 c: 6. -1 is assigned to the first parameter
a, 2 is assigned to the next parameter b, and
c is not passed a value, so it uses its default value
6.
Readable Code
Revise a function you wrote for one of the previous exercises to try to make the code more readable. Then, collaborate with one of your neighbors to critique each other’s functions and discuss how your function implementations could be further improved to make them more readable.
Return versus print
Note that return and print are not
interchangeable. print is a Python function that
prints data to the screen. It enables us, as users,
see the data. return statement, on the other hand, makes
data visible to the program. Let’s have a look at the following
function:
Question: What will we see if we execute the following commands?
Python will first execute the function add with
a = 7 and b = 3, and, therefore, print
10. However, because function add does not
have a line that starts with return (no return
“statement”), it will, by default, return nothing which, in Python
world, is represented as None. Therefore, A
will be assigned to None and the last line
(print(A)) will print None. As a result, we
will see:
OUTPUT
10
None
Selecting Characters From Strings
If the variable s refers to a string, then
s[0] is the string’s first character and s[-1]
is its last. Write a function called outer that returns a
string made up of just the first and last characters of its input. A
call to your function should look like this:
OUTPUT
hm
Rescaling an Array
Write a function rescale that takes an array as input
and returns a corresponding array of values scaled to lie in the range
0.0 to 1.0. (Hint: If L and H are the lowest
and highest values in the original array, then the replacement for a
value v should be (v-L) / (H-L).)
Testing and Documenting Your Function
Run the commands help(numpy.arange) and
help(numpy.linspace) to see how to use these functions to
generate regularly-spaced values, then use those values to test your
rescale function. Once you’ve successfully tested your
function, add a docstring that explains what it does.
PYTHON
"""Takes an array as input, and returns a corresponding array scaled so
that 0 corresponds to the minimum and 1 to the maximum value of the input array.
Examples:
>>> rescale(numpy.arange(10.0))
array([ 0. , 0.11111111, 0.22222222, 0.33333333, 0.44444444,
0.55555556, 0.66666667, 0.77777778, 0.88888889, 1. ])
>>> rescale(numpy.linspace(0, 100, 5))
array([ 0. , 0.25, 0.5 , 0.75, 1. ])
"""
Defining Defaults
Rewrite the rescale function so that it scales data to
lie between 0.0 and 1.0 by default, but will
allow the caller to specify lower and upper bounds if they want. Compare
your implementation to your neighbor’s: do the two functions always
behave the same way?
PYTHON
def rescale(input_array, low_val=0.0, high_val=1.0):
"""rescales input array values to lie between low_val and high_val"""
L = numpy.amin(input_array)
H = numpy.amax(input_array)
intermed_array = (input_array - L) / (H - L)
output_array = intermed_array * (high_val - low_val) + low_val
return output_array
Identifying Syntax Errors
- Read the code below, and (without running it) try to identify what the errors are.
- Run the code, and read the error message. Is it a
SyntaxErroror anIndentationError? - Fix the error.
- Repeat steps 2 and 3, until you have fixed all the errors.
Identifying Variable Name Errors
- Read the code below, and (without running it) try to identify what the errors are.
- Run the code, and read the error message. What type of
NameErrordo you think this is? In other words, is it a string with no quotes, a misspelled variable, or a variable that should have been defined but was not? - Fix the error.
- Repeat steps 2 and 3, until you have fixed all the errors.
3 NameErrors for number being misspelled,
for message not defined, and for a not being
in quotes.
Fixed version:
- Practice makes perfect.