Programming with Python¶

Session Overview¶

What is programming with Python?
Basic math with Python
Types of variables
Importing packages
Numpy
For loops, if statements, and logicals
Creating Functions
Reading in a NetCDF file

What is programming with Python?¶

Python is an open-source programming language and the most widely used language in the world. Why it’s awesome:

Free to use 💸
Extensive ecosystem of libraries 📚
Huge community of users 👭
Tools for reproducible research 🧑‍🔬

Often, code is written in a text editor, then run in a command-line interface. Jupyter Notebooks 📓 allows us to write and run code within a single document. They also allow us to embed text and code.

Visual Studio Code (VSCode) is an easy to use development environment with extensions for every major programming language. We will be using VSCode for this workshop

Basic Mathematical Operations 📝¶

Operation	Operator	Example	Value
Addition	`+`	`2 + 3`	`5`
Subtraction	`-`	`2 - 3`	`-1`
Multiplication	`*`	`2 * 3`	`6`
Division	`/`	`7 / 3`	`2.66667`
Modulus	`%`	`7 % 3`	`1`
Exponentiation	`**`	`2 ** 0.5`	`1.41421`

We will enter our expressions in code cells. Hit shift + enter or press the “Run” button to execute the code in the cell.

[6]:

[6]:

[7]:

-15 + 23.42

[7]:

8.420000000000002

[8]:

8 ** 3

[8]:

Python uses typical order of operations - PEMDAS ✏️

[9]:

(2 + 3 + 4) / 3

[9]:

3.0

Variables¶

A variable is a place to store a value or object, so it can be referred to later in our code. To define a variable, we use an assignment statement

In the example above, zebra is bound to 9 (the value) not 23-14 (expression)

Example¶

Before we assign it a value, a variable is undefined.

[10]:

temp_in_c

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[10], line 1
----> 1 temp_in_c

NameError: name 'temp_in_c' is not defined

[ ]:

temp_in_c = 5
temp_in_c

[ ]:

temp_in_f = temp_in_c * 9/5 + 32
temp_in_f

41.0

Any time we use temp_in_f in an expression, 41.0 is substituted for it.

[ ]:

temp_in_f * -4

-164.0

The above expression does not change the value of temp_in_f, because we did not reassign temp_in_f

[ ]:

temp_in_f

41.0

Naming variables¶

Give your variables helpful names so that you/your collaborators know what they refer to

Variables can contain uppercase, lowercase, numbers, and underscores
- they cannot start with a number
- they are case sensitive!
- no character limit!

Examples of valid but poor variable names:

[ ]:

six = 15

[ ]:

hours = 60 * 60 * 24 * 365

Examples of assignment statements that are valid and use good variable names:

[ ]:

seconds_per_hour = 60 * 60
hours_per_year = 24 * 365
seconds_per_year = seconds_per_hour * hours_per_year

Variable Types¶

What’s the difference?

[ ]:

4 / 2

2.0

[ ]:

5 - 3

To us, 2.0 and 2 are the same number. But to Python, these appear to be different

Two numeric variable types: `int` and `float`¶

int: an integer of any size
float: a number with a decimal point

Integers `int`:¶

If you add, subtract, multiply or exponentiate int, result is another int
int precision is exact (i.e., 5 is exactly 5)

Use type() to check the kind of data type

[ ]:

type(2 ** 300)

int

Floats `float`:¶

Specified using a decimal point
Might be printed using scientific notation

[ ]:

3.2 + 2.5

5.7

[ ]:

type(5.7)

float

Strings `str` 🧶¶

A string is a snippet of text.

Enclosed by either single quotes (’) or doulble quotes (“)
Can be any length

[ ]:

"My string"

'My string'

[ ]:

type("My string")

str

Note: Python automatically determines types

String arithmetic¶

When using the + symbol between strings, the operation is called concatenation

[ ]:

s1 = 'send'
s2 = 'waves'

[ ]:

s1 + s2

'sendwaves'

[ ]:

s1 + ' ' + s2

'send waves'

String functions¶

You can use special functions on strings

Examples: upper, title, replace, but there are many more

[ ]:

s1.upper()

'SEND'

[ ]:

s1.title()

'Send'

[ ]:

s1.replace('s', 'b')

'bend'

We can look at the length of our string

[ ]:

len(s1)

Converting between data types¶

Mixing ints and floats in an expression results in a ``float``
A value can be converted to an int, float, or str
Some strings can ve converted to int and float

[ ]:

int(2.0 + 3)

[ ]:

str(3)

'3'

A note on built-in Python Functions¶

Functions in Python work the same way mathematical functions do

Input values to functions are called arguments
Calling a function asks the function to execute code on the given arguments

Python comes with a number of built-in functions such as int, float, str, type, and len (which we have already used)

Type ? after a function’s name to see its documentation, or use the help function

[ ]:

str?

Init signature: str(self, /, *args, **kwargs)
Docstring:
str(object='') -> str
str(bytes_or_buffer[, encoding[, errors]]) -> str

Create a new string object from the given object. If encoding or
errors is specified, then the object must expose a data buffer
that will be decoded using the given encoding and error handler.
Otherwise, returns the result of object.__str__() (if defined)
or repr(object).
encoding defaults to 'utf-8'.
errors defaults to 'strict'.
Type:           type
Subclasses:     StrEnum, DeferredConfigString, _rstr, LSString, include, Keys, InputMode, ColorDepth, CompleteStyle, FoldedCase, ...

[ ]:

help(len)

Help on built-in function len in module builtins:

len(obj, /)
    Return the number of items in a container.

Print statements¶

In order to look at the content of a variable, we do not always need to call the variable name at the end of the script

We can use the built-in print function instead

[ ]:

phrase = "world!"
print(phrase)

world!

We can also print statements before printing the variable

[ ]:

print("Hello", phrase)

Hello world!

Booleans¶

Conditional statements check if a statement is either True or False

The result can be stored in a variable called a boolean (bool for short).

Booleans can only be True or False

Comparison Operators

Symbol	Meaning
`==`	equal to
`!=`	not equal to
`<`	less than
`<=`	less than or equal to
`>`	greater than
`>=`	greater than or equal to

[ ]:

a = (5 == 6)
a

False

[ ]:

type(a)

bool

[ ]:

b = 9 + 10 < 21
b

True

Lists¶

A list is used to store multiple values. To create a new list, use [square brackets]

Lists are a sequence of any type of object

[ ]:

temp_list = [38, 33, 40, 34, 26, 23, 34]

[ ]:

type(temp_list)

list

To find the average temperature, we can divide the sum of the temperatures by the number of temperatures recorded, using built in functions sum and len:

[ ]:

sum(temp_list) / len(temp_list)

32.57142857142857

A single list can store elements of different types

[ ]:

mixed_temp = [68, 'sixty', 68.9, 62]
mixed_temp

[68, 'sixty', 68.9, 62]

A note on Lists¶

Lists are slow for data processing
To work with datasets, we want to use arrays instead
To gain this additional functionality, we need to import a library

Importing Packages¶

Python doesn’t have everything we need built in. Without reinventing the wheel, we can utilize tools already developed

We import packages (AKA libraries) through import statements
Syntax for calling functions: package_name.function()

[ ]:

import numpy as np # numpy is usually imported as np
from numpy import ones_like

As seen above –> instead of importing a complete library, we can import the function we need by using:

from package import function

Useful Packages:

Package	Purpose
numpy	Numerical operations
matplotlib	Plotting and visualization
netCDF4	Using netCDF files
pandas	Data analysis and manipulation
xarray	Labeled multi-dimensional arrays

Packages have their own associated documentation.

NumPy.¶

Alt text

NumPy provides support for arrays and matrix operations and is the most heavily used math library in Python.

Arrays¶

Arrays are collections of values (similar to lists), but are optimized for numerical computations

Store elements of a single, uniform data type

The simplest way to create an array is to pass a list of numbers as the input to np.array()

[ ]:

my_list = [0., 1., 2., 3., 4., 5.]
my_array = np.array(my_list)
my_array

array([0., 1., 2., 3., 4., 5.])

NumPy arrays have the type numpy.ndarray

[ ]:

type(my_array)

numpy.ndarray

Positions¶

Each element of an array has a position

Python is “0-indexed’

This means that the position of the first element in an array is 0, not 1.

[ ]:

my_array[0]

np.float64(0.0)

A negative number indicates that the count is going backward (i.e variable[-1] is the last element in the array)

[ ]:

my_array[-1]

np.float64(5.0)

Array-number arithmetic¶

Arrays make it easy to perform the same operation to every element. This is known as broadcasting.

[ ]:

# Increase all temperatures by 3 degrees
my_array + 3

array([3., 4., 5., 6., 7., 8.])

[ ]:

# halve all temperatures
my_array / 2

array([0. , 0.5, 1. , 1.5, 2. , 2.5])

Is my_array changed?

[ ]:

my_array # no!

array([0., 1., 2., 3., 4., 5.])

Slicing Arrays¶

When working with NumPy arrays, slicing allows us to extract a portion of the data using:

array[start:stop] -> starts at start index up to but not including the stop index
array[:stop] -> starts at the beginning (index 0) and goes up to stop -1
array[start:] -> starts at start and goes all the way to the end
array[:] -> gives you the entire array

[ ]:

pizza = np.arange(0, 8) # (can also use np.linspace(0, 7, 8))
print(pizza)
print("Number of pizza slices:", len(pizza))

[0 1 2 3 4 5 6 7]
Number of pizza slices: 8

[ ]:

# I want to eat slices 1-4
print(pizza[1:5])

[1 2 3 4]

[ ]:

# I want to eat up to slice 2
print(pizza[:3])

[0 1 2]

[ ]:

# I want to eat all slices except 0-3
print(pizza[4:])

[4 5 6 7]

[ ]:

# I want the whole pizza
print(pizza[:])

[0 1 2 3 4 5 6 7]

Array Methods¶

NumPy comes with many functions that can be used with arrays

A full list of methods can be found in the NumPy documentation

[ ]:

print("The max of my array is:", np.max(my_array))

The max of my array is: 5.0

We can call NumPy functions on an array with np.function(array) or array.function().

[ ]:

print("The minimum of my array is:", my_array.min())
print("The mean of my array is:", my_array.mean())

The minimum of my array is: 0.0
The mean of my array is: 2.5

np.append()¶

We can add to the end of our array with np.append.

[ ]:

np.append(pizza, 8.)

array([0., 1., 2., 3., 4., 5., 6., 7., 8.])

Activity¶

Suppose a coastal town is experiencing a storm. On Day 1, it rains 1mm. Each day after that, rainfall total increases by 1mm. If this continues for 30 days, how much total rain falls in that month in centimeters? Save this value as rain_total.

Hint: Use np.arange and .sum()

[ ]:

rain_total = np.arange(1,31).sum() / 10 # in cm
rain_total

np.float64(46.5)

If statements and For loops¶

What is an `If` statement?¶

An if statement lets you make decisions in your code

Checks whether a condition is True or False
If True, a block of code is run

If statements check a conditional statement, and they evaluate if it is True. The syntax is:

if (conditional statement):
    code to execute

[ ]:

temp_today = 22

if (temp_today > 20):
    print("It's warm today!")

It's warm today!

Adding `else` and `elif`¶

You can also use elif, short for else if to check multiple conditions

An else statement catches everything else

[ ]:

temp_today = 19

if temp_today > 20:
    print("It's hot!")
elif temp_today > 18:
    print("It's warm")
else:
    print("it's chilly")

It's warm

`for` loops¶

A for loop is used to repeat a block of code a certain number of times (perhaps for iterating over elements in a list, array, or range of numbers). range is commonly used in for loops, and is similar to np.arange().

[ ]:

for i in range(0, 10):  # i is the loop variable, which will range between 0 and 9
    print(i)

We can also iterate over lists/arrays

[ ]:

colors = ["Red", "Orange", "Yellow", "Green", "Blue", "Purple"]
for c in colors:
    print(c)

Red
Orange
Yellow
Green
Blue
Purple

`if` statements in `for` loops:¶

for loops and if statements can be combined!

[ ]:

temps = np.arange(15.,25.,2)

for t in temps:
    if t > 20:
        print("It's hot", t)
    else:
        print("It's cold", t)

It's cold 15.0
It's cold 17.0
It's cold 19.0
It's hot 21.0
It's hot 23.0

Creating counters¶

A counter is a variable you use to keep track of how many times something happens. The general format is:

Start the counter at 0
Add 1 every time something meets a condition
Short hand notation for updating variables with add/subtract/multiply is: +=, -=, *=, etc.

Let’s try to find how many days are warm enough to swim without a wetsuit

[ ]:

count_warm = 0

for t in temps:
    if t > 20:
        count_warm += 1

print("Number of warm days: ", count_warm)

Number of warm days:  2

Activity¶

Scenario: You are tasked to monitor local buoys. Each buoy records significant wave height in meters. Your task is to:

Print out the height of each wave
Use an if statement to flag dangerous waves (greater then 2.5 meters)
Count how many waves are dangerous

[ ]:

wave_heights = [1.2, 2.7, 3.1, 0.9, 2.0, 2.6, 1.8, 3.5, 2.3, 2.3, 3.1, 0.8, 1.9, 2.5, 1.7]

# 1. Loop through each wave height

# 2. Print the wave height

# 3. If the height is > 2.5, print a warning

# 4. Count how many are dangerous

Solution¶

[ ]:

# Solution

# Wave heights recorded by different buoys (in meters)
wave_heights = [1.2, 2.7, 3.1, 0.9, 2.0, 2.6, 1.8, 3.5, 2.3]

# Initialize counter for dangerous waves
danger_count = 0

# Loop through each wave height
for wave in wave_heights:
    if wave > 2.5:
        print(f"Wave height: {wave} m —  Danger!")
        danger_count += 1  # Increment counter
    else:
        print(f"Wave height: {wave} m")

# Print total number of dangerous waves
print(f"\nTotal dangerous waves: {danger_count}")

Wave height: 1.2 m
Wave height: 2.7 m —  Danger!
Wave height: 3.1 m —  Danger!
Wave height: 0.9 m
Wave height: 2.0 m
Wave height: 2.6 m —  Danger!
Wave height: 1.8 m
Wave height: 3.5 m —  Danger!
Wave height: 2.3 m

Total dangerous waves: 4

Creating Functions¶

Up until this point, we have used existing functions to learn Python

We can also define our own functions

Basic Syntax¶

def function_name(argument):
    # comment
    result = 1 + argument
    return result

def = tells Python we are defining a function
function_name = name of the function (you choose!)
argument = input(s) to the function
comment = text explaining the code
return = sends back a result
variables defined inside a function only exist inside the function, they must be returned to save them

[ ]:

def greeting():
    print("Hello!")
    return

greeting()

Hello!

[ ]:

# Convert Celsius to Fahrenheit
def c_to_f(temp_c):
    temp_f = (temp_c * 9/5) + 32
    return temp_f

[ ]:

c_to_f(20)

68.0

[ ]:

# add_it_up takes 2 arguments
def add(a,b):
    """ adds two numbers"""
    return a+b
add(2,2)

[ ]:

def pythagorean(a, b):
    '''Computes the hypotenuse length of a right triangle with legs a and b.'''

    c = (a ** 2 + b ** 2) ** 0.5

    return c

[ ]:

pythagorean(3,4)

5.0

Reading In Data: Scripps Pier Temperature¶

The data for this exercise is taken from the Scripps Pier. The data is stored in a netCDF format, so we will import some tools from our netCDF4 Python package. In case you aren’t familiar, NetCDF is a file format that stores data and meta data. This allows us to have a temperature timeseries with an associated time, depth, lat, and lon.

[1]:

from netCDF4 import Dataset

# open the dataset in read mode, we will not be editing it
ds = Dataset("python_programming/scripps_pier-2023.nc", mode='r')

print(ds)

<class 'netCDF4.Dataset'>
root group (NETCDF4 data model, file format HDF5):
    dimensions(sizes): time(125305), maxStrlen64(64)
    variables(dimensions): int64 time(time), float32 temperature(time), float32 conductivity(time), float32 pressure(time), float32 salinity(time), float32 chlorophyll_raw(time), float32 chlorophyll(time), int8 temperature_flagPrimary(time), int8 temperature_flagSecondary(time), int8 conductivity_flagPrimary(time), int8 conductivity_flagSecondary(time), int8 pressure_flagPrimary(time), int8 pressure_flagSecondary(time), int8 salinity_flagPrimary(time), int8 salinity_flagSecondary(time), int8 chlorophyll_flagPrimary(time), int8 chlorophyll_flagSecondary(time), float32 sigmat(time), float32 diagnosticVoltage(time), float32 currentDraw(time), float32 aux1(time), float32 aux3(time), float32 aux4(time), |S1 instrument1(maxStrlen64), |S1 instrument2(maxStrlen64), |S1 platform1(maxStrlen64), |S1 station(maxStrlen64), float32 lat(), float32 lon(), float32 depth(), float64 crs()
    groups:

[ ]:

# Print the available variable names with keys():
print(ds.variables.keys())

dict_keys(['time', 'temperature', 'conductivity', 'pressure', 'salinity', 'chlorophyll_raw', 'chlorophyll', 'temperature_flagPrimary', 'temperature_flagSecondary', 'conductivity_flagPrimary', 'conductivity_flagSecondary', 'pressure_flagPrimary', 'pressure_flagSecondary', 'salinity_flagPrimary', 'salinity_flagSecondary', 'chlorophyll_flagPrimary', 'chlorophyll_flagSecondary', 'sigmat', 'diagnosticVoltage', 'currentDraw', 'aux1', 'aux3', 'aux4', 'instrument1', 'instrument2', 'platform1', 'station', 'lat', 'lon', 'depth', 'crs'])

We can extract the tempearture data using it’s name temperature and the [:] operator to get all of the values.

[ ]:

# Extract data
temp_nc = ds.variables['temperature'][:]
temp_nc

masked_array(data=[15.1105, 15.1084, 15.0969, ..., 16.6199, 16.6152,
                   16.6175],
             mask=False,
       fill_value=np.float64(1e+20),
            dtype=float32)

We can see that this has some extra meta data in it, so lets convert it to a regular NumPy array.

[ ]:

temp = np.array(temp_nc)
temp

array([15.1105, 15.1084, 15.0969, ..., 16.6199, 16.6152, 16.6175],
      shape=(125305,), dtype=float32)

Convert all temperatures to Farenheit and assign it to a new variable temp_farenheit

Hint: $ ^:nbsphinx-math:circ `F = (:nbsphinx-math:frac{9}{5}` * ^:nbsphinx-math:`circ `C) + 32$

[ ]:

# convert all temperature to Farenheit
temp_far = (9/5) * temp + 32
#or
temp_far = c_to_f(temp)
temp_far

array([59.198902, 59.19512 , 59.17442 , ..., 61.91582 , 61.90736 ,
       61.9115  ], shape=(125305,), dtype=float32)

[ ]:

time_nc = ds.variables['time']
time_nc

<class 'netCDF4.Variable'>
int64 time(time)
    units: minutes since 2023-01-01 00:01:00
    calendar: proleptic_gregorian
unlimited dimensions: time
current shape = (125305,)
filling on, default _FillValue of -9223372036854775806 used

The meta data can be useful! Here it tells us that our units of time are minutes since Jan 1, 2023 at 12:01 AM. Let’s extract the time data, the same way we did for temperature.

[ ]:

# create an array
time = np.array(time_nc[:])
time

array([     0,      4,      8, ..., 504299, 504303, 504307],
      shape=(125305,))

What is the time between data points? (How frequently do we get our temperature measurements?)

[ ]:

dt = time[1] - time[0]

print('Temperature is measured every', dt, 'minutes')

Temp is measured every 4 minutes

How many data points are there per day?

[ ]:

data_per_hr = 60/dt
data_per_day = data_per_hr * 24
data_per_day

np.float64(360.0)

Print the first hour’s worth of data

[ ]:

temp[:data_per_hr]

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[157], line 1
----> 1 temp[:data_per_hr]

TypeError: slice indices must be integers or None or have an __index__ method

Whoops! We can’t index with a float. We need to convert to an int.

[ ]:

temp[:int(data_per_hr)]

array([15.1105, 15.1084, 15.0969, 15.0874, 15.0832, 15.0824, 15.0818,
       15.0799, 15.0732, 15.0711, 15.0799, 15.0741, 15.0745, 15.0684,
       15.095 ], dtype=float32)

Compute the daily mean temperature for each day

Steps:

Define a function that takes in a 1D array of temperatures and returns a list of daily means
use a for loop to slice the array into chunks of 360
Compute and store the mean for each day

[ ]:

## Starter Code

def compute_daily_means(temp_array):
    daily_means = []  # Start with an empty array

    # Calculate how many full days are in the dataset
    num_days = ...  # Hint: use len(temp_array) and  division /

    # Loop through each day
    for i in range(num_days):
        # HINT: slice the array to get one day's worth of temperatures
        start = i * ...
        end = start + ...
        day_temps = temp_array[start:end]

        # Compute the daily mean
        mean = ...

        # Append the result to daily_means
        daily_means = ...

    return daily_means

# Run your function on the data and print results
daily_avgs = compute_daily_means(temp)

daily_avgs

Potential solution:

[ ]:

def compute_daily_means(temp_array):
    daily_means = [] # empty array
    num_days = len(temp_array) / 360

    for i in np.arange(num_days):
        start = int(i * 360)
        end = start + 360
        day_temps = temp_array[start:end]
        daily_mean = np.mean(day_temps)
        daily_means.append(daily_mean)

    return daily_means

# Run the function
daily_means = compute_daily_means(temp)
daily_means[:10]

[np.float32(14.855899),
 np.float32(14.873767),
 np.float32(14.834493),
 np.float32(14.812212),
 np.float32(14.873991),
 np.float32(14.855767),
 np.float32(14.821135),
 np.float32(14.878883),
 np.float32(14.8914795),
 np.float32(14.778408)]

Acknowledgements¶

Some of the material in this lesson is derived from the Software Carpentry Lessons for Python Programming and Plotting https://swcarpentry.github.io/python-novice-inflammation/reference/ and HDSI at UC San Diego https://datascience.ucsd.edu/

Programming with Python¶

Session Overview¶

What is programming with Python?¶

Basic Mathematical Operations 📝¶

Variables¶

Example¶

Naming variables¶

Variable Types¶

Two numeric variable types: int and float¶

Integers int:¶

Floats float:¶

Strings str 🧶¶

String arithmetic¶

String functions¶

Converting between data types¶

A note on built-in Python Functions¶

Print statements¶

Booleans¶

Lists¶

A note on Lists¶

Importing Packages¶

NumPy.¶

Arrays¶

Positions¶

Array-number arithmetic¶

Slicing Arrays¶

Array Methods¶

np.append()¶

Activity¶

If statements and For loops¶

What is an If statement?¶

Adding else and elif¶

for loops¶

if statements in for loops:¶

Creating counters¶

Activity¶

Solution¶

Creating Functions¶

Basic Syntax¶

Reading In Data: Scripps Pier Temperature¶

Acknowledgements¶

Two numeric variable types: `int` and `float`¶

Integers `int`:¶

Floats `float`:¶

Strings `str` 🧶¶

What is an `If` statement?¶

Adding `else` and `elif`¶

`for` loops¶

`if` statements in `for` loops:¶