Lab 10, Part 2: Ironman data
In this part of lab, we are working with a subset of the results from the 2016 Ironman Triathlon race in Kona, Hawaii.
This race has 3 parts, all completed in order, without a break:
- A 2.4 mile (3.86 km) swim
- A 112-mile (180.25 km) bicycle ride
- A 26.22 miles (42.20 km) marathon run
Task 0. Familiarize yourself with the data
In your lab10
folder there is a provided file called kona.py
that contains a list of dictionaries, where each dictionary corresponds to one athlete. We are only looking at athletes in the 18-24 year old range.
Create a new file called ironman.py
— at the top of this file import kona
so you can use the provided list of dictionaries:
from kona import kona
Here is a snapshot of what the list of dictionaries from kona.py
looks like (you can scroll horizontally to see more):
[
{'swim': 0.95, 'finish': 9.433333333333334, 'run': 3.283333333333333, 'firstname': 'Hans Christian', 'lastname': 'Tungesvik', 'genderRank': '128', 'overallRank': '137', 'bike': 5.083333333333333, 'country': 'NOR', 'divRank': '1'},
{'swim': 0.95, 'finish': 9.516666666666667, 'run': 3.4833333333333334, 'firstname': 'Kristian', 'lastname': 'Hindkjaer', 'genderRank': '159', 'overallRank': '169', 'bike': 4.933333333333333, 'country': 'DNK', 'divRank': '2'},
{'swim': 1.05, 'finish': 9.533333333333333, 'run': 3.35, 'firstname': 'Ivan', 'lastname': 'Kharin', 'genderRank': '172', 'overallRank': '183', 'bike': 5.0, 'country': 'RUS', 'divRank': '3'},
[...]
{'swim': 1.0666666666666667, 'finish': 'DNF', 'run': '---', 'firstname': 'Emily', 'lastname': 'Kempson', 'genderRank': '---', 'overallRank': '---', 'bike': '---', 'country': 'AUS', 'divRank': '---'},
]
Notes about the data:
- Each athlete's dictionary contains 10
key:value
pairs that include their first and last name, country, finish time, overall rank, rank by gender, and their times for the run/bike/swim portions of the event. - Times were converted from a 3:15:23 format to a 3.25 where 3 is the number of hours and .25 is fractional number of hours (i.e., 15 minutes → 1/4 of an hour). Seconds were discarded from our data.
- Did Not Finish
- For the athletes that did not finish the race, the
finish
key has the valueDNF
, short for Did Not Finish. - For a DNF athlete, many of the key values (e.g.,
divRank
,run
,bike
,genderRank
) are a string of three dashes:'---'
. At some point, you may have to handle the dictionaries of athletes who did not finish, so keep this in mind.
- For the athletes that did not finish the race, the
Task 1A. printNames - Prints athlete names
Write a function called printNames
that takes a list of dictionaries (e.g. the kona data)
and prints names each of the 69 athletes in the format lastname, firstname
.
Example:
printNames(kona)
Tungesvik,Hans Christian
Hindkjaer,Kristian
Kharin,Ivan
Mortensen,Mikkel
Geddes,Alexander
Lopes,Andre
Manninen,Juuso
Tissot,Alexis
...
Davis,Jaimee
Talker,Elisa
Fritz,Grant
Kempson,Emily
Task 1B. getNameTuples - Return a list of athlete name tuples
Write a function called getNameTuples
that takes a list of dictionaries (e.g. the kona data)
and returns a list of tuples of each of the 69 athletes in the list in the format (lastname, firstname)
, sorted by lastname.
Example:
nameTups = getNameTuples(kona)
nameTups[:10] # just showing first 10
# displayed for readability
[('Akiyama', 'Yuichi'),
('Appleby', 'Matthew'),
('Atkins', 'Wendy'),
('Azuma', 'Tomohiko'),
('Boll', 'Pascal'),
('Bonde', 'Line'),
('Braun', 'Maximilian'),
('Brock', 'Katrine'),
('Callaghan', 'Tom'),
('Carroll', 'Leah')]
Task 1C. getNameTuplesSortedByFirstname - Return a list of athlete name tuples, sorted by firstname
This task is similar to Task 1B above, except that the tuples should be sorted by firstname, instead of lastname. This requires writing a helper function, and then using that helper function as a key function to sort by (see Slide 13-13 in this set of slides).
Example:
firstnameTups = getNameTuplesSortedByFirstname(kona)
firstnameTups[:10] # just showing first 10
# displayed for readability
[('Reid', 'Aj'),
('Whelan', 'Alex'),
('Geddes', 'Alexander'),
('Jackson', 'Alexander'),
('Tissot', 'Alexis'),
('Lopes', 'Andre'),
('Oleander', 'Anna'),
('Rudson', 'Benjamin'),
('Maas', 'Benjamin'),
('Pleckaitis', 'Braden')]
Task 2. getNamesCountry - Returns a sorted list of tuples of athletes in the format (lastname, firstname) from a particular country
Write a function called getNamesCountry
that takes a list of dictionaries (e.g. the kona data) and a 3 letter country abbrevation.
This function should return a sorted list of tuples of athlete names in the format (lastname, firstname) from the given country.
Example: Athletes from BRA:
getNamesCountry(kona, 'BRA')
[('Lopes', 'Andre'), ('Pacheco Venturini', 'Guilherme'), ('Ponte', 'Paula')]
Example: Athletes from JPN:
getNamesCountry(kona, 'JPN')
[('Akiyama', 'Yuichi'), ('Azuma', 'Tomohiko'), ('Shun', 'Hiraya')]
Task 3. getAllCountries - Return unique countries
Write a function called getAllCountries
that takes a list of dictionaries (e.g. the kona data) and returns all unique countries represented at the Ironman race, sorted alphabetically.
Example:
print(getAllCountries(kona))
['ARG', 'AUS', 'AUT', 'BEL', 'BRA',
'CAN', 'CHE', 'DNK', 'ESP', 'FIN',
'FRA', 'GBR', 'ITA', 'JPN', 'LTU',
'MEX', 'NOR', 'NZL', 'PRI', 'RUS',
'SWE', 'USA']
There should be 22 unique countries represented across all athletes.
Task 4. totalAthletesByCountry - Return countries & athlete count
Write a function called totalAthletesByCountry
that takes a list of dictionaries (e.g. the kona data) and returns a dictionary where each key is a unique country, and the corresponding value is the total number of athletes from that country.
Example:
print(totalAthletesByCountry(kona))
Results:
{'NOR': 1, 'DNK': 6, 'RUS': 1, 'NZL': 2, 'BRA': 3, 'FIN': 1,
'FRA': 5, 'CHE': 2, 'USA': 22, 'CAN': 4, 'AUT': 2, 'MEX': 1,
'ITA': 1, 'GBR': 1, 'AUS': 6, 'PRI': 2, 'JPN': 3, 'LTU': 1,
'ARG': 1, 'SWE': 2, 'BEL': 1, 'ESP': 1}
(It's okay if your order does not match the example shown, just as long as all the values exist, because order does not matter in dictionaries.)
There are different ways you could approach this task.
Task 5. buildDictWithAverageEventTimes - Average event times for each country's athletes.
Write a function called buildDictWithAverageEventTimes
that...
- Takes a list of dictionaries (e.g. the kona data)
- Returns a dictionary where the keys are unique countries, and the corresponding values are a dictionary of average times for each event (swim, bike, run).
Example:
print(buildDictWithAverageEventTimes(kona))
# Note: Line breaks have been added for readability
{
'NOR': {'swim': 0.95, 'bike': 5.08, 'run': 3.28},
'DNK': {'swim': 1.12, 'bike': 5.46, 'run': 3.65},
'RUS': {'swim': 1.05, 'bike': 5.0, 'run': 3.35},
'NZL': {'swim': 1.14, 'bike': 5.88, 'run': 3.71},
'BRA': {'swim': 0.99, 'bike': 5.45, 'run': 3.77},
'FIN': {'swim': 0.88, 'bike': 5.05, 'run': 3.7},
'FRA': {'swim': 1.03, 'bike': 5.45, 'run': 3.85},
'CHE': {'swim': 1.03, 'bike': 5.51, 'run': 4.06},
'USA': {'swim': 1.06, 'bike': 5.86, 'run': 4.23},
'CAN': {'swim': 1.15, 'bike': 6.03, 'run': 4.35},
'AUT': {'swim': 1.21, 'bike': 6.48, 'run': 5.1},
'MEX': {'swim': 1.05, 'bike': 5.92, 'run': 3.35},
'ITA': {'swim': 1.08, 'bike': 5.65, 'run': 3.75},
'GBR': {'swim': 1.13, 'bike': 5.7, 'run': 3.65},
'AUS': {'swim': 0.88, 'bike': 4.87, 'run': 3.57},
'PRI': {'swim': 1.14, 'bike': 6.02, 'run': 4.79},
'JPN': {'swim': 1.17, 'bike': 6.1, 'run': 4.67},
'LTU': {'swim': 1.0, 'bike': 5.75, 'run': 4.35},
'ARG': {'swim': 1.1, 'bike': 6.18, 'run': 4.2},
'SWE': {'swim': 1.23, 'bike': 6.85, 'run': 4.88},
'BEL': {'swim': 0.88, 'bike': 5.1, 'run': 6.8},
'ESP': {'swim': 1.35, 'bike': 7.7, 'run': 4.55}
}
The resulting dictionary should have 22 elements.
It's okay if your order does not match the example shown, just as long as all the values exist. It's also fine if your values are slightly different: there are different approaches to the DNF entries that result in different values.
Tips:
- Divide the function into two parts: add up all the times, then average the times.
- The
totalAthletesByCountry
andgetAllCountries
functions may or may not be useful, depending in your strategy. - Your results should be a dictionary of dictionaries; see this page for examples of different ways of constructing nested dictionaries.
- You will have to handle the DNF (Did Not Finish) dictionary entries;
a boolean expression like
type(x) == str
can be used to determine if the time is a string (DNF
or---
).
Table of Contents
- Lab 10 Home
- Cheat Sheet: Lists & Dictionaries
- Part 1: Exercises
- Part 2: Ironman data
- Knowledge Check