Lab 10, Part 2: Ironman data

In this part of lab, we are working with a subset of the results from the 2016 Ironman Triathlon race in Kona, Hawaii.

This race has 3 parts, all completed in order, without a break:

A 2.4 mile (3.86 km) swim
A 112-mile (180.25 km) bicycle ride
A 26.22 miles (42.20 km) marathon run

Task 0. Familiarize yourself with the data

In your lab10 folder there is a provided file called kona.py that contains a list of dictionaries, where each dictionary corresponds to one athlete. We are only looking at athletes in the 18-24 year old range.

Create a new file called ironman.py — at the top of this file import kona so you can use the provided list of dictionaries:

from kona import kona

Here is a snapshot of what the list of dictionaries from kona.py looks like (you can scroll horizontally to see more):

[
    {'swim': 0.95, 'finish': 9.433333333333334, 'run': 3.283333333333333, 'firstname': 'Hans Christian', 'lastname': 'Tungesvik', 'genderRank': '128', 'overallRank': '137', 'bike': 5.083333333333333, 'country': 'NOR', 'divRank': '1'},
    {'swim': 0.95, 'finish': 9.516666666666667, 'run': 3.4833333333333334, 'firstname': 'Kristian', 'lastname': 'Hindkjaer', 'genderRank': '159', 'overallRank': '169', 'bike': 4.933333333333333, 'country': 'DNK', 'divRank': '2'},
    {'swim': 1.05, 'finish': 9.533333333333333, 'run': 3.35, 'firstname': 'Ivan', 'lastname': 'Kharin', 'genderRank': '172', 'overallRank': '183', 'bike': 5.0, 'country': 'RUS', 'divRank': '3'},

[...]

    {'swim': 1.0666666666666667, 'finish': 'DNF', 'run': '---', 'firstname': 'Emily', 'lastname': 'Kempson', 'genderRank': '---', 'overallRank': '---', 'bike': '---', 'country': 'AUS', 'divRank': '---'},
]

Notes about the data:

Each athlete's dictionary contains 10 key:value pairs that include their first and last name, country, finish time, overall rank, rank by gender, and their times for the run/bike/swim portions of the event.
Times were converted from a 3:15:23 format to a 3.25 where 3 is the number of hours and .25 is fractional number of hours (i.e., 15 minutes → 1/4 of an hour). Seconds were discarded from our data.
Did Not Finish
- For the athletes that did not finish the race, the finish key has the value DNF, short for Did Not Finish.
- For a DNF athlete, many of the key values (e.g., divRank, run, bike, genderRank) are a string of three dashes: '---'. At some point, you may have to handle the dictionaries of athletes who did not finish, so keep this in mind.

Task 1A. printNames - Prints athlete names

Partner A

Write a function called printNames that takes a list of dictionaries (e.g. the kona data) and prints names each of the 69 athletes in the format lastname, firstname.

Example:

printNames(kona)

Tungesvik,Hans Christian
Hindkjaer,Kristian
Kharin,Ivan
Mortensen,Mikkel
Geddes,Alexander
Lopes,Andre
Manninen,Juuso
Tissot,Alexis
...
Davis,Jaimee
Talker,Elisa
Fritz,Grant
Kempson,Emily

Task 1B. getNameTuples - Return a list of athlete name tuples

Partner A (yes, A again)

Write a function called getNameTuples that takes a list of dictionaries (e.g. the kona data) and returns a list of tuples of each of the 69 athletes in the list in the format (lastname, firstname), sorted by lastname.

Example:

nameTups = getNameTuples(kona)
nameTups[:10]  # just showing first 10

# displayed for readability
[('Akiyama', 'Yuichi'), 
 ('Appleby', 'Matthew'), 
 ('Atkins', 'Wendy'), 
 ('Azuma', 'Tomohiko'), 
 ('Boll', 'Pascal'), 
 ('Bonde', 'Line'), 
 ('Braun', 'Maximilian'), 
 ('Brock', 'Katrine'), 
 ('Callaghan', 'Tom'), 
 ('Carroll', 'Leah')]

Task 1C. getNameTuplesSortedByFirstname - Return a list of athlete name tuples, sorted by firstname

Partner B

This task is similar to Task 1B above, except that the tuples should be sorted by firstname, instead of lastname. This requires writing a helper function, and then using that helper function as a key function to sort by (see Slide 13-13 in this set of slides).

Example:

firstnameTups = getNameTuplesSortedByFirstname(kona)
firstnameTups[:10]  # just showing first 10

# displayed for readability
[('Reid', 'Aj'), 
 ('Whelan', 'Alex'), 
 ('Geddes', 'Alexander'), 
 ('Jackson', 'Alexander'), 
 ('Tissot', 'Alexis'), 
 ('Lopes', 'Andre'), 
 ('Oleander', 'Anna'), 
 ('Rudson', 'Benjamin'), 
 ('Maas', 'Benjamin'), 
 ('Pleckaitis', 'Braden')]

Task 2. getNamesCountry - Returns a sorted list of tuples of athletes in the format (lastname, firstname) from a particular country

Partner B (yes, again)

Write a function called getNamesCountry that takes a list of dictionaries (e.g. the kona data) and a 3 letter country abbrevation.

This function should return a sorted list of tuples of athlete names in the format (lastname, firstname) from the given country.

Example: Athletes from BRA:

getNamesCountry(kona, 'BRA')

[('Lopes', 'Andre'), ('Pacheco Venturini', 'Guilherme'), ('Ponte', 'Paula')]

Example: Athletes from JPN:

getNamesCountry(kona, 'JPN')

[('Akiyama', 'Yuichi'), ('Azuma', 'Tomohiko'), ('Shun', 'Hiraya')]

Task 3. getAllCountries - Return unique countries

Partner A

Write a function called getAllCountries that takes a list of dictionaries (e.g. the kona data) and returns all unique countries represented at the Ironman race, sorted alphabetically.

Example:

print(getAllCountries(kona))

['ARG', 'AUS', 'AUT', 'BEL', 'BRA', 
 'CAN', 'CHE', 'DNK', 'ESP', 'FIN', 
 'FRA', 'GBR', 'ITA', 'JPN', 'LTU', 
 'MEX', 'NOR', 'NZL', 'PRI', 'RUS', 
 'SWE', 'USA']

There should be 22 unique countries represented across all athletes.

Task 4. totalAthletesByCountry - Return countries & athlete count

Partner B

Write a function called totalAthletesByCountry that takes a list of dictionaries (e.g. the kona data) and returns a dictionary where each key is a unique country, and the corresponding value is the total number of athletes from that country.

Example:

print(totalAthletesByCountry(kona))

Results:

{'NOR': 1, 'DNK': 6, 'RUS': 1, 'NZL': 2, 'BRA': 3, 'FIN': 1, 
 'FRA': 5, 'CHE': 2, 'USA': 22, 'CAN': 4, 'AUT': 2, 'MEX': 1, 
 'ITA': 1, 'GBR': 1, 'AUS': 6, 'PRI': 2, 'JPN': 3, 'LTU': 1, 
 'ARG': 1, 'SWE': 2, 'BEL': 1, 'ESP': 1}

(It's okay if your order does not match the example shown, just as long as all the values exist, because order does not matter in dictionaries.)

There are different ways you could approach this task.

Task 5. buildDictWithAverageEventTimes - Average event times for each country's athletes.

Partner A

Write a function called buildDictWithAverageEventTimes that...

Takes a list of dictionaries (e.g. the kona data)
Returns a dictionary where the keys are unique countries, and the corresponding values are a dictionary of average times for each event (swim, bike, run).

Example:

print(buildDictWithAverageEventTimes(kona))

# Note: Line breaks have been added for readability
{
    'NOR': {'swim': 0.95, 'bike': 5.08, 'run': 3.28},
    'DNK': {'swim': 1.12, 'bike': 5.46, 'run': 3.65},
    'RUS': {'swim': 1.05, 'bike': 5.0, 'run': 3.35},
    'NZL': {'swim': 1.14, 'bike': 5.88, 'run': 3.71},
    'BRA': {'swim': 0.99, 'bike': 5.45, 'run': 3.77},
    'FIN': {'swim': 0.88, 'bike': 5.05, 'run': 3.7},
    'FRA': {'swim': 1.03, 'bike': 5.45, 'run': 3.85},
    'CHE': {'swim': 1.03, 'bike': 5.51, 'run': 4.06},
    'USA': {'swim': 1.06, 'bike': 5.86, 'run': 4.23},
    'CAN': {'swim': 1.15, 'bike': 6.03, 'run': 4.35},
    'AUT': {'swim': 1.21, 'bike': 6.48, 'run': 5.1},
    'MEX': {'swim': 1.05, 'bike': 5.92, 'run': 3.35},
    'ITA': {'swim': 1.08, 'bike': 5.65, 'run': 3.75},
    'GBR': {'swim': 1.13, 'bike': 5.7, 'run': 3.65},
    'AUS': {'swim': 0.88, 'bike': 4.87, 'run': 3.57},
    'PRI': {'swim': 1.14, 'bike': 6.02, 'run': 4.79},
    'JPN': {'swim': 1.17, 'bike': 6.1, 'run': 4.67},
    'LTU': {'swim': 1.0, 'bike': 5.75, 'run': 4.35},
    'ARG': {'swim': 1.1, 'bike': 6.18, 'run': 4.2},
    'SWE': {'swim': 1.23, 'bike': 6.85, 'run': 4.88},
    'BEL': {'swim': 0.88, 'bike': 5.1, 'run': 6.8},
    'ESP': {'swim': 1.35, 'bike': 7.7, 'run': 4.55}
}

The resulting dictionary should have 22 elements.

It's okay if your order does not match the example shown, just as long as all the values exist. It's also fine if your values are slightly different: there are different approaches to the DNF entries that result in different values.

Tips:

Divide the function into two parts: add up all the times, then average the times.
The totalAthletesByCountry and getAllCountries functions may or may not be useful, depending in your strategy.
Your results should be a dictionary of dictionaries; see this page for examples of different ways of constructing nested dictionaries.
You will have to handle the DNF (Did Not Finish) dictionary entries; a boolean expression like type(x) == str can be used to determine if the time is a string (DNF or ---).

Lab 10, Part 2: Ironman data

Task 0. Familiarize yourself with the data

Task 1A. printNames - Prints athlete names

Task 1B. getNameTuples - Return a list of athlete name tuples

Task 1C. getNameTuplesSortedByFirstname - Return a list of athlete name tuples, sorted by firstname

Task 2. getNamesCountry - Returns a sorted list of tuples of athletes in the format (lastname, firstname) from a particular country

Task 3. getAllCountries - Return unique countries

Task 4. totalAthletesByCountry - Return countries & athlete count

Task 5. buildDictWithAverageEventTimes - Average event times for each country's athletes.

Table of Contents