1. Reviewing Sequences

We have seen four kinds of sequences so far: strings, lists, ranges, and tuples.
Python can distinguish among them via their delimiters: quotes for strings, square brackets for lists, and parentheses for tuples.

In [1]:
phrase = "Quincy's quilters quit quilting quickly"     # a string
numbers = [0, 10, 20, 30, 40, 50]                      # a list of numbers
courses = ["CS 111", "CS 115", "CS 121"
           "CS 204", "CS 220", "CS 230"]               # a list of strings
person = ('Frederick', 'Douglass')                     # a tuple of strings
date = ('November', 8, 20201)                          # a mixed tuple
odds = range(1, 10, 2)                                 # a range of numbers
In [2]:
print("type of 'phrase':", type(phrase))
print("type of 'numbers':", type(numbers))
print("type of 'characters':", type(courses))
print("type of 'person':", type(person))
print("type of 'date':", type(date))
print("type of 'odds':", type(odds))
type of 'phrase': <class 'str'>
type of 'numbers': <class 'list'>
type of 'characters': <class 'list'>
type of 'person': <class 'tuple'>
type of 'date': <class 'tuple'>
type of 'odds': <class 'range'>

Common Operations for Sequences

Sequences share many operations:

  • subscripting with indices,
  • slicing (with colon),
  • checking for membership with in,
  • use of len to indicate length
  • iteration through loops

a) Examples of indexing
Outputs in this case are a single element of the sequence.

In [3]:
phrase[14] # access element at index 14
Out[3]:
'e'
In [4]:
numbers[3] # access element at index 3
Out[4]:
30
In [5]:
person[1]  # access element at index 1
Out[5]:
'Douglass'
In [6]:
odds[2] # access element at index 2
Out[6]:
5

b) Examples of slicing
Outputs in this case are subsequences.

In [7]:
phrase[4:13] # slicing - get the elements indexed from 4 to 13 (this is not included)
Out[7]:
"cy's quil"
In [8]:
date[:2] # slicing - get the first two elements of the tuple
Out[8]:
('November', 8)
In [9]:
numbers[2:] # slicing - get all elements starting at index 2
Out[9]:
[20, 30, 40, 50]
In [10]:
odds[1::2] # slicing - get every other odd number starting at index 1
Out[10]:
range(3, 11, 4)

b) Examples of using the membership operator in
Outputs are boolean values.

In [11]:
2015 in date
Out[11]:
False
In [12]:
50 in numbers
Out[12]:
True
In [13]:
'quit' in phrase
Out[13]:
True
In [14]:
4 in odds
Out[14]:
False

Reminder: Why do we care about tuples?

Tuples are often used in Python to do multiple assignments in one single statement:

In [15]:
a, b = 0, 1
a, b
Out[15]:
(0, 1)
In [16]:
0, 1
Out[16]:
(0, 1)

Python generates tuples whenever we use commas to separate values:

In [17]:
len(phrase), len(numbers), len(courses), len(person), len(date), len(odds)
Out[17]:
(39, 6, 5, 2, 3, 5)

2. Motivating Dictionaries: Keys and Values

Key/Value Associations

In many real-life situations, information is naturally described in terms of two-column tables that associate a key in the first column with a value in the second column, where (1) the keys are unique and (2) the key/value association is rather arbtirary (i.e., there's no simple rule that determines the value from the key). For example:

Days in a Month

Month (key) Days (value)
Jan 31
Feb 28
Mar 31
Apr 30
May 31
Jun 30
Jul 31
Aug 31
Sep 30
Oct 31
Nov 30
Dec 31

Scrabble Points

Letter (key) Points (value) Letter (key) Points (value) Letter (key) Points (value)
a 1 j 8 s 1
b 3 k 5 t 1
c 3 l 1 u 1
d 2 m 3 v 4
e 1 n 1 w 4
f 4 o 1 x 8
g 2 p 3 y 4
h 4 q 10 z 10
i 1 r 1

Email Addresses

Name (key) Email (value)
Ada Lovelace ada@babbage.com
Grace Hopper grahop@vassar.edu
Katherine Johnson johnsonk@nasa.gov
Margaret Hamilton mhamiltonm@mit.edu

Looking up the Value associated with a Key

We often want to look up the value associate with a given key. There are many ways to do this, but many of the approaches we're familiar with are tedious to express.

A multi-branch conditional that tests keys one by one

In [18]:
def lookupDaysInMonth1(month):
    if month == 'Jan':
        return 31
    elif month == 'Feb':
        return 28
    elif month == 'Mar':
        return 31
    if month == 'Apr':
        return 30
    elif month == 'May':
        return 31
    elif month == 'Jun':
        return 30
    elif month == 'Jul':
        return 31
    elif month == 'Aug':
        return 31
    elif month == 'Sep':
        return 30
    if month == 'Oct':
        return 31
    elif month == 'Nov':
        return 30
    elif month == 'Dec':
        return 31
    else: 
        return 'Not a valid month'
    
def testLookupDaysInMonth1(month):
    print(f'{month} has {lookupDaysInMonth1(month)} days')
    
testLookupDaysInMonth1('Apr')
testLookupDaysInMonth1('Oct')
testLookupDaysInMonth1('Feb')
testLookupDaysInMonth1('March')
Apr has 30 days
Oct has 31 days
Feb has 28 days
March has Not a valid month days

A multi-branch conditional that groups keys by repeated values

In [19]:
def lookupDaysInMonth2(month):
    if month in ['Jan', 'Mar', 'May', 'Jul', 'Aug', 'Oct', 'Dec']:
        return 31
    elif month in ['Apr', 'Jun', 'Sep', 'Nov']:
        return 30
    elif month in ['Feb']:
        return 28
    else: 
        return 'Not a valid month'

def testLookupDaysInMonth2(month):
    print(f'{month} has {lookupDaysInMonth2(month)} days')
    
testLookupDaysInMonth2('Apr')
testLookupDaysInMonth2('Oct')
testLookupDaysInMonth2('Feb')
testLookupDaysInMonth2('March')
Apr has 30 days
Oct has 31 days
Feb has 28 days
March has Not a valid month days

Search for key in a list of key value tuples

In [20]:
daysInMonthPairs = [
    ('Jan', 31), ('Feb', 28), ('Mar', 31), 
    ('Apr', 30), ('May', 31), ('Jun', 30), 
    ('Jul', 31), ('Aug', 31), ('Sep', 30), 
    ('Oct', 31), ('Nov', 30), ('Dec', 31)
]

def lookupDaysInMonth3(monthToLookup):
    for mon, days in daysInMonthPairs:
        if monthToLookup == mon:
            return days # Early return!
    return 'Not a valid month'

def testLookupDaysInMonth3(month):
    print(f'{month} has {lookupDaysInMonth3(month)} days')
    
testLookupDaysInMonth3('Apr')
testLookupDaysInMonth3('Oct')
testLookupDaysInMonth3('Feb')
testLookupDaysInMonth3('March')
Apr has 30 days
Oct has 31 days
Feb has 28 days
March has Not a valid month days

The final approach has the nice feature that it separates the specification of the key/value pairs from the code that loops through pairs searching for the pair with the desired key, and then returns the value of that key.

Although this final approach is simpler than the earlier complex conditionals, it still has a big downside: the time it takes to find the value associated with a key is proportional to the number of key/value pairs. In practice, we could have millions of keys (or more!), and it might take a long time to find the desired key if we examine every key/value pair one-by-one.

3. Introducing Dictionaries: A Better Way to Lookup the Value for a Key

3.1 Dictionary Syntax

Dictionaries are unordered collections of key/value items.

In [21]:
daysInMonth = {'Jan': 31, 'Feb': 28, 'Mar': 31, 'Apr': 30,
                   'May': 31, 'Jun': 30, 'Jul': 31, 'Aug': 31,
                   'Sep': 30, 'Oct': 31, 'Nov': 30} # Dec is missing on purpose
daysInMonth
Out[21]:
{'Jan': 31,
 'Feb': 28,
 'Mar': 31,
 'Apr': 30,
 'May': 31,
 'Jun': 30,
 'Jul': 31,
 'Aug': 31,
 'Sep': 30,
 'Oct': 31,
 'Nov': 30}

Each key/value item is written key:item, and a dictionary is written down as a comma-separated collection of items delimited by curly braces.

The type of a dictionary is dict:

In [22]:
type(daysInMonth)
Out[22]:
dict

3.2 The Order of Key/Value Items order Doesn't Matter!

Unlike with sequences (such as strings, lists, tuples), where the order of elements matters, and we can access element by an index related to their order, the order of key/value items in a dictionary does not matter and we cannot access the items by an index.

Note: Dictionaries behind the scenes do actually maintain the order in which keys/values were inserted into the dictionary. This explains why, when we run the cell above, the months are displayed in the same order they were written. But you should not treat dictionaries as an ordered collection in the same way you would a sequence like a list. You cannot access a dictionary by index. You cannot append to a dictionary because a dictionary does not have a "last" item. Therefore, conceptually you should treat the key/value items as unordered elements in the dictionary.

3.3 The Most Important Dictionary Feature: Easy & Efficient Lookup via Subscripting

The most important feature of a dictionary is that we can use the subscripting notation dictionary[key] to look up the value associated with key in dictionary:

In [23]:
daysInMonth['Apr']
Out[23]:
30
In [24]:
daysInMonth['Oct']
Out[24]:
31
In [25]:
daysInMonth['Feb']
Out[25]:
28

Importantly, "under the hood" this lookup does not require searching through the key/value items one-by-one. Instead, it is almost immediate to find the value associated with a key, indepedent of how many key/value items there are. So looking up a key with millions of key/value items takes about the same time as looking it up with only dozens of items!

3.4 Key Errors

When a key does not exist in the dictionary, looking up a key results in a KeyError:

In [26]:
daysInMonth['March']
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[26], line 1
----> 1 daysInMonth['March']

KeyError: 'March'

Because dictionaries are not sequences, attempting to access items by an integer index will also fail with a KeyError

In [27]:
daysInMonth[3]
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[27], line 1
----> 1 daysInMonth[3]

KeyError: 3

However, integer indices are OK when the dictionary keys are themselves integers:

In [28]:
integerNames = {1: 'one', 3: 'three', 4: 'four', 7: 'seven'}

integerNames[3]
Out[28]:
'three'
In [29]:
integerNames[2]
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[29], line 1
----> 1 integerNames[2]

KeyError: 2

3.5 Use in to Check if a Key exists in a Dictionary

The in operator is used to check if a key is in a dictionary:

In [30]:
'Apr' in daysInMonth
Out[30]:
True
In [31]:
'March' in daysInMonth
Out[31]:
False

Note that in behaves differently in sequences and dictionaries:

  • In a sequence, in determines if an element is in the sequence.
  • In a dictionary, in determines if a key appears in a key/value item in the dictionary.

Using the in operator, we can implement the dictionary version of the lookupDaysInMonth function from above:

In [32]:
def lookupDaysInMonth(monthToLookup):
    if monthToLookup in daysInMonth:
        return daysInMonth[monthToLookup] # No loop is needed!!!
    else: 
        return 'Not a valid month'

def testLookupDaysInMonth(month):
    print(f'{month} has {lookupDaysInMonth(month)} days')
    
testLookupDaysInMonth3('Apr')
testLookupDaysInMonth3('Oct')
testLookupDaysInMonth3('Feb')
testLookupDaysInMonth3('March')
Apr has 30 days
Oct has 31 days
Feb has 28 days
March has Not a valid month days

IMPORTANT Note that we do not have to loop through the key/value pairs to find the value associated with the key. We just use the key as the subscript, and the dictionary "magically" returns the corresponding values. This is easy, powerful, and quick!

4. Exercise 1: Rewrite scrabblePoints

A great use for dictionaries is to store data that can simplify choosing among different values. Here is a scrabblePoints function.

In [33]:
def scrabblePoints(letter):
    "Return the scrabble score associated with a letter."
    if letter in 'aeilnorstu':
        return 1
    elif letter in 'dg':
        return 2
    elif letter in 'bcmp':
        return 3
    elif letter in 'fhvwy':
        return 4
    elif letter in 'k':
        return 5
    elif letter in 'jx':
        return 8
    elif letter in 'qz':
        return 10
    return 0

for letter in 'abdhjkq': 
    print(f"{letter} is worth {scrabblePoints(letter)} points.")
a is worth 1 points.
b is worth 3 points.
d is worth 2 points.
h is worth 4 points.
j is worth 8 points.
k is worth 5 points.
q is worth 10 points.

We can simplify the scrabblePoints function by storing the letter and their points in a dictionary:

In [34]:
scrabbleDict = {'a': 1, 'b': 3, 'c': 3, 'd': 2, 'e': 1, 'f': 4, 'g': 2, 
                'h': 4, 'i': 1, 'j': 8, 'k': 5, 'l': 1, 'm': 3, 'n': 1, 
                'o': 1, 'p': 3, 'q': 10, 'r': 1, 's': 1, 't': 1, 
                'u': 1, 'v': 4, 'w': 4, 'x': 8, 'y': 4, 'z': 10}

Now let's define a simpler scrabblePoints2 function using scrabbleDict:

In [35]:
def scrabblePoints2(letter):
    "Return the scrabble score associated with a letter."
    # Algorithm
    # 1. If letter is in scrabbleDict, return its points
    # 2. Otherwise return 0
    
    # Your code here
    if letter in scrabbleDict:
        return scrabbleDict[letter]
    return 0
In [36]:
# Test with different values
for letter in 'abdhjkq7!': 
    print(f"{letter} is worth, {scrabblePoints2(letter)} points.")
a is worth, 1 points.
b is worth, 3 points.
d is worth, 2 points.
h is worth, 4 points.
j is worth, 8 points.
k is worth, 5 points.
q is worth, 10 points.
7 is worth, 0 points.
! is worth, 0 points.

5. More Dictionary Operations

5.1 Literal dictionaries

We can write the entire dictionary as a literal value by wrapping braces around a comma-separate sequence of key:value pairs:

In [37]:
student = {'name': 'Georgia Dome', 'dorm': 'Munger Hall', 
           'section': 2, 
           'year': 2023, 
           'CSMajor?': True}

student
Out[37]:
{'name': 'Georgia Dome',
 'dorm': 'Munger Hall',
 'section': 2,
 'year': 2023,
 'CSMajor?': True}

Unlike in our previous examples, the above student dictionary shows that the values in a dictionary can have different types.

In fact, even the types of the keys in a dictionary can have different types:

In [38]:
# a dictionary may have different types as keys and values
mixedLabelDict = {"orange": "fruit", 
                  3: "March", 
                  "even": [2,4,6,8]} 
mixedLabelDict
Out[38]:
{'orange': 'fruit', 3: 'March', 'even': [2, 4, 6, 8]}

The keys in a dictionary are required to be unique. If you write a dictionary literal that repeats a key, this is not an error, but only the last key/value item with that key will be used.

In [39]:
{'a':1, 'b':2, 'a':3, 'c':4, 'a':5}
Out[39]:
{'a': 5, 'b': 2, 'c': 4}

Your Turn: Create a simple dictionary

  1. Define a dictionary named person that has two keys: 'first' and 'last' and as corresponding values your own first and last names.
  2. Use this dict in a print statement to display: Well done, FIRST LAST! (where FIRST and LAST are your first and last names read from the dictionary).
In [40]:
# create dict
# Your code here
person = {'first': 'Katherine', 'last': 'Johnson'}
In [41]:
# print phrase
# Your code here
print("Well done, " + person['first'], person['last'] + "!")
Well done, Katherine Johnson!

5.2 More examples of dictionaries

We''ll use the following dictionaries in following examples in this notebook, so don't forget to run this cell.

In [42]:
student = {'name': 'Georgia Dome', 
           'dorm': 'Munger Hall', 
           'section': 2, 
           'year': 2023, 
           'CSMajor?': True}

phones = {'Gal Gadot': 5558671234, 
          'Trevor Noah': 9996541212, 
          'Paula A. Johnson': 7811234567}

computerScientists = {('Ada','Lovelace'):['ada@babbage.com', 1815],
                      ('Grace', 'Hopper'):['grahope@vassar.edu', 1906],
                      ('Katherine', 'Johnson'):['johnsonk@nasa,gov', 1918],
                      ('Margaret', 'Hamilton'):['mhamilton@mit.edu', 1936]
                      }

# these are contributions of edits by Wikipedia editors
contributions = {
                 'uma52': {2015: 10, 2016: 15},
                 'setam$3': {2012: 23, 2013: 34, 2014: 17},
                 'rid12': {2009: 5, 2010: 18, 2012: 4} 
                }

The dictionary computerScientists contains four important women in the field of computer science. Check out their Wikipedia pages below to learn more about them:

  1. Ada Lovelace: One of the first programmers and credited with writing the first computer program.
  2. Grace Hopper: One of the pioneers of compilers (programs that translate one computer language to another) and the COBOL programming language.
  3. Katherine Johnson: An American Mathematician whose prowess in calculating orbital mechanics was essential in verifying calculations by early computers used in NASA flight missions. One of the first African American women to work as a NASA scientist.
  4. Margaret Hamilton: Her team developed the onboard flight software for the Apollo Missions and was director of the Software Engineering Division of the MIT Instrumentation Labratory.

President Barack Obama awarded all three American women the Presidential Medal of Freedom.

Your Turn: Practice with Subscripting Dictionaries

In [43]:
student
Out[43]:
{'name': 'Georgia Dome',
 'dorm': 'Munger Hall',
 'section': 2,
 'year': 2023,
 'CSMajor?': True}
In [44]:
# write the expression to retrieve the value 2023 from student
# Your code here
student['year']
Out[44]:
2023
In [45]:
phones
Out[45]:
{'Gal Gadot': 5558671234,
 'Trevor Noah': 9996541212,
 'Paula A. Johnson': 7811234567}
In [46]:
# write the expression to retrive Gal Gadot's phone number from phones
# Your code here
phones['Gal Gadot']
Out[46]:
5558671234
In [47]:
computerScientists
Out[47]:
{('Ada', 'Lovelace'): ['ada@babbage.com', 1815],
 ('Grace', 'Hopper'): ['grahope@vassar.edu', 1906],
 ('Katherine', 'Johnson'): ['johnsonk@nasa,gov', 1918],
 ('Margaret', 'Hamilton'): ['mhamilton@mit.edu', 1936]}
In [48]:
# write the expression to retrieve Grace Hopper's information from computerScientists
# Your code here
computerScientists[('Grace', 'Hopper')]
Out[48]:
['grahope@vassar.edu', 1906]
In [49]:
# what does this return?
computerScientists[('Ada', 'Lovelace')][0][0]
Out[49]:
'a'

5.3 len gives the Size of a Dictionary (the Number of Key/Value Items)

In [50]:
len(student)
Out[50]:
5
In [51]:
len(scrabbleDict)
Out[51]:
26
In [52]:
len(daysInMonth) # Remember, Dec is missing!
Out[52]:
11

5.4 Dictionary Mutability 1: Growing a Dictionary by Adding a Key/Value Item

Dictionaries are mutable. One way we can change them is by adding new key/value items. This means we can start with an empty dictionary and grow it in much the same way we do a list. This is a common way to create dictionaries in many of our problems.

In [53]:
cart = {} # The empty dictionary
cart['oreos'] = 3.99 # Add the item 'oreos': 3.99; len(cart) is now 1
cart['kiwis'] = 2.54 # Add the item 'kiwis': 2.54; len(cart) is now 2

cart
Out[53]:
{'oreos': 3.99, 'kiwis': 2.54}

Note: Since dictionaries are unordered, the order in which we enter key/value pairs is irrelevant.

Recall that the dictionary above was missing the month Dec, which has 31 days. Add it to the daysInMonth dictionary below.

In [54]:
# Your code here
daysInMonth['Dec'] = 31

5.5 Dictionary Mutability 2: Changing a Key/Value Item

When a key is already in a dictionary, assigning a value to that key in the dictionary changes the value at that key.

For example, the key Feb is associated with what value in the daysInMonth dictionary?

In [55]:
daysInMonth
Out[55]:
{'Jan': 31,
 'Feb': 28,
 'Mar': 31,
 'Apr': 30,
 'May': 31,
 'Jun': 30,
 'Jul': 31,
 'Aug': 31,
 'Sep': 30,
 'Oct': 31,
 'Nov': 30,
 'Dec': 31}
In [56]:
daysInMonth['Feb']
Out[56]:
28

If it's a leap year, we can change the value associated with Feb to be 29 via assignment with the subscript:

In [57]:
daysInMonth['Feb'] = 29   # change value associated with a key
In [58]:
daysInMonth
Out[58]:
{'Jan': 31,
 'Feb': 29,
 'Mar': 31,
 'Apr': 30,
 'May': 31,
 'Jun': 30,
 'Jul': 31,
 'Aug': 31,
 'Sep': 30,
 'Oct': 31,
 'Nov': 30,
 'Dec': 31}
In [59]:
daysInMonth['Feb']
Out[59]:
29

5.6 Dictionary Keys must be Immutable!

Although dictionaries are mutable, the keys of dictionaries must be immutable.

In [60]:
daysInMonth[['Feb', 2021]] = 28   # try to use a key that has month and year in a list
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[60], line 1
----> 1 daysInMonth[['Feb', 2021]] = 28   # try to use a key that has month and year in a list

TypeError: unhashable type: 'list'

But the following works, because a tuple is immutable:

In [61]:
daysInMonth[('Feb', 2023)] = 28
daysInMonth
Out[61]:
{'Jan': 31,
 'Feb': 29,
 'Mar': 31,
 'Apr': 30,
 'May': 31,
 'Jun': 30,
 'Jul': 31,
 'Aug': 31,
 'Sep': 30,
 'Oct': 31,
 'Nov': 30,
 'Dec': 31,
 ('Feb', 2023): 28}

The computerScientists dictionary is an example of a dictionary with tuples as keys.

In [62]:
computerScientists
Out[62]:
{('Ada', 'Lovelace'): ['ada@babbage.com', 1815],
 ('Grace', 'Hopper'): ['grahope@vassar.edu', 1906],
 ('Katherine', 'Johnson'): ['johnsonk@nasa,gov', 1918],
 ('Margaret', 'Hamilton'): ['mhamilton@mit.edu', 1936]}

5.7 Dictionary Mutability 3: The pop method

Given a key, the pop method on a dictionary removes the key/value pair with that key from the dictionary and returns the value formerly associated with the key. pop mutates the dictionary.

In [63]:
cart
Out[63]:
{'oreos': 3.99, 'kiwis': 2.54}
In [64]:
cart.pop('oreos')
Out[64]:
3.99
In [65]:
cart
Out[65]:
{'kiwis': 2.54}
In [66]:
daysInMonth
Out[66]:
{'Jan': 31,
 'Feb': 29,
 'Mar': 31,
 'Apr': 30,
 'May': 31,
 'Jun': 30,
 'Jul': 31,
 'Aug': 31,
 'Sep': 30,
 'Oct': 31,
 'Nov': 30,
 'Dec': 31,
 ('Feb', 2023): 28}
In [67]:
daysInMonth.pop(('Feb', 2023))
Out[67]:
28
In [68]:
daysInMonth
Out[68]:
{'Jan': 31,
 'Feb': 29,
 'Mar': 31,
 'Apr': 30,
 'May': 31,
 'Jun': 30,
 'Jul': 31,
 'Aug': 31,
 'Sep': 30,
 'Oct': 31,
 'Nov': 30,
 'Dec': 31}

QUESTION: It looks like the method pop works similarly to the one for lists. Do you think it will behave the same if we don't provide an argument value for it? Explain.

6. Exercise 2: Word Frequencies

Given text with words, we often want to know how many times each word appears in the text. This is an excellent illustration of the power of dictionaries!

To simplify the problem, assume we want to define a frequencies function that is given a list of words and returns a dictionary that associates each word in the list with the number of times it appears in the list. For example:

frequencies(["house", "bird", "house", "chirp", "feather", "chirp", "chirp"])
=> {'house': 2, 'bird': 1, 'chirp': 3, 'feather': 1}
In [69]:
def frequencies(wordList):
    """Given a list of words, returns a dictionary of word frequencies"""
    # Algorithm
    # 1. create an empty dict
    # 2. iterate through the words of the given list
    # 3. set the value or increment the value for each word
    # 4. return the dict
    
    # Your code here
    freqDict = {}
    for word in wordList:
        if word in freqDict:
            freqDict[word] += 1
        else:
            freqDict[word] = 1
    return freqDict
In [70]:
frequencies(["house", "bird", "house", "chirp", "feather", "chirp", "chirp"])
Out[70]:
{'house': 2, 'bird': 1, 'chirp': 3, 'feather': 1}

7. View objects associated with Dictionaries: .keys(), .values(), and .items()

The .keys(), .values(), and .items() methods return so-called view objects associated with a dictionary. Each of these methods returns an object conceptually containing only the keys, values, and items of the dictionary, respectively.

In [71]:
daysInMonth # the entire dict
Out[71]:
{'Jan': 31,
 'Feb': 29,
 'Mar': 31,
 'Apr': 30,
 'May': 31,
 'Jun': 30,
 'Jul': 31,
 'Aug': 31,
 'Sep': 30,
 'Oct': 31,
 'Nov': 30,
 'Dec': 31}
In [72]:
daysInMonth.keys()
Out[72]:
dict_keys(['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'])

The keys in a dict_keys object are techincally unordered, but appear to have a key order that is determined by the order in which the key/value pairs were added to the dictionary.

In [73]:
type(daysInMonth.keys())
Out[73]:
dict_keys

Programs are not supposed to depend on that order of keys returned by .keys(), and so dict_keys objects are not sequences, and cannot be subscripted with an index:

In [74]:
daysInMonth.keys()[3]
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[74], line 1
----> 1 daysInMonth.keys()[3]

TypeError: 'dict_keys' object is not subscriptable

Note that the .keys(), .values(), and .items() methods each return a different type of object.

In [75]:
daysInMonth.values()
Out[75]:
dict_values([31, 29, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31])

Again, the values in a dict_values object are technically unordered, but they are guaranteed to have the same ordering as the corresponding keys returned by .keys().

In [76]:
type(daysInMonth.values())
Out[76]:
dict_values

The list returned by .values() is synchronized with the list returned by .keys(). You can find corresponding months and days in the same index.

In [77]:
daysInMonth.items()
Out[77]:
dict_items([('Jan', 31), ('Feb', 29), ('Mar', 31), ('Apr', 30), ('May', 31), ('Jun', 30), ('Jul', 31), ('Aug', 31), ('Sep', 30), ('Oct', 31), ('Nov', 30), ('Dec', 31)])
In [78]:
type(daysInMonth.items())
Out[78]:
dict_items

The objects of type dict_keys, dict_values, and dict_items are so-called dictionary views that reflect any subsequent changes to the underlying dictionary from which they were made.

In [79]:
numNames = {'one': 1, 'two': 2, 'three': 3}
ks = numNames.keys() 
vs = numNames.values()
its = numNames.items()
print('keys:', ks)
print('values:', vs)
print('items:', its)
keys: dict_keys(['one', 'two', 'three'])
values: dict_values([1, 2, 3])
items: dict_items([('one', 1), ('two', 2), ('three', 3)])
In [80]:
numNames['four'] = 4
print('keys:', ks)
print('values:', vs)
print('items:', its)
keys: dict_keys(['one', 'two', 'three', 'four'])
values: dict_values([1, 2, 3, 4])
items: dict_items([('one', 1), ('two', 2), ('three', 3), ('four', 4)])
In [81]:
numNames.pop('two')
print('keys:', ks)
print('values:', vs)
print('items:', its)
keys: dict_keys(['one', 'three', 'four'])
values: dict_values([1, 3, 4])
items: dict_items([('one', 1), ('three', 3), ('four', 4)])

8. Iterating over a dictionary

There are many ways to iterate over a dictionary:

  1. over the keys (iterate directly over the dictionary; do not use .keys())
  2. over the values (with .values())
  3. over the items (with .items())
In [82]:
phones
Out[82]:
{'Gal Gadot': 5558671234,
 'Trevor Noah': 9996541212,
 'Paula A. Johnson': 7811234567}
In [83]:
# iterate directly (by default Python goes over the keys, because they are unique)
for key in phones:
    print(key, phones[key])
Gal Gadot 5558671234
Trevor Noah 9996541212
Paula A. Johnson 7811234567

Using for to iterate over a dictionary means iterating over all the keys in the dictionary, so there is no need to use .keys(), which would create an unnecessary object. So we prefer to write for key in phones: rather than for key in phones.keys():.

However, we do need .values() and .items() to iterate over the values and items of a dictionary, respectively.

In [84]:
for val in phones.values():
    print("Call " + str(val) + "!")
Call 5558671234!
Call 9996541212!
Call 7811234567!
In [85]:
# sometimes is useful to iterate over the items directly
# notice the tuple assignment in the for loop
for name, number in phones.items():
    print(f"Call {name} at {number}.")
Call Gal Gadot at 5558671234.
Call Trevor Noah at 9996541212.
Call Paula A. Johnson at 7811234567.
In [86]:
daysInMonth = {'Jan': 31, 'Feb': 28, 'Mar': 31, 'Apr': 30,
               'May': 31, 'Jun': 30, 'Jul': 31, 'Aug': 31,
               'Sep': 30, 'Oct': 31, 'Nov': 30, 'Dec': 31}
In [87]:
for month, days in daysInMonth.items():
    print(f"{month} has {days} days.")
Jan has 31 days.
Feb has 28 days.
Mar has 31 days.
Apr has 30 days.
May has 31 days.
Jun has 30 days.
Jul has 31 days.
Aug has 31 days.
Sep has 30 days.
Oct has 31 days.
Nov has 30 days.
Dec has 31 days.

Keys and values in a dictionary are asymmetric, in the sense that going from keys to values is easy, but going for values to keys is hard.

While it is never necessary to use a loop to find the value associated with a key, it is necessary to use a loop to find all of the keys associated with a value (there may be more than one!) For example, consider the following function:

In [88]:
def findMonthsWithDays(targetDays):
    ''' Return a list of months that have targetDays'''
    monthList = [] 
    for month, days in daysInMonth.items():
        if days == targetDays:
            monthList.append(month)
    return monthList
In [89]:
findMonthsWithDays(30)
Out[89]:
['Apr', 'Jun', 'Sep', 'Nov']
In [90]:
findMonthsWithDays(27)
Out[90]:
[]

9. Exercise 3: Find Key with Largest Value

Define the function getKeyWithMaxValue that behaves as shown below:

In [2]: getKeyWithMaxValue({'A': 0.25, 'E': 0.36, 
                            'I': 0.16, 'O': 0.18, 'U': 0.05})

Out[2]: 'E'

Hint: Remember the built-in function max; when given a list, it returns the largest value in the list. Also, remember the method values for a dictionary.

In [91]:
def getKeyWithMaxValue(dct):
    """Given a dict whose values are numbers, return the key that
    corresponds to the highest value.
    """
    # One possible algorithm:
    # 1. find the max with the help of the .values method
    # 2. iterate through the keys to find which key has a value that is equal to max
    # 3. return that key (it can be an early return)
    
    # Your code here
    
    maxVal = max(dct.values())
    for letter in dct:
        if dct[letter] == maxVal:
            return letter
In [92]:
getKeyWithMaxValue({'A': 0.25, 'E': 0.36, 'I': 0.16, 'O': 0.18, 'U': 0.05})
Out[92]:
'E'

10. Exercise 4: Reversing a Dictionary

Define a function reverseDictionary that takes a dict that has many similar values and creates a new dict where the keys are the unique values and the values are lists of the keys.

Example:

reverseDictionary(daysInMonth) => 
{31: ['Jan', 'Mar', 'May', 'Jul', 'Aug', 'Oct', 'Dec'],
 28: ['Feb'],
 30: ['Apr', 'Jun', 'Sep', 'Nov']}
In [93]:
def reverseDictionary(dct):
    """Given a dict that has many repeating values, returns a new dict where 
    the old values become the new keys.  The new values are lists containing
    all the old keys with the same value.
    """
    # Algorithm
    # 1. Create an empty dict
    # 2. Iterate over the dictionary
    # 3. If a key exists, append to the corresponding value (which is a list)
    # 4. If not, create the key:value pair, by assigning a list with one element to the new key
    # 5. return dict
    
    # Your code here
    reverseDct = {}
    
    for key, value in dct.items():
        if value in reverseDct:
            reverseDct[value].append(key)
        else:
            reverseDct[value] = [key]
            
    return reverseDct
In [94]:
reverseDictionary(daysInMonth)
Out[94]:
{31: ['Jan', 'Mar', 'May', 'Jul', 'Aug', 'Oct', 'Dec'],
 28: ['Feb'],
 30: ['Apr', 'Jun', 'Sep', 'Nov']}

11. Digging Deeper With Dictionaries

Use the built-in dict function

The dict function can create a dictionary from a list of tuples, where every tuple has two elements.

In [95]:
dict([('DEU', 49), ('ALB', 355), ('UK', 44)]) # a list of tuples for country codes
Out[95]:
{'DEU': 49, 'ALB': 355, 'UK': 44}

A tuple that is not part of a list will not work:

In [96]:
dict(('USA', 1))
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[96], line 1
----> 1 dict(('USA', 1))

ValueError: dictionary update sequence element #0 has length 3; 2 is required

Calling dict with zero arguments creates an empty dictionary:

In [97]:
dict() # creates an empty dict
Out[97]:
{}

The get method

The method get is used to avoid the step of checking for a key before updating.
This is possible because this method will return a "default" value when the key is not in the dictionary.
In all other cases, it will return the value associated with the given key.

In [98]:
daysInMonth
Out[98]:
{'Jan': 31,
 'Feb': 28,
 'Mar': 31,
 'Apr': 30,
 'May': 31,
 'Jun': 30,
 'Jul': 31,
 'Aug': 31,
 'Sep': 30,
 'Oct': 31,
 'Nov': 30,
 'Dec': 31}
In [99]:
daysInMonth.get('Oct', 'unknown')
Out[99]:
31
In [100]:
daysInMonth.get('March', 'unknown')
Out[100]:
'unknown'

Remember that if we try to access a non-existing key directly, we'll get a KeyError:

In [101]:
daysInMonth['March']
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[101], line 1
----> 1 daysInMonth['March']

KeyError: 'March'

Using get, allows us to avoid that error:

In [102]:
daysInMonth.get('March')

QUESTION: Why don't we see anything?

The update method

dict1.update(dict2) mutates dict1 by assigning the key/value pairs of dict2 to dict1.

In [103]:
# let's remind ourselves of the contributions
contributions
Out[103]:
{'uma52': {2015: 10, 2016: 15},
 'setam$3': {2012: 23, 2013: 34, 2014: 17},
 'rid12': {2009: 5, 2010: 18, 2012: 4}}
In [104]:
newContributions = {'brix4': {2011: 39, 2013: 27, 2015: 41},
                    'uma52': {2017: 21}}
In [105]:
contributions.update(newContributions)

QUESTION: What didn't you see an output from running the cell above?

In [106]:
contributions
Out[106]:
{'uma52': {2017: 21},
 'setam$3': {2012: 23, 2013: 34, 2014: 17},
 'rid12': {2009: 5, 2010: 18, 2012: 4},
 'brix4': {2011: 39, 2013: 27, 2015: 41}}

QUESTION: Why did the 2015 and 2016 contributions for uma52 disappear?

The clear method

We can wipe out the content of a dictionary with clear:

In [107]:
letters = {"a" : 1, "b" : 2}
In [108]:
letters.clear()
letters
Out[108]:
{}

What does "hashable" mean?

When Python stores the keys of a dictionary in memory, it stores their hashes, which is an integer returned by the hash function. Only immutable objects can be hashed.

In [109]:
hash("Wellesley")
Out[109]:
6868119656788521145
In [110]:
hash(['Feb', 2015])
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[110], line 1
----> 1 hash(['Feb', 2015])

TypeError: unhashable type: 'list'
In [111]:
hash( ('Feb', 2015) ) # Tuples are hashable even though lists are not
Out[111]:
-2755314832634355469
In [112]:
hash(123456) # numbers are their own hash value
Out[112]:
123456

At this point, you don't have to worry about why the keys are hashed, or how the hash function works!
Take more advanced CS courses to learn more.

In [ ]: