Table of Contents
scrabblePoints
.keys()
, .values()
, and .items()
We have seen four kinds of sequences so far: strings, lists, ranges, and tuples.
Python can distinguish among them via their delimiters: quotes for strings, square brackets for lists, and parentheses for tuples.
phrase = "Quincy's quilters quit quilting quickly" # a string
numbers = [0, 10, 20, 30, 40, 50] # a list of numbers
courses = ["CS 111", "CS 115", "CS 121"
"CS 204", "CS 220", "CS 230"] # a list of strings
person = ('Frederick', 'Douglass') # a tuple of strings
date = ('November', 8, 20201) # a mixed tuple
odds = range(1, 10, 2) # a range of numbers
print("type of 'phrase':", type(phrase))
print("type of 'numbers':", type(numbers))
print("type of 'characters':", type(courses))
print("type of 'person':", type(person))
print("type of 'date':", type(date))
print("type of 'odds':", type(odds))
Sequences share many operations:
in
,len
to indicate lengtha) Examples of indexing
Outputs in this case are a single element of the sequence.
phrase[14] # access element at index 14
numbers[3] # access element at index 3
person[1] # access element at index 1
odds[2] # access element at index 2
b) Examples of slicing
Outputs in this case are subsequences.
phrase[4:13] # slicing - get the elements indexed from 4 to 13 (this is not included)
date[:2] # slicing - get the first two elements of the tuple
numbers[2:] # slicing - get all elements starting at index 2
odds[1::2] # slicing - get every other odd number starting at index 1
b) Examples of using the membership operator in
Outputs are boolean values.
2015 in date
50 in numbers
'quit' in phrase
4 in odds
Tuples are often used in Python to do multiple assignments in one single statement:
a, b = 0, 1
a, b
0, 1
Python generates tuples whenever we use commas to separate values:
len(phrase), len(numbers), len(courses), len(person), len(date), len(odds)
In many real-life situations, information is naturally described in terms of two-column tables that associate a key in the first column with a value in the second column, where (1) the keys are unique and (2) the key/value association is rather arbtirary (i.e., there's no simple rule that determines the value from the key). For example:
Month (key) | Days (value) |
---|---|
Jan | 31 |
Feb | 28 |
Mar | 31 |
Apr | 30 |
May | 31 |
Jun | 30 |
Jul | 31 |
Aug | 31 |
Sep | 30 |
Oct | 31 |
Nov | 30 |
Dec | 31 |
Letter (key) | Points (value) | Letter (key) | Points (value) | Letter (key) | Points (value) | |||||
---|---|---|---|---|---|---|---|---|---|---|
a | 1 | j | 8 | s | 1 | |||||
b | 3 | k | 5 | t | 1 | |||||
c | 3 | l | 1 | u | 1 | |||||
d | 2 | m | 3 | v | 4 | |||||
e | 1 | n | 1 | w | 4 | |||||
f | 4 | o | 1 | x | 8 | |||||
g | 2 | p | 3 | y | 4 | |||||
h | 4 | q | 10 | z | 10 | |||||
i | 1 | r | 1 |
Name (key) | Email (value) |
---|---|
Ada Lovelace | ada@babbage.com |
Grace Hopper | grahop@vassar.edu |
Katherine Johnson | johnsonk@nasa.gov |
Margaret Hamilton | mhamiltonm@mit.edu |
We often want to look up the value associate with a given key. There are many ways to do this, but many of the approaches we're familiar with are tedious to express.
def lookupDaysInMonth1(month):
if month == 'Jan':
return 31
elif month == 'Feb':
return 28
elif month == 'Mar':
return 31
if month == 'Apr':
return 30
elif month == 'May':
return 31
elif month == 'Jun':
return 30
elif month == 'Jul':
return 31
elif month == 'Aug':
return 31
elif month == 'Sep':
return 30
if month == 'Oct':
return 31
elif month == 'Nov':
return 30
elif month == 'Dec':
return 31
else:
return 'Not a valid month'
def testLookupDaysInMonth1(month):
print(f'{month} has {lookupDaysInMonth1(month)} days')
testLookupDaysInMonth1('Apr')
testLookupDaysInMonth1('Oct')
testLookupDaysInMonth1('Feb')
testLookupDaysInMonth1('March')
def lookupDaysInMonth2(month):
if month in ['Jan', 'Mar', 'May', 'Jul', 'Aug', 'Oct', 'Dec']:
return 31
elif month in ['Apr', 'Jun', 'Sep', 'Nov']:
return 30
elif month in ['Feb']:
return 28
else:
return 'Not a valid month'
def testLookupDaysInMonth2(month):
print(f'{month} has {lookupDaysInMonth2(month)} days')
testLookupDaysInMonth2('Apr')
testLookupDaysInMonth2('Oct')
testLookupDaysInMonth2('Feb')
testLookupDaysInMonth2('March')
daysInMonthPairs = [
('Jan', 31), ('Feb', 28), ('Mar', 31),
('Apr', 30), ('May', 31), ('Jun', 30),
('Jul', 31), ('Aug', 31), ('Sep', 30),
('Oct', 31), ('Nov', 30), ('Dec', 31)
]
def lookupDaysInMonth3(monthToLookup):
for mon, days in daysInMonthPairs:
if monthToLookup == mon:
return days # Early return!
return 'Not a valid month'
def testLookupDaysInMonth3(month):
print(f'{month} has {lookupDaysInMonth3(month)} days')
testLookupDaysInMonth3('Apr')
testLookupDaysInMonth3('Oct')
testLookupDaysInMonth3('Feb')
testLookupDaysInMonth3('March')
The final approach has the nice feature that it separates the specification of the key/value pairs from the code that loops through pairs searching for the pair with the desired key, and then returns the value of that key.
Although this final approach is simpler than the earlier complex conditionals, it still has a big downside: the time it takes to find the value associated with a key is proportional to the number of key/value pairs. In practice, we could have millions of keys (or more!), and it might take a long time to find the desired key if we examine every key/value pair one-by-one.
daysInMonth = {'Jan': 31, 'Feb': 28, 'Mar': 31, 'Apr': 30,
'May': 31, 'Jun': 30, 'Jul': 31, 'Aug': 31,
'Sep': 30, 'Oct': 31, 'Nov': 30} # Dec is missing on purpose
daysInMonth
Each key/value item is written key:
item, and a dictionary is written down as a comma-separated collection of items delimited by curly braces.
The type of a dictionary is dict
:
type(daysInMonth)
Unlike with sequences (such as strings, lists, tuples), where the order of elements matters, and we can access element by an index related to their order, the order of key/value items in a dictionary does not matter and we cannot access the items by an index.
Note: Dictionaries behind the scenes do actually maintain the order in which keys/values were inserted into the dictionary. This explains why, when we run the cell above, the months are displayed in the same order they were written. But you should not treat dictionaries as an ordered collection in the same way you would a sequence like a list. You cannot access a dictionary by index. You cannot append to a dictionary because a dictionary does not have a "last" item. Therefore, conceptually you should treat the key/value items as unordered elements in the dictionary.
The most important feature of a dictionary is that we can use the subscripting notation dictionary[
key]
to look up the value associated with key in dictionary:
daysInMonth['Apr']
daysInMonth['Oct']
daysInMonth['Feb']
Importantly, "under the hood" this lookup does not require searching through the key/value items one-by-one. Instead, it is almost immediate to find the value associated with a key, indepedent of how many key/value items there are. So looking up a key with millions of key/value items takes about the same time as looking it up with only dozens of items!
When a key does not exist in the dictionary, looking up a key results in a KeyError
:
daysInMonth['March']
Because dictionaries are not sequences, attempting to access items by an integer index will also fail with a KeyError
daysInMonth[3]
However, integer indices are OK when the dictionary keys are themselves integers:
integerNames = {1: 'one', 3: 'three', 4: 'four', 7: 'seven'}
integerNames[3]
integerNames[2]
in
to Check if a Key exists in a Dictionary¶The in
operator is used to check if a key is in a dictionary:
'Apr' in daysInMonth
'March' in daysInMonth
Note that in
behaves differently in sequences and dictionaries:
in
determines if an element is in the sequence.in
determines if a key appears in a key/value item in the dictionary.Using the in
operator, we can implement the dictionary version of the lookupDaysInMonth
function from above:
def lookupDaysInMonth(monthToLookup):
if monthToLookup in daysInMonth:
return daysInMonth[monthToLookup] # No loop is needed!!!
else:
return 'Not a valid month'
def testLookupDaysInMonth(month):
print(f'{month} has {lookupDaysInMonth(month)} days')
testLookupDaysInMonth3('Apr')
testLookupDaysInMonth3('Oct')
testLookupDaysInMonth3('Feb')
testLookupDaysInMonth3('March')
IMPORTANT Note that we do not have to loop through the key/value pairs to find the value associated with the key. We just use the key as the subscript, and the dictionary "magically" returns the corresponding values. This is easy, powerful, and quick!
scrabblePoints
¶A great use for dictionaries is to store data that can simplify choosing among different values. Here is a scrabblePoints
function.
def scrabblePoints(letter):
"Return the scrabble score associated with a letter."
if letter in 'aeilnorstu':
return 1
elif letter in 'dg':
return 2
elif letter in 'bcmp':
return 3
elif letter in 'fhvwy':
return 4
elif letter in 'k':
return 5
elif letter in 'jx':
return 8
elif letter in 'qz':
return 10
return 0
for letter in 'abdhjkq':
print(f"{letter} is worth {scrabblePoints(letter)} points.")
We can simplify the scrabblePoints
function by storing the letter and their points in a dictionary:
scrabbleDict = {'a': 1, 'b': 3, 'c': 3, 'd': 2, 'e': 1, 'f': 4, 'g': 2,
'h': 4, 'i': 1, 'j': 8, 'k': 5, 'l': 1, 'm': 3, 'n': 1,
'o': 1, 'p': 3, 'q': 10, 'r': 1, 's': 1, 't': 1,
'u': 1, 'v': 4, 'w': 4, 'x': 8, 'y': 4, 'z': 10}
Now let's define a simpler scrabblePoints2
function using scrabbleDict
:
def scrabblePoints2(letter):
"Return the scrabble score associated with a letter."
# Algorithm
# 1. If letter is in scrabbleDict, return its points
# 2. Otherwise return 0
# Your code here
if letter in scrabbleDict:
return scrabbleDict[letter]
return 0
# Test with different values
for letter in 'abdhjkq7!':
print(f"{letter} is worth, {scrabblePoints2(letter)} points.")
student = {'name': 'Georgia Dome', 'dorm': 'Munger Hall',
'section': 2,
'year': 2023,
'CSMajor?': True}
student
Unlike in our previous examples, the above student
dictionary shows that the values in a dictionary can have different types.
In fact, even the types of the keys in a dictionary can have different types:
# a dictionary may have different types as keys and values
mixedLabelDict = {"orange": "fruit",
3: "March",
"even": [2,4,6,8]}
mixedLabelDict
The keys in a dictionary are required to be unique. If you write a dictionary literal that repeats a key, this is not an error, but only the last key/value item with that key will be used.
{'a':1, 'b':2, 'a':3, 'c':4, 'a':5}
person
that has two keys: 'first' and 'last' and as corresponding values your own first and last names.print
statement to display: Well done, FIRST LAST!
(where FIRST and LAST are your first and last names read from the dictionary).# create dict
# Your code here
person = {'first': 'Katherine', 'last': 'Johnson'}
# print phrase
# Your code here
print("Well done, " + person['first'], person['last'] + "!")
We''ll use the following dictionaries in following examples in this notebook, so don't forget to run this cell.
student = {'name': 'Georgia Dome',
'dorm': 'Munger Hall',
'section': 2,
'year': 2023,
'CSMajor?': True}
phones = {'Gal Gadot': 5558671234,
'Trevor Noah': 9996541212,
'Paula A. Johnson': 7811234567}
computerScientists = {('Ada','Lovelace'):['ada@babbage.com', 1815],
('Grace', 'Hopper'):['grahope@vassar.edu', 1906],
('Katherine', 'Johnson'):['johnsonk@nasa,gov', 1918],
('Margaret', 'Hamilton'):['mhamilton@mit.edu', 1936]
}
# these are contributions of edits by Wikipedia editors
contributions = {
'uma52': {2015: 10, 2016: 15},
'setam$3': {2012: 23, 2013: 34, 2014: 17},
'rid12': {2009: 5, 2010: 18, 2012: 4}
}
The dictionary computerScientists
contains four important women in the field of computer science. Check out their Wikipedia pages below to learn more about them:
President Barack Obama awarded all three American women the Presidential Medal of Freedom.
student
# write the expression to retrieve the value 2023 from student
# Your code here
student['year']
phones
# write the expression to retrive Gal Gadot's phone number from phones
# Your code here
phones['Gal Gadot']
computerScientists
# write the expression to retrieve Grace Hopper's information from computerScientists
# Your code here
computerScientists[('Grace', 'Hopper')]
# what does this return?
computerScientists[('Ada', 'Lovelace')][0][0]
len
gives the Size of a Dictionary (the Number of Key/Value Items)¶len(student)
len(scrabbleDict)
len(daysInMonth) # Remember, Dec is missing!
Dictionaries are mutable. One way we can change them is by adding new key/value items. This means we can start with an empty dictionary and grow it in much the same way we do a list. This is a common way to create dictionaries in many of our problems.
cart = {} # The empty dictionary
cart['oreos'] = 3.99 # Add the item 'oreos': 3.99; len(cart) is now 1
cart['kiwis'] = 2.54 # Add the item 'kiwis': 2.54; len(cart) is now 2
cart
Note: Since dictionaries are unordered, the order in which we enter key/value pairs is irrelevant.
Recall that the dictionary above was missing the month Dec
, which has 31 days. Add it to the daysInMonth
dictionary below.
# Your code here
daysInMonth['Dec'] = 31
When a key is already in a dictionary, assigning a value to that key in the dictionary changes the value at that key.
For example, the key Feb
is associated with what value in the daysInMonth
dictionary?
daysInMonth
daysInMonth['Feb']
If it's a leap year, we can change the value associated with Feb
to be 29 via assignment with the subscript:
daysInMonth['Feb'] = 29 # change value associated with a key
daysInMonth
daysInMonth['Feb']
Although dictionaries are mutable, the keys of dictionaries must be immutable.
daysInMonth[['Feb', 2021]] = 28 # try to use a key that has month and year in a list
But the following works, because a tuple is immutable:
daysInMonth[('Feb', 2023)] = 28
daysInMonth
The computerScientists
dictionary is an example of a dictionary with tuples as keys.
computerScientists
pop
method¶Given a key, the pop
method on a dictionary removes the key/value pair with that key from the dictionary and returns the value formerly associated with the key. pop
mutates the dictionary.
cart
cart.pop('oreos')
cart
daysInMonth
daysInMonth.pop(('Feb', 2023))
daysInMonth
QUESTION: It looks like the method pop
works similarly to the one for lists. Do you think it will behave the same if we don't provide an argument value for it? Explain.
Given text with words, we often want to know how many times each word appears in the text. This is an excellent illustration of the power of dictionaries!
To simplify the problem, assume we want to define a frequencies
function that is given a list of words and returns a dictionary that associates each word in the list with the number of times it appears in the list. For example:
frequencies(["house", "bird", "house", "chirp", "feather", "chirp", "chirp"])
=> {'house': 2, 'bird': 1, 'chirp': 3, 'feather': 1}
def frequencies(wordList):
"""Given a list of words, returns a dictionary of word frequencies"""
# Algorithm
# 1. create an empty dict
# 2. iterate through the words of the given list
# 3. set the value or increment the value for each word
# 4. return the dict
# Your code here
freqDict = {}
for word in wordList:
if word in freqDict:
freqDict[word] += 1
else:
freqDict[word] = 1
return freqDict
frequencies(["house", "bird", "house", "chirp", "feather", "chirp", "chirp"])
.keys()
, .values()
, and .items()
¶The .keys()
, .values()
, and .items()
methods return so-called view objects associated with a dictionary. Each of these methods returns an object conceptually containing only the keys, values, and items of the dictionary, respectively.
daysInMonth # the entire dict
daysInMonth.keys()
The keys in a dict_keys
object are techincally unordered, but appear to have a key order that is determined by the order in which the key/value pairs were added to the dictionary.
type(daysInMonth.keys())
Programs are not supposed to depend on that order of keys returned by .keys()
, and so dict_keys
objects are not sequences, and cannot be subscripted with an index:
daysInMonth.keys()[3]
Note that the .keys()
, .values()
, and .items()
methods each return a different type of object.
daysInMonth.values()
Again, the values in a dict_values
object are technically unordered, but they are guaranteed to have the same ordering as the corresponding keys returned by .keys()
.
type(daysInMonth.values())
The list returned by .values()
is synchronized with the list returned by .keys()
. You can find corresponding months and days in the same index.
daysInMonth.items()
type(daysInMonth.items())
The objects of type dict_keys
, dict_values
, and dict_items
are so-called dictionary views that reflect any subsequent changes to the underlying dictionary from which they were made.
numNames = {'one': 1, 'two': 2, 'three': 3}
ks = numNames.keys()
vs = numNames.values()
its = numNames.items()
print('keys:', ks)
print('values:', vs)
print('items:', its)
numNames['four'] = 4
print('keys:', ks)
print('values:', vs)
print('items:', its)
numNames.pop('two')
print('keys:', ks)
print('values:', vs)
print('items:', its)
There are many ways to iterate over a dictionary:
.keys()
).values()
).items()
)phones
# iterate directly (by default Python goes over the keys, because they are unique)
for key in phones:
print(key, phones[key])
Using for
to iterate over a dictionary means iterating over all the keys in the dictionary, so there is
no need to use .keys()
, which would create an unnecessary object. So we prefer to write for key in phones:
rather than for key in phones.keys():
.
However, we do need .values()
and .items()
to iterate over the values and items of a dictionary, respectively.
for val in phones.values():
print("Call " + str(val) + "!")
# sometimes is useful to iterate over the items directly
# notice the tuple assignment in the for loop
for name, number in phones.items():
print(f"Call {name} at {number}.")
daysInMonth = {'Jan': 31, 'Feb': 28, 'Mar': 31, 'Apr': 30,
'May': 31, 'Jun': 30, 'Jul': 31, 'Aug': 31,
'Sep': 30, 'Oct': 31, 'Nov': 30, 'Dec': 31}
for month, days in daysInMonth.items():
print(f"{month} has {days} days.")
Keys and values in a dictionary are asymmetric, in the sense that going from keys to values is easy, but going for values to keys is hard.
While it is never necessary to use a loop to find the value associated with a key, it is necessary to use a loop to find all of the keys associated with a value (there may be more than one!) For example, consider the following function:
def findMonthsWithDays(targetDays):
''' Return a list of months that have targetDays'''
monthList = []
for month, days in daysInMonth.items():
if days == targetDays:
monthList.append(month)
return monthList
findMonthsWithDays(30)
findMonthsWithDays(27)
Define the function getKeyWithMaxValue
that behaves as shown below:
In [2]: getKeyWithMaxValue({'A': 0.25, 'E': 0.36,
'I': 0.16, 'O': 0.18, 'U': 0.05})
Out[2]: 'E'
Hint: Remember the built-in function max
; when given a list, it returns the largest value in the list. Also, remember the method values
for a dictionary.
def getKeyWithMaxValue(dct):
"""Given a dict whose values are numbers, return the key that
corresponds to the highest value.
"""
# One possible algorithm:
# 1. find the max with the help of the .values method
# 2. iterate through the keys to find which key has a value that is equal to max
# 3. return that key (it can be an early return)
# Your code here
maxVal = max(dct.values())
for letter in dct:
if dct[letter] == maxVal:
return letter
getKeyWithMaxValue({'A': 0.25, 'E': 0.36, 'I': 0.16, 'O': 0.18, 'U': 0.05})
Define a function reverseDictionary
that takes a dict that has many similar values and creates a new dict where the keys are the unique values and the values are lists of the keys.
Example:
reverseDictionary(daysInMonth) =>
{31: ['Jan', 'Mar', 'May', 'Jul', 'Aug', 'Oct', 'Dec'],
28: ['Feb'],
30: ['Apr', 'Jun', 'Sep', 'Nov']}
def reverseDictionary(dct):
"""Given a dict that has many repeating values, returns a new dict where
the old values become the new keys. The new values are lists containing
all the old keys with the same value.
"""
# Algorithm
# 1. Create an empty dict
# 2. Iterate over the dictionary
# 3. If a key exists, append to the corresponding value (which is a list)
# 4. If not, create the key:value pair, by assigning a list with one element to the new key
# 5. return dict
# Your code here
reverseDct = {}
for key, value in dct.items():
if value in reverseDct:
reverseDct[value].append(key)
else:
reverseDct[value] = [key]
return reverseDct
reverseDictionary(daysInMonth)
dict
function¶The dict
function can create a dictionary from a list of tuples, where every tuple has two elements.
dict([('DEU', 49), ('ALB', 355), ('UK', 44)]) # a list of tuples for country codes
A tuple that is not part of a list will not work:
dict(('USA', 1))
Calling dict
with zero arguments creates an empty dictionary:
dict() # creates an empty dict
get
method¶The method get
is used to avoid the step of checking for a key before updating.
This is possible because this method will return a "default" value when the key is not in the dictionary.
In all other cases, it will return the value associated with the given key.
daysInMonth
daysInMonth.get('Oct', 'unknown')
daysInMonth.get('March', 'unknown')
Remember that if we try to access a non-existing key directly, we'll get a KeyError:
daysInMonth['March']
Using get
, allows us to avoid that error:
daysInMonth.get('March')
QUESTION: Why don't we see anything?
update
method¶dict1.update(dict2)
mutates dict1
by assigning the key/value pairs of dict2
to dict1
.
# let's remind ourselves of the contributions
contributions
newContributions = {'brix4': {2011: 39, 2013: 27, 2015: 41},
'uma52': {2017: 21}}
contributions.update(newContributions)
QUESTION: What didn't you see an output from running the cell above?
contributions
QUESTION: Why did the 2015 and 2016 contributions for uma52
disappear?
clear
method¶We can wipe out the content of a dictionary with clear
:
letters = {"a" : 1, "b" : 2}
letters.clear()
letters
When Python stores the keys of a dictionary in memory, it stores their hashes, which is an integer returned by the hash
function. Only immutable objects can be hashed.
hash("Wellesley")
hash(['Feb', 2015])
hash( ('Feb', 2015) ) # Tuples are hashable even though lists are not
hash(123456) # numbers are their own hash value
At this point, you don't have to worry about why the keys are hashed, or how the hash
function works!
Take more advanced CS courses to learn more.