Problem Set 8 - Due Tue, Nov 12, 2019 at 23:59pm

Due Fri, Nov 15, 2019 at 6:00pm*

*Updated 11/09/19 at 10:48am

Reading material and Resources

Reading

  1. Slides and notebooks from Lec 15 Sorting & lambda, Lec 16 Dictionaries, Lec 17 Accumulation Pattern with Lists and Dictionaries, Lec 18 Sets + More Testing & Debugging
  2. Lab 09 on Dictionaries
  3. Think Python, Ch. 11: Dictionaries
  4. Python documentation on dictionary operations
  5. Python documentation on set operations

About this Problem Set

This problem set will give you practice with dictionaries, which are an important data structure for the rest of the course. It will also give you practice with testing (in the context of finding differences between large dictionaries), data analysis, and data visualization.

Use this shared Google Doc to find a pair programming partner and record your partner. Remember that you can talk with other people about high-level problem-solving strategies, but sharing code is an Honor Code violation.

Other notes:


Task 0: Scrambled Solutions

This is an individual problem which you must complete on your own, although you can ask for help from the CS111 staff.

This time we have three puzzles for you to solve to review the solutions to problem set 7. Go to:

CS 111 Puzzles

and select the options under Problem Set 7.

As before, please download and submit your solution files, and email pmawhort@wellesley.edu if you run into trouble.


Task 1: Titanic Analysis

This task is an individual problem which you must complete on your own, though you may ask for help from the CS111 staff.

Background

In this task, you will analyze data about passengers from the Titanic, a large ship that tragically sank in the waters of the North Atlantic Ocean in 1912.

Passenger data can be found in the titanicdata.txt file provided in the ps08 folder. Study the structure of the data in titanicdata.txt. Then, study the code in the readtitanic.py file, which reads the contents of titanicdata.txt and transforms it into a list of dictionaries, where every list element is a passenger represented by a dictionary. Because the information for the passengers is incomplete, not all passenger dictionaries have the same keys. Try the examples below in your interactive console after running readtitanic.py. Notice how passenger 2168 has six keys in the dictionary, but passenger 2203 has only four keys.

In [25]: len(passengerList)
Out[25]: 2208

In[26]: passengerList[2168]
Out[26]:
{'age': '31.0',
 'class': '1st Class',
 'group': 'Servant',
 'job': 'Personal Maid',
 'name': 'Miss Helen Alice Wilson',
 'status': 'survivor'}

In[27]: passsengerList[2203]
Out[27]:
{'age': '22.0',
 'class': '3rd Class',
 'name': 'Mr Mapriededer Zakarian',
 'status': 'victim'}

There is even a passenger who has only three keys in the dictionary. UNGRADED CHALLENGE: Can you find this passenger using a single line list comprehension statement? UNGRADED EXTRA CHALLENGE: Can you find both this passenger and their index in the list with one line of code?

Open up titanic.py in your ps08 folder. This file imports the passengerList variable from readtitanic.py.

Your Task: createSurvivalDictionary

In this task, you will define a createSurvivalDictionary function that takes a list of passenger dictionaries (such as passengerList) and creates a dictionary that keeps track of the number of survivors and victims in every cabin class, as well as the calculated survival rate by cabin class. Here is the contract of createSurvivalDictionary, which is provided in titanic.py:

def createSurvivalDictionary(passengers):
   """Given a list of passenger dictionaries, return a dictionary
   whose keys are all the cabin classes that appear in passengers.
   The value associated with each cabin class name should
   itself be another dictionary that has three key/value pairs:
     (1) The key "survivors" maps to the number of survivors in
         the cabin class;
     (2) The key "victims" maps to the number of victims in the
         cabin class;
     (3) The key "survivalRate" maps to the survival rate in the
         cabin class (a floating point number rounded to 3 decimal digits)
    """

Examples

In []: createSurvivalDictionary(passengerList[0:10])
Out[]: 
{'2nd Class': {'survivalRate': 0.5, 'survivors': 1, 'victims': 1},
 '3rd Class': {'survivalRate': 0.571, 'survivors': 4, 'victims': 3},
 'Victualling': {'survivalRate': 0.0, 'survivors': 0, 'victims': 1}}

In []: createSurvivalDictionary(passengerList)
Out[]: 
{'1st Class': {'survivalRate': 0.62, 'survivors': 201, 'victims': 123},
 '2nd Class': {'survivalRate': 0.418, 'survivors': 119, 'victims': 166},
 '3rd Class': {'survivalRate': 0.254, 'survivors': 180, 'victims': 528},
 'A la Carte': {'survivalRate': 0.043, 'survivors': 3, 'victims': 66},
 'Deck': {'survivalRate': 0.652, 'survivors': 43, 'victims': 23},
 'Engine': {'survivalRate': 0.222, 'survivors': 72, 'victims': 253},
 'Victualling': {'survivalRate': 0.218, 'survivors': 94, 'victims': 337}}

Algorithm Notes

Here we sketch two slightly different algorithms for solving this subtask. We recommend that you implement whichever one you find more intuitive. (And if you prefer yet another approach that works, that's fine, too.)

Algorithm 1

  1. Find the set of all cabin classes. Do this by accumulating all the cabin classes into a list, and then apply the set function described below.

    • The built-in function set can convert a list with duplicate elements into a set of unique elements. For example:

      In [50]: fruitList = ['apple', 'orange', 'apple', 'orange', 'lemon']
      
      In [51]: set(fruitList)
      Out[51]: {'apple', 'lemon', 'orange'}
      
      In [52]: print(set(fruitList))
      set(['orange', 'lemon', 'apple'])

    For more on the set function and sets, see the Set Notes of Task 2b or in Lec 18.

    • NOTE: if you feel comfortable with list comprehensions, this is a great place to use one.

    • DO NOT assume that the cabin classes will be the exact ones that appear in passengerList ('1st Class', '2nd Class', 'Deck', etc.). Instead, your code needs to discover what cabin classes are used in the actual passengers list argument. See the NOTES section below for more on this.
  2. Make a survival dictionary that associates each cabin class with a mini-dictionary for cabins , {'survivors': 0, 'victims': 0}.

    • DO NOT create a single mini-dictionary miniDct = {'survivors': 0, 'victims': 0}, and then use the variable miniDict to map it to every key. This will give incorrect results due to the aliasing issues we've been warning you about, because all of your keys will be pointing to the same dictionary value.

      Instead, within a for loop assign the value {'survivors': 0, 'victims': 0} directly to each cabin class (from the set you generated in step 1.)

  3. Iterate through the passenger list, updating the mini-dictionary for the cabin class of each passenger.

  4. Once all the passengers have been processed, the correct survivalRate value can be assigned to each mini-dictionary based on the number of survivors and victims (adding a new key:value item to the existing mini-dicts).


Algorithm 2

Algorithm 1 iterates through all passengers in the passenger list twice (once to find the names of the cabin classes and create mini-dictionaries for each class; and a second time to populate the mini-dictionaries). Algorithm 2 iterates through all passengers only once, creating a mini-dictionary for each cabin class the first time it encounters that cabin class.

  1. Create an initially empty survival dictionary that will eventually associate each cabin class with a mini-dictionary for cabins.

  2. Iterate through the passenger list, performing the following substeps for each passenger:

    a. Determine the cabin class of the passenger.

    b. If this cabin class does not already exists in the survival dictionary, associate it in that dictionary with a new mini-dictionary {'survivors': 0, 'victims': 0}

    c. Update mini-dict for cabin class with survival status of passenger.

  3. Once all the passengers have been processed, the correct survivalRate value can be assigned to each mini-dictionary based on the number of survivors and victims (adding a new key:value item to the existing mini-dicts).

Other Notes


Task 2: Unjumble

This task is a Partner-recommended problem in which it is strongly recommended that you work with a partner as part of a two-person team. Use this shared Google Doc to find a pair programming partner and record your partner

Word Jumble is a popular game that appears in many newspapers and online. The game involves "unjumbling" English words whose letters have been reordered. For instance, the jumbled word ytikt can be unjumbled to kitty. Here is one version of the game online.


You will create a Python program in unjumble.py that is able to successfully unjumble jumbled words. Here is how to produce candidate words from a jumbled word:


We have provided five lists of English words of different sizes.

Flesh out the functions and iteration table as described in the subtasks below. All your code should be in your unjumble.py file in the ps08 folder from the download directory.

Subtask 2a: unjumbleKey

In this part, you will define the unjumbleKey function.

The unjumbleKey function takes a single argument, a string, and returns a string that is the unjumble key of its input -- i.e., a string consisting of the lowercase versions of all characters of the string in alphabetical order. For example:

In [1]: unjumbleKey('argle')
Out[1]: 'aeglr'

In [2]: unjumbleKey('regal')
Out[2]: 'aeglr'

In [3]: unjumbleKey('Star')
Out[3]: 'arst'

In [4]: unjumbleKey('HISTRIONICS')
Out[4]: 'chiiinorsst'

Notes

Subtask 2b: makeUnjumbleDictionary Iteration Table

In this part, you will create an iteration table for the makeUnjumbleDictionary function. This will prepare you to define the makeUnjumbleDictionary function in Subtask 2c.

The makeUnjumbleDictionary function takes a single argument, a list of words, and returns a dictionary that associates the unjumble key of every word in the word list with the set of all words with that unjumble key. All words in the same set will be anagrams--i.e., words that all have exactly the same letters (including repeated ones), but in different orders.

Here we show how makeUnjumbleDicionary should behave on two of the word lists imported by unjumble.py: minisculeWordList (6 words) and tinyWordList (33 words):

In [1]: minisculeWordList
Out[1]: ['arts', 'post', 'rats', 'regal', 'spot', 'star']

In [2]: makeUnjumbleDictionary(minisculeWordList)
Out[2]: 
{'aeglr': {'regal'},
 'arst': {'arts', 'rats', 'star'},
 'opst': {'post', 'spot'}}

In [3]: tinyWordList
# Note: Canopy prints each of the 33 words on separate lines, but here we
# compress them into five lines to save space.
Out[3]: 
['alerting', 'altering', 'arts', 'caster', 'caters', 'crates', 'glare', 
 'histrionics', 'integral', 'lager', 'large', 'opts', 'post', 'pots',
 'rats', 'reacts', 'recast', 'regal', 'relating', 'restrain', 'retrains',
 'spot', 'star', 'strainer', 'stop', 'tars', 'terrains', 'traces', 'trainers',
 'triangle', 'trichinosis', 'tops', 'tsar']

In [4]: makeUnjumbleDictionary(tinyWordList)
Out[4]:
{'acerst': {'caster', 'caters', 'crates', 'reacts', 'recast', 'traces'},
 'aegilnrt': {'alerting', 'altering', 'integral', 'relating', 'triangle'},
 'aeglr': {'glare', 'lager', 'large', 'regal'},
 'aeinrrst': {'restrain', 'retains', 'strainer', 'terrains', 'trainers'},
 'arst': {'arts', 'rats', 'star', 'tars', 'tsar'},
 'chiiinorsst': {'histrionics', 'trichinosis'},
 'opst': {'opts', 'post', 'pots', 'spot', 'stop', 'tops'}}


Notes on Sets

The dictionary values in this problem are sets, not lists. Like lists, sets are mutable collections. Unlike lists, they are unordered and cannot contain duplicate elements. Below are some notes on creating and using sets. See this Python documentation for more details on how to use sets.

Your Subtask: Create an Iteration Table

Before you start to define the makeUnjumbleDictionary function, we ask you to create an iteration table to understand how the loop in this function should process the word list minsculeWordList, which contains six words:

['arts', 'post', 'rats', 'regal', 'spot', 'star']

Assume that the loop has a state variable named unjumbleDictSoFar that will be used to "grow" the resulting dictionary as each word is processed. Initially, unjumbleDictSoFar should be the empty dictionary. The purpose of this subtask is for you to think carefully about how this dictionary should change when each word is processed.

Here is a template for this iteration table that includes the first two rows:

word unjumbleDictSoFar
{}
'arts' {'arst': {'arts'}}
'post'
'rats'
'regal'
'spot'
'star'

Your subtask is to fill out the dictionaries in the remaining five rows. Keep in mind that each dictionary key should be a string that's an unjumble key for a word, and each value in the dictionary should be a set of words with they key as their unjumble key.

As in PS06 Task 3a, you will submit your iteration table as a list of tuples, so that it can be tested by Codder. In this case, you should flesh out the following skeleton of the list of tuples named minisculeDictTable in unjumble.py.

minisculeDictTable = [
    ('word', 'unjumbleDictSoFar'),
    (None, {}),
    ('arts', {'arst': {'arts'}}),
    ('post', None), # Replace None by the correct dictionary 
    ('rats', None), # Replace None by the correct dictionary 
    ('regal', None), # Replace None by the correct dictionary 
    ('spot', None), # Replace None by the correct dictionary 
    ('star', None) # Replace None by the correct dictionary 
    ]

Codder will test the that the structure of the tuple list in your iteration table is valid, but the contents of the tuple will not be checked by Codder until the final grading pass after submission.

Subtask 2c: makeUnjumbleDictionary Function

In this subtask, you will define the makeUnjumbleDictionary function described in Subtask 2b.

unjumble.py imports five word lists: minisculeWordList (6 words), tinyWordList (33 words), smallWordList (1,158 words), mediumWordList (45,420 words), and largeWordList (445,528 words). Test makeUnjumbleDictionary on these lists:

In [1]: minisculeDict = makeUnjumbleDictionary(minisculeWordList)

In [2]: minisculeDict
Out[2]: 
{'aeglr': {'regal'},
 'arst': {'arts', 'rats', 'star'},
 'opst': {'post', 'spot'}}

In [3]: print(minisculeDict)
{'opst': {'spot', 'post'}, 'arst': {'arts', 'star', 'rats'}, 'aeglr': {'regal'}}
# Note that when a unjumble dict is printed rather than returned, 
# the order of both the unjumble keys and the set elements are unpredictable!

In [4]: tinyDict = makeUnjumbleDictionary(tinyWordList)

In [5]: tinyDict
Out[5]:
{'acerst': {'caster', 'caters', 'crates', 'reacts', 'recast', 'traces'},
 'aegilnrt': {'alerting', 'altering', 'integral', 'relating', 'triangle'},
 'aeglr': {'glare', 'lager', 'large', 'regal'},
 'aeinrrst': {'restrain', 'retains', 'strainer', 'terrains', 'trainers'},
 'arst': {'arts', 'rats', 'star', 'tars', 'tsar'},
 'chiiinorsst': {'histrionics', 'trichinosis'},
 'opst': {'opts', 'post', 'pots', 'spot', 'stop', 'tops'}}

In [6]: smallDict = makeUnjumbleDictionary(smallWordList)

# smallDict and the other test dictionaries have too many elements to display
# them all, so let's just print its length.
In [7]: print('len of smallDict:', len(smallDict))
len of smallDict: 483

In [8]: mediumDict = makeUnjumbleDictionary(mediumWordList)

In [9]: print('len of mediumDict:', len(mediumDict))
len of mediumDict: 42270

In [10]: largeDict = makeUnjumbleDictionary(largeWordList)

In [11]: print('len of largeDict:', len(largeDict))
len of largeDict: 388509

Notes

Subtask 2d: unjumble

The unjumble function takes an unjumble dictionary (of the type created by makeUnjumbleDictionary) and a string (that doesn't need to be a meaningful word), and returns a set of all words in the dictionary to which the input string unjumbles. If there are no such words, it returns the empty set. For example,

In [1]: tinyDict = makeUnjumbleDictionary(tinyWordList)

In [2]: unjumble(tinyDict, 'esrat')
Out[2]: set()

In [3]: print(unjumble(tinyDict, 'esrat'))
set()

In [4]: mediumDict = makeUnjumbleDictionary(mediumWordList)

In [5]: unjumble(mediumDict, 'esrat')
Out[5]: {'aster', 'rates', 'stare', 'tears'}

In [6]: print(unjumble(mediumDict, 'esrat'))
{'aster', 'stare', 'tears', 'rates'} # unpredictable order when printed

In [7]: largeDict = makeUnjumbleDictionary(largeWordList)

In [8]: unjumble(largeDict, 'esrat')
Out[8]: {'arest', 'arets', 'aster', 'astre', 'earst', 'rates', 'reast', 'resat', 
'serta', 'stare', 'stear', 'strae', 'tares', 'tarse', 'taser', 'tears', 'teras'}
# In Canopy and Jupyter notebooks, each string would appear on a separate line,  
# but we've condensed them into two lines here. 

For fun, use your unjumble function as an assistant in playing the online jumble game.

Notes

Subtask 2e: mostAnagrams

Which words have the most anagrams? Define a function named mostAnagrams that takes an unjumble dictionary (of the type created by makeUnjumbleDictionary) and returns the largest set of anagrams in that dictionary.

In [1]: tinyDict = makeUnjumbleDictionary(tinyWordList)

In [2]: mostAnagrams(tinyDict)
Out[2]: {'opts', 'post', 'pots', 'spot', 'stop', 'tops'}
# This is one of two correct results for mostAnagrams(tinyDict). 
# The other is {'caster', 'caters', 'crates', 'reacts', 'recast', 'traces'}

In [3]: mediumDict = makeUnjumbleDictionary(mediumWordList)

In [4]: mostAnagrams(mediumDict)
Out[4]: {'carets', 'caster', 'caters', 'crates', 'reacts', 'recast', 'traces'}

In [5]: mostAnagrams(largeDict)
Out[5]: {'arest',
 'arets',
 'aster',
 'astre',
 'earst',
 'rates',
 'reast',
 'resat',
 'serta',
 'stare',
 'stear',
 'strae',
 'tares',
 'tarse',
 'taser',
 'tears',
 'teras'}

Notes

Subtask 2f: compareUnjumbleDicts

Motivation

As noted earlier in Lec 12 Testing and Debugging, a major downside to using Codder for testing code is that it doesn't give you practice with writing your own testing code, which is an important real-world programming skill. For example, if you go on to take CS230 Data Structures, in that course no testing code will be provided; you will be expected to write all of your testing code from scratch by yourself!

This problem is designed to give you some more practice with writing and using your own version of the kind of testing code used in Codder. In particular, you will address the following problem: how do you know whether or not the dictionaries created by makeUnjumbleDictionary in Subtask 2c are correct?

To simplify things, we provide you with with the correct unjumbler dictionaries minisculeDictCorrect, tinyDictCorrect, smallDictCorrect, mediumDictCorrect, and largeDictCorrect. Then your task becomes comparing the correct dictionary with your dictionary that is returned by your makeUnjumbleDictionary.

It turns out it is trivial in Python to compare two dictionaries dict1 and dict2 for equality: just use ==. So dict1 == dict2 is True if and only if:

  1. The dictionaries have exactly the same keys; and

  2. The value at each of the keys is the same (as determined by ==). In other words, it is the case that for all keys k in the dictionary, dict1[k] == dict2[k].

But when two unjumble dictionaries differ, you don't just want to know that they differ, you want to know how they differ, because this provides valuable information on possible bugs in makeUnjumbleDictionary. After all, you would be upset if Codder just reported that two huge dictionaries were different, but didn't give any indication of ways in which they differed.

Your Task

In this task, you will flesh out a function compareUnjumbleDicts that takes two arguments:

  1. the expected (correct) unjumble dictionary

  2. an actual unjumble dictionary returned by makeUnjumbleDictionary

This function should print out information highlighting any differences between the dictionaries, as detailed below.

When it is done, this function should also return True if the dictionaries are the same and False if they differ.

Here is the information that compareUnjumbleDicts should print:

1. Number of Keys:

If the number of keys in the two dictionaries differs, this message should be printed (where of course the correct number of keys should be used):

The expected dict has 483 keys but the actual dict has 509 keys.

But if the dictionaries have the same number of keys, this message should be printed:

Both the expected and actual dicts have 483 keys.

2. Differences in Keys:

If the two sets of keys are exactly the same, then this message should be printed:

Both the expected and actual dicts have exactly the same keys.

But if there are any expected keys that are not found in the actual dict (missing keys), this message should be printed, which should list up to the first five such keys (in sorted order):

Missing keys in the expected dict but not the actual dict:
["'aehor", "'aehos", "'aelory", "'aillnosuv", "'beinor"]

Additionally if there are any actual keys that are not found in the expected dict (extra keys), this message should be printed, which should list up to the first five such keys (in sorted order):

Extra keys in the actual dict but not the expected one:
["'boeinr", "'coellnno", "'connoor", "'doell", "'doellnno"]

3. Differences in Values:

If there are any keys in both dictionaries at which the two unjumble dictionaries have different values, then this message should be printed for the first such key in sorted order:

Both dicts have key "abel", but the sets at this key differ:
* the expected set has sorted values
  ['Abel', 'Bela', 'Elba', 'abel', 'able', 'albe', 'bael', 'bale', 'beal', 'bela', 'blae', 'blea']
* the actual set has sorted values
  ['abel', 'able', 'albe', 'bael', 'bale', 'beal', 'bela', 'blae', 'blea']

But if the dictionaries have exactly the same values at all keys that they share, then this message should be printed:

Both dicts have exactly the same sets for all keys that are in both dicts.

More Set Operations

Implementing the above behavior is simplified by using set operations on the keys and values in the unjumble dictionary. Here we illustrate these operations in the context of some sample sets:

In [1]: s1 = {'cat', 'ferret', 'ant', 'dog'}

In [2]: s2 = {'bunny', 'dog', 'cat', 'emu'}

Even though sets are inherently unordered, Canopy will show their elements in sorted order when they are the result of an expression in the interaction pane:

In [3]: s1
Out[3]: {'cat', 'ferret', 'ant', 'dog'}

However, when print is used on a set in Canopy, the order of the elements is unpredictable:

In [4]: print(s1)
{'ant', 'ferret', 'dog', 'cat'}

The union of sets s1 and s2, written s1.union(s2) or s1 | s2, returns a new set that has all the elements that are in either set:

In [5]: s1.union(s2)
Out[5]: {'ant', 'bunny', 'cat', 'dog', 'emu', 'ferret'}

In [6]: s1 | s2
Out[6]: {'ant', 'bunny', 'cat', 'dog', 'emu', 'ferret'}

The intersection of sets s1 and s2, written s1.intersection(s2) or s1 & s2, returns a new set that has all the elements that are in both sets:

In [7]: s1.intersection(s2)
Out[7]: {'cat', 'dog'}

In [8]: s1 & s2
Out[8]: {'cat', 'dog'}

The difference of sets s1 and s2, written s1.difference(s2) or s1 - s2, returns a all the elements that are in the first set but not the second:

In [9]: s1.difference(s2)
Out[9]: {'ant', 'ferret'}

In [10]: s1 - s2
Out[10]: {'ant', 'ferret'}

In [11]: s2.difference(s1)
Out[11]: {'bunny', 'emu'}

In [12]: s2 - s1
Out[12]: {'bunny', 'emu'}

Sets are tested for equality using ==. The order in which elements are written in a set is irrelevant for determining equality:

In [13]: s1 == s2
Out[13]: False

In [14]: {'cat', 'bunny', 'dog', 'ant'} == {'dog', 'ant', 'cat', 'bunny'} 
Out[14]: True

A set can be created from a list by use the set function, and a list can be create from a set via the list function:

In [15]: set(['cat', 'emu', 'ant'])
Out[15]: {'ant', 'cat', 'emu'}

In [16]: set(['emu', 'cat', 'emu', 'ant', 'emu', 'cat'])
Out[16]: {'ant', 'cat', 'emu'}

In [17]: list({'cat', 'bunny', 'dog', 'ant'})
Out[17]: ['ant', 'cat', 'dog', 'bunny']

Note that set removes duplicates from a list, and the order of elements returned by list from a set is unpredictable.

Testing compareUnjumbleDicts

unjumble.py imports six versions of a small dictionary named smallDict1 through smallDict6. Some of these are the same as the correct dictionary smallDictCorrect, while others differ from the correct dictionary in their keys and/or values.

You can test your compareUnjumbleDicts on these test dictionaries. For example:

In [18]: compareUnjumbleDicts(smallDictCorrect, smallDictCorrect)

Both the expected and actual dicts have 483 keys.

Both the expected and actual dicts have exactly the same keys.

Both dicts have exactly the same sets for all keys that are in both dicts.
Out[18]: True

In [19]: compareUnjumbleDicts(smallDictCorrect, smallDict1)

The expected dict has 483 keys but the actual dict has 509 keys.

Missing keys in the expected dict but not the actual dict:
["'aehor", "'aehos", "'aelory", "'aillnosuv", "'beinor"]

Extra keys in the actual dict but not the expected one:
["'boeinr", "'coellnno", "'connoor", "'doell", "'doellnno"]

Both dicts have key "abel", but the sets at this key differ:
* the expected set has sorted values
  ['Abel', 'Bela', 'Elba', 'abel', 'able', 'albe', 'bael', 'bale', 'beal', 'bela', 'blae', 'blea']
* the actual set has sorted values
  ['abel', 'able', 'albe', 'bael', 'bale', 'beal', 'bela', 'blae', 'blea']
Out[19]: False

In unjumble.py, the if __name__ == '__main__': block contains the following line that you can uncomment and run to test all six test dictionaries.

testCompareUnjumbleDicts(smallDictTestCases)

Here are the outputs expected from these tests:

In [47]: testCompareUnjumbleDicts(smallDictTestCases)

------------------------------------------------------------
Testing compareUnjumbleDicts (on small dictionaries):

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
compareUnjumbleDicts(smallDictCorrect, smallDict1)

The expected dict has 483 keys but the actual dict has 509 keys.

Missing keys in the expected dict but not the actual dict:
["'aehor", "'aehos", "'aelory", "'aillnosuv", "'beinor"]

Extra keys in the actual dict but not the expected one:
["'boeinr", "'coellnno", "'connoor", "'doell", "'doellnno"]

Both dicts have key "abel", but the sets at this key differ:
* the expected set has sorted values
  ['Abel', 'Bela', 'Elba', 'abel', 'able', 'albe', 'bael', 'bale', 'beal', 'bela', 'blae', 'blea']
* the actual set has sorted values
  ['abel', 'able', 'albe', 'bael', 'bale', 'beal', 'bela', 'blae', 'blea']

compareUnjumbleDicts returns False

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
compareUnjumbleDicts(smallDictCorrect, smallDict2)

Both the expected and actual dicts have 483 keys.

Both the expected and actual dicts have exactly the same keys.

Both dicts have key "'aehor", but the sets at this key differ:
* the expected set has sorted values
  ["O'Hare"]
* the actual set has sorted values
  ['OHare']

compareUnjumbleDicts returns False

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
compareUnjumbleDicts(smallDictCorrect, smallDict3)

Both the expected and actual dicts have 483 keys.

Both the expected and actual dicts have exactly the same keys.

Both dicts have exactly the same sets for all keys that are in both dicts.

compareUnjumbleDicts returns True

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
compareUnjumbleDicts(smallDictCorrect, smallDict4)

The expected dict has 483 keys but the actual dict has 482 keys.

Missing keys in the expected dict but not the actual dict:
["'aehor", "'aehos", "'aelory", "'aillnosuv", "'beinor"]

Extra keys in the actual dict but not the expected one:
['aaaabdgghitv', 'adlmou', 'aehor', 'aehos', 'aelory']

Both dicts have exactly the same sets for all keys that are in both dicts.

compareUnjumbleDicts returns False

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
compareUnjumbleDicts(smallDictCorrect, smallDict5)

Both the expected and actual dicts have 483 keys.

Both the expected and actual dicts have exactly the same keys.

Both dicts have exactly the same sets for all keys that are in both dicts.

compareUnjumbleDicts returns True

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
compareUnjumbleDicts(smallDictCorrect, smallDict6)

Both the expected and actual dicts have 483 keys.

Both the expected and actual dicts have exactly the same keys.

Both dicts have key "'aehor", but the sets at this key differ:
* the expected set has sorted values
  ["O'Hare"]
* the actual set has sorted values
  ["o'hare"]

compareUnjumbleDicts returns False

Using compareUnjumbleDicts to test your makeUnjumbleDictionary

The whole point of defining compareUnjumbleDicts is to test if the dictionaries returned by your makeUnjumbleDictionary function are correct.

For example, suppose you created an unjumble dictionary named smallDict as follows:

smallDict = makeUnjumbleDictionary(smallWordList)

Then you can test smallDict against the provided smallDictCorrect dictionary as follows:

In [48]: compareUnjumbleDicts(smallDictCorrect, smallDict)

Both the expected and actual dicts have 483 keys.

Both the expected and actual dicts have exactly the same keys.

Both dicts have exactly the same sets for all keys that are in both dicts.
Out[48]: True

If you get a True result, you pass the test case for smallDict. But if you get a False result, you will get debugging information that should help you track down and fix bugs in your definition of makeUnjumbleDictionary.

In unjumble.py, the if __name__ == '__main__': block contains the following line that you can uncomment and run to use compareUnjumbleDicts to test the dictionaries created for all five wordlists by your makeUnjumbleDictionary function.

testCompareUnjumbleDicts(myDictTestCases)

If all five test cases return True, then your makeUnjumbleDictionary function passes the test cases. Otherwise, you'll need to debug at least one of makeUnjumbleDictionary or compareUnjumbleDicts.


Task 3: Honor Code Form

As in the previous psets, your honor code submission for this pset will involve defining entering values for the variables in the honorcode.py file.


How to turn in this Problem Set