CS111: Problem Set 8.

Softcopy Due Wednesday, April 1, 2015 at 11:59pm (no fooling!)
Hardcopy Due on Thursday, April 2, 2015, in Lecture

Reading

  1. Think Python, Chapter 11: Dictionaries
  2. Python documentation on dictionaries
  3. Python documentation on dictionary operations

About this Problem Set

This problem set is intended to give you practice with dictionaries.

It has been designed so that you do not have to start the pset until after you return from Spring Break. In particular:

  • There are only two tasks, not three.
  • There is no partner task, so there is no need to coordinate with someone else.
  • The assignment is due on a Wednesday night (April 1), not a Tuesdasy night (March 31).

All code for this assignment is available in the ps08_programs folder in the cs111/download directory within your cs server account.

This problem set will be graded using this grade sheet.

How to turn in this Problem Set

You must submit both soft-copy (electronic) and hard-copy (printed) versions of your problem set.

Softcopy submission

Save your unjumble.py file in the Unjumbler folder an your menu.py file in the WellesleyFresh folder within your ps08_programs folder. Submit the entire ps08_programs folder (renamed to yourname_ps08_programs) to your drop folder on the cs server using Fetch (for Macs) or WinSCP (for PCs).

Hardcopy submission

Print out your unjumble.py and menu.py files. Staple these pages together with a cover page, and submit this hardcopy package in lab on Thursday.


Task 1: Unjumbler

Word Jumble is a popular game that appears in many newspapers and online. The game involves "unjumbling" English words whose letters have been reordered. For instance, the jumbled word ytikt can be unjumbled to kitty. Here is one version of the online game.

In this problem, you will create a Python program that is able to successfully unjumble jumbled words. Your program will start with a file of English words. For each word in the file, it will convert that word to an unjumble key by sorting the lowercase versions of the characters of the word in alphabetical order. For instance, the unjumble key for 'regal' would be 'aeglr'. It will create an unjumble dictionary that associates each such unjumble key with a list of all words that have the same unjumble key. For example, the unjumble dictionary will associate the unjumble key 'aeglr' with the list of words ['glare', 'lager', 'large', 'regal']. Finally, to unjumble a string you simply convert it to its unjumble key and look that up in the unjumble dictionary. For example, to unjumble 'rgael', you convert it to its key 'aeglr', and look this up in the unjumble dictionary to find that it unjumbles to any of the words in the list ['glare', 'lager', 'large', 'regal'].

To complete this task, you will need to define the following four functions in the file unjumble.py, which can be found in the unjumble.py, which can be found in the Unjumbler folder within the ps08_programs folder available from the download directory on the cs server.

  1. unjumbleKey takes a single argument, a string, and returns a string that is the unjumble key of its input --- i.e., a string consisting of the lowercase versions of all all characters of the string in alphabetical order. For example,
    In[1]: unjumbleKey('argle')
    Out[1]: 'aeglr'
    In[2]: unjumbleKey('regal')
    Out[2]: 'aeglr'
    In[3]: unjumbleKey('Star')
    Out[3]: 'arst'
    In[4]: unjumbleKey('histrionics')
    Out[4]: 'chiiinorsst'

    Notes:

    • When applied to a string, the sorted function returns a list of the characters in sorted order:
      In[4]: sorted('abracadabra')
      Out[4]: ['a','a','a','a','a','b','b','c','d','r','r']
    • The join method on strings can be used to glue a list of strings together, using the string to which join is applied as a separator.
      In[5]: ':'.join(['bunny','cat','dog'])
      Out[5]: 'bunny:cat:dog'
      In[6]: ' '.join(['bunny','cat','dog'])
      Out[6]: 'bunny cat dog'
      In[7]: ''.join(['bunny','cat','dog'])
      Out[7]: 'bunnycatdog'
  2. makeUnjumbleDictionary takes a single argument, the name (a string) of a wordlist file that has one word per line, and returns a dictionary that associates the unjumble key of every word in the wordlist with the list of all words with that unjumble key. All words in the same list are anagrams --- i.e., words that all have exactly the same letters (including repeated ones), but in different orders.

    The Unjumbler folder in the ps08_programs folder contains three wordlist files: tinyWordList.txt (33 words), mediumWordList.txt (45,425 words), and largeWordList.txt (438,712 words), For example, the file tinyWordList.txt contains the following words:

        alerting
        altering
        arts
        caster
        caters
        crates
        glare
        histrionics
        integral
        lager
        large
        rats
        reacts
        recast
        regal
        relating
        restrain
        retrains
        opts
        post
        pots
        spot
        star
        strainer
        stop
        tars
        terrains
        traces
        triangle
        trichinosis
        tops
        tsar

    The invocation of makeUnjumbleDictionary on this file should return a dictionary with 7 key/value items:

    In[8]: tinyUnjumbleDict = makeUnjumbleDictionary('tinyWordList.txt')
    In[9]: tinyUnjumbleDict
    Out[9]: {'acerst':['caster','caters','crates','reacts','recast','traces'], 'aegilnrt':['alerting','altering','integral','relating','triangle'], 'aeglr':['glare','lager','large','regal'], 'aeinrrst':['restrain','retains','strainer','terrains','trainers'], 'arst':['arts','rats','star','tars','tsar'], 'chiiinorsst':['histrionics','trichinosis'], 'opst':['opts','post','pots','spot','stop','tops']}

    Note: Use the following getLinesFromFile function (already defined in unjumble.py) to create a list of words from a file that contains one word per line.

    def getLinesFromFile(fileName):
        '''Returns a list of strings where each string is a line from the                                                           
            specified file. The trailing newline character is not included                                                          
            as part of a string in the list.'''
        return(map(lambda s: s.strip(), list(open(fileName, 'r'))))
    

    For example:

    In[10]: getLinesFromFile('tinyWordList.txt')
    Out[10]: ['alerting', 'altering', 'arts', 'caster', 'caters', 'crates', 'glare', 'histrionics', 'integral', 'lager', 'large', 'rats', 'reacts', 'recast', 'regal', 'relating', 'restrain', 'retrains', 'opts', 'post', 'pots', 'spot', 'star', 'strainer', 'stop', 'tars', 'terrains', 'traces', 'trainers', 'triangle', 'trichinosis', 'tops', 'tsar']

    In order for this to work, make sure that in Canopy you connect to the folder ps08_programs/Unjumbler.

  3. unjumble takes an unjumble dictionary (associating each unjumble key with a list of words) and a string and returns a list of all words in the dictionary to which the input string unjumbles. If there are no such words, it returns the empty list. For example,
    In[11]: tinyUnjumbleDict = makeUnjumbleDictionary('tinyWordList.txt')
    In[12]: unjumble(tinyUnjumbleDict, 'argle')
    Out[12]: ['glare','lager','large','regal']
    In[13]: unjumble(tinyUnjumbleDict, 'arst')
    Out[13]: ['arts','rats','star','tars','tsar']
    In[14]: unjumble(tinyUnjumbleDict, 'foobar')
    Out[14]: []
  4. For fun, use your unjumble function as an assistant in playing the online game.

  5. mostAnagrams takes an unjumble dictionary (associating each unjumble key with a list of words) and returns the longest anagram list in the dictionary. This is the list of all the anagrams of the words with the most anagrams in the dictionary. If more than one list has the same length, it can arbitrarily return one such list.

    For example, mostAnagrams(tinyUnjumbleDict) should return one of the following two lists:

    • ['caster','caters','crates','reacts','recast','traces']
    • ['opts','post','pots','spot','stop','tops']

    In addition to defining the mostAnagrams function, use this function to determine the longest list of anagrams in largeWordList.txt, and include this list in a comment at the bottom of your unjumble.py file..


    Task 2: Wellesley Fresh

    In this task, you will investigate meal options from Wellesley Fresh menu descriptions scraped from the dorm menu web sites. As part of the starter code that you can download, we have provided you with a Python module wellesleyFresh containing a function named getAllEntries that, when invoked on zero arguments, returns a list of all the Wellesley Fresh menu entries for a particular week. Each entry in the returned list is a dictionary with keys 'day', 'hall', 'meal', and 'dish', such as:

    {'day':'Thursday', 'hall':'Bates', 'meal':'Homestyle Lunch', 'dish':'Baked Chicken Bruschetta'}

    For example, here's a sample call to wellesleyFresh.getAllEntries() (where most of the entries have been omitted to save space):

    In[2]: import wellesleyFresh
    In[3]: allEntries = wellesleyFresh.getAllEntries()
    In[4]: allEntries
    Out[4]: [{'dish': 'Waffle Bar', 'meal': 'Continental Breakfast', 'hall': 'Bae Pao Lu Chow', 'day': 'Monday'},
    {'dish': 'Ham Cabbage', 'meal': 'Soup', 'hall': 'Bae Pao Lu Chow', 'day': 'Monday'},
    {'dish': 'Macaroni Cheese', 'meal': 'Homestyle Dinner', 'hall': 'Bae Pao Lu Chow', 'day': 'Monday'},
    ... many entries omitted here ...
    {'dish': 'Roast Turkey Breast', 'meal': 'Homestyle Dinner', 'hall': 'Tower', 'day': 'Sunday'},
    {'dish': 'Roast Turkey Dinner', 'meal': 'Allergy Station', 'hall': 'Tower', 'day': 'Sunday'}]

    Your goal is to define three functions, as specified below, in a new file menu.py that you create from scratch in the folder ps08_programs/WellesleyFresh. In addition to the three specified functions, you are encouraged to define any additional helper functions as you deem appropriate. Make sure that your menu.py file begins with the line import wellesleyFresh.

    1. getMenu has three parameters: (1) a list of dictionaries corresponding to a week's Wellesley Fresh menu entries, (2) a day of the week, and (3) the name of a dining hall. The function should print out a line of the form meal:dish for each entry from the week for the specified dining hall available on the specified day. The order of the lines is unimportant. Here's a sample invocation of getMenu:
      In[15] : getMenu(allEntries, 'Thursday', 'Bates')
      Deli:Egg Salad
      Daily Soups:Fish Chowder
      Breakfast:Scrambled Eggs Or Whites Veg
      Homestyle Lunch:Wilted Kale
      Homestyle Lunch:Green Beans
      Deli:Turkey Ham
      Homestyle Dinner:Steamed Corn
      Breakfast:Hardboiled Eggs Veg
      Fusion:Turkey Ala King
      Homestyle Dinner:Beef Ribs
      Breakfast:Waffle Station Veg
      Homestyle Dinner:Creamed Spinach
      Breakfast:Steel Cut Oatmeal
      Global Grill:Grilled Keilbasa
      Fusion:Vegetarian Ala King
      Homestyle Lunch:Baked Chicken Bruschetta
      Pasta:Pesto
      Homestyle Lunch:Garlic Toast
      Homestyle Dinner:Baked Potato Bar
      Global Grill:Potato Perogies
      Pasta:Marinara
      Deli:Cheeses Veg
      Global Grill:Potato Pancakes
    2. printDishWords has two parameters: (1) a list of dictionaries corresponding to the week's Wellesley Fresh menu entries and (2) an integer indicating the minimum number of times a word must appear in the week's dish descriptions for the function to print it. The function prints out in alphabetical order all (lowercased) words that appear at least the specified number of times anywhere in the "dish" descriptions for the week. The number of occurrences of each of these words is also printed, with a tab character ('\t') separating the word and number of occurrences. For example, in one week, the following 27 words appeared at least 20 times in the week's dish descriptions:
      In[17] : printDishWords(allEntries, 20)
      bar     50
      bean    20
      beans   20
      beef    26
      boiled  24
      cheese  63
      chicken 45
      egg     35
      eggs    63
      fresh   28
      fruit   23
      ham     25
      hard    24
      marinara        20
      oatmeal 26
      potatoes        24
      rice    33
      roasted 25
      salad   36
      sauce   26
      scrambled       34
      sliced  24
      turkey  37
      v       67
      veg     48
      waffle  33
      whites  24
      

      Note: use the Python sort or sorted functions to perform sorting. (Check the Python documentation to understand the difference between these!)

    3. searchMenu has two parameters: (1) a list of dictionaries corresponding to the week's Wellesley Fresh menu entries and (2) a list of search term strings. The function prints out, in alphabetical order, lines of the form hall:meal:dish:day, where each line corresponds to a menu entry in which each of the search term strings matches at least one of the entry field values (hall, meal, dish, day). A search term matches a field if the lowercase version of the term is a substring of the lowercase version of the field. For example, the search terms Chow and chow match the hall Bae Pao Lu Chow and the dish Seafood Chowder; and the search terms at, At, AT match the hall Bates, the day Saturday, the meal Allergy Station, and the dish Oatmeal Bar.
      In[19]: searchMenu(allEntries, ['chicken', 'dinner'])
      Bae Pao Lu Chow:Homestyle Dinner:Chicken Broccoli Alfredo:Wednesday
      Bae Pao Lu Chow:Homestyle Dinner:Garlic Chicken:Sunday
      Bae Pao Lu Chow:Homestyle Dinner:Tuscan Rotisserie Chicken:Monday
      Bates:Homestyle Dinner:Chicken Divan:Wednesday
      Bates:Homestyle Dinner:Roasted Chicken:Saturday
      Pomeroy:Dinner:Creamed Chicken:Friday
      Stone Davis:Dinner:Chicken Gumbo:Thursday
      Stone Davis:Dinner:Teriyaki Roti Chicken:Tuesday
      Tower:Allergy Station:Baked Chicken Dinner:Wednesday
      Tower:Allergy Station:Baked Chicken Parm Dinner:Friday
      Tower:Homestyle Dinner:Chicken Sausage Gumbo:Wednesday
      Tower:Homestyle Dinner:Jerk Chicken:Monday
      

      In[19]: searchMenu(allEntries, ['chicken', 'dinner', 'tower'])
      Tower:Allergy Station:Baked Chicken Dinner:Wednesday
      Tower:Allergy Station:Baked Chicken Parm Dinner:Friday
      Tower:Homestyle Dinner:Chicken Sausage Gumbo:Wednesday
      Tower:Homestyle Dinner:Jerk Chicken:Monday
      

      In[20]: searchMenu(allEntries, ['chicken', 'dinner', 'tower', 'wed'])
      Tower:Allergy Station:Baked Chicken Dinner:Wednesday
      Tower:Homestyle Dinner:Chicken Sausage Gumbo:Wednesday
      

      In[21]: searchMenu(allEntries, ['at', 'bar', 'thu'])
      Bae Pao Lu Chow:Continental Breakfast:Oatmeal Bar:Thursday
      Bates:Homestyle Dinner:Baked Potato Bar:Thursday
      Pomeroy:Breakfast:Oatmeal Bar:Thursday
      Stone Davis:Breakfast:Steel Cut Oats Bar V:Thursday
      Tower:Composed Salad Bar:Potato Salad Veg:Thursday