Problem Set 10 - Due Tue, Dec 4 at 23:59 EST

Reading

  1. Slides and notebooks from Lec 19: Turtle Recursion, Lec 20: Fruitful Recursion, Lec 22: Recursive File Tree Traversal.

  2. Lab 12: Fruitful Recursion

  3. Think Python, Sec 5.8--5.10: Recursion

  4. Think Python, Ch. 6: Fruitful Functions (includes more on Recursion, plus other useful information)

About this Problem Set

The goal of this problem set is to give you practice with fruitful recursion in several contexts: TurtleWorld, PictureWorld, and file system processing.

There are four tasks in this problem set:

  1. In Task 1 (Individual Task), you will define a fruitful recursive function that draws a pattern using Turtle World and returns information about the pattern.

  2. In Task 2 (Partner-recommended Task), you will generate quilts with a recursive pattern.

  3. In Task 3 (Partner-recommended Task), you will write some testing code to figure out which of several versions of a recursive function works correctly.

  4. In Task 4 (Individual Task), you will write testing code for a file traversing algorithm that you will finish working on next week.

Note that Tasks 2 and 3 are both partner-recommended tasks. You may choose to work with a partner on one or both tasks, but you may not work with two different partners on the different tasks. Use this shared Google Doc to find a partner for this pset.

Notes:


Task 1: Turtle Shrubs

This is an individual problem which you must complete on your own, though you may ask for help from the CS111 staff.

In this task, you'll write a fruitful recursive function named shrub that draws a tree pattern with a turtle and returns a tuple of values (see below for details). Your shrub function should be invariant with respect to the turtle's position and direction. Define your function in the shrub.py file in the ps10 folder.

A turtle shrub is a variation of the turtle tree in Lecture 19 (slides 19-40 through 19-75). It is created by the shrub function, which takes four parameters in order:

  1. trunkLength: the length of the branch at the base of the shrub.
  2. angle: the angle relative to the trunk for the right and left branches
  3. shrinkFactor: the trunk of the right branch is shrinkFactor times trunkLength, and the trunk of the left branch is shrinkFactor*shrinkFactor times trunkLength
  4. minLength: the minimum branch length in the shrub.

The shrub function returns a pair (a tuple with two elements) of:

  1. The total number of branches in the shrub, including the trunk
  2. The total length of branches in the shrub, including the trunk

 

Below are sample invocations of the shrub function, with the magenta text indicating the value returned by the invocation, i.e., the number of branches in the shrub:

shrub(100, 15, 0.8, 60) -> (4, 308.0)
shrub(100, 15, 0.8, 50) -> (7, 461.6)
shrub(100, 15, 0.8, 10) -> (232, 3973.9861913600025)
shrub(100, 30, 0.82, 10) -> (376, 6386.440567704483)
shrub(200, 90, 0.75, 10) -> (232, 5056.675148010254)

Notes:

 


Task 2: Quilts with Recursion

This task is a partner-recommended problem in which it is recommended that you work with a partner as part of a two-person team. If you work with a partner on multiple tasks in this problem set, you must work with the same partner for all tasks.

You will write a program that generates quilts like this:

The ps10 folder contains picture.py, our module built upon cs1graphics where every object is a 200x200 "picture." The module includes several functions to generate and combine pictures, such as upperRightNest that we saw in Lec 20, Fruitful Recursion, fourPics and rotations that you saw in Lec 05, and others.

The recursiveQuilts.py file includes the following predefied pictures that you can use as building blocks for your patterns:

pyTri
gwTri
redLeaves
greenLeaves
emptyPic

Write all your code for this task in the recursiveQuilts.py file.

Subtask 2a. quadRecurse

Define a function, quadRecurse(pic1, pic2, levels), that generates the following pictures:

quadRecurse(pyTri, gwTri, 1)
quadRecurse(pyTri, gwTri, 2)
quadRecurse(pyTri, gwTri, 3)
quadRecurse(pyTri, gwTri, 4)
quadRecurse(pyTri, gwTri, 5)
quadRecurse(pyTri, gwTri, 6)

quadRecurse(pyTri, gwTri, 0) should be an empty picture:

Notes:

Subtask 2b. quilt

Now define a function, quilt(pic1, pic2, levels), that generates the following pictures. quilt is not a recursive function, and is very short. (The function body can be written in one (rather long) line, but you may want to express it in several lines). You will need to invoke quadRecurse, as well as at least two of the functions from picture.py.

Carefully study the relationship between the patterns produced by quadRecurse and quilt for the same number of levels.

quilt(pyTri, gwTri, 1)
quilt(pyTri, gwTri, 2)
quilt(pyTri, gwTri, 3)
quilt(pyTri, gwTri, 4)

(Note: invoking quilt with levels4 may cause your program to freeze.)

quilt(pyTri, gwTri, 0) should be an empty picture as well.

quilt(redLeaves, greenLeaves, 4) generates the picture shown at the beginning of the task.

Uncomment the test invocations at the bottom of recursiveQuilts.py to test your functions. Type closeAllPics() in the interactive pane to close the picture canvases without crashing Canopy.


Task 3: Testing Code

This task is a partner-recommended problem in which it is recommended that you work with a partner as part of a two-person team. If you work with a partner on multiple tasks in this problem set, you must work with the same partner for all tasks.

Throughout the semester, we've helped you write your code by providing extensive testing code via Codder (and various test cases within the code files themselves). After CS111, however, you will be expected to write your own test cases for the code that you produce. In fact, it's often more efficient to start with the test cases, and finish writing the code later. This task provides you with some working (and some non-working) functions, and asks you to write test cases to figure out which of the provided functions work correctly. Define your test cases in the testFindSorted.py file in the ps10 folder (not in the findSorted.py file, which you should not modify).

For this task, we have implemented quite a few different functions that are supposed to do the same thing as Python's in operator: figure out whether a list of strings contains a particular string or not. The catch is that the code is allowed to assume that the list is sorted, which lets it potentially work more efficiently. The findSorted.py file contains the attempted solutions for this problem, but not all of them work correctly. The most basic version is called findReference and it just takes a list of strings (stuff) and a string to search for (string) and returns True or False by using the in operator:

def findReference(stuff, string):
  return string in stuff

Although findReference does not depend on the stuff list being sorted, the other implementations are allowed to do so, which means that all of your test cases should only specify sorted lists for input.

You do not have to change them or implement any new functions for this task. Instead, your job is to write a set of test cases (function inputs paired with expected outputs) that will allow you to figure out exactly which of the implementations are working properly. Each test case should be a tuple containing a tuple of function arguments and the expected output for that input. Because the various findSorted implementations take a list and a string as arguments, the arguments tuple part of each test case will be a tuple containing a list and a string. For example, a simple test case showing that for an empty list and the string "anything", the function should return False would look like this:

(([], "anything"), False)

For each of the five different implementation strategies (loop, simple recursion, splitting recursion, dividing loop, and dividing recursion) there is exactly one correct implementation, so including the findReference reference implementation, there are exactly 6 correct implementations to find. In order to identify the correct implementations, add test cases to the testFindSorted.py file and run it to see which implementations pass. Keep adding test cases until there are only 6 passing functions (and make sure that you don't add any incorrect cases; findReference should always pass). Once you have identified the correct implementations, you're done!

Notes:


Task 4: Preparing for File Traversal

This is an individual problem which you must complete on your own, though you may ask for help from the CS111 staff.

Before starting this task, you should review terminology and examples from Lec 22: Recursive File Traversal.

For this task, you will write test cases for a file traversal problem that we will work on next week. For problem set 11, the first task is to write some fruitful recursive functions that can traverse directories and report the files it finds and their sizes, as well as the sizes of directories. Instead of using Codder to help us figure out if our code is working, you will create test cases this week that you can use to help write a solution next week. For many problems it's much easier to write test cases than to solve the problem, and once you have some test cases, solving the problem becomes easier because you can test your solution as you work on it. Working on test cases first like this is called "test-driven development."

Background

The ps10 directory contains a subdirectory testfiles that you will use for examples. This testfiles directory is similar to the testdir directory used in the slides and notebook of Lec 22: Recursive File Traversal.

Below is an indented list showing the entire file tree rooted at testfiles. The total size of all non-hidden files in this tree is 3000 bytes. The so-called hidden files are files whose names begins with a dot. These hidden files (highlighted below) are often not shown when displaying the contents of a directory, and such hidden files should be ignored by the functions you define. This is especially important, because some operating systems add hidden files to all directories (on Mac OS, every directory gets a hidden .DS_Store file).

The os module supplies a function os.path.getsize that returns the size of a file in bytes. (You can think of "one byte" as the size of a single character, but things are actually a bit more complex than that.) For example:

In [1]: import os.path

In [2]: os.path.getsize('testfiles/hollow/.hidden')
Out[2]: 1000

In [3]: os.path.getsize('testfiles/html/img/100.png')
Out[3]: 995

In [4]: os.path.getsize('testfiles/txt/hello.txt')
Out[4]: 180

The above outputs indicated that the files .hidden, 100.png, and hello.txt have sizes of 1000, 995, and 180 bytes, respectively.

When os.path.getsize is called on a directory, it returns the size of bookkeeping information associated with the directory that is unrelated to the sizes of files contained in the directory.

In [5]: os.path.getsize('testfiles/html')
Out[5]: 204

In this task, you must ignore the size of bookkeeping information associated with a directory.

Also, you must ignore hidden files entirely (use the provided isHidden function to check if a file counts as a hidden file).

All code for this task should be written in the provided fileTreeSizes.py file in the ps10 folder.

The Problem

On problem set 11, you will be asked to define a function called fileTreeListWithSizes that takes as its single argument the pathname (a string) of a file or directory relative to the current working directory. It should return a list of pairs where each pair is a tuple of

  1. a pathname (relative to the current working directory); and
  2. the file tree size of the file or directory with that pathname.

The pairs should be ordered such that the pair for a directory comes directly after the pairs for all directories and/or files that directory contains, and within a directory, files and directories should appear alphabetically.

No hidden files should appear in the list: if a hidden file or directory is targeted, the result should be an empty list, and if a directory containing a hidden file or directory is targeted, the entry for that file or the entries for that directory should not appear in the list at all.

For example, if we had the following file structure:

The output should look like this:

In [0]: fileTreeListWithSizes('a/b/q.txt')
Out[0]: [('a/b/q.txt', 400)]

In [1]: fileTreeListWithSizes('a/b/.hidden')
Out[1]: []

In [2]: fileTreeListWithSizes('a/c')
Out[2]:
[('a/c/hello.txt', 6),
('a/c/world.txt', 6),
('a/c', 12)]

In [3]: fileTreeListWithSizes('a')
Out[3]:
[('a/b/q.txt', 400),
('a/b/z.txt', 30),
('a/b', 430),
('a/c/hello.txt', 6),
('a/c/world.txt', 6),
('a/c', 12),
('a/def.py', 8),
('a', 450)]

You do not have to implement the fileTreeListWithSizes function this week!

Writing Tests

Instead of implementing fileTreeListWithSizes, you have to write tests for it. Each test case should specify the input and corresponding correct output as a pair. Given the file tree example above, one possible test case would be:

(
  'a/c',
  [
    ('a/c/hello.txt', 6),
    ('a/c/world.txt', 6),
    ('a/c', 12)
  ]
)

In the ps10 directory, there is a testfiles directory that you will use to test fileTreeListWithSizes. Your test cases should all specify the path to a file or directory inside testfiles (or specify testfiles itself) as the input.

Use the os.path.getsize function (and if you find it useful, the os.listdir function to list files in a directory including hidden files) to discover the sizes of the various files inside of testfiles. Alternatively, you could use your operating system's file browser instead, but make sure you can find the exact size of each file in bytes. In either case, write down (in a text file or on paper) the paths to each file (relative to the ps10 directory) and the size in bytes of each file (you need the exact size in bytes, not an estimate in KB or MB).

Hint: The total size for all non-hidden files in the testfiles directory and its subdirectories is exactly 3000 bytes. The testfiles/html/img folder alone accounts for 2000 bytes.

Requirements

Once you know which files are where and how big they are, write at least the following seven test cases:

As specified above, each test case should be a pair including:

  1. the path to the target file or directory and,
  2. the list of (path, size) tuples that should be the result for that target.

You should write your test cases in the fileTreeSizes.py file; the length of the testCases list should be at least seven when you are done.

Notes


Task 5: Honor Code Form and Final Checks

As in the previous psets, your honor code submission for this pset will involve defining entering values for the variables in the honorcode.py file.

If you wrote any function invocations or print statements in your Python files to test your code, please remove them, comment them out before you submit, or wrap them in a if __name__=='__main__' block at the end of a .py file. Points will be deducted for isolated function invocations or superfluous print statements.

It's a good idea to run Codder one last time before submission.


How to turn in this Problem Set