CS111 Lecture: Testing and Debugging¶

Table of Contents:

Overview
A Function Testing Example: countChar
Towards Automated Testing: Printing Test Case
Digression: Creating complex strings with Python 3's f-strings
Exercise 1: Developing Test Cases That Show countChar is Buggy
Using optimism for input/output testing
Designing Black-box Test Cases
Designing Glass-box Test Cases and Minimal Counterexamples
- 8.1 Exercise 4: Glass-Box Testing of hasCGBlock
- 8.2 Exercise 5: Minimal Counterexamples for Other Buggy hasGCBlock Functions
Debugging Techniques
Exercise 6: Debugging Buggy Versions of hasCGBlock

1. Overview¶

As you've learned so far in CS111, it can be challenging to create programs that work correctly in all cases.

Programs that don't behave correctly are said to be buggy because they contain bugs = reasons why they don't work. The process of finding and fixing these bugs is called debugging.

The first step in the debugging process is testing. You don't know whether a program might be incorrect until you have evidence that it's not working as expected. We begin today by discussing how to develop test cases that help us determine cases in which programs misbehave.

Here are the high-level steps in today's lecture:

Testing test cases interactively in Thonny is cumbersome. We'll show how to automate input/output testing using test cases that consist of (1) a set of inputs for the function we want to test and (2) the corresponding outputs we expect that function to return for these inputs. We'll then compare the actual results returned by the function to the expected results, and pay attention to the cases where these differ. This approach can be used in any programming language.
Developing our own programs to perform input/output testing has a high overhead. We show how the optimism library can simplify testing in Python. We've used optimism for testing in several previous psets, but haven't really asked you to understand the details. After this lecture, we'll expect that you will know how to use optimism to express input/output test cases going forward. This is an important skill you need to master as you mature as programmers.
What do you do when testing reveals cases in which your program doesn't work? You'll then need to apply debugging techniques to determine why it misbehaves and how to fix it.

The goals of this lecture are to:

Give you an appreciation for the importance of automated input/output testing;
Teach you how to develop effective test cases;
Show you how to use optimism to express those test cases; and
Intoduce useful debugging techniques that will help you find and fix bugs in your programs.

2. A Function Testing Example: `countChar`¶

As a simple example of testing, consider the following buggy version of the countChar function that has been use in previous lecture notebooks.

# BUGGY version of countChar
def countChar(char, word):
    '''Return the number of time char occurs in word, ignoring case.'''
    counter = 0
    for i in range(1, len(word)-1):
        if word[i] == char.lower():
            counter += 1
    return counter

Below are a few test cases involving countChar in which it works as expected. (But it has bugs that will be exhibited by other test cases!)

countChar('s', 'Mississippi')

4

countChar('S', 'Mississippi')

4

countChar('p', 'Mississippi')

2

countChar('a', 'Mississippi')

0

3. Towards Automated Testing: Printing Test Cases¶

It's tedious to interactively test calls to countChar one at a time. Can we do better?

The following doesn't work, because the notebook returns only the value of the last expression in a cell.

countChar('s', 'Mississippi')
countChar('S', 'Mississippi')
countChar('p', 'Mississippi')
countChar('a', 'Mississippi')

0

Running a file in Thonny is similarly problematic. That never automatically shows the values of any expressions.

For this reason, it's helpful to define a testing function print_countChar that prints the inputs and outputs for a test case:

def print_countChar(char, word):
    """Helper function to test countChar"""
    print("countChar('"     # This is just one long string
          + char + "', '"   # concatenated out of parts
          + word + "' => "
          + str(countChar(char, word))#  This is the number that results from calling countChar                     
         )

print_countChar('s', 'Mississippi')
print_countChar('S', 'Mississippi')
print_countChar('p', 'Mississippi')
print_countChar('a', 'Mississippi')

countChar('s', 'Mississippi' => 4
countChar('S', 'Mississippi' => 4
countChar('p', 'Mississippi' => 2
countChar('a', 'Mississippi' => 0

4. Digression: Creating complex strings with Python 3's f-strings¶

The string concatenations in print_countChar are difficult to create and read.

You have encounter similar situations before in tasks like timeProfiler.

Is there a better way? Yes! The version of Python we use in Thonny (Python 3.7) supports a feature called f-strings that greatly simply specifying complex strings that have constant parts and parts that are the results of evaluating expressions.

An f-string is a string preceded by the character f that specifies a string template with "holes" (marked by {}) that can be filled by the results of arbitrary Python expressions. The results of the expressions in the holes are automatically converted to strings, so we don't need to explicitly use str to do that.

For example:

def testSum(n1, n2): 
    print(f'{n1} + {n2} => {n1+n2}')
    
testSum(1,2)
testSum(5,3)

1 + 2 => 3
5 + 3 => 8

Take a moment to appreciate the power of f-strings. Without f-strings, we would need to change the line

print(f'{n1} + {n2} => {n1+n2}')

to

print(str(n1) + ' + ' + str(n2) + ' => ' + str(n1+n2))

The f-string version is so much easier to read and write!

Now let's use an f-string to simplify the definition of test_countChar:

# Version of print_countChar that uses f-strings
def print_countChar(char, word): 
    """Helper function to test countChar"""
    print(f"countChar('{char}', '{word}') => {countChar(char, word)}")
    
print_countChar('s', 'Mississippi')
print_countChar('S', 'Mississippi')
print_countChar('p', 'Mississippi')
print_countChar('a', 'Mississippi')

countChar('s', 'Mississippi') => 4
countChar('S', 'Mississippi') => 4
countChar('p', 'Mississippi') => 2
countChar('a', 'Mississippi') => 0

5. Exercise 1: Developing Test Cases That Show `countChar` is Buggy¶

The fact that the four test cases for countChar give the expected result can lull us into a false sense of confidence that the countChar function is defined correctly.

In fact, it has several bugs. Study the definition of countChar, and below it write several calls to print_countChar whose results are not the expected ones, demonstrating that countChar is buggy. You should write a least one test case for each bug.

# BUGGY version of countChar, repeated for your convenience
def countChar(char, word):
    '''Return the number of time char occurs in word, ignoring case.'''
    counter = 0
    for i in range(1, len(word)-1):
        if word[i] == char.lower():
            counter += 1
    return counter

# Below, write calls to print_countChar(... , ...) 
# that demonstrate bugs in the above countChar defition 
# Your code here
print_countChar('m', 'mississippi') # Doesn't count first char (i.e., at index 0)
print_countChar('i', 'Mississippi') # Doesn't count last char (i.e., at index (len(word) - 1) (same as index -1))
print_countChar('p', 'MISSISSIPPI') # Doesn't correctly handle uppercase characters in word

countChar('m', 'mississippi') => 0
countChar('i', 'Mississippi') => 3
countChar('p', 'MISSISSIPPI') => 0

6. Using `optimism` for input/output testing¶

Testing functions like print_countChar help us to express test cases, but they still require lots of work on our part in terms of manually checking that the actual result matches the result that we expect.

Using our knowledge of loops and lists/tuples, we could develop more sophisticated testing functions that include the expected result along with the arguments, and warn us when the expected result does not match the actual result.

However, these more sophisticated testing functions are a lot of work to define! And handling all sorts of special cases is especially challenging. E.g., How do we handle functions that get input from the user? How do we handle functions that print output in addition to (or instead of) returning a result?

An alternative approach is to use a library of powerful functions that work together to express a mini-language for testing. One such library is the optimism library developed by CS111's very own Peter Mawhorter.

It turns out that you have already benefited from optimism on almost all your pset tasks, though you may not know it. Most pset tasks involve a testing file that, when run, prints checkmarks to indicate that a test case passes and an X to indicate a test case fails. That testing feedback is being provided by optimism!

You can learn more about optimism from Reference>Quick Reference menu item on the CS111 web site.

6.1 `optimism` Example: Testing `countChar`¶

Let's see how optimism can be used to automate the checking of test cases for countChar.

Step 1: First we need to import the optimism library

import optimism # imports optimism testing library

Step 2: Next, for any function we want to test, we create a so-called test manager for that function by calling optimism.testFunction with the function as its single argument:

test_countChar = optimism.testFunction(countChar)

Note that the argument is a function value from the unquoted function name. It is not a string with the function name.

Step 3: Using the function manager (test_countChar in this example), create a test case by calling the .case method on the test manager with the function arguments that should be tested. In the following example, case names the test case for using ('s', 'Mississippi') as the arguments to countChar.

case1 = test_countChar.case('s', 'Mississippi') # test case for countChar('s', 'Mississippi')

Step 4: Finally, the test case can be run by using the .checkReturnValue method with the expected value as its single argument. For example, for the test case countChar('s', 'Mississippi'), we expect the result to be 4:

case1.checkReturnValue(4) # checks that countChar('s', 'Mississippi') returns 4

✓ 3554504884.py:1

True

When the actual value of the test case matches the expected value (like in this case), .checkReturnValue prints a checkmark (✓) along with the name of the testing file and line number in that testing file where the test was performed. E.g., something like:

✓ test_countChar.py:11

But when .checkReturnValue is run in a Jupyter notebook, there is no testing file, so optimism creates an auto-generated Python test file from the contents of the cell that has the form number.py. E.g., something like:

✓ 3554504884.py:1

In addition to printing information, the call to .checkReturnValue also returns True when the test case succeeds.

If the actual result does not match the expected result, .checkReturnValue prints an ✗ along with the testing file name, line number, and information about the function arguments, actual result, and expected result:

case1.checkReturnValue(3) # checks that countChar('s', 'Mississippi') returns 3

✗ 4163749710.py:1
  Result:
    4
  was NOT equivalent to the expected value:
    3
  Called function 'countChar' with arguments:
    char = 's'
    word = 'Mississippi'

False

In the above example, the actual result 4 is really correct and it's the expected value 3 that is incorrect!

In addition to printing information, the call to .checkReturnValue also returns False when the test case fails.

In practice, the test case resulting from .case isn't named, and instead .checkReturnValue is called directly on the result of .case. For example:

test_countChar.case('s', 'Mississippi').checkReturnValue(4)
test_countChar.case('S', 'Mississippi').checkReturnValue(4)
test_countChar.case('p', 'Mississippi').checkReturnValue(2)
test_countChar.case('a', 'Mississippi').checkReturnValue(0)

✓ 3584681756.py:1
✓ 3584681756.py:2
✓ 3584681756.py:3
✓ 3584681756.py:4

True

6.2 Exercise 2: Expressing your Example 1 Tests Using `optimism`¶

Below, use optimism to express your test cases from Exercise 1, giving correct values for the expected values. Running these test cases should indicate a failure for each case, along with relevant information.

# Write test cases similar to 
#
#   test_countChar.case('s', 'Mississippi').checkReturnValue(4)
#
# for each of your test cases from Exercise 1 above. 

# Your code here
test_countChar.case('m', 'mississippi').checkReturnValue(1) # Doesn't count first char (i.e., at index 0)
test_countChar.case('i', 'Mississippi').checkReturnValue(4) # Doesn't count last char (i.e., at index (len(word) - 1) (same as index -1))
test_countChar.case('p', 'MISSISSIPPI').checkReturnValue(2) # Doesn't correctly handle uppercase characters in word

✗ 1236502778.py:8
  Result:
    0
  was NOT equivalent to the expected value:
    1
  Called function 'countChar' with arguments:
    char = 'm'
    word = 'mississippi'
✗ 1236502778.py:9
  Result:
    3
  was NOT equivalent to the expected value:
    4
  Called function 'countChar' with arguments:
    char = 'i'
    word = 'Mississippi'
✗ 1236502778.py:10
  Result:
    0
  was NOT equivalent to the expected value:
    2
  Called function 'countChar' with arguments:
    char = 'p'
    word = 'MISSISSIPPI'

False

6.3 Testing multiple definitions of the same function¶

Suppose we had multiple definitions of the same function, such as the different versions of countChar below:

def countChar1(char, word):
    counter = 0
    for i in range(1, len(word)-1):
        if word[i] == char.lower():
            counter += 1
    return counter

def countChar2(char, word):
    counter = 0
    for i in range(0, len(word)):
        if word.lower()[i] == char.lower():
            counter += 1
    return counter

def countChar3(char, word):
    lowerWord = word.lower()
    counter = 0
    for i in range(1, len(word)):
        if lowerWord[i] == char.lower():
            counter += 1
    return counter

def countChar4(char, word):
    counter = 0
    for letter in word:
        if char == letter:
            counter += 1
    return counter

def countChar5(char, word):
    char = char.upper()
    word = word.upper()
    counter = 0
    for letter in word:
        if char == letter:
            counter += 1
    return counter

We can define a test_countCharFunction function that takes any one of these functions as its single argument and tests that function on numerous test cases.

Run the following cell to run all the test cases on all five versions of countChar.

def test_countCharFunction(fcn):
    print('-'*40)
    print(f"Testing countChar function {fcn}")
    tester = optimism.testFunction(fcn) 
    tester.case('s', 'Mississippi').checkReturnValue(4) 
    tester.case('S', 'Mississippi').checkReturnValue(4) 
    tester.case('p', 'Mississippi').checkReturnValue(2) 
    tester.case('a', 'Mississippi').checkReturnValue(0) 
    tester.case('m', 'mississippi').checkReturnValue(1) 
    tester.case('i', 'Mississippi').checkReturnValue(4) 
    tester.case('p', 'MISSISSIPPI').checkReturnValue(2)
          
countCharFunctions = [countChar1, countChar2, countChar3, countChar4, countChar5]
          
for f in countCharFunctions:
    test_countCharFunction(f)

----------------------------------------
Testing countChar function <function countChar1 at 0x1037d37f0>

✓ 100191123.py:5
✓ 100191123.py:6
✓ 100191123.py:7
✓ 100191123.py:8
✗ 100191123.py:9
  Result:
    0
  was NOT equivalent to the expected value:
    1
  Called function 'countChar1' with arguments:
    char = 'm'
    word = 'mississippi'
✗ 100191123.py:10
  Result:
    3
  was NOT equivalent to the expected value:
    4
  Called function 'countChar1' with arguments:
    char = 'i'
    word = 'Mississippi'
✗ 100191123.py:11
  Result:
    0
  was NOT equivalent to the expected value:
    2
  Called function 'countChar1' with arguments:
    char = 'p'
    word = 'MISSISSIPPI'

----------------------------------------
Testing countChar function <function countChar2 at 0x10378c160>

✓ 100191123.py:5
✓ 100191123.py:6
✓ 100191123.py:7
✓ 100191123.py:8
✓ 100191123.py:9
✓ 100191123.py:10
✓ 100191123.py:11

----------------------------------------
Testing countChar function <function countChar3 at 0x1037d3640>

✓ 100191123.py:5
✓ 100191123.py:6
✓ 100191123.py:7
✓ 100191123.py:8
✗ 100191123.py:9
  Result:
    0
  was NOT equivalent to the expected value:
    1
  Called function 'countChar3' with arguments:
    char = 'm'
    word = 'mississippi'
✓ 100191123.py:10
✓ 100191123.py:11

----------------------------------------
Testing countChar function <function countChar4 at 0x1037d3520>

✓ 100191123.py:5
✗ 100191123.py:6
  Result:
    0
  was NOT equivalent to the expected value:
    4
  Called function 'countChar4' with arguments:
    char = 'S'
    word = 'Mississippi'
✓ 100191123.py:7
✓ 100191123.py:8
✓ 100191123.py:9
✓ 100191123.py:10
✗ 100191123.py:11
  Result:
    0
  was NOT equivalent to the expected value:
    2
  Called function 'countChar4' with arguments:
    char = 'p'
    word = 'MISSISSIPPI'

----------------------------------------
Testing countChar function <function countChar5 at 0x1037d3910>

✓ 100191123.py:5
✓ 100191123.py:6
✓ 100191123.py:7
✓ 100191123.py:8
✓ 100191123.py:9
✓ 100191123.py:10
✓ 100191123.py:11

Based on the results of the above test cases, you can be sure that versions with an ✗ test are buggy.

How sure are you that the versions that pass all tests are correct?

7. Designing Test Cases: Glass Box vs. Black Box Testing¶

Going forward in CS111, for some pset tasks, you will be asked to write your own .py testing files that use optimism to test your functions. How do you think about designing your test cases?

Situations like countChar, where you get to see the function definitions, are called glass-box testing, because you get to study the details of the function code when designing test cases on which you think the function will succeed or fail.

Testing a function without seeing its definition is called black-box testing, because the testing is purely based on its input/output behavior according to its contract without being able to see the code implementing the function. It's as if it's a mechanical contraption whose internal workings are hidden inside a black box and cannot be viewed.

7.1 Categories of Black-box Test Cases¶

When designing black-box tests, you must imagine ways in which the function might be implemented and how such implementations could go wrong. Some classes of test cases:

Regular cases: These are "normal" cases that check basic advertised input/output functionality, like tests of counting different letters in "Mississippi" for countChar.
Implied conditional cases: When the contract mention different categories of an input

(e.g., positive or negative numbers, vowels vs. nonvowels), it implies that these categories will be checked by conditionals in the function body. Since those conditionals could be wrong, testing all combinations values from input categories is prudent.

Edge cases: These are tests of extreme or special cases that the function might not handle properly. For example
- For numeric inputs, extreme inputs can include 0, large numbers, negative numbers,
and floats vs. ints.
- Fencepost errors are off-by-one errors, which are common in programs. E.g n elements in a list are separated by n-1 commas, not n.
- For inputs that are indices of sequences, test indices near the ends of the
sequence, e.g., indices like 0, 1, -1 and len(seq), len(seq)-1, len(seq)+1.
Since Python allows negative indices, you should also test -len(seq), -len(seq)-1, -len(seq)+1.
- For functions involving elements of sequences, test elements in the first and last positions of the sequences, e.g. characters at the beginning and end of a string.
- For inputs that are sequences, empty and small sequences are often not handled
correctly, so you should always test empty and singleton strings/lists/tuples. When specific numbers are mentioned in the contract (e.g. isBeauteous tests for 3 consecutive vowels) it's important to test strings of length <= 3 as edge cases.
- For inputs expected to be booleans, what happens if other Truthy/Falsey values
are supplied? Is it OK that to treat other Truthy/Falsey values as True/False?

7.2 Example: Designing Black-box Tests for `countChar`¶

In the case of testing countChar, how confident are we that our tests with testing different characters in different capitalizations of "Mississippi" effectively tests countChar?

Rather than testing a long string like "Mississippi", it may be more effective to carefully test a combination of shorter strings and characters in those strings.

Some things to keep in mind:

Although the parameter to countChar is named word, it can be any string, so don't get hung up on making it an actual word.
Because the contract looks for a particular character in the word, tests really only need to distinguish between that character and other characters. So we can make the character we're looking for 'a' and use 'b' for all other characters. (This assumes the code doesn't do something crazy like handle particular characters or classes of characters --- like vowels --- specially.)
The empty string should be tested as an edge case.
It's important to test characters in the first and last positions of the string.
Because the contract mentions upper and lower case, testing different combinations of case in the character and word is essential.

Based on the above considerations, let's modify the test_countCharFunction from above to do a more careful job at specifying black-box test cases for countChar. We can avoid the tedium of writing tester.case(... , ...).checkReturnValue(...) by first collecting test cases into a list of tuples, and then iterating over that list within test_countCharFunction:

blackBoxCountCharTestCases = [
    
    # Test the empty string
    ('a','', 0),
   
    # Test "negative" singleton string
    ('a','b', 0),
    
    # Test all capitalizations of "positive" singleton string
    ('a','a', 1), ('a','A', 1), ('A','a', 1), ('A','A', 1), 
    
    # Test two-element strings (where char can be at beginning or end of word)
    ('a','Aa', 2), ('a','aA', 2), ('A','Aa', 2), ('A','aA', 2),
    # No need to repeat capitalization combinations here:
    ('a','ab', 1), ('a','ba', 1), ('a','bb', 0), 
   
    # Length-3 strings distinguish ends from middles
    ('a', 'aaA', 3), ('a', 'aAA', 3), ('A', 'aaA', 3), ('A', 'aAA', 3),
    ('a', 'aab', 2), ('a', 'aba', 2), ('A', 'baa', 2), 
    ('a','abb', 1), ('a', 'bab', 1), ('a','bba', 1), ('a', 'bbb', 0),
    
    # Try a few longer strings
    ('a','aAAaA', 5), ('A','aAAaA', 5), 
    ('a','abAbA', 3), ('A','abAbA', 3),
    ('a','babAb', 2), ('A','babAb', 2),
    ('a','bbbbb', 0), 
]

def test_countCharFunction(fcn):
    print('-'*40)
    print(f"Testing countChar function {fcn}")
    tester = optimism.testFunction(fcn) 
    for (char, word, expectedValue) in blackBoxCountCharTestCases: # Behold the power of tuple assignment!
        tester.case(char, word).checkReturnValue(expectedValue)

We can now run our more extensive tests on any version of countChar

test_countCharFunction(countChar2) # Try any version of countChar here

----------------------------------------
Testing countChar function <function countChar2 at 0x10378c160>

✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34

7.3 Exercise 3: Black-box Testing of `hasCGBlock`¶

Recall this specification for the hasCGBlock function from the ps05 genetics task:

def hasCGBlock(RNAseq):
    """"
    Given an RNA sequence, this function must return True if the sequence contains a block of 5 consecutive 
    'C' and/or 'G' bases, and False otherwise. The block may be any combination of 'C' and 'G' bases 
    as long as there are 5 in a row with no other bases in between them. But if other bases are present, 
    there might be more than 5 total 'C' or 'G' bases in the sequence without it actually containing a 'CG' 
    block.
    """

Below, develop a list of black-box test cases for hasCGBlock. Think about various ways in which an attempted implementation might behave incorrectly. For example:

Maybe the code just counts the total number of Cs and Gs rather than the longest sequence of Cs and Gs in a row.
Maybe the code tests for five Cs in a row or five Gs in a row, but not a combination of five Cs or Gs in a row.
Maybe the code incorrectly has an early return for False when a base that is not C or G is encountered.
Maybe the code tests for five As or U's in a row rather than Cs or Gs.
Maybe the code uses indexing to test for the next 4 characters after the current one and is not careful about out-of-bounds indexing.

blackBoxHasCGBlockTestCases = [
    # Test at least the examples from the genetics problem Examples section
    ('CGGCC', True), ('CGACCG', False), ('CGACCGCGU', True),
    
    # Add more test cases below
    # Your code here
    
    # Test the empty string
    ('', False), 
    
    # Test singleton strings
    ('A', False), ('C', False), ('G', False), ('U', False), 
    
    # Test some CG blocks with length 2 to 4
    ('CC', False), ('GG', False), 
    ('CCC', False), ('GGG', False), 
    ('CCCC', False), ('GGGG', False), ('CCGG', False), ('CGCG', False),
    
    # Test some CG blocks with length exactly 5
    ('CCCCC', True), ('GGGGG', True), ('CGCGC', True), ('GCGCG', True), 
    ('CCGGC', True), ('GGCCG', True), 
    
    # Test length-5 CG blocks preceded and/or folowed by AU
    ('ACCCCC', True), ('AGGGGG', True), ('UCCGGC', True), ('UGGCCG', True),  
    ('CCCCCU', True), ('GGGGGU', True), ('CCGGCA', True), ('GGCCGA', True),
    ('ACCCCCU', True), ('AGGGGGU', True), ('UCCGGCA', True), ('UGGCCGA', True),
    
    # Test some False cases where there are 5 or more Cs and Gs, but not in a row. 
    ('CCAGGG', False), ('CCACGUGG', False), ('CCCCAGGGG', False), 
    
    # Test some True cases length 5 (or more) come after at least one A and/or U
    ('CCCCUGGGGACCGGCCG', True), ('CACGUCGGCAUCCGGC', True) ,
    
    # Test 5 AUs in a row
    ('AAAAA', False), ('UUUU', False), ('AAUUA', False),   
]

Below is a function test_hasCGBlockFunction that tests all of the above test cases.

import optimism

def test_hasCGBlockFunction(fcn):
    print('-'*40)
    print(f"Testing hasCGBlockFunction function {fcn}")
    tester = optimism.testFunction(fcn) 
    for (RNAseq, expectedValue) in blackBoxHasCGBlockTestCases: # Behold the power of tuple assignment!
        tester.case(RNAseq).checkReturnValue(expectedValue)

Let's import some black-box definitions of hasCGBlock from a file and test them.

How confident are you that you can tell from the test results which versions are likely to be correct?

from hasCGBlockFunctions import *

for f in [hasCGBlock1, hasCGBlock2, hasCGBlock3, hasCGBlock4,
          hasCGBlock5, hasCGBlock6, hasCGBlock7, hasCGBlock8,
          hasCGBlock9, hasCGBlock10, hasCGBlock11, hasCGBlock12]:
    test_hasCGBlockFunction(f)

----------------------------------------
Testing hasCGBlockFunction function <function hasCGBlock1 at 0x10378c700>

✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8

----------------------------------------
Testing hasCGBlockFunction function <function hasCGBlock2 at 0x1036f2560>

✓ 3941837263.py:8
✗ 3941837263.py:8
  Result:
    True
  was NOT equivalent to the expected value:
    False
  Called function 'hasCGBlock2' with arguments:
    seq = 'CGACCG'
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✗ 3941837263.py:8
  Result:
    True
  was NOT equivalent to the expected value:
    False
  Called function 'hasCGBlock2' with arguments:
    seq = 'CCAGGG'
✗ 3941837263.py:8
  Result:
    True
  was NOT equivalent to the expected value:
    False
  Called function 'hasCGBlock2' with arguments:
    seq = 'CCACGUGG'
✗ 3941837263.py:8
  Result:
    True
  was NOT equivalent to the expected value:
    False
  Called function 'hasCGBlock2' with arguments:
    seq = 'CCCCAGGGG'
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8

----------------------------------------
Testing hasCGBlockFunction function <function hasCGBlock3 at 0x1037d3f40>

✓ 3941837263.py:8
✓ 3941837263.py:8
✗ 3941837263.py:8
  Result:
    False
  was NOT equivalent to the expected value:
    True
  Called function 'hasCGBlock3' with arguments:
    seq = 'CGACCGCGU'
✗ 3941837263.py:8
  Result:
    None
  was NOT equivalent to the expected value:
    False
  Called function 'hasCGBlock3' with arguments:
    seq = ''
✓ 3941837263.py:8
✗ 3941837263.py:8
  Result:
    None
  was NOT equivalent to the expected value:
    False
  Called function 'hasCGBlock3' with arguments:
    seq = 'C'
✗ 3941837263.py:8
  Result:
    None
  was NOT equivalent to the expected value:
    False
  Called function 'hasCGBlock3' with arguments:
    seq = 'G'
✓ 3941837263.py:8
✗ 3941837263.py:8
  Result:
    None
  was NOT equivalent to the expected value:
    False
  Called function 'hasCGBlock3' with arguments:
    seq = 'CC'
✗ 3941837263.py:8
  Result:
    None
  was NOT equivalent to the expected value:
    False
  Called function 'hasCGBlock3' with arguments:
    seq = 'GG'
✗ 3941837263.py:8
  Result:
    None
  was NOT equivalent to the expected value:
    False
  Called function 'hasCGBlock3' with arguments:
    seq = 'CCC'
✗ 3941837263.py:8
  Result:
    None
  was NOT equivalent to the expected value:
    False
  Called function 'hasCGBlock3' with arguments:
    seq = 'GGG'
✗ 3941837263.py:8
  Result:
    None
  was NOT equivalent to the expected value:
    False
  Called function 'hasCGBlock3' with arguments:
    seq = 'CCCC'
✗ 3941837263.py:8
  Result:
    None
  was NOT equivalent to the expected value:
    False
  Called function 'hasCGBlock3' with arguments:
    seq = 'GGGG'
✗ 3941837263.py:8
  Result:
    None
  was NOT equivalent to the expected value:
    False
  Called function 'hasCGBlock3' with arguments:
    seq = 'CCGG'
✗ 3941837263.py:8
  Result:
    None
  was NOT equivalent to the expected value:
    False
  Called function 'hasCGBlock3' with arguments:
    seq = 'CGCG'
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✗ 3941837263.py:8
  Result:
    False
  was NOT equivalent to the expected value:
    True
  Called function 'hasCGBlock3' with arguments:
    seq = 'ACCCCC'
✗ 3941837263.py:8
  Result:
    False
  was NOT equivalent to the expected value:
    True
  Called function 'hasCGBlock3' with arguments:
    seq = 'AGGGGG'
✗ 3941837263.py:8
  Result:
    False
  was NOT equivalent to the expected value:
    True
  Called function 'hasCGBlock3' with arguments:
    seq = 'UCCGGC'
✗ 3941837263.py:8
  Result:
    False
  was NOT equivalent to the expected value:
    True
  Called function 'hasCGBlock3' with arguments:
    seq = 'UGGCCG'
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✗ 3941837263.py:8
  Result:
    False
  was NOT equivalent to the expected value:
    True
  Called function 'hasCGBlock3' with arguments:
    seq = 'ACCCCCU'
✗ 3941837263.py:8
  Result:
    False
  was NOT equivalent to the expected value:
    True
  Called function 'hasCGBlock3' with arguments:
    seq = 'AGGGGGU'
✗ 3941837263.py:8
  Result:
    False
  was NOT equivalent to the expected value:
    True
  Called function 'hasCGBlock3' with arguments:
    seq = 'UCCGGCA'
✗ 3941837263.py:8
  Result:
    False
  was NOT equivalent to the expected value:
    True
  Called function 'hasCGBlock3' with arguments:
    seq = 'UGGCCGA'
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✗ 3941837263.py:8
  Result:
    False
  was NOT equivalent to the expected value:
    True
  Called function 'hasCGBlock3' with arguments:
    seq = 'CCCCUGGGGACCGGCCG'
✗ 3941837263.py:8
  Result:
    False
  was NOT equivalent to the expected value:
    True
  Called function 'hasCGBlock3' with arguments:
    seq = 'CACGUCGGCAUCCGGC'
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8

----------------------------------------
Testing hasCGBlockFunction function <function hasCGBlock4 at 0x1037d3880>

✗ 3941837263.py:8
  Result:
    False
  was NOT equivalent to the expected value:
    True
  Called function 'hasCGBlock4' with arguments:
    seq = 'CGGCC'
✓ 3941837263.py:8
✗ 3941837263.py:8
  Result:
    False
  was NOT equivalent to the expected value:
    True
  Called function 'hasCGBlock4' with arguments:
    seq = 'CGACCGCGU'
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✗ 3941837263.py:8
  Result:
    False
  was NOT equivalent to the expected value:
    True
  Called function 'hasCGBlock4' with arguments:
    seq = 'CCCCC'
✗ 3941837263.py:8
  Result:
    False
  was NOT equivalent to the expected value:
    True
  Called function 'hasCGBlock4' with arguments:
    seq = 'GGGGG'
✗ 3941837263.py:8
  Result:
    False
  was NOT equivalent to the expected value:
    True
  Called function 'hasCGBlock4' with arguments:
    seq = 'CGCGC'
✗ 3941837263.py:8
  Result:
    False
  was NOT equivalent to the expected value:
    True
  Called function 'hasCGBlock4' with arguments:
    seq = 'GCGCG'
✗ 3941837263.py:8
  Result:
    False
  was NOT equivalent to the expected value:
    True
  Called function 'hasCGBlock4' with arguments:
    seq = 'CCGGC'
✗ 3941837263.py:8
  Result:
    False
  was NOT equivalent to the expected value:
    True
  Called function 'hasCGBlock4' with arguments:
    seq = 'GGCCG'
✗ 3941837263.py:8
  Result:
    False
  was NOT equivalent to the expected value:
    True
  Called function 'hasCGBlock4' with arguments:
    seq = 'ACCCCC'
✗ 3941837263.py:8
  Result:
    False
  was NOT equivalent to the expected value:
    True
  Called function 'hasCGBlock4' with arguments:
    seq = 'AGGGGG'
✗ 3941837263.py:8
  Result:
    False
  was NOT equivalent to the expected value:
    True
  Called function 'hasCGBlock4' with arguments:
    seq = 'UCCGGC'
✗ 3941837263.py:8
  Result:
    False
  was NOT equivalent to the expected value:
    True
  Called function 'hasCGBlock4' with arguments:
    seq = 'UGGCCG'
✗ 3941837263.py:8
  Result:
    False
  was NOT equivalent to the expected value:
    True
  Called function 'hasCGBlock4' with arguments:
    seq = 'CCCCCU'
✗ 3941837263.py:8
  Result:
    False
  was NOT equivalent to the expected value:
    True
  Called function 'hasCGBlock4' with arguments:
    seq = 'GGGGGU'
✗ 3941837263.py:8
  Result:
    False
  was NOT equivalent to the expected value:
    True
  Called function 'hasCGBlock4' with arguments:
    seq = 'CCGGCA'
✗ 3941837263.py:8
  Result:
    False
  was NOT equivalent to the expected value:
    True
  Called function 'hasCGBlock4' with arguments:
    seq = 'GGCCGA'
✗ 3941837263.py:8
  Result:
    False
  was NOT equivalent to the expected value:
    True
  Called function 'hasCGBlock4' with arguments:
    seq = 'ACCCCCU'
✗ 3941837263.py:8
  Result:
    False
  was NOT equivalent to the expected value:
    True
  Called function 'hasCGBlock4' with arguments:
    seq = 'AGGGGGU'
✗ 3941837263.py:8
  Result:
    False
  was NOT equivalent to the expected value:
    True
  Called function 'hasCGBlock4' with arguments:
    seq = 'UCCGGCA'
✗ 3941837263.py:8
  Result:
    False
  was NOT equivalent to the expected value:
    True
  Called function 'hasCGBlock4' with arguments:
    seq = 'UGGCCGA'
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✗ 3941837263.py:8
  Result:
    False
  was NOT equivalent to the expected value:
    True
  Called function 'hasCGBlock4' with arguments:
    seq = 'CCCCUGGGGACCGGCCG'
✗ 3941837263.py:8
  Result:
    False
  was NOT equivalent to the expected value:
    True
  Called function 'hasCGBlock4' with arguments:
    seq = 'CACGUCGGCAUCCGGC'
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8

----------------------------------------
Testing hasCGBlockFunction function <function hasCGBlock5 at 0x1037d36d0>

✗ 3941837263.py:8
  Result:
    False
  was NOT equivalent to the expected value:
    True
  Called function 'hasCGBlock5' with arguments:
    seq = 'CGGCC'
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✗ 3941837263.py:8
  Result:
    False
  was NOT equivalent to the expected value:
    True
  Called function 'hasCGBlock5' with arguments:
    seq = 'CCCCC'
✗ 3941837263.py:8
  Result:
    False
  was NOT equivalent to the expected value:
    True
  Called function 'hasCGBlock5' with arguments:
    seq = 'GGGGG'
✗ 3941837263.py:8
  Result:
    False
  was NOT equivalent to the expected value:
    True
  Called function 'hasCGBlock5' with arguments:
    seq = 'CGCGC'
✗ 3941837263.py:8
  Result:
    False
  was NOT equivalent to the expected value:
    True
  Called function 'hasCGBlock5' with arguments:
    seq = 'GCGCG'
✗ 3941837263.py:8
  Result:
    False
  was NOT equivalent to the expected value:
    True
  Called function 'hasCGBlock5' with arguments:
    seq = 'CCGGC'
✗ 3941837263.py:8
  Result:
    False
  was NOT equivalent to the expected value:
    True
  Called function 'hasCGBlock5' with arguments:
    seq = 'GGCCG'
✗ 3941837263.py:8
  Result:
    False
  was NOT equivalent to the expected value:
    True
  Called function 'hasCGBlock5' with arguments:
    seq = 'ACCCCC'
✗ 3941837263.py:8
  Result:
    False
  was NOT equivalent to the expected value:
    True
  Called function 'hasCGBlock5' with arguments:
    seq = 'AGGGGG'
✗ 3941837263.py:8
  Result:
    False
  was NOT equivalent to the expected value:
    True
  Called function 'hasCGBlock5' with arguments:
    seq = 'UCCGGC'
✗ 3941837263.py:8
  Result:
    False
  was NOT equivalent to the expected value:
    True
  Called function 'hasCGBlock5' with arguments:
    seq = 'UGGCCG'
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8
✗ 3941837263.py:8
  Result:
    False
  was NOT equivalent to the expected value:
    True
  Called function 'hasCGBlock5' with arguments:
    seq = 'CACGUCGGCAUCCGGC'
✓ 3941837263.py:8
✓ 3941837263.py:8
✓ 3941837263.py:8

hasCGBlock2('CCGGC')

True

8. Designing Glass-box Test Cases and Minimal Counterexamples¶

Glass-box testing occurs when you are testing a function/program whose code you can inspect. Because you can see the implementation, you can focus on test cases that take advantage of implementation details in order to attempt to get the function to misbehave.

For example:

You should supply test inputs that force every conditional branch in the code to be executed at least once.
When loops are involved, you should supply inputs that cause the loop to be executed zero, one, and multiple times.
If a loop is executed over a sequence, you should test that it processes all elements of the sequence appropriately. In particular, it should avoid so-called fence-post errors in which it fails to appropriately process the first or last elements of the sequence.
When sequence indices are involved, you should supply test inputs that force these indices to be edge cases.

The main goal in glass-box testing is finding counterexamples = inputs that cause the function to misbehave. Particularly interesting counter examples are minimal counterexamples, which are the "shortest" counterexamples. E.g., in functions with string inputs, the shortest string that exhibits a bug is a minimal counterexample.

8.1 Exercise 4: Glass-Box Testing of `hasCGBlock2`¶

Below is a buggy version of the hasCGBlock function named hasCGBlock2:

def hasCGBlock2(seq):
    count = 0
    for base in seq:
        if base in 'CG':
            count += 1
            if count == 5:
                 return True
    return False

The problem with hasCGBlock2 is that it just counts that the total number of Cs and Gs in word is at least 5 without checking that they are consecutive. It will behave correctly on strings with fewer than 5 Cs and Gs or with at least 5 consecutive Cs and Gs, but will incorrectly return True for strings that have 5 or more Cs and Gs without having 5 of them in a row.

Below, give a minimal counterexample on which hasCGBlock2 gives the incorrect answer:

# Put your counterexample for hasCGBlock2 here:
# hasCGBlock2(???)
# Your code here
hasCGBlock2('CACCCC') # Any sequence of 5 C/Gs with one A or U not at the ends will work

True

8.2 Exercise 5: Minimal Counterexamples for Other Buggy `hasCGBlock` Functions¶

Below are three other buggy versions of hasCGBlock. Develop minimal counterexamples for each of them

def hasCGBlock4(seq):
    count = 0
    for base in seq:
        if base in 'CG':
            count += 1
            return count == 5
    return False

# Put your minimal counterexample for hasCGBlock4 here: 
# Your code here
hasCGBlock4('CCCCC')
# hasCGBlock4 returns the value of the boolean expression `count == 5`
# when it encounters a C or G. Since count 1 is one after the first
# C or G, it returns False after processing the first C or G in the string. 
# So a minimal counterexample is any string beginning with 5 C/Gs,
# because it *should* return True in this case but instead returns False.

False

def hasCGBlock10(seq):
    return 'A' not in seq and 'U' not in seq

# Put your minimal counterexample for hasCGBlock10 here: 
# Your code here
hasCGBlock10('')
# hasCGBlock10 returns True for any string that does not contain A or U.
# So it returns true for any string consisting of only Cs and Gs, *including*
# the empty string. So the empty string is the minimal counterexample.

True

def hasCGBlock11(seq):
    return 'CCCCC' in seq or 'GGGGG' in seq

# Put your minimal counterexample for hasCGBlock11 here: 
# Your code here
hasCGBlock11('CGGGG')
# hasCGBlock11 only returns True when the string contains one of the 
# special strings 'CCCCC' or 'GGGGG. So it will behave incorrectly
# (by returning False) for an string of with 5 C/Gs that is not one
# of these two special strings.

False

9. Debugging Techniques¶

Test cases help us determine cases in which functions misbehave. But then how do we determine why they misbehave and how do fix them?

Here we study some debugging techniques for identifying and fixing bugs in programs. Most of these techniques involve adding print statements to a program.

You should also consult Peter Mawhorter's debugging poster, which is linked from the Reference>Debugging menu item on the CS111 web site.

9.1 Pay Attention to Error Messages¶

Sometimes bugs lead to errors when running a program. In many cases, studying the error messages will help you to identify the location of the bug. For example, can you use the error message to find and fix the bug in the following code?

def area(side):
    """
    return the area of a square with the givens side length
    """
    return size^2
    
area(10)

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[33], line 7
      2     """
      3     return the area of a square with the givens side length
      4     """
      5     return size^2
----> 7 area(10)

Cell In[33], line 5, in area(side)
      1 def area(side):
      2     """
      3     return the area of a square with the givens side length
      4     """
----> 5     return size^2

NameError: name 'size' is not defined

In the case of Syntax Errors, Thonny often only notices these one or more lines after the actual syntax error, so you need to look for the error before the line being reported in the error message. For example, where is the bug in the following example?

import random

def randomGesture():
    n = random.randint(1,3)
    if n == 1:
        # Still trying to figure out what to do here. 
        
def testRandomGesture():
    print(f'randomGesture() => {randomGesture()}')
    
testRandomGesture()

# Solution notes go here:
# After every line ending in a colon, Python *requires* at least one indented statement,
# and there is no such statement after `if n == 1:`
# But, confusingly, Python's error message indicates the problem is located
# at the colon after `def testRandomGesture():`, which is the next nonempty
# line Python processes after `if n == 1:`

  Cell In[34], line 8
    def testRandomGesture():
                            ^
IndentationError: expected an indented block after 'if' statement on line 5

9.2 Use `print` to show a function call with its arguments¶

It's generally helpful to know when a function is called and what arguments it has been called with.

Study what's printed by beats and play to help figure out why play is not correct.

def beats(gesture1, gesture2):
    '''
    In the rock/paper/scissors game: 
    * rock beats scissors
    * scissors beats paper
    * paper beats rock
    '''
    #*** DEBUGGING PRINT: Print call with args 
    print(f'beats({gesture1}, {gesture2})')  
    return (gesture1 == 'rock' and gesture2 == 'scissors'
            or gesture1 == 'scissors' and gesture2 == 'paper'
            or gesture1 == 'paper' and gesture2 == 'rock')
            
def play(you, opponent):
    #*** DEBUGGING PRINT: Print call with args 
    print(f'play({you}, {opponent})') 
    # Ignore invalid gestures for now. 
    if beats(you, opponent):
        print('You win')
    elif not beats(you, opponent):
        print('Opponent wins')
    else:
        print('Game is a tie')
            
play('scissors', 'paper')
play('paper', 'scissors')
play('paper', 'paper')

# Solution notes go here:
# Since beats('paper', 'paper') returns False, 
# not beats('paper', 'paper') returns True
# and so 'Opponent wins' is printed. 
# This can be fixed by replacing not beats(you, opponent) 
# by (opponent, you)

play(scissors, paper)
beats(scissors, paper)
You win
play(paper, scissors)
beats(paper, scissors)
beats(paper, scissors)
Opponent wins
play(paper, paper)
beats(paper, paper)
beats(paper, paper)
Opponent wins

9.3 Use `print` to show the return value of a function¶

In addition to showing the arguments to a function when a function is called, it's often a good idea to show both the arguments and the return value when it returns.

In order to do this, it is often necessary to introduce a variable (such as result) to first name the returned value so that it can be printed before it is returned (without recalculating it).

Here are example of code before/after adding the debugging prints:

import math

def squareBefore(n):
    return n*n

def squareAfter(n):
    result = n*n
    #*** DEBUGGING PRINT: Print call with args and return value
    print(f'square({n}) => {result}')
    return result

def hypotenuseBefore(a,b):
    return math.sqrt(squareBefore(a) + squareBefore(b))

def hypotenuseAfter(a,b):
    result = math.sqrt(squareAfter(a) + squareAfter(b))
    #*** DEBUGGING PRINT: Print call with args and return value
    print(f'hypotenuse({a}, {b}) => {result}')
    return result

hypotenuseAfter(3,4)

square(3) => 9
square(4) => 16
hypotenuse(3, 4) => 5.0

5.0

When returns are performed in conditional branches, you should:

Initialize result to None before the conditionals.
Replace each return Expr by result = Expr
End the function body with return result

Below are examples of some buggy code before/after adding the debugging prints. Use the printed output to help you find and fix the bugs:

def isEvenBefore(n):
    return n%2

def isEvenAfter(n):
    result = n%2
    #*** DEBUGGING PRINT: Print call with args and return value
    print(f'isEven({n}) => {result}')
    return result
          
def chooseColorBefore(index, color1, color2):
    '''If index is even, return color1; otherwise return color2'''
    if isEvenBefore(index):
          return color1
    else:
          return color2
          
def chooseColorAfter(index, color1, color2):
    '''If index is even, return color1; otherwise return color2'''
    result = None
    if isEvenAfter(index):
          result = color1
    else:
          result = color2
    print(f'chooseColor({index}, {color1}, {color2}) => {result}')
          
for i in range(4):
    chooseColorAfter(i, 'blue', 'green')
    
# Solution notes go here:
# The bug is that isEven returns n%2, not n%2 == 0
# Python treats even remainder 0 as Falsey and odd remainder 1 as Truthy,
# and so returns the wrong colors.

isEven(0) => 0
chooseColor(0, blue, green) => green
isEven(1) => 1
chooseColor(1, blue, green) => blue
isEven(2) => 0
chooseColor(2, blue, green) => green
isEven(3) => 1
chooseColor(3, blue, green) => blue

import random

def randomGestureBefore():
    n = random.randint(1,4)
    if n == 1:
        return 'rock'
    if n == 2:
        return 'paper'
    if n == 3:
        return 'scissors'
    
def randomGestureAfter():
    n = random.randint(1,4)
    result = None
    if n == 1:
        result = 'rock'
    if n == 2:
        result = 'paper'
    if n == 3:
        result = 'scissors'
    #*** DEBUGGING PRINT: Print call with args and return value
    print(f'randomGesture() => {result}')
    return result

# Test randomGestureAfter 10 times
for i in range(10):
    randomGestureAfter()
    
# Solution notes go here:
# random.randint(1,4) returns a number between 1 and 4 *inclusive*
# Since 4 is not handled by the if statements, None is returned when n is 4.

randomGesture() => None
randomGesture() => paper
randomGesture() => scissors
randomGesture() => scissors
randomGesture() => rock
randomGesture() => paper
randomGesture() => None
randomGesture() => None
randomGesture() => None
randomGesture() => paper

9.4 Use `print` to show both calling and returning from a function¶

When a function is giving an error, it's a good idea to use print to show both when the function is called and when it returns.

Here's an example; use the printed information to find and fix the bug.

def isBookendsBefore(word):
    '''Returns True if word begins and ends with the same character;
    otherwise returns False'''
    return word[0] == word[-1]

def isBookendsAfter(word):
    '''Returns True if word begins and ends with the same character;
    otherwise returns False'''
    #*** DEBUGGING PRINT: Print call with args
    print(f"Entering isBookends('{word}')")
    result = word[0] == word[-1]
    #*** DEBUGGING PRINT: Print call with args and return value
    print(f"Exiting isBookends('{word}') => {result}")

for w in ['mom', 'cat', 'I', '', 'ee']:
    isBookendsAfter(w)

Entering isBookends('mom')
Exiting isBookends('mom') => True
Entering isBookends('cat')
Exiting isBookends('cat') => False
Entering isBookends('I')
Exiting isBookends('I') => True
Entering isBookends('')

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
Cell In[39], line 16
     13     print(f"Exiting isBookends('{word}') => {result}")
     15 for w in ['mom', 'cat', 'I', '', 'ee']:
---> 16     isBookendsAfter(w)

Cell In[39], line 11, in isBookendsAfter(word)
      9 #*** DEBUGGING PRINT: Print call with args
     10 print(f"Entering isBookends('{word}')")
---> 11 result = word[0] == word[-1]
     12 #*** DEBUGGING PRINT: Print call with args and return value
     13 print(f"Exiting isBookends('{word}') => {result}")

IndexError: string index out of range

9.5 Using `print` to display iteration tables¶

We saw in Lec 08 While Loops and Lec 09 Sequences and Loop that a function adding two print statments to a loop (one right before the loop and one at the end of the loop body) can display an iteration table for the state variables of the loop.

Let's review that technique here in the context of debugging the definition of countChar given at the beginning of this notebook.

In countCharTable below, in addition to displaying the state variables i and counter, we also display word[i], since this is important for debugging.

For completeness, we might also want to print when the function is called and when it returns. But to avoid too much clutter, we will include only the iteration table prints in these examples.

def countCharTable(char, word):
    '''Return the number of time char occurs in word, ignoring case.'''
    counter = 0 
    for i in range(1, len(word)-1):
         #*** DEBUGGING PRINT: Print rows of iteration table
        print(f"countChar loop: | i: {i} | word[i]: {word[i]} | counter: {counter} |")
        if word[i] == char.lower():
            counter += 1
    return counter

We know from testing that countChar('m', 'mississippi') returns 0 when 1 is expected. Why is that? Let's see:

countCharTable('m', 'mississippi')

countChar loop: | i: 1 | word[i]: i | counter: 0 |
countChar loop: | i: 2 | word[i]: s | counter: 0 |
countChar loop: | i: 3 | word[i]: s | counter: 0 |
countChar loop: | i: 4 | word[i]: i | counter: 0 |
countChar loop: | i: 5 | word[i]: s | counter: 0 |
countChar loop: | i: 6 | word[i]: s | counter: 0 |
countChar loop: | i: 7 | word[i]: i | counter: 0 |
countChar loop: | i: 8 | word[i]: p | counter: 0 |
countChar loop: | i: 9 | word[i]: p | counter: 0 |

0

Ah, because i starts at 1 rather than 0, the letter starting the word is never counted. Let's fix that:

def countCharTableFix1(char, word):
    '''Return the number of time char occurs in word, ignoring case.'''
    counter = 0 
    for i in range(0, len(word)-1): #*** Bug Fix #1: start index should be 0, not 1
         #*** DEBUGGING PRINT: Print rows of iteration table
        print(f"countChar loop: | i: {i} | word[i]: {word[i]} | counter: {counter} |")
        if word[i] == char.lower():
            counter += 1
    return counter

Now countCharTableFix1('m', 'mississippi') works as expected:

countCharTableFix1('m', 'mississippi')

countChar loop: | i: 0 | word[i]: m | counter: 0 |
countChar loop: | i: 1 | word[i]: i | counter: 1 |
countChar loop: | i: 2 | word[i]: s | counter: 1 |
countChar loop: | i: 3 | word[i]: s | counter: 1 |
countChar loop: | i: 4 | word[i]: i | counter: 1 |
countChar loop: | i: 5 | word[i]: s | counter: 1 |
countChar loop: | i: 6 | word[i]: s | counter: 1 |
countChar loop: | i: 7 | word[i]: i | counter: 1 |
countChar loop: | i: 8 | word[i]: p | counter: 1 |
countChar loop: | i: 9 | word[i]: p | counter: 1 |

1

Below, countCharTableFix1('i', 'mississippi') still returns 3 rather than the expected 4. Why?

countCharTableFix1('i', 'mississippi')

countChar loop: | i: 0 | word[i]: m | counter: 0 |
countChar loop: | i: 1 | word[i]: i | counter: 0 |
countChar loop: | i: 2 | word[i]: s | counter: 1 |
countChar loop: | i: 3 | word[i]: s | counter: 1 |
countChar loop: | i: 4 | word[i]: i | counter: 1 |
countChar loop: | i: 5 | word[i]: s | counter: 2 |
countChar loop: | i: 6 | word[i]: s | counter: 2 |
countChar loop: | i: 7 | word[i]: i | counter: 2 |
countChar loop: | i: 8 | word[i]: p | counter: 3 |
countChar loop: | i: 9 | word[i]: p | counter: 3 |

3

Oh, the loop never processes the last letter i at word[10] because the second argument to range is len(word)-1 rather than len(word). Let's fix this second bug:

def countCharTableFix2(char, word):
    '''Return the number of time char occurs in word, ignoring case.'''
    counter = 0 
    for i in range(0, len(word)): #*** Bug Fix #1: start index should be 0, not 1
                                  #*** Bug Fix #2: start index should be len(word), 
                                  #      not len(word)-1
         #*** DEBUGGING PRINT: Print rows of iteration table
        print(f"countChar loop: | i: {i} | word[i]: {word[i]} | counter: {counter} |")
        if word[i] == char.lower():
            counter += 1
    return counter

With this second fix, countCharTableFix2('i', 'mississippi') now works as expected:

countCharTableFix2('i', 'mississippi')

countChar loop: | i: 0 | word[i]: m | counter: 0 |
countChar loop: | i: 1 | word[i]: i | counter: 0 |
countChar loop: | i: 2 | word[i]: s | counter: 1 |
countChar loop: | i: 3 | word[i]: s | counter: 1 |
countChar loop: | i: 4 | word[i]: i | counter: 1 |
countChar loop: | i: 5 | word[i]: s | counter: 2 |
countChar loop: | i: 6 | word[i]: s | counter: 2 |
countChar loop: | i: 7 | word[i]: i | counter: 2 |
countChar loop: | i: 8 | word[i]: p | counter: 3 |
countChar loop: | i: 9 | word[i]: p | counter: 3 |
countChar loop: | i: 10 | word[i]: i | counter: 3 |

4

There's still another bug. countCharTableFix2('I', 'MISSISSIPPI') still returns 0 rather than the expected 4. Why is that?

countCharTableFix2('I', 'MISSISSIPPI')

countChar loop: | i: 0 | word[i]: M | counter: 0 |
countChar loop: | i: 1 | word[i]: I | counter: 0 |
countChar loop: | i: 2 | word[i]: S | counter: 0 |
countChar loop: | i: 3 | word[i]: S | counter: 0 |
countChar loop: | i: 4 | word[i]: I | counter: 0 |
countChar loop: | i: 5 | word[i]: S | counter: 0 |
countChar loop: | i: 6 | word[i]: S | counter: 0 |
countChar loop: | i: 7 | word[i]: I | counter: 0 |
countChar loop: | i: 8 | word[i]: P | counter: 0 |
countChar loop: | i: 9 | word[i]: P | counter: 0 |
countChar loop: | i: 10 | word[i]: I | counter: 0 |

0

The reason isn't obvious from the iteration table, but it does give a hint. Why is counter not being incremented when word[i] is the letter I? It's because we're comparing word[i] with char.lower() when we should be using word[i].lower() instead. Let's fix that third bug:

def countCharTableFix3(char, word):
    '''Return the number of time char occurs in word, ignoring case.'''
    counter = 0 
    for i in range(0, len(word)): #*** Bug Fix #1: start index should be 0, not 1
                                  #*** Bug Fix #2: start index should be len(word), 
                                  #      not len(word)-1
         #*** DEBUGGING PRINT: Print rows of iteration table
        print(f"countChar loop: | i: {i} | word[i]: {word[i]} | counter: {counter} |")
        if word[i].lower() == char.lower(): #*** Bug Fix #3: add .lower() to word[i]
            counter += 1
    return counter

Now countCharTableFix3('I', 'MISSISSIPPI') works as expected.

countCharTableFix3('I', 'MISSISSIPPI')

countChar loop: | i: 0 | word[i]: M | counter: 0 |
countChar loop: | i: 1 | word[i]: I | counter: 0 |
countChar loop: | i: 2 | word[i]: S | counter: 1 |
countChar loop: | i: 3 | word[i]: S | counter: 1 |
countChar loop: | i: 4 | word[i]: I | counter: 1 |
countChar loop: | i: 5 | word[i]: S | counter: 2 |
countChar loop: | i: 6 | word[i]: S | counter: 2 |
countChar loop: | i: 7 | word[i]: I | counter: 2 |
countChar loop: | i: 8 | word[i]: P | counter: 3 |
countChar loop: | i: 9 | word[i]: P | counter: 3 |
countChar loop: | i: 10 | word[i]: I | counter: 3 |

4

If we fix the all three bugs in countChar, do we resolve all the test case failures?

Below, note that we comment out the debugging prints so they do not interfere with the testing. But we do not delete the debugging prints, since we may want to uncomment them for debugging purposes in the future!

def countCharFixed(char, word): # FIXED VERSION, WITH PRINTS COMMENTED OUT
    '''Return the number of time char occurs in word, ignoring case.'''
    counter = 0 
    for i in range(0, len(word)): #*** Bug Fix #1: start index should be 0, not 1
                                  #*** Bug Fix #2: start index should be len(word), 
                                  #      not len(word)-1
        #*** DEBUGGING PRINT: Print rows of iteration table
        #print(f"countChar loop: | i: {i} | word[i]: {word[i]} | counter: {counter} |")
        if word[i].lower() == char.lower(): #*** Bug Fix #3: add .lower() to word[i]
            counter += 1
    return counter

test_countCharFunction(countCharFixed)

----------------------------------------
Testing countChar function <function countCharFixed at 0x103de2710>

✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34
✓ 981255542.py:34

Great! We now pass all the test cases. Does that mean our function is completely correct?

Not necessarily! Maybe there are some cases in which the function still doesn't work, but they're not in our list of test cases. So it may be too early to declare victory, but we've increased our confidence in the correctness of the countChar function definition.

9.6 Use the Thonny Debugger¶

Instead of (or in addition to) sprinking prints in your program, another way to debug is to use the debugging features of Thonny, like setting breakpoints, stepping over and into function, and examining the state of the program variables.

For example, the hasCGBlock7 function below encounters an index error when run on the input 'CGACCGG'.

def hasCGBlock7(seq):
    for index in range(len(seq)):
        if (seq[index] in 'CG'
            and seq[index+1] in 'CG'
            and seq[index+2] in 'CG'
            and seq[index+3] in 'CG'
            and seq[index+4] in 'CG'):
            return True
    return False

print(hasCGBlock7('CGACCGG'))

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
Cell In[52], line 11
      8             return True
      9     return False
---> 11 print(hasCGBlock7('CGACCGG'))

Cell In[52], line 7, in hasCGBlock7(seq)
      1 def hasCGBlock7(seq):
      2     for index in range(len(seq)):
      3         if (seq[index] in 'CG'
      4             and seq[index+1] in 'CG'
      5             and seq[index+2] in 'CG'
      6             and seq[index+3] in 'CG'
----> 7             and seq[index+4] in 'CG'):
      8             return True
      9     return False

IndexError: string index out of range

To understand why this happens:

Copy the above cell (both hasCGBlock7 function and print call) into a new Thonny editor window, and save the file (say as hasCGBlock7.py)
In Thonny, set a breakpoint by double-clicking on line 3 (the if statement)
Click on the bug icon to run the file in debugging mode. It should stop execution when it first encounters the breakpoint. At this point, the value of the index variable (shown at the bottom of the function frame for hasCGBlock7('CGACCGG')) will be 0.
Click on the green debugging triangle (to the left of the stop sign) to resume execution until the breakpoint is encoutered again. Each time you click on the green triangle, index will increment. When index is 3 and you click on the green triangle, an index error will be encountered. This indicates that the index error happens when index is 3. This error will leave the debugger.
Repeat steps #3 and #4, except in #4 do not click the green debugging triangle when index is 3. Instead, click the step into icon many times to see the detailed evaluation of the test expression of the conditional (which has four ands) watch carefully for when the index error occurs. What line does it occur at and why?

10. Exercise 6: Debugging Buggy Versions of `hasCGBlock`¶

Use the debugging techniques above, particularly printing iteration tables, to identify (but not necessarily fix) the bugs in the following buggy versions of hasCGBlock. Test the print-augmented versions on potential counterexamples to identify bugs.

def hasCGBlock3(seq):
    count = 0
    for base in seq:
        if base in 'CG':
            count += 1
            if count == 5:
                 return True
        else:
            return False

# Try the debugging version on potential counterexamples
# hasCGBlock3Table('CGACCGCGU')

# Write your debugging version below:
# Your code here
def hasCGBlock3Table(seq):
    #*** DEBUGGING PRINT: Print call with args
    print(f"Entering hasCGBlock3('{seq}')")
    count = 0
    for base in seq:
        #*** DEBUGGING PRINT: Print row of iteration table
        print(f"hasCGBlock3 loop: | base: {base} | count: {count} |")
        if base in 'CG':
            count += 1
            if count == 5:
                #*** DEBUGGING PRINT: Print call with args and return value
                print(f"Exiting hasCGBlock3('{seq}') => {True}")
                return True
        else:
            #*** DEBUGGING PRINT: Print call with args and return value
            print(f"Exiting hasCGBlock3('{seq}') => {False}")
            return False
        #*** DEBUGGING PRINT: Print remaining rows of iteration table
        print(f"hasCGBlock3 loop: | base: {base} | count: {count} |")
    # Note: function will return None if it reaches this point
    #*** DEBUGGING PRINT: Print call with args and return value
    print(f"Exiting isBeauteous2('{word}') => {None}")       


# Try the debugging version on potential counterexamples
hasCGBlock3Table('CGACCGCGU')

# Solution notes go here:
# The counter is correctly incremented for C/G, 
# but function returns false when the first non-C/G is encountered

Entering hasCGBlock3('CGACCGCGU')
hasCGBlock3 loop: | base: C | count: 0 |
hasCGBlock3 loop: | base: C | count: 1 |
hasCGBlock3 loop: | base: G | count: 1 |
hasCGBlock3 loop: | base: G | count: 2 |
hasCGBlock3 loop: | base: A | count: 2 |
Exiting hasCGBlock3('CGACCGCGU') => False

False

def hasCGBlock7(seq):
    for index in range(len(seq)):
        if (seq[index] in 'CG'
            and seq[index+1] in 'CG'
            and seq[index+2] in 'CG'
            and seq[index+3] in 'CG'
            and seq[index+4] in 'CG'):
            return True
    return False

# Try the debugging version on potential counterexample
# hasCGBlock7Table('CGACCG')

# Write your debugging version below:
# Your code here
def hasCGBlock7Table(seq):
    #*** DEBUGGING PRINT: Print call with args
    print(f"Entering hasCGBlock7('{seq}')")  
    for index in range(len(seq)): 
        #*** DEBUGGING PRINT: Print row of iteration table
        print(f"hasCGBlock7 loop: | index: {index} |")
        if (seq[index] in 'CG'
            and seq[index+1] in 'CG'
            and seq[index+2] in 'CG'
            and seq[index+3] in 'CG'
            and seq[index+4] in 'CG'):
            #*** DEBUGGING PRINT: Print call with args and return value
            print(f"Exiting hasCGBlock7('{seq}') => {True}")
            return True
    #*** DEBUGGING PRINT: Print call with args and return value
    print(f"Exiting hasCGBlock7('{seq}') => {True}")
    return False

#Try the debugging version on potential counterexamples
hasCGBlock7Table('CGCCGA')
hasCGBlock7Table('CGACCG')

# Solution notes go here:
# The function succeeds on 'CGCCGA' by verifying that the 5 indices
# starting at 0 all contain a C or G
#
# The function fails on 'CGACCG' when index = 3, because seq[index+3]
# is seq[6], and index 6 is out of bounds
#
# Why didn't it similarly fail at index = 2 at the line seq[index+4]?
# Because Python has a "short-circuit" `and` construct, and since
# seq[index] = 'A' when index is 2, tha `and` returns false immediately
# without testing seq[index+1],  seq[index+2], etc.

Entering hasCGBlock7('CGCCGA')
hasCGBlock7 loop: | index: 0 |
Exiting hasCGBlock7('CGCCGA') => True
Entering hasCGBlock7('CGACCG')
hasCGBlock7 loop: | index: 0 |
hasCGBlock7 loop: | index: 1 |
hasCGBlock7 loop: | index: 2 |
hasCGBlock7 loop: | index: 3 |

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
Cell In[54], line 36
     34 #Try the debugging version on potential counterexamples
     35 hasCGBlock7Table('CGCCGA')
---> 36 hasCGBlock7Table('CGACCG')
     38 # Solution notes go here:
     39 # The function succeeds on 'CGCCGA' by verifying that the 5 indices
     40 # starting at 0 all contain a C or G
   (...)
     47 # seq[index] = 'A' when index is 2, tha `and` returns false immediately
     48 # without testing seq[index+1],  seq[index+2], etc. 

Cell In[54], line 25, in hasCGBlock7Table(seq)
     19 for index in range(len(seq)): 
     20     #*** DEBUGGING PRINT: Print row of iteration table
     21     print(f"hasCGBlock7 loop: | index: {index} |")
     22     if (seq[index] in 'CG'
     23         and seq[index+1] in 'CG'
     24         and seq[index+2] in 'CG'
---> 25         and seq[index+3] in 'CG'
     26         and seq[index+4] in 'CG'):
     27         #*** DEBUGGING PRINT: Print call with args and return value
     28         print(f"Exiting hasCGBlock7('{seq}') => {True}")
     29         return True

IndexError: string index out of range

This is the end of the notebook!

CS111 Lecture: Testing and Debugging¶

1. Overview¶

2. A Function Testing Example: countChar¶

3. Towards Automated Testing: Printing Test Cases¶

4. Digression: Creating complex strings with Python 3's f-strings¶

5. Exercise 1: Developing Test Cases That Show countChar is Buggy¶

6. Using optimism for input/output testing¶

6.1 optimism Example: Testing countChar¶

6.2 Exercise 2: Expressing your Example 1 Tests Using optimism¶