This task is part of project05 which is due at 23:00 EDT on 2024-10-08.
You have the option to work with a partner on this task if you wish. Working with a partner requires more work to coordinate schedules, but if you work together and make sure that you are both understanding the code you write, you will make progress faster and learn more.
You can download the starter code for this task using this link.
You can submit this task using this link.
Put all of your work for this task into the file
genetics.py
(which is provided among the starter files)
This task will give you practice using loops to deal with strings letter-by-letter. It is biology-themed, but no understanding of biology should be required to complete it.
RNA and DNA are molecules that make up our genetic code, and their structure can be represented as a sequence of bases, represented by the letters 'A', 'T', 'C', 'G', and 'U'. Each base is one letter. In our bodies, DNA gets translated into RNA, by replacing each 'A' with a 'U', each 'T' with an 'A', each 'C' with a 'G', and each 'G' with a 'C'. In this task we will be dealing with strings containing these letters, and writing functions to process these strings using loops that consider each letter individually. There are 5 different functions you must write:
countOtherBases
-
This function takes two arguments — (1) a sequence of bases
and (2) a particular base to exclude — and returns the number of
bases (i.e., letters) in the
sequence that are different from excluded base. These examples for
countOtherBases
show how it works. templateSequence
- Given a
sequence of RNA bases, returns the sequence of DNA bases that must
have been used to create that RNA
sequence (i.e., the template
for that piece of RNA). You must use the provided templateBase
function
to compute the DNA template base that corresponds to a single RNA
base within this function. These examples for
templateSequence
demonstrate how it
works.onlyAU
- Given an RNA sequence,
returns a new sequence consisting of only the 'A' and 'U' bases
from the original sequence with the
'C' and 'G' bases removed. These examples for
onlyAU
demonstrate how it must work.transcriptionErrors
-
Given two sequences, one of RNA and one of DNA, looks for
transcription errors where the base in the RNA sequence doesn't
match the base in the DNA sequence. Each RNA base is supposed to be
transcribed from a particular DNA base (for example, the DNA base
'A' becomes the RNA base 'U'), but sometimes errors occur in this
process. Given an RNA sequence and the DNA it came from, any place
where the corresponding bases aren't paired (according to the
templateBase
function) is a transcription error. This function
must return the number of transcription errors evident in the given
RNA and DNA pair. This
example of transcriptionErrors
demonstrates how this should work. For this function, you may NOT
assume that the two strings provided will always be the same length,
and if one is longer, in addition to counting errors in the portion
where they overlap, each extra base in the longer string should
count as an error, although we will only test this as an extra
goal.hasCGBlock
- Given a sequence of
RNA, this function must return True
if it contains a block of 5
consecutive 'C' and/or 'G' bases, and False
otherwise. The block may be any
combination of 'C' and 'G' bases as long as there are 5 in a row
with no other bases in between them. But if other bases are
present, there might be more than 5 total 'C' or 'G' bases in the
sequence without it actually containing a 'CG' block. These
examples of hasCGBlock
demonstrate how it
must work.Once you have written these five functions and have tested them to make sure they work correctly, you are done with this task. Refer to the examples below in understanding how these functions should work.
genetics.py
file contains a correct definition of the
templateBase
function that you must use in your definitions
of templateSequence
and transcriptionErrors
optimism
for this task in the
test_genetics.py
file. This time around, we have not provided
enough tests to really check most of your functions. You are
encouraged to edit test_genetics.py
and add more test cases (just
follow the format of the cases that we've provided) if you want to be
sure that your functions work correctly.countOtherBases
Examples
Examples of how countOtherBases
works. You may assume that the second argument will always be a single letter.
In []:Out[]:countOtherBases('GAUUACA', 'A')
In []:4
Out[]:countOtherBases('GAUUACA', 'G')
6
templateSequence
Examples
Examples of how templateSequence
works. You are required to use the provided templateBase
function to compute the template version of each RNA base.
In []:Out[]:templateSequence('GACU')
In []:'CTGA'
Out[]:templateSequence('AAA')
In []:'TTT'
Out[]:templateSequence('UUU')
'AAA'
onlyAU
Examples
Examples of how onlyAU
works.
In []:Out[]:onlyAU('GUACGU')
In []:'UAU'
Out[]:onlyAU('AAU')
In []:'AAU'
Out[]:onlyAU('GCC')
''
transcriptionErrors
Examples
Examples of how transcriptionErrors
works. Note that it looks at the
first RNA base and the first DNA base together, determines if they are a
correct match or not (according to the templateBase
function), and
then moves on and repeats this process for each pair of bases, counting
the number of mismatched pairs. In the first example, for instance,
because the templateBase
for 'A' is 'T', the first 'A' of the RNA
sequence and the first 'T' of the DNA sequence are matching (i.e., not
an error). The second bases also match (they're 'A' and 'T' again), but
in the third position, the 'A' in the RNA does not match the 'C' in the
DNA (it should have been a third 'T', or the RNA base should have been a
'G'). Thus the number of errors is 1 for those two sequences. The second
example shows two sequences that are perfectly matched, while the third
example contains two errors: the initial 'C' matched with a 'C', and the
second 'A' in the DNA sequence matched with an 'A' in the RNA sequence.
In []:Out[]:transcriptionErrors('AAA', 'TTC')
In []:1
Out[]:transcriptionErrors('AAG', 'TTC')
In []:0
Out[]:transcriptionErrors('CAGUAGG', 'CTCAACC')
In []:2
Out[]:transcriptionErrors('GAUA', 'CTTTCG')
3
hasCGBlock
Examples
Examples of how hasCGBlock
works. Note that in the second example,
there are a total of at least 5 'C's and 'G's, but they don't form a
continuous block (the 'A' interrupts them).
In []:Out[]:hasCGBlock('CGGCC')
In []:True
Out[]:hasCGBlock('CGACCG')
In []:False
Out[]:hasCGBlock('CGACCGCGU')
True
=
or by defining a parameter for a function) you must also later use that variable as part of another expression. If you need to create a variable that you won't use, it must have the name _
, but you should only do this if absolutely necessary.countOtherBases
must return the correct result
countOtherBases
function is run must match the solution result.countOtherBases
must return the correct result
countOtherBases
function is run must match the solution result.templateSequence
must return the correct result
templateSequence
function is run must match the solution result.templateSequence
must return the correct result
templateSequence
function is run must match the solution result.onlyAU
must return the correct result
onlyAU
function is run must match the solution result.onlyAU
must return the correct result
onlyAU
function is run must match the solution result.transcriptionErrors
must return the correct result
transcriptionErrors
function is run must match the solution result.transcriptionErrors
must return the correct result
transcriptionErrors
function is run must match the solution result.hasCGBlock
must return the correct result
hasCGBlock
function is run must match the solution result.hasCGBlock
must return the correct result
hasCGBlock
function is run must match the solution result.countOtherBases
with 2 parameters
def
to define countOtherBases
with 2 parameterscountOtherBases
with 2 parameters, use any kind of loop in exactly one place.countOtherBases
with 2 parameters
def
to define countOtherBases
with 2 parameterscountOtherBases
with 2 parameters, use any kind of loop in at least one place.countOtherBases
with 2 parameters, use an if
statement (possibly accompanied by an elif
or else
block) in at least one place.return
statement
countOtherBases
with 2 parameters, use return _
in at least one place..count
countOtherBases
with 2 parameters, strings have a built-in .count
method, but you may not use is in this function.templateSequence
with 1 parameter
def
to define templateSequence
with 1 parametertemplateSequence
with 1 parameter, use any kind of loop in exactly one place.templateSequence
with 1 parameter
def
to define templateSequence
with 1 parametertemplateSequence
with 1 parameter, use any kind of loop in at least one place.templateBase
templateSequence
with 1 parameter, call templateBase
in at least one place.return
statement
templateSequence
with 1 parameter, use return _
in at least one place.onlyAU
with 1 parameter
def
to define onlyAU
with 1 parameteronlyAU
with 1 parameter, use any kind of loop in exactly one place.onlyAU
with 1 parameter
def
to define onlyAU
with 1 parameteronlyAU
with 1 parameter, use any kind of loop in at least one place.onlyAU
with 1 parameter, use an if
statement (possibly accompanied by an elif
or else
block) in at least one place.return
statement
onlyAU
with 1 parameter, use return _
in at least one place.transcriptionErrors
with 2 parameters
def
to define transcriptionErrors
with 2 parameterstranscriptionErrors
with 2 parameters, use any kind of loop in exactly one place.transcriptionErrors
with 2 parameters
def
to define transcriptionErrors
with 2 parameterslen
transcriptionErrors
with 2 parameters, call len
in at least one place.transcriptionErrors
with 2 parameters, use any kind of loop in at least one place.transcriptionErrors
with 2 parameters, use an if
statement (possibly accompanied by an elif
or else
block) in at least one place.return
statement
transcriptionErrors
with 2 parameters, use return _
in at least one place.hasCGBlock
with 1 parameter
def
to define hasCGBlock
with 1 parameterhasCGBlock
with 1 parameter, use any kind of loop in exactly one place.hasCGBlock
with 1 parameter, use an if
statement (possibly accompanied by an elif
or else
block) in at least one place.return
statement
if
/else
block within the loop within the definition of hasCGBlock
with 1 parameter, use return _
in at least one place.hasCGBlock
with 1 parameter
def
to define hasCGBlock
with 1 parameterhasCGBlock
with 1 parameter, use any kind of loop in at least one place.hasCGBlock
with 1 parameter, use an if
statement (possibly accompanied by an elif
or else
block) in at least one place.return
statement
hasCGBlock
with 1 parameter, use return _
in at least one place.