This task is part of project06 which is due at 23:00 EDT on 2024-10-22.
You have the option to work with a partner on this task if you wish. Working with a partner requires more work to coordinate schedules, but if you work together and make sure that you are both understanding the code you write, you will make progress faster and learn more.
You can download the starter code for this task using this link.
You can submit this task using this link.
Put all of your work for this task into the file
audioToolkit.py
(you will create this file from scratch)
Note that this task is required, in addition to one of the two other tasks this week.
Note: Do not modify any of the starter code in the python files
audio_config.py
or waveTools.py
. You will need to modify
apply_audioToolkit.py
if you want to hear the effects in action, and you
could modify test_audioToolkit.py
if you wanted to add extra tests.
This task involves writing several functions that will process audio files with digital effects. Our program will work only on .wav audio files. Most of the information in a .wav audio file is simply a list of numbers that represent the audio signals of songs, speech, or noise. Yes, the audio you listen to on your computer is stored as a list of numbers!
Most audio files, including .wav files, have two lists of numbers, one for the left speaker and one for the right. These two lists are called channels and the numbers in each list are called samples. We will design most of our audio effects to work on a single channel and the starter code will handle processing both channels.
The effects you will write will just manipulate list of numbers, but the
coolest part is that you can (optionally) apply the effects to some real
audio files by using the supplied apply_audioToolkit.py
file. You will
need comment out or uncomment lines in the applyEffects
function in
that file to select which effect(s) to apply to which file(s).
simpleaudio
package to let Python play sounds directlyIn order for your computer to be able to play the sound files used in the
applyEffects
function, you will need to install an extra Python library
called simpleaudio
. Otherwise no sound will be played directly, but you
will still be able to open the processed .wav files manually to hear the
audio output.
In Thonny, you can install simpleaudio
using the following steps:
simpleaudio
into the search
field and press ENTER.simpleaudio
package should be displayed;
click on the Install
button to install it, which should be a
fairly quick process. We have found that for some people, particularly on Windows computers, the install process fails. In that case, click on the details below to see some options for trying to deal with that.
Note: before following these instructions, remember that you can always just skip this step. Python won't play sounds directly, but you will be to open up the .wav files it creates manually and hear the sounds, both for this assignment and for other assignments involving audio.
In Thonny when the install fails, an error window will pop up (if you already closed it, you can simply try the install again). If the text there matches this example, then there may be a way around the problem:
SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools. warnings.warn( running build running build_py creating build creating build\lib.win-amd64-cpython-310 creating build\lib.win-amd64-cpython-310\simpleaudio copying simpleaudio\__init__.py -> build\lib.win-amd64-cpython-310\simpleaudio copying simpleaudio\shiny.py -> build\lib.win-amd64-cpython-310\simpleaudio copying simpleaudio\functionchecks.py -> build\lib.win-amd64-cpython-310\simpleaudio creating build\lib.win-amd64-cpython-310\simpleaudio\test_audio copying simpleaudio\test_audio\c.wav -> build\lib.win-amd64-cpython-310\simpleaudio\test_audio copying simpleaudio\test_audio\e.wav -> build\lib.win-amd64-cpython-310\simpleaudio\test_audio copying simpleaudio\test_audio\g.wav -> build\lib.win-amd64-cpython-310\simpleaudio\test_audio copying simpleaudio\test_audio\left_right.wav -> build\lib.win-amd64-cpython-310\simpleaudio\test_audio copying simpleaudio\test_audio\notes_2_16_44.wav -> build\lib.win-amd64-cpython-310\simpleaudio\test_audio running build_ext building 'simpleaudio._simpleaudio' extension error: Microsoft Visual C++ 14.0 or greater is required. Get it with "Microsoft C++ Build Tools": https://visualstudio.microsoft.com/visual-cpp-build-tools/ [end of output] note: This error originates from a subprocess, and is likely not a problem with pip. error: legacy-install-failure × Encountered error while trying to install package. ╰─> simpleaudio note: This is an issue with the package mentioned above, not pip. hint: See above for output from the failure. Process returned with code 1
If your error message is substantially different from what's shown above, then these steps might not help (and if you're not on Windows, they definitely won't). The key point is the suggestion in the error message about installing Microsoft Visual C++.
It's a non-trivial process which will take up ~8GB of space, although having basic build tools installed on your computer may come in handy at other points in the future. The link to the build tools that's in the error message (which I've duplicated in the first part of this sentence so you can just click it) is the right place to start. On that page, click the "Download Build Tools" button, which will download an installer. Once that (small) download finishes, open it up to start the installation process.
Once the installer opens, it has the typical licence agreement, but then takes you to a confusing page with a huge amount of options for what to install. I've included a screenshot of that page below with the correct option selected, but the only thing you need is the very first one in the upper-left corner: The one titled "Desktop development with C++."
Once you select that option, click the "Install" button, but make sure you have enough free space first or the process might not complete. It will go to a progress screen showing download and installation progress which will take a while. You can leave it going in the background and do other things (although don't shut off or hibernate your computer). Depending on your computer and your internet connection, I'd expect this step to take 10-30 minutes for most people.
Once it's installed, it will recommend that you restart, but I didn't
have to do that: just go back to Thonny and try to install simpleaudio
again. If that doesn't work, I'd try to restart the computer once to see
if that helps things, and if it still doesn't work, feel free to reach
out to an instructor, but at that point it might be best to give up on
simpleaudio
, since it's not strictly necessary.
As usual, you must document each function you write.
As a general rule, all effect functions must return a new list, and you must not modify the original channel or samples that is provided as an argument. Each function must also use at least one loop.
Due to the representation of floating point numbers in computers, certain mathematical calculations will lead to very tiny errors. These tiny errors are okay. DO NOT ROUND YOUR ANSWERS. If your answers differ only ever so slightly from our output below, then your answer is almost certainly correct. The testing code should ignore these errors if they are indeed just issues with floating-point computation and not larger errors.
makeSofter
The function makeSofter
is designed
to make your audio file softer. makeSofter
takes a channel (i.e., a
list of numbers) and must return a new list where each value has been
multiplied by a factor of 0.1. In
general, the numbers representing audio (i.e., samples) should lie in the
range of -1 to 1. Audio files are perceived as louder the closer those
samples get to -1 and 1. Therefore, multiplying all values by 0.1 will
narrow the range of the samples, making the audio softer. This example
of makeSofter
shows how it works.
Notice from the example how some of the answers have very small errors when attempting to multiply by 0.1. Again that is okay; we will be testing the accuracy of your results out to only 3 decimal places.
chipmunk
The function chipmunk
is designed to
make everything sound like it was made by Alvin and the Chipmunks!
Technically, all the pitches and frequency of the sound become higher.
When applied to the human voice, it gives a chipmunk-like effect.
chipmunk
takes a channel and returns a new channel of only those
samples at even indices. The other
samples are discarded. This example shows how chipmunk
works.
Note that you must use a loop, and you are NOT allowed to just use a slice, even though that would normally be one possible solution.
removeVocals
The function removeVocals
is
designed to remove the vocal track from a song. The approach is crude
and simple: it only works for certain songs when the vocal track is
evenly split between the left channel and the right channel. Therefore,
we can subtract one channel from the other and the remaining result
should be the rest of the song. Accordingly, removeVocals
requires two
channels as arguments: the left and the right channel. removeVocals
will return a single channel.
To illustrate the approach, let's take two channels as an example: a left
channel of [0.1, 0.5, -0.2, 0.3]
and a right channel of [-0.1, -0.6,
0.3, 0.1]
. To remove the vocals subtract all the values in the right
channel from the corresponding values of the left
channel. That should yield a result
of [0.2, 1.1, -0.5, 0.2]
. You'll notice that with the right
combination of numbers it is possible for the subtraction to produce a
result outside of our preferred range of -1 to 1. The final step to the
approach is to divide all the results by two. Therefore, the final
result should be [0.1, 0.55, -0.25, 0.1]
.
You may assume that the left channel and right channels are the same length.
This example demonstrates how removeVocals
works.
As a final note about this effect, it works decently well on
sound file "distance.wav"
for "The Distance" by Cake, but poorly on the others. Even so with the Cake example, you can
hear that some of the other instrumentation is degraded, meaning that
those instruments were relatively evenly split between the left and right
channels. It is incredibly difficult to devise a strategy for this that
works universally.
reverse
The function reverse
takes a channel
and reverses the order of the samples.
Like all effect functions, reverse
must return a new channel rather
than modifying the channel it is given. This is an old audio technique
that can be used to create swells. It has also been used in some
contexts to create hidden messages that can be revealed only when the
listener plays the audio backwards. Check out this Wikipedia article on
Backmasking.
For this function, you may not use the reversed
function or the method
.reverse()
, and you must
use a loop even though it's possible
to accomplish this with just a slice. In any other context, you should
use one of those tools for a task like this, since it is usually
inadvisable to try and rewrite something that has already been written
for you. However, for pedagogical purposes, implementing reverse
is
great practice for working with loops and indices.
These examples show how reverse
should work.
twoSampleDelay
The function twoSampleDelay
creates an underwater effect by removing the higher pitches in the audio
file. In technical terms, this is called a Lowpass Filter. A filter is
a processing technique that removes parts of an audio signal. Filtering
is a common technique in lots of data processing especially for images
and video.
twoSampleDelay
works by adding together the original channel and the
original channel shifted by two indices to the
right. For example, suppose we had
a channel of [0.1, 0.4, -0.1, 0.3]
. A shifted version of that channel
would be [?, ?, 0.1, 0.4]
. We can see that the original 0.1 has now
moved two indices to the right as well as the sample 0.4. Notice how -0.1
and 0.3 "fall off" and are no longer retained when we shift. We want to
ensure that the shifted version is the same length as the original.
There is also a question about how to fill the two spots on the left of
the shifted version, represented above with question marks. Those will
be filled by arguments passed to twoSampleDelay
. The parameter
twoSampleBack
will be used for index 0 and oneSampleBack
will be used
for index 1. Suppose then that twoSampleBack
had a value of -0.9 and
oneSampleBack
had a value of -0.7. Then if channel [0.1, 0.4, -0.1,
0.3]
is shifted, we would get [-0.9, -0.7, 0.1, 0.4]
.
The final step to the strategy is to add the original and shifted version together and divide by two. When we add, we add at each index as shown below to produce a new list.
| 0.1 | 0.4 | -0.1 | 0.3 | + | -0.9 | -0.7 | 0.1 | 0.4 | ----------------------------- | -0.8 | -0.3 | 0.0 | 0.7 |
Then we divide each sample by two. In this example, that would give us a
final result of [-0.4, -0.15, 0, 0.35]
.
twoSampleDelay should work with lists of size zero to two as well!
Some hints:
twoSampleBack
can be a little complicated. Make sure
you understand how the shifted version is constructed before you
start writing your code.oneSampleBack
and
twoSampleBack
as you progress through the loop to accomplish the
same task. Reducing the number of loops in your program can pay big
dividends in terms of efficiency, especially when working with large
files like audio files. These examples demonstrate how twoSampleDelay
should
work.
As usual, we've provided a file named test_audioToolkit.py
which will
test your functions when you run it. It will only test the functions that
you've actually defined, so you should run it each time you finish a
function to make sure you have it correct before moving on to the next
one. If you attempt the challenges below, it will also test those.
If you'd like to challenge yourself, here are a few more functions you could write. These are not part of the rubric and will not be graded; they are purely for your own benefit if you have extra time.
ohYeah
(ungraded)Note: ohYeah
is not part of the rubric and does not count towards
your grade. It is an optional challenge problem.
The function ohYeah
will be used to drop the pitch, similar to how
chipmunk
raised the pitch. There is no particular name associated with
this effect but a famous example comes from the song "Oh Yeah" by Yello
popularized in Ferris Bueller's Day Off. You can listen
here.
ohYeah
works by retaining all the samples of the original channel and
adds new samples in between. The values of the new samples are always
halfway between the original samples. ohYeah
accepts two parameters:
a channel and a sample called prevSample
.
Suppose we call ohYeah
with the channel [0.1, -0.3, 0.3]
and a
previous sample of 0.6 as in ohYeah([0.1, -0.3, 0.3], 0.6)
.
We'll keep all the samples from the original channel and we will add
values that are halfway between. Halfway between 0.1 and -0.3 is -0.1
and halfway between -0.3 and 0.3 is 0. Therefore the final output will
contain [0.1, -0.1, -0.3, 0.0, 0.3]
.
The final step is to prepend a sample that is halfway between the
previous sample and the 0th sample of the original list. In this
example, 0.35 is halfway between the previous sample of 0.6 and the 0th
index from the original channel of 0.1. Notice that ohYeah
will always
retain the original samples but double the length by adding samples in
between the originals and attaching a new one to the front.
The image below gives a visual depiction of how each sample in the output is generated by the inputs. The blue dotted lines show how the output sample is the halfway point between two of the original samples. The red arrow shows where the sample from the original channel is preserved.
This process of generating new sample points between others is called interpolation and is used often for many audio effects.
If you are curious why we need this previous sample, the starter code for
audioToolkit.py
processes an audio file in chunks. In order to connect
each chunk smoothly and without any glitches, some of the effects like
ohYeah
need information from the previous chunk. Here we need the last
sample of the previous chunk in order to properly calculate the halfway
point to the start of the next chunk.
crescendo
(ungraded)Note: crescendo
is not part of the rubric and does not count towards
your grade. It is an optional challenge problem.
The function crescendo
creates a fade-in on the audio track for the
duration of the song length. The simplest way to produce a fade-in is to
generate a list of samples that incrementally grow from some start point
up to some end point, very similar to what the range
function does,
except that they're not integers.
For example, suppose we want to generate 4 samples that grow from 0.2 up
to 0.5. Such a list of samples would be [0.2, 0.3, 0.4, 0.5]
. Note
how we include both 0.2 and 0.4 at the ends, and that with 4 numbers in
the list, there are 3 spaces between them, so the intermediate numbers
are 1/3 and 2/3 of the way in between the endpoints. Also notice that
the rate of increase is always the same (i.e., 0.1) from one value to the
next.
To create a fade-in, then, we multiply our increasing samples by the
original samples at each index and return a new list. For example,
suppose our original samples are [0.2, -0.4, 0.6, 0.1]
and we want
fade-in from 0.2 to 0.5, then we would multiply each index as follows:
| 0.2 | -0.4 | 0.6 | 0.1 | * | 0.2 | 0.3 | 0.4 | 0.5 | --------------------------------- | 0.04 | -0.12 | 0.24 | 0.05 |
The final result is [0.04, -0.12, 0.24, 0.05]
. The length of the
increasing samples should always match the length of the original
samples.
The function crescendo
takes three arguments: channel
, startVolume
,
and endVolume
. It should return a channel of the same length.
startVolume
represents the starting point of the ramping samples and
endVolume
represents the ending point of the ramping samples.
startVolume
and endVolume
are values between 0 and 1 that represent
what fraction of the song's overall volume should be achieved with 1
being full volume.
Note: custom
is not part of the rubric and does not count towards
your grade. It is an optional challenge problem.
The function custom
is a function to allow you to explore your own
audio effect. But it can be interesting to see how manipulation of
samples produces interesting audio effects. Remember though to keep your
range of numbers between -1 and 1. Otherwise you could damage your
speakers or your ears!
Do not add any parameters to custom
otherwise it will not work with the
starter code (you could also modify the starter code to compensate for
this, but doing so is complex).
To process an audio file using your custom effect, make sure to call
applyEffect
with the second argument as "custom"
. For example,
applyEffect("poker.wav", "custom")
Show us all the fun ways you can experiment with music!
applyEffect
and applyEffects
The apply_audioToolkit.py
starter file contains applyEffect
and
applyEffects
functions which can apply the effects you write to sound
clips we've provided. The starter code contains a folder called sounds
which contains audio excerpts from the following songs:
"poker.wav"
: "Poker Face" by Lady Gaga "hello.wav"
: "Hello" by Adele"prayer.wav"
: "I Say A Little Prayer" by Aretha Franklin"distance.wav"
: "The Distance" by CakeThe function applyEffect
is called to process one of the four audio
files from above with one of the audio effects you will write. You do
not need to write applyEffect
. It has already been written for you.
The function applyEffect
has two parameters:
"poker.wav"
)applyEffect
will process the sound file using the given effect,
and store the processed sound file in the sounds
folder
with _proc
after the base file name. For example,
applyEffect("poker.wav", "makeSofter")
will make the volume of the song "Poker Face" by Lady Gaga softer
and place the new file in the sounds
folder under the name
"poker\_proc.wav"
. You could rename this to
"poker\_softer.wav"
if you want to save this file and prevent
it from being overwritten by another call to applyEffect
.
The apply_audioToolkit.py
file provides function calls to applyEffect
to apply effects to the above songs. Uncomment any call to try your
effect! You could also copy-paste them and modify the arguments to apply
different effects to different audio excerpts.
In the code for applyEffect
, you can see how it uses an if/elif
structure to look at the string provided and decide which function to
call.
The rubric includes some extra goals which require that your code does
not waste fruit or waste
boxes: you shouldn't ignore the
results of a fruitful function, nor should you create any variables that
you don't use. This applies to loop variables defined by a for
loop as
well as regular variables and parameters, but sometimes you really don't
need to use a loop variable, even though Python forces you to declare one.
If you need to create a for
loop but you don't need to use the loop
variable, use _
(a single underscore) as the name of the variable to
indicate that you are intentionally not using it. The "don't waste boxes"
check will be skipped for any variable with that name.
makeSofter
Examples
These examples show how makeSofter
works. Notice how the number of samples in the output channel is the same as the
number of samples in the input channel, but each number is scaled. Also
note that we will ignore the small rounding errors that Python sometimes
produces.
In []:Out[]:makeSofter([0, 0.1, 3, -1.3])
In []:[0.0, 0.010000000000000002, 0.30000000000000004, -0.13]
Out[]:makeSofter([-0.2, 0.4, 0.1, 0.2])
In []:[ -0.020000000000000004, 0.04000000000000001, 0.010000000000000002, 0.020000000000000004, ]
Out[]:makeSofter([-0.2, 0.1])
[-0.020000000000000004, 0.010000000000000002]
chipmunk
Examples
These examples show how chipmunk
works. Notice that the result has exactly half as many samples (rounded up) as
in the input channel.
In []:chipmunk([0, -0.7, 0.3