Instructions for audioToolkit

(produced at 17:00 UTC on 2024-10-20)

This task is part of project06 which is due at 23:00 EDT on 2024-10-22.

You have the option to work with a partner on this task if you wish. Working with a partner requires more work to coordinate schedules, but if you work together and make sure that you are both understanding the code you write, you will make progress faster and learn more.

You can download the starter code for this task using this link.

You can submit this task using this link.

Put all of your work for this task into the file audioToolkit.py
(you will create this file from scratch)

Note that this task is required, in addition to one of the two other tasks this week.

Note: Do not modify any of the starter code in the python files audio_config.py or waveTools.py. You will need to modify apply_audioToolkit.py if you want to hear the effects in action, and you could modify test_audioToolkit.py if you wanted to add extra tests.

General Overview

This task involves writing several functions that will process audio files with digital effects. Our program will work only on .wav audio files. Most of the information in a .wav audio file is simply a list of numbers that represent the audio signals of songs, speech, or noise. Yes, the audio you listen to on your computer is stored as a list of numbers!

Most audio files, including .wav files, have two lists of numbers, one for the left speaker and one for the right. These two lists are called channels and the numbers in each list are called samples. We will design most of our audio effects to work on a single channel and the starter code will handle processing both channels.

The effects you will write will just manipulate list of numbers, but the coolest part is that you can (optionally) apply the effects to some real audio files by using the supplied apply_audioToolkit.py file. You will need comment out or uncomment lines in the applyEffects function in that file to select which effect(s) to apply to which file(s).

(Optional but very cool) Install the simpleaudio package to let Python play sounds directly

In order for your computer to be able to play the sound files used in the applyEffects function, you will need to install an extra Python library called simpleaudio. Otherwise no sound will be played directly, but you will still be able to open the processed .wav files manually to hear the audio output.

In Thonny, you can install simpleaudio using the following steps:

  1. In the "Tools" menu, select the "Manage packages" option.
  2. In the window that pops up, type simpleaudio into the search field and press ENTER.
  3. Information about the simpleaudio package should be displayed; click on the Install button to install it, which should be a fairly quick process.

We have found that for some people, particularly on Windows computers, the install process fails. In that case, click on the details below to see some options for trying to deal with that.

Click here for info about `simpleaudio` install issues.

Note: before following these instructions, remember that you can always just skip this step. Python won't play sounds directly, but you will be to open up the .wav files it creates manually and hear the sounds, both for this assignment and for other assignments involving audio.

In Thonny when the install fails, an error window will pop up (if you already closed it, you can simply try the install again). If the text there matches this example, then there may be a way around the problem:

SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
        warnings.warn(
      running build
      running build_py
      creating build
      creating build\lib.win-amd64-cpython-310
      creating build\lib.win-amd64-cpython-310\simpleaudio
      copying simpleaudio\__init__.py -> build\lib.win-amd64-cpython-310\simpleaudio
      copying simpleaudio\shiny.py -> build\lib.win-amd64-cpython-310\simpleaudio
      copying simpleaudio\functionchecks.py -> build\lib.win-amd64-cpython-310\simpleaudio
      creating build\lib.win-amd64-cpython-310\simpleaudio\test_audio
      copying simpleaudio\test_audio\c.wav -> build\lib.win-amd64-cpython-310\simpleaudio\test_audio
      copying simpleaudio\test_audio\e.wav -> build\lib.win-amd64-cpython-310\simpleaudio\test_audio
      copying simpleaudio\test_audio\g.wav -> build\lib.win-amd64-cpython-310\simpleaudio\test_audio
      copying simpleaudio\test_audio\left_right.wav -> build\lib.win-amd64-cpython-310\simpleaudio\test_audio
      copying simpleaudio\test_audio\notes_2_16_44.wav -> build\lib.win-amd64-cpython-310\simpleaudio\test_audio
      running build_ext
      building 'simpleaudio._simpleaudio' extension
      error: Microsoft Visual C++ 14.0 or greater is required. Get it with "Microsoft C++ Build Tools": https://visualstudio.microsoft.com/visual-cpp-build-tools/
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: legacy-install-failure

× Encountered error while trying to install package.
╰─> simpleaudio

note: This is an issue with the package mentioned above, not pip.
hint: See above for output from the failure.
Process returned with code 1

A screenshot of the error message shown above in the Thonny window
where it appears. If you're reading this, there's a good chance you're
not using Thonny, so just look for the text "Microsoft Visual Studio C++
14.0 or greater is required" in the error output from an install attempt
for the simpleaudio
package.

If your error message is substantially different from what's shown above, then these steps might not help (and if you're not on Windows, they definitely won't). The key point is the suggestion in the error message about installing Microsoft Visual C++.

It's a non-trivial process which will take up ~8GB of space, although having basic build tools installed on your computer may come in handy at other points in the future. The link to the build tools that's in the error message (which I've duplicated in the first part of this sentence so you can just click it) is the right place to start. On that page, click the "Download Build Tools" button, which will download an installer. Once that (small) download finishes, open it up to start the installation process.

Once the installer opens, it has the typical licence agreement, but then takes you to a confusing page with a huge amount of options for what to install. I've included a screenshot of that page below with the correct option selected, but the only thing you need is the very first one in the upper-left corner: The one titled "Desktop development with C++."

A screenshot of the installer options page, showing that the "Desktop
develeopment with C++" option is the only one selected (it's in the first
group of options titled "Desktop & Mobile (4)," on the default
"Workloads" tab). The screenshot also highlights the bottom-right corner
where the total space required (8.02GB) is listed as well as the "Install
button (shortcut key 'I') after a drop-down menu which is set to
"Install while downloading" (that's the default). Actions to take on this
screen are to first make sure that "Desktop development with C++" is
selected (nothing is selected by default) and then click/activate the "Install"
button.

Once you select that option, click the "Install" button, but make sure you have enough free space first or the process might not complete. It will go to a progress screen showing download and installation progress which will take a while. You can leave it going in the background and do other things (although don't shut off or hibernate your computer). Depending on your computer and your internet connection, I'd expect this step to take 10-30 minutes for most people.

Once it's installed, it will recommend that you restart, but I didn't have to do that: just go back to Thonny and try to install simpleaudio again. If that doesn't work, I'd try to restart the computer once to see if that helps things, and if it still doesn't work, feel free to reach out to an instructor, but at that point it might be best to give up on simpleaudio, since it's not strictly necessary.

Audio Effects to Write

As usual, you must document each function you write.

As a general rule, all effect functions must return a new list, and you must not modify the original channel or samples that is provided as an argument. Each function must also use at least one loop.

Due to the representation of floating point numbers in computers, certain mathematical calculations will lead to very tiny errors. These tiny errors are okay. DO NOT ROUND YOUR ANSWERS. If your answers differ only ever so slightly from our output below, then your answer is almost certainly correct. The testing code should ignore these errors if they are indeed just issues with floating-point computation and not larger errors.

Audio Effect 1: makeSofter

The function makeSofter is designed to make your audio file softer. makeSofter takes a channel (i.e., a list of numbers) and must return a new list where each value has been multiplied by a factor of 0.1. In general, the numbers representing audio (i.e., samples) should lie in the range of -1 to 1. Audio files are perceived as louder the closer those samples get to -1 and 1. Therefore, multiplying all values by 0.1 will narrow the range of the samples, making the audio softer. This example of makeSofter shows how it works.

Notice from the example how some of the answers have very small errors when attempting to multiply by 0.1. Again that is okay; we will be testing the accuracy of your results out to only 3 decimal places.

Audio Effect 2: chipmunk

The function chipmunk is designed to make everything sound like it was made by Alvin and the Chipmunks! Technically, all the pitches and frequency of the sound become higher. When applied to the human voice, it gives a chipmunk-like effect. chipmunk takes a channel and returns a new channel of only those samples at even indices. The other samples are discarded. This example shows how chipmunk works.

Note that you must use a loop, and you are NOT allowed to just use a slice, even though that would normally be one possible solution.

Audio Effect 3: removeVocals

The function removeVocals is designed to remove the vocal track from a song. The approach is crude and simple: it only works for certain songs when the vocal track is evenly split between the left channel and the right channel. Therefore, we can subtract one channel from the other and the remaining result should be the rest of the song. Accordingly, removeVocals requires two channels as arguments: the left and the right channel. removeVocals will return a single channel.

To illustrate the approach, let's take two channels as an example: a left channel of [0.1, 0.5, -0.2, 0.3] and a right channel of [-0.1, -0.6, 0.3, 0.1]. To remove the vocals subtract all the values in the right channel from the corresponding values of the left channel. That should yield a result of [0.2, 1.1, -0.5, 0.2]. You'll notice that with the right combination of numbers it is possible for the subtraction to produce a result outside of our preferred range of -1 to 1. The final step to the approach is to divide all the results by two. Therefore, the final result should be [0.1, 0.55, -0.25, 0.1].

You may assume that the left channel and right channels are the same length.

This example demonstrates how removeVocals works.

As a final note about this effect, it works decently well on sound file "distance.wav" for "The Distance" by Cake, but poorly on the others. Even so with the Cake example, you can hear that some of the other instrumentation is degraded, meaning that those instruments were relatively evenly split between the left and right channels. It is incredibly difficult to devise a strategy for this that works universally.

Audio Effect 4: reverse

The function reverse takes a channel and reverses the order of the samples. Like all effect functions, reverse must return a new channel rather than modifying the channel it is given. This is an old audio technique that can be used to create swells. It has also been used in some contexts to create hidden messages that can be revealed only when the listener plays the audio backwards. Check out this Wikipedia article on Backmasking.

For this function, you may not use the reversed function or the method .reverse(), and you must use a loop even though it's possible to accomplish this with just a slice. In any other context, you should use one of those tools for a task like this, since it is usually inadvisable to try and rewrite something that has already been written for you. However, for pedagogical purposes, implementing reverse is great practice for working with loops and indices.

These examples show how reverse should work.

Audio Effect 5: twoSampleDelay

The function twoSampleDelay creates an underwater effect by removing the higher pitches in the audio file. In technical terms, this is called a Lowpass Filter. A filter is a processing technique that removes parts of an audio signal. Filtering is a common technique in lots of data processing especially for images and video.

twoSampleDelay works by adding together the original channel and the original channel shifted by two indices to the right. For example, suppose we had a channel of [0.1, 0.4, -0.1, 0.3]. A shifted version of that channel would be [?, ?, 0.1, 0.4]. We can see that the original 0.1 has now moved two indices to the right as well as the sample 0.4. Notice how -0.1 and 0.3 "fall off" and are no longer retained when we shift. We want to ensure that the shifted version is the same length as the original.

There is also a question about how to fill the two spots on the left of the shifted version, represented above with question marks. Those will be filled by arguments passed to twoSampleDelay. The parameter twoSampleBack will be used for index 0 and oneSampleBack will be used for index 1. Suppose then that twoSampleBack had a value of -0.9 and oneSampleBack had a value of -0.7. Then if channel [0.1, 0.4, -0.1, 0.3] is shifted, we would get [-0.9, -0.7, 0.1, 0.4].

The final step to the strategy is to add the original and shifted version together and divide by two. When we add, we add at each index as shown below to produce a new list.

   |  0.1 |  0.4 | -0.1 | 0.3 |
 + | -0.9 | -0.7 |  0.1 | 0.4 |
-----------------------------
   | -0.8 | -0.3 |  0.0 | 0.7 |

Then we divide each sample by two. In this example, that would give us a final result of [-0.4, -0.15, 0, 0.35].

twoSampleDelay should work with lists of size zero to two as well!

Some hints:

  • The strategy for twoSampleBack can be a little complicated. Make sure you understand how the shifted version is constructed before you start writing your code.
  • The above strategy can be implemented exactly as described using several loops. As an extra goal, use exactly one loop. If you attempt this, think about updating the parameters oneSampleBack and twoSampleBack as you progress through the loop to accomplish the same task. Reducing the number of loops in your program can pay big dividends in terms of efficiency, especially when working with large files like audio files.

These examples demonstrate how twoSampleDelay should work.

Testing

As usual, we've provided a file named test_audioToolkit.py which will test your functions when you run it. It will only test the functions that you've actually defined, so you should run it each time you finish a function to make sure you have it correct before moving on to the next one. If you attempt the challenges below, it will also test those.

Ungraded Challenges

If you'd like to challenge yourself, here are a few more functions you could write. These are not part of the rubric and will not be graded; they are purely for your own benefit if you have extra time.

Audio Effect 6: ohYeah (ungraded)

Note: ohYeah is not part of the rubric and does not count towards your grade. It is an optional challenge problem.

The function ohYeah will be used to drop the pitch, similar to how chipmunk raised the pitch. There is no particular name associated with this effect but a famous example comes from the song "Oh Yeah" by Yello popularized in Ferris Bueller's Day Off. You can listen here.

ohYeah works by retaining all the samples of the original channel and adds new samples in between. The values of the new samples are always halfway between the original samples. ohYeah accepts two parameters: a channel and a sample called prevSample.

Suppose we call ohYeah with the channel [0.1, -0.3, 0.3] and a previous sample of 0.6 as in ohYeah([0.1, -0.3, 0.3], 0.6). We'll keep all the samples from the original channel and we will add values that are halfway between. Halfway between 0.1 and -0.3 is -0.1 and halfway between -0.3 and 0.3 is 0. Therefore the final output will contain [0.1, -0.1, -0.3, 0.0, 0.3].

The final step is to prepend a sample that is halfway between the previous sample and the 0th sample of the original list. In this example, 0.35 is halfway between the previous sample of 0.6 and the 0th index from the original channel of 0.1. Notice that ohYeah will always retain the original samples but double the length by adding samples in between the originals and attaching a new one to the front.

The image below gives a visual depiction of how each sample in the output is generated by the inputs. The blue dotted lines show how the output sample is the halfway point between two of the original samples. The red arrow shows where the sample from the original channel is preserved. A diagram showing how the previous sample and the samples from the
original channel are used to create the final output

This process of generating new sample points between others is called interpolation and is used often for many audio effects.

If you are curious why we need this previous sample, the starter code for audioToolkit.py processes an audio file in chunks. In order to connect each chunk smoothly and without any glitches, some of the effects like ohYeah need information from the previous chunk. Here we need the last sample of the previous chunk in order to properly calculate the halfway point to the start of the next chunk.

Audio Effect 7: crescendo (ungraded)

Note: crescendo is not part of the rubric and does not count towards your grade. It is an optional challenge problem.

The function crescendo creates a fade-in on the audio track for the duration of the song length. The simplest way to produce a fade-in is to generate a list of samples that incrementally grow from some start point up to some end point, very similar to what the range function does, except that they're not integers.

For example, suppose we want to generate 4 samples that grow from 0.2 up to 0.5. Such a list of samples would be [0.2, 0.3, 0.4, 0.5]. Note how we include both 0.2 and 0.4 at the ends, and that with 4 numbers in the list, there are 3 spaces between them, so the intermediate numbers are 1/3 and 2/3 of the way in between the endpoints. Also notice that the rate of increase is always the same (i.e., 0.1) from one value to the next.

To create a fade-in, then, we multiply our increasing samples by the original samples at each index and return a new list. For example, suppose our original samples are [0.2, -0.4, 0.6, 0.1] and we want fade-in from 0.2 to 0.5, then we would multiply each index as follows:

   | 0.2  | -0.4  | 0.6  | 0.1  |
 * | 0.2  |  0.3  | 0.4  | 0.5  |
---------------------------------
   | 0.04 | -0.12 | 0.24 | 0.05 |

The final result is [0.04, -0.12, 0.24, 0.05]. The length of the increasing samples should always match the length of the original samples.

The function crescendo takes three arguments: channel, startVolume, and endVolume. It should return a channel of the same length. startVolume represents the starting point of the ramping samples and endVolume represents the ending point of the ramping samples. startVolume and endVolume are values between 0 and 1 that represent what fraction of the song's overall volume should be achieved with 1 being full volume.

For Fun: Custom Audio Effect (ungraded)

Note: custom is not part of the rubric and does not count towards your grade. It is an optional challenge problem.

The function custom is a function to allow you to explore your own audio effect. But it can be interesting to see how manipulation of samples produces interesting audio effects. Remember though to keep your range of numbers between -1 and 1. Otherwise you could damage your speakers or your ears!

Do not add any parameters to custom otherwise it will not work with the starter code (you could also modify the starter code to compensate for this, but doing so is complex).

To process an audio file using your custom effect, make sure to call applyEffect with the second argument as "custom". For example,

applyEffect("poker.wav", "custom")

Show us all the fun ways you can experiment with music!

Using applyEffect and applyEffects

The apply_audioToolkit.py starter file contains applyEffect and applyEffects functions which can apply the effects you write to sound clips we've provided. The starter code contains a folder called sounds which contains audio excerpts from the following songs:

  1. "poker.wav": "Poker Face" by Lady Gaga
  2. "hello.wav": "Hello" by Adele
  3. "prayer.wav": "I Say A Little Prayer" by Aretha Franklin
  4. "distance.wav": "The Distance" by Cake

The function applyEffect is called to process one of the four audio files from above with one of the audio effects you will write. You do not need to write applyEffect. It has already been written for you. The function applyEffect has two parameters:

  1. The name of the audio file to process (e.g., "poker.wav")
  2. The name of the function that will process the audio as a string (e.g., "makeSofter")

applyEffect will process the sound file using the given effect, and store the processed sound file in the sounds folder with _proc after the base file name. For example,

applyEffect("poker.wav", "makeSofter")

will make the volume of the song "Poker Face" by Lady Gaga softer and place the new file in the sounds folder under the name "poker\_proc.wav". You could rename this to "poker\_softer.wav" if you want to save this file and prevent it from being overwritten by another call to applyEffect.

The apply_audioToolkit.py file provides function calls to applyEffect to apply effects to the above songs. Uncomment any call to try your effect! You could also copy-paste them and modify the arguments to apply different effects to different audio excerpts.

In the code for applyEffect, you can see how it uses an if/elif structure to look at the string provided and decide which function to call.

Note on loops and wasting boxes

The rubric includes some extra goals which require that your code does not waste fruit or waste boxes: you shouldn't ignore the results of a fruitful function, nor should you create any variables that you don't use. This applies to loop variables defined by a for loop as well as regular variables and parameters, but sometimes you really don't need to use a loop variable, even though Python forces you to declare one. If you need to create a for loop but you don't need to use the loop variable, use _ (a single underscore) as the name of the variable to indicate that you are intentionally not using it. The "don't waste boxes" check will be skipped for any variable with that name.

Examples

makeSofter Examples

These examples show how makeSofter works. Notice how the number of samples in the output channel is the same as the number of samples in the input channel, but each number is scaled. Also note that we will ignore the small rounding errors that Python sometimes produces.

In []:
makeSofter([0, 0.1, 3, -1.3])
Out[]:
[0.0, 0.010000000000000002, 0.30000000000000004, -0.13]
In []:
makeSofter([-0.2, 0.4, 0.1, 0.2])
Out[]:
[ -0.020000000000000004, 0.04000000000000001, 0.010000000000000002, 0.020000000000000004, ]
In []:
makeSofter([-0.2, 0.1])
Out[]:
[-0.020000000000000004, 0.010000000000000002]

chipmunk Examples

These examples show how chipmunk works. Notice that the result has exactly half as many samples (rounded up) as in the input channel.

In []:
chipmunk([0, -0.7, 0.3