This task is part of project11 which is due at 23:00 EDT on 2025-04-29.
You have the option to work with a partner on this task if you wish. Working with a partner requires more work to coordinate schedules, but if you work together and make sure that you are both understanding the code you write, you will make progress faster and learn more.
You can download the starter code for this task using this link.
You can submit this task using this link.
Put all of your work for this task into the file
payroll.py
(you will create this file from scratch)
Your company "Franchisia" has hired you to handle some formatting for
payroll reports from its various branch locations in Wolleslu and the
surrounding towns. You could simply do the work by hand in a spreadsheet
app, but instead, you decide to write some Python functions to handle
things automatically. Since the reports are submitted as CSV files, you
decide to use the csv module to process the data.
This example shows how the data is formatted; we have supplied a number of directories full of such CSV files for testing purposes.
In a new file named payroll.py, write each of the functions described
below. Start by using import csv to gain access to the csv module.
NOTE: Throughout your code, when using open, you should add
newline='' as an argument if you are going to use csv.DictReader or
csv.DictWriter to work with the file contents (which should be every
time you use open). If you forget to do this, in some cases your output
files may appear double-spaced.
columnAverageYour boss is always asking you to tell them the averages for specific branch locations, like the average total pay or overtime hours worked. Since it's hard to predict what they'll ask for, you decide to make a function that can handle any request for an average.
columnAverage takes a filename as its first parameter and a string
naming a column as its second. It
reads the specified file and must average the numeric values in the
specified column by adding them up
and then dividing by the number of numeric values it found. If the
specified column doesn't exist in the specified file, or if it exists but
doesn't contain any numeric values (i.e., values which can be converted
to a floating-point number using the float function), columnAverage
must return None.
You must use a csv.DictReader to read the
data.
These examples show how columnAverage should
work.
Hint: use try/except to handle the possibility that a value might not
be a number. Keep your try block as small as possible to avoid skipping
over too much code when an exception does occur.
mergeReportsAnother common task is merging reports from two branch locations into one report file.
mergeReports takes three filenames as
arguments specifying two files to
merge and a file to write the results into. It may assume that both files
being merged contain the same columns, in the same order, and it just
needs to write one file containing all of the rows from both files being
merged, except that the redundant
header row of the second file is skipped. The rows of the second file
should appear in their original order after the rows of the first file.
These examples show how mergeReports should
work. You may assume that the first file includes
at least one row of data.
updateReportSometimes, a branch will send over one payroll file, and then later send an update because their first report was wrong. These updated reports are a pain to deal with because they only have some of the workers: any worker not in the updated report is assumed to have correct data in the original report. You decide to define a function to make updating reports easy.
updateReport takes three filenames as
arguments specifying a original file,
an update file, and a result file to write into. It must write an
updated report based on the original
file by filtering out rows for
employees listed in the update file, and then adding all rows from
the update file (in the same order they appear in that file). All of the
rows of the original file which aren't being updated will come first in
the result file, followed by all of the rows in the update file.
What does it mean for an employee to be "listed in the update file"? The value of the "Id" column represents a worker's unique ID, so only if an ID from the original file is found among the IDs of the update file do we discard the original row. In other words, if the "Id" value from a row in the original file appears in the "Id" column of any row of the update file, that original row should not be written to the result file (because the corresponding row of the update file will include that employee and will be written to the result).
The columns may be in different orders in the different files, but by
using csv.DictReader you shouldn't have to worry about that. The output
file should have the same columns in the same order as the first input
file.
You must use a csv.DictReader to read the
data, and you must use a
csv.DictWriter to write the
data.
These examples show how updateReport should
work. You may assume that the original file
includes at least one row of data.
Note: Think strategically about how to filter out updated rows. Which file(s) do you need to read and in what order? What data structure would make it easiest to find the updated rows and skip them when writing the result?
salariesReportAt most locations, there's a manger who gets a salary instead of an hourly wage. One thing your boss usually asks for is a report on just these salaried employees, from all locations. You decide to write a function that will filter a report to just have the "Name", "Id", "Pay", and "Hours" columns, and to only include rows where the "Wage" value is 'salaried'.
salariesReport takes a report filename and an output filename as
arguments specifying a report
file to read and a result file to write to. It must filter the rows of
the input report to just those where the "Wage" value is 'salaried', and
include only the "Name", "Id", "Pay", and "Hours" columns in the
result.
Salaried employees are any employees whose value for the "Wage" column is the string 'salaried'. Values from the following columns (in this order) should be written into the combined report:
Name, Id, Pay, Hours
Note that the "Wage" column is not written to the report, and that the order of columns differs from the default order in most of the provided files. You may not assume the order of these columns in the CSV file you process, although you may assume that each column exists in the input file.
You must use a csv.DictReader to read the
data, which will also
mean you don't have to worry about column ordering.
These examples show how salariesReport should
work, including what the output file should look
like. You may assume that the specified
file is a valid report with a correct header containing all of the
columns you need. If there are multiple salaried employees in one report
file, they should appear in the same order as they appear in that file.
If you're up for a further challenge, there's one company office in each
of the obvious_reports, subtle_reports, and reports directories
which is doing something bad. Figure out what's going on, and which
office is doing it in each directory.
Format of payroll files
These examples just show the format of a payload file. Note that for easier viewing, you can copy-paste the contents of a CSV file into a spreadsheet (like Google Sheets or Excel) and possibly choose the "split text to columns" option from the paste menu that pops up when you paste (at least in Google Sheets) if things are all in one column.
Note that the column order is the same in almost all of the CSV files, but you cannot assume that it will always be the same.
Although the precise meaning of each column is not important to this task, the columns are:
In []:File%more reports/Wolleslu.csvreports/Wolleslu.csvName,Id,Hours,Overtime,Wage,Pay,Benefits Walter,379325,140,0,14.7,2057.66,617.3 Jolynn,379327,132,0,11.58,1528.22,458.46 Anastasia,379330,135,0,13.7,1849.1,554.73 John,379333,132,0,11.6,1530.63,459.19 Emberlynn,379334,156,0,15.06,2349.61,704.88 Mahdi,379337,97,0,11.33,1099.13,329.74 Lazarus,379339,141,0,14.32,2018.98,605.69 Yonatan,379341,101,0,11.56,1167.64,350.29 Colt,379344,148,0,14.27,2112.16,633.65 Levi,379345,140,0,11.71,1639.92,491.98 Ryder,379346,142,0,14.58,2070.51,621.15 Charlie,379347,148,0,14.96,2214.11,664.23 Genesis,379315,134,0,salaried,5278.66,2903.26In []:File%more reports/Hamingfram.csvreports/Hamingfram.csvName,Id,Hours,Overtime,Wage,Pay,Benefits Eleanor,379351,133,0,12.17,1618.41,485.52 Abigail,379352,146,0,11.41,1666.24,499.87 Mila,379361,137,0,14.09,1930.0,579.0 Ethan,379362,76,0,13.99,1063.35,53.17 Lily,379363,66,0,11.65,768.71,38.44 Sadie,379364,125,0,12.31,1538.81,461.64 Gabriel,379350,160,57,salaried,5691.47,3130.31In []:File%more updates/Newt2.csvupdates/Newt2.csvName,Id,Hours,Overtime,Wage,Pay,Benefits Skye,379243,113,0,11.39,1287.49,383.62 Arlo,379251,137,0,13.91,1919.59,575.92 Jordan Jr.,379267,25,0,25.00,525.00,0.00 Jordan,379240,128,0,salaried,5066.93,2786.81In []:File%more special/Mixedup.csvspecial/Mixedup.csvBenefits,Name,Pay,Id,Hours,Wage,Overtime 404.47,Luna,1348.22,379328,116,11.62,0 351.79,Phoenix,1172.62,379330,88,13.33,0 409.23,Khiara,1364.11,379331,125,10.91,0 322.16,Helene,1073.85,379333,82,13.1,0 355.97,Alexander,1186.57,379334,93,12.76,0 3126.18,Hadley,5683.96,379327,160,salaried,76
columnAverage examples
columnAverage reads a CSV file and averages all numeric values in the
named column. Note that it should work even for files whose columns
aren't in the same order as the payroll files we've given you to work
with, so you'll need to detect which column to read from based on the
first row of the file (or use a csv.DictReader). It will return None
if either the file doesn't have a column with the specified name, or if
there are no numeric values in that column.
The files referenced in this example and others are provided as part of the starter code.
In []:Out[]:columnAverage('reports/Wolleslu.csv', 'Hours')In []:134.30769230769232Out[]:columnAverage('reports/Wolleslu.csv', 'Pay')In []:2070.4869230769227Out[]:columnAverage('reports/Notuck.csv', 'Hours')In []:120.5Out[]:columnAverage('reports/Notuck.csv', 'Pay')1955.8312500000002
mergeReports examples
mergeReports merges two reports into one file, writing just a single
copy of the header row at the top. It is allowed to assume that the
files being merged have the same columns in the same order.
In []:FilemergeReports('reports/Newt.csv', 'reports/Notuck.csv', 'results/Newtuck.csv')results/Newtuck.csvName,Id,Hours,Overtime,Wage,Pay,Benefits Skye,379243,109,0,11.39,1241.93,372.58 Jay,379245,76,0,13.85,1052.89,52.64 Naliyah,379247,186,0,10.34,1923.65,769.46 Ibrahim,379248,145,0,14.3,2073.74,622.12 Londyn,379249,207,0,12.53,2592.75,1037.1 Arlo,379251,145,0,13.91,2016.96,605.09 Jordan,379240,103,0,salaried,5066.93,2786.81 Jazmin,379255,53,0,15.42,817.24,40.86 Isabella,379256,188,0,13.17,2475.62,990.25 Karolyne,379257,60,0,14.25,855.14,42.76 Lucas,379260,146,0,11.2,1634.58,490.37 Palmer,379261,152,0,11.64,1769.49,530.85 Annabelle,379262,65,0,13.91,904.0,45.2 Allison,379263,140,0,14.53,2034.77,610.43 Adrian,379252,160,15,salaried,5155.81,2835.7In []:FilemergeReports( 'reports/Hamingfram.csv', 'reports/Hamneed.csv', 'results/Hamham.csv' )results/Hamham.csvName,Id,Hours,Overtime,Wage,Pay,Benefits Eleanor,379351,133,0,12.17,1618.41,485.52 Abigail,379352,146,0,11.41,1666.24,499.87 Mila,379361,137,0,14.09,1930.0,579.0 Ethan,379362,76,0,13.99,1063.35,53.17 Lily,379363,66,0,11.65,768.71,38.44 Sadie,379364,125,0,12.31,1538.81,461.64 Gabriel,379350,160,57,salaried,5691.47,3130.31 Rane,379267,83,0,12.7,1054.21,316.26 Samuel,379268,169,0,13.29,2246.17,898.47 Charlotte,379271,101,0,11.04,1114.76,334.43 Kai,379272,182,0,14.12,2569.0,1027.6 Ahtziri,379276,140,0,10.49,1468.52,440.56 Allister,379266,160,74,salaried,5391.25,2965.19
updateReport examples
updateReport takes an original report and a second report containing
some updates and/or additions. It merges these two reports into an
updated report, where employee rows from the original report which have
an entry with the same ID among the updates are omitted, and those
updated rows, along with any rows for employees with new IDs (i.e., all
of the rows of updates), are added to the end of the original report.
This function is allowed to assume that the files being combined have the same columns in the same order.
In []:FileupdateReport('reports/Newt.csv', 'updates/Newt2.csv', 'results/NewtUpdated.csv')results/NewtUpdated.csvName,Id,Hours,Overtime,Wage,Pay,Benefits Jay,379245,76,0,13.85,1052.89,52.64 Naliyah,379247,186,0,10.34,1923.65,769.46 Ibrahim,379248,145,0,14.3,2073.74,622.12 Londyn,379249,207,0,12.53,2592.75,1037.1 Skye,379243,113,0,11.39,1287.49,383.62 Arlo,379251,137,0,13.91,1919.59,575.92 Jordan Jr.,379267,25,0,25.00,525.00,0.00 Jordan,379240,128,0,salaried,5066.93,2786.81In []:FileupdateReport( 'reports/Thamwal.csv', 'updates/Thamwal2.csv', 'results/ThamwalUpdated.csv' )results/ThamwalUpdated.csvName,Id,Hours,Overtime,Wage,Pay,Benefits Penelope,379278,138,0,13.14,1812.86,543.86 Dasia,379279,71,0,12.14,861.76,43.09 Leo,379280,61,0,11.12,678.02,33.9 Nicholas,379281,139,0,11.5,1598.71,479.61 Remi,379292,54,0,10.48,566.17,28.31 Lynnea,379295,147,0,12.67,1862.27,558.68 Xavier,379297,150,0,13.82,2072.94,621.88 Tucker,379300,148,0,12.3,1819.77,545.93 Zamir,379304,59,0,11.93,703.63,35.18 Isaiah,379305,139,0,12.75,1772.46,531.74 Mason,379277,160,38,salaried,5676.21,3121.92 Luis,379291,0,0,14.53,0,0 Luisa,379291,57,0,14.53,828.04,41.4
salariesReport examples
salariesReport takes an input report filename and an output filename
and filters the input report, looking for rows where the value of the
"Wage" column is 'salaried', indicating a salaried employee who does
not have an hourly wage. It gathers all of these rows from every file it
inspects, and writes a combined report, which has the following columns:
Name, Id, Pay, Hours
Note that "Wage", "Benefits", and "Overtime" are not included. This function may assume that the input file will include at least these columns, but it may not assume that they'll be in the same order from file to file. It must write them in the order shown above when it creates its output file.
In []:FilesalariesReport('reports/Wolleslu.csv', 'results/salaries_Wolleslu.csv')results/salaries_Wolleslu.csvName,Id,Pay,Hours Genesis,379315,5278.66,134In []:FilesalariesReport('reports/Whyland.csv', 'results/salaries_Whyland.csv')results/salaries_Whyland.csvName,Id,Pay,Hours Ayden,379366,5668.94,109 Fenix,379482,6000.00,85
columnAverage must return the correct result
columnAverage function is run must match the solution result.mergeReports must write the correct data into the appropriate file
mergeReports function is run must match what the solution writes.updateReport must write the correct data into the appropriate file
updateReport function is run must match what the solution writes.salariesReport must write the correct data into the appropriate file
salariesReport function is run must match what the solution writes.columnAverage with 2 parameters
def to define columnAverage with 2 parametersopen
columnAverage with 2 parameters, call open in at least one place.DictReader
columnAverage with 2 parameters, use csv.DictReader.columnAverage with 2 parameters, use any kind of loop in at least one place.return statement
columnAverage with 2 parameters, use return _ in at least one place.mergeReports with 3 parameters
def to define mergeReports with 3 parametersopen
mergeReports with 3 parameters, call open in at least one place.updateReport with 3 parameters
def to define updateReport with 3 parametersopen
updateReport with 3 parameters, call open in at least one place.updateReport with 3 parameters, use any kind of loop in at least one place.DictReader
updateReport with 3 parameters, use csv.DictReader.DictWriter
updateReport with 3 parameters, use csv.DictWriter.salariesReport with 2 parameters
def to define salariesReport with 2 parametersopen
salariesReport with 2 parameters, call open in at least one place.DictReader
salariesReport with 2 parameters, use csv.DictReader.DictWriter
salariesReport with 2 parameters, use csv.DictWriter.= or by defining a parameter for a function) you must also later use that variable as part of another expression. If you need to create a variable that you won't use, it must have the name _, but you should only do this if absolutely necessary.columnAverage must return the correct result
columnAverage function is run must match the solution result.mergeReports must write the correct data into the appropriate file
mergeReports function is run must match what the solution writes.updateReport must write the correct data into the appropriate file
updateReport function is run must match what the solution writes.salariesReport must write the correct data into the appropriate file
salariesReport function is run must match what the solution writes.columnAverage with 2 parameters
def to define columnAverage with 2 parameterscolumnAverage with 2 parameters, use any kind of loop in at most 2 places.columnAverage with 2 parameters, use a with statement in at least one place.mergeReports with 3 parameters
def to define mergeReports with 3 parametersopen
mergeReports with 3 parameters, call open in at least one and at most 3 places.mergeReports with 3 parameters, use a with statement in at least one and at most 3 places.updateReport with 3 parameters
def to define updateReport with 3 parametersopen
updateReport with 3 parameters, call open in at least one and at most 3 places.updateReport with 3 parameters, use a with statement in at least one and at most 3 places.salariesReport with 2 parameters
def to define salariesReport with 2 parametersopen
salariesReport with 2 parameters, call open in at least one and at most 2 places.salariesReport with 2 parameters, use a with statement in at least one and at most 2 places.