This task is part of project11 which is due at 23:00 EST on 2024-12-10.
You have the option to work with a partner on this task if you wish. Working with a partner requires more work to coordinate schedules, but if you work together and make sure that you are both understanding the code you write, you will make progress faster and learn more.
You can download the starter code for this task using this link.
You can submit this task using this link.
Put all of your work for this task into the file
payroll.py
(you will create this file from scratch)
Your company "Franchisia" has hired you to handle some formatting for
payroll reports from its various branch locations in Wolleslu and the
surrounding towns. You could simply do the work by hand in a spreadsheet
app, but instead, you decide to write some Python functions to handle
things automatically. Since the reports are submitted as CSV files, you
decide to use the csv
module to process the data.
This example shows how the data is formatted; we have supplied a number of directories full of such CSV files for testing purposes.
In a new file named payroll.py
, write each of the functions described
below. Start by using import csv
to gain access to the csv
module.
NOTE: Throughout your code, when using open
, you should add
newline=''
as an argument if you are going to use csv.DictReader
or
csv.DictWriter
to work with the file contents (which should be every
time you use open
). If you forget to do this, in some cases your output
files may appear double-spaced.
columnAverage
Your boss is always asking you to tell them the averages for specific branch locations, like the average total pay or overtime hours worked. Since it's hard to predict what they'll ask for, you decide to make a function that can handle any request for an average.
columnAverage
takes a filename as its first parameter and a string
naming a column as its second. It
reads the specified file and must average the numeric values in the
specified column by adding them up
and then dividing by the number of numeric values it found. If the
specified column doesn't exist in the specified file, or if it exists but
doesn't contain any numeric values (i.e., values which can be converted
to a floating-point number using the float
function), columnAverage
must return None
.
You must use a csv.DictReader
to read the
data.
These examples show how columnAverage
should
work.
Hint: use try/except
to handle the possibility that a value might not
be a number. Keep your try block as small as possible to avoid skipping
over too much code when an exception does occur.
mergeReports
Another common task is merging reports from two branch locations into one report file.
mergeReports
takes three filenames as
arguments specifying two files to
merge and a file to write the results into. It may assume that both files
being merged contain the same columns, in the same order, and it just
needs to write one file containing all of the rows from both files being
merged, except that the redundant
header row of the second file is skipped. The rows of the second file
should appear in their original order after the rows of the first file.
These examples show how mergeReports
should
work. You may assume that the first file includes
at least one row of data.
updateReport
Sometimes, a branch will send over one payroll file, and then later send an update because their first report was wrong. These updated reports are a pain to deal with because they only have some of the workers: any worker not in the updated report is assumed to have correct data in the original report. You decide to define a function to make updating reports easy.
updateReport
takes three filenames as
arguments specifying a original file,
an update file, and a result file to write into. It must write an
updated report based on the original
file by filtering out rows for
employees listed in the update file, and then adding all rows from
the update file (in the same order they appear in that file). All of the
rows of the original file which aren't being updated will come first in
the result file, followed by all of the rows in the update file.
What does it mean for an employee to be "listed in the update file"? The value of the "Id" column represents a worker's unique ID, so only if an ID from the original file is found among the IDs of the update file do we discard the original row. In other words, if the "Id" value from a row in the original file appears in the "Id" column of any row of the update file, that original row should not be written to the result file (because the corresponding row of the update file will include that employee and will be written to the result).
The columns may be in different orders in the different files, but by
using csv.DictReader
you shouldn't have to worry about that. The output
file should have the same columns in the same order as the first input
file.
You must use a csv.DictReader
to read the
data, and you must use a
csv.DictWriter
to write the
data.
These examples show how updateReport
should
work. You may assume that the original file
includes at least one row of data.
Note: Think strategically about how to filter out updated rows. Which file(s) do you need to read and in what order? What data structure would make it easiest to find the updated rows and skip them when writing the result?
salariesReport
At most locations, there's a manger who gets a salary instead of an hourly wage. One thing your boss usually asks for is a report on just these salaried employees, from all locations. You decide to write a function that will filter a report to just have the "Name", "Id", "Pay", and "Hours" columns, and to only include rows where the "Wage" value is 'salaried'.
salariesReport
takes a report filename and an output filename as
arguments specifying a report
file to read and a result file to write to. It must filter the rows of
the input report to just those where the "Wage" value is 'salaried', and
include only the "Name", "Id", "Pay", and "Hours" columns in the
result.
Salaried employees are any employees whose value for the "Wage" column is the string 'salaried'. Values from the following columns (in this order) should be written into the combined report:
Name, Id, Pay, Hours
Note that the "Wage" column is not written to the report, and that the order of columns differs from the default order in most of the provided files. You may not assume the order of these columns in the CSV file you process, although you may assume that each column exists in the input file.
You must use a csv.DictReader
to read the
data, which will also
mean you don't have to worry about column ordering.
These examples show how salariesReport
should
work, including what the output file should look
like. You may assume that the specified
file is a valid report with a correct header containing all of the
columns you need. If there are multiple salaried employees in one report
file, they should appear in the same order as they appear in that file.
If you're up for a further challenge, there's one company office in each
of the obvious_reports
, subtle_reports
, and reports
directories
which is doing something bad. Figure out what's going on, and which
office is doing it in each directory.
Format of payroll files
These examples just show the format of a payload file. Note that for easier viewing, you can copy-paste the contents of a CSV file into a spreadsheet (like Google Sheets or Excel) and possibly choose the "split text to columns" option from the paste menu that pops up when you paste (at least in Google Sheets) if things are all in one column.
Note that the column order is the same in almost all of the CSV files, but you cannot assume that it will always be the same.
Although the precise meaning of each column is not important to this task, the columns are:
In []:File%more reports/Wolleslu.csv
reports/Wolleslu.csvName,Id,Hours,Overtime,Wage,Pay,Benefits Walter,379325,140,0,14.7,2057.66,617.3 Jolynn,379327,132,0,11.58,1528.22,458.46 Anastasia,379330,135,0,13.7,1849.1,554.73 John,379333,132,0,11.6,1530.63,459.19 Emberlynn,379334,156,0,15.06,2349.61,704.88 Mahdi,379337,97,0,11.33,1099.13,329.74 Lazarus,379339,141,0,14.32,2018.98,605.69 Yonatan,379341,101,0,11.56,1167.64,350.29 Colt,379344,148,0,14.27,2112.16,633.65 Levi,379345,140,0,11.71,1639.92,491.98 Ryder,379346,142,0,14.58,2070.51,621.15 Charlie,379347,148,0,14.96,2214.11,664.23 Genesis,379315,134,0,salaried,5278.66,2903.26In []:File%more reports/Hamingfram.csv
reports/Hamingfram.csvName,Id,Hours,Overtime,Wage,Pay,Benefits Eleanor,379351,133,0,12.17,1618.41,485.52 Abigail,379352,146,0,11.41,1666.24,499.87 Mila,379361,137,0,14.09,1930.0,579.0 Ethan,379362,76,0,13.99,1063.35,53.17 Lily,379363,66,0,11.65,768.71,38.44 Sadie,379364,125,0,12.31,1538.81,461.64 Gabriel,379350,160,57,salaried,5691.47,3130.31In []:File%more updates/Newt2.csv
updates/Newt2.csvName,Id,Hours,Overtime,Wage,Pay,Benefits Skye,379243,113,0,11.39,1287.49,383.62 Arlo,379251,137,0,13.91,1919.59,575.92 Jordan Jr.,379267,25,0,25.00,525.00,0.00 Jordan,379240,128,0,salaried,5066.93,2786.81In []:File%more special/Mixedup.csv
special/Mixedup.csvBenefits,Name,Pay,Id,Hours,Wage,Overtime 404.47,Luna,1348.22,379328,116,11.62,0 351.79,Phoenix,1172.62,379330,88,13.33,0 409.23,Khiara,1364.11,379331,125,10.91,0 322.16,Helene,1073.85,379333,82,13.1,0 355.97,Alexander,1186.57,379334,93,12.76,0 3126.18,Hadley,5683.96,379327,160,salaried,76
columnAverage
examples
columnAverage
reads a CSV file and averages all numeric values in the
named column. Note that it should work even for files whose columns
aren't in the same order as the payroll files we've given you to work
with, so you'll need to detect which column to read from based on the
first row of the file (or use a csv.DictReader
). It will return None
if either the file doesn't have a column with the specified name, or if
there are no numeric values in that column.
The files referenced in this example and others are provided as part of the starter code.
In []:Out[]:columnAverage('reports/Wolleslu.csv', 'Hours')
In []:134.30769230769232
Out[]:columnAverage('reports/Wolleslu.csv', 'Pay')
In []:2070.4869230769227
Out[]:columnAverage('reports/Notuck.csv', 'Hours')
In []:120.5
Out[]:columnAverage('reports/Notuck.csv', 'Pay')
1955.8312500000002
mergeReports
examples
mergeReports
merges two reports into one file, writing just a single
copy of the header row at the top. It is allowed to assume that the
files being merged have the same columns in the same order.
In []:FilemergeReports('reports/Newt.csv', 'reports/Notuck.csv', 'results/Newtuck.csv')
results/Newtuck.csvName,Id,Hours,Overtime,Wage,Pay,Benefits Skye,379243,109,0,11.39,1241.93,372.58 Jay,379245,76,0,13.85,1052.89,52.64 Naliyah,379247,186,0,10.34,1923.65,769.46 Ibrahim,379248,145,0,14.3,2073.74,622.12 Londyn,379249,207,0,12.53,2592.75,1037.1 Arlo,379251,145,0,13.91,2016.96,605.09 Jordan,379240,103,0,salaried,5066.93,2786.81 Jazmin,379255,53,0,15.42,817.24,40.86 Isabella,379256,188,0,13.17,2475.62,990.25 Karolyne,379257,60,0,14.25,855.14,42.76 Lucas,379260,146,0,11.2,1634.58,490.37 Palmer,379261,152,0,11.64,1769.49,530.85 Annabelle,379262,65,0,13.91,904.0,45.2 Allison,379263,140,0,14.53,2034.77,610.43 Adrian,379252,160,15,salaried,5155.81,2835.7In []:FilemergeReports( 'reports/Hamingfram.csv', 'reports/Hamneed.csv', 'results/Hamham.csv' )
results/Hamham.csvName,Id,Hours,Overtime,Wage,Pay,Benefits Eleanor,379351,133,0,12.17,1618.41,485.52 Abigail,379352,146,0,11.41,1666.24,499.87 Mila,379361,137,0,14.09,1930.0,579.0 Ethan,379362,76,0,13.99,1063.35,53.17 Lily,379363,66,0,11.65,768.71,38.44 Sadie,379364,125,0,12.31,1538.81,461.64 Gabriel,379350,160,57,salaried,5691.47,3130.31 Rane,379267,83,0,12.7,1054.21,316.26 Samuel,379268,169,0,13.29,2246.17,898.47 Charlotte,379271,101,0,11.04,1114.76,334.43 Kai,379272,182,0,14.12,2569.0,1027.6 Ahtziri,379276,140,0,10.49,1468.52,440.56 Allister,379266,160,74,salaried,5391.25,2965.19
updateReport
examples
updateReport
takes an original report and a second report containing
some updates and/or additions. It merges these two reports into an
updated report, where employee rows from the original report which have
an entry with the same ID among the updates are omitted, and those
updated rows, along with any rows for employees with new IDs (i.e., all
of the rows of updates), are added to the end of the original report.
This function is allowed to assume that the files being combined have the same columns in the same order.
In []:FileupdateReport('reports/Newt.csv', 'updates/Newt2.csv', 'results/NewtUpdated.csv')
results/NewtUpdated.csvName,Id,Hours,Overtime,Wage,Pay,Benefits Jay,379245,76,0,13.85,1052.89,52.64 Naliyah,379247,186,0,10.34,1923.65,769.46 Ibrahim,379248,145,0,14.3,2073.74,622.12 Londyn,379249,207,0,12.53,2592.75,1037.1 Skye,379243,113,0,11.39,1287.49,383.62 Arlo,379251,137,0,13.91,1919.59,575.92 Jordan Jr.,379267,25,0,25.00,525.00,0.00 Jordan,379240,128,0,salaried,5066.93,2786.81In []:FileupdateReport( 'reports/Thamwal.csv', 'updates/Thamwal2.csv', 'results/ThamwalUpdated.csv' )
results/ThamwalUpdated.csvName,Id,Hours,Overtime,Wage,Pay,Benefits Penelope,379278,138,0,13.14,1812.86,543.86 Dasia,379279,71,0,12.14,861.76,43.09 Leo,379280,61,0,11.12,678.02,33.9 Nicholas,379281,139,0,11.5,1598.71,479.61 Remi,379292,54,0,10.48,566.17,28.31 Lynnea,379295,147,0,12.67,1862.27,558.68 Xavier,379297,150,0,13.82,2072.94,621.88 Tucker,379300,148,0,12.3,1819.77,545.93 Zamir,379304,59,0,11.93,703.63,35.18 Isaiah,379305,139,0,12.75,1772.46,531.74 Mason,379277,160,38,salaried,5676.21,3121.92 Luis,379291,0,0,14.53,0,0 Luisa,379291,57,0,14.53,828.04,41.4
salariesReport
examples
salariesReport
takes an input report filename and an output filename
and filters the input report, looking for rows where the value of the
"Wage" column is 'salaried'
, indicating a salaried employee who does
not have an hourly wage. It gathers all of these rows from every file it
inspects, and writes a combined report, which has the following columns:
Name, Id, Pay, Hours
Note that "Wage", "Benefits", and "Overtime" are not included. This function may assume that the input file will include at least these columns, but it may not assume that they'll be in the same order from file to file. It must write them in the order shown above when it creates its output file.
In []:FilesalariesReport('reports/Wolleslu.csv', 'results/salaries_Wolleslu.csv')
results/salaries_Wolleslu.csvName,Id,Pay,Hours Genesis,379315,5278.66,134In []:FilesalariesReport('reports/Whyland.csv', 'results/salaries_Whyland.csv')
results/salaries_Whyland.csvName,Id,Pay,Hours Ayden,379366,5668.94,109 Fenix,379482,6000.00,85
=
or by defining a parameter for a function) you must also later use that variable as part of another expression. If you need to create a variable that you won't use, it must have the name _
, but you should only do this if absolutely necessary.columnAverage
must return the correct result
columnAverage
function is run must match the solution result.columnAverage
must return the correct result
columnAverage
function is run must match the solution result.mergeReports
must write the correct data into the appropriate file
mergeReports
function is run must match what the solution writes.mergeReports
must write the correct data into the appropriate file
mergeReports
function is run must match what the solution writes.updateReport
must write the correct data into the appropriate file
updateReport
function is run must match what the solution writes.updateReport
must write the correct data into the appropriate file
updateReport
function is run must match what the solution writes.salariesReport
must write the correct data into the appropriate file
salariesReport
function is run must match what the solution writes.salariesReport
must write the correct data into the appropriate file
salariesReport
function is run must match what the solution writes.columnAverage
with 2 parameters
def
to define columnAverage
with 2 parameterscolumnAverage
with 2 parameters, use any kind of loop in at most 2 places.columnAverage
with 2 parameters, use a with statement in at least one place.columnAverage
with 2 parameters
def
to define columnAverage
with 2 parametersopen
columnAverage
with 2 parameters, call open
in at least one place.DictReader
columnAverage
with 2 parameters, use csv.DictReader
.columnAverage
with 2 parameters, use any kind of loop in at least one place.return
statement
columnAverage
with 2 parameters, use return _
in at least one place.mergeReports
with 3 parameters
def
to define mergeReports
with 3 parametersopen
mergeReports
with 3 parameters, call open
in at least one and at most 3 places.mergeReports
with 3 parameters, use a with statement in at least one and at most 3 places.mergeReports
with 3 parameters
def
to define mergeReports
with 3 parametersopen
mergeReports
with 3 parameters, call open
in at least one place.updateReport
with 3 parameters
def
to define updateReport
with 3 parametersopen
updateReport
with 3 parameters, call open
in at least one and at most 3 places.updateReport
with 3 parameters, use a with statement in at least one and at most 3 places.updateReport
with 3 parameters
def
to define updateReport
with 3 parametersopen
updateReport
with 3 parameters, call open
in at least one place.updateReport
with 3 parameters, use any kind of loop in at least one place.DictReader
updateReport
with 3 parameters, use csv.DictReader
.DictWriter
updateReport
with 3 parameters, use csv.DictWriter
.salariesReport
with 2 parameters
def
to define salariesReport
with 2 parametersopen
salariesReport
with 2 parameters, call open
in at least one and at most 2 places.salariesReport
with 2 parameters, use a with statement in at least one and at most 2 places.salariesReport
with 2 parameters
def
to define salariesReport
with 2 parametersopen
salariesReport
with 2 parameters, call open
in at least one place.DictReader
salariesReport
with 2 parameters, use csv.DictReader
.DictWriter
salariesReport
with 2 parameters, use csv.DictWriter
.