Instructions for payroll

(produced at 17:54 UTC on 2021-11-23)

This task is part of ps09 which is due at 23:59 EST on 2021-11-23.

You have the option to work with a partner on this task if you wish. Working with a partner requires more work to coordinate schedules, but if you work together and make sure that you are both understanding the code you write, you will make progress faster and learn more.

You can submit this task using this link.

Put all of your work for this task into the file payroll.py
(you will create this file from scratch)

Overview

Your company "Franchisia" has hired you to handle some formatting for payroll reports from its various branch locations in Wolleslu and the surrounding towns. You could simply do the work by hand in a spreadsheet app, but instead, you decide to write some Python functions to handle things automatically. Since the reports are submitted as CSV files, you decide to use the csv module to process the data.

This example shows how the data is formatted; we have supplied a number of directories full of such CSV files for testing purposes.

In a new file named payroll.py, write each of the functions described below.

columnAverage

columnAverage takes a filename as its first parameter and a string naming a column as its second. It reads the specified file and must average the numeric values in the specified column by adding them up and then dividing by the number of numeric values it found. If the specified column doesn't exist in the specified file, or if it exists but doesn't contain any numeric values (i.e., values which can be converted to a floating-point number using the float function), columnAverage must return None.

As an extra goal, you should use a csv.DictReader to read the data, although if you don't use a csv.DictReader, you must at least use a normal csv.reader.

These examples show how columnAverage should work.

mergeReports

mergeReports takes three filenames as arguments specifying two files to merge and a file to write the results into. It may assume that both files being merged contain the same columns, in the same order, and it just needs to write one file containing all of the rows from both files being merged, except that the redundant header row of the second file is skipped. The rows of the second file should appear in their original order after the rows of the first file.

These examples show how mergeReports should work.

updateReport

updateReport takes three filenames as arguments specifying a base file, an update file, and a result file to write into. It must write an updated report based on the base file by overwriting rows for employees listed in the update file, and adding rows from the update file that contain new employees.

All of the rows of the base file which aren't being updated will come first in the result file, followed by all of the rows in the update file. The values in the "Id" column in the two files determine which rows from the base file are being replaced: if a base row's "Id" value appears in the "Id" column of any row of the update file, that row should not be written to the result file.

You may assume that the columns in the two files are in the same order, but you may not assume that the order will always match the ordering provided in most of the starter files, so you will need to find the position of the "Id" column, or use csv.DictReaders to read the files. The output file should have the same columns in the same order as both input files.

As an extra goal, you should use a csv.DictReader to read the data, and you should use a csv.DictWriter to write the data, although if you don't use a csv.DictReader and/or csv.DictWriter, you must at least use a normal csv.reader and a normal csv.writer.

These examples show how updateReport should work.

salariesReport

salariesReport takes a directory path and a filename as arguments specifying a directory to search in and a report file to write to. It must find all CSV files in the specified directory and assemble a report with information about each salaried employee listed in those files.

Salaried employees are any employees whose value for the "Wage" column is the string 'salaried'. Values from the following columns (in this order) should be written into the combined report:

Name, Id, Pay, Hours, Overtime

Note that the "Wage" column is not written to the report, and that the order of columns differs from the default order in most of the provided files. You may not assume anything about the order of these columns in the CSV files you process, although you may assume that each column exists in each file. This function should process only files whose filename ends in '.csv', and it does not need to process files in subdirectories of the target directory.

As an extra goal, you should use a csv.DictReader to read the data, although if you don't use a csv.DictReader, you must at least use a normal csv.reader. Using a csv.DictReader to read the data will greatly simplify your work.

These examples show how salariesReport should work, including what the output file should look like.

Notes

  • As usual you must document each function you write, and you must not waste fruit or waste boxes.
  • You should check the procedure goals for each of the functions, as they include specific limitations you must obey, especially if you want to satisfy the extra goals, which are not mentioned in the short descriptions above.
  • Most of the procedure goals in this task can be satisfied by using the required function or construct in a helper function that the original function calls. You may want to take advantage of this to centralize some common code for reading from and/or writing to CSV files, although this is not required.

Ungraded Challenge

If you're up for a further challenge, there's one company office in each of the obvious_reports, subtle_reports, and reports directories which is doing something bad. Figure out what's going on, and which office is doing it in each directory.

Examples

Format of payroll files

These examples just show the format of a payload file. Note that for easier viewing, you can copy-paste the contents of a CSV file into a spreadsheet (like Google Sheets or Excel) and possibly choose the "split text to columns" option from the paste menu that pops up when you paste (at least in Google Sheets) if things are all in one column.

Note that the column order is the same in almost all of the CSV files, but you cannot assume that it will always be the same.

Although the precise meaning of each column is not important to this task, the columns are:

  • Name: Employee's first name.
  • Id: Employee's unique ID number.
  • Hours: Hours worked this month (four weeks). Capped at 4x40 = 160.
  • Overtime: Any hours worked beyond 160 this month.
  • Wage: Either a number for the employee's hourly wage, or the word "salaried" for salaried employees.
  • Pay: The actual amount paid. Should be hours times wage, plus overtime times 1.5x wage for hourly employees.
  • Benefits: Money spent on things like health insurance for that employee, which isn't direct compensation. Benefits vary by the individual, but those working fewer than 40 hours per week receive fewer benefits as they are not full-time employees, and those working fewer than 19 hours per week receive even fewer benefits, as they are not even half-time employees.
In []:
%more reports/Wolleslu.csv
File
reports/Wolleslu.csv
Name,Id,Hours,Overtime,Wage,Pay,Benefits Walter,379325,140,0,14.7,2057.66,617.3 Jolynn,379327,132,0,11.58,1528.22,458.46 Anastasia,379330,135,0,13.7,1849.1,554.73 John,379333,132,0,11.6,1530.63,459.19 Emberlynn,379334,156,0,15.06,2349.61,704.88 Mahdi,379337,97,0,11.33,1099.13,329.74 Lazarus,379339,141,0,14.32,2018.98,605.69 Yonatan,379341,101,0,11.56,1167.64,350.29 Colt,379344,148,0,14.27,2112.16,633.65 Levi,379345,140,0,11.71,1639.92,491.98 Ryder,379346,142,0,14.58,2070.51,621.15 Charlie,379347,148,0,14.96,2214.11,664.23 Genesis,379315,134,0,salaried,5278.66,2903.26
In []:
%more reports/Hamingfram.csv
File
reports/Hamingfram.csv
Name,Id,Hours,Overtime,Wage,Pay,Benefits Eleanor,379351,133,0,12.17,1618.41,485.52 Abigail,379352,146,0,11.41,1666.24,499.87 Mila,379361,137,0,14.09,1930.0,579.0 Ethan,379362,76,0,13.99,1063.35,53.17 Lily,379363,66,0,11.65,768.71,38.44 Sadie,379364,125,0,12.31,1538.81,461.64 Gabriel,379350,160,57,salaried,5691.47,3130.31
In []:
%more updates/Newt2.csv
File
updates/Newt2.csv
Name,Id,Hours,Overtime,Wage,Pay,Benefits Skye,379243,113,0,11.39,1287.49,383.62 Arlo,379251,137,0,13.91,1919.59,575.92 Jordan Jr.,379267,25,0,25.00,525.00,0.00 Jordan,379240,128,0,salaried,5066.93,2786.81
In []:
%more special/Mixedup.csv
File
special/Mixedup.csv
Benefits,Name,Pay,Id,Hours,Wage,Overtime 404.47,Luna,1348.22,379328,116,11.62,0 351.79,Phoenix,1172.62,379330,88,13.33,0 409.23,Khiara,1364.11,379331,125,10.91,0 322.16,Helene,1073.85,379333,82,13.1,0 355.97,Alexander,1186.57,379334,93,12.76,0 3126.18,Hadley,5683.96,379327,160,salaried,76

columnAverage examples

columnAverage reads a CSV file and averages all numeric values in the named column. Note that it should work even for files whose columns aren't in the same order as the payroll files we've given you to work with, so you'll need to detect which column to read from based on the first row of the file (or use a csv.DictReader). It will return None if either the file doesn't have a column with the specified name, or if there are no numeric values in that column.

The files referenced in this example and others are provided as part of the starter code.

In []:
columnAverage('reports/Wolleslu.csv', 'Hours')
Out[]:
134.30769230769232
In []:
columnAverage('reports/Wolleslu.csv', 'Pay')
Out[]:
2070.4869230769227
In []:
columnAverage('reports/Notuck.csv', 'Hours')
Out[]:
120.5
In []:
columnAverage('reports/Notuck.csv', 'Pay')
Out[]:
1955.8312500000002

mergeReports examples

mergeReports merges two reports into one file, writing just a single copy of the header row at the top. It is allowed to assume that the files being merged have the same columns in the same order.

In []:
mergeReports('reports/Newt.csv', 'reports/Notuck.csv', 'results/Newtuck.csv')
File
results/Newtuck.csv
Name,Id,Hours,Overtime,Wage,Pay,Benefits Skye,379243,109,0,11.39,1241.93,372.58 Jay,379245,76,0,13.85,1052.89,52.64 Naliyah,379247,186,0,10.34,1923.65,769.46 Ibrahim,379248,145,0,14.3,2073.74,622.12 Londyn,379249,207,0,12.53,2592.75,1037.1 Arlo,379251,145,0,13.91,2016.96,605.09 Jordan,379240,103,0,salaried,5066.93,2786.81 Jazmin,379255,53,0,15.42,817.24,40.86 Isabella,379256,188,0,13.17,2475.62,990.25 Karolyne,379257,60,0,14.25,855.14,42.76 Lucas,379260,146,0,11.2,1634.58,490.37 Palmer,379261,152,0,11.64,1769.49,530.85 Annabelle,379262,65,0,13.91,904.0,45.2 Allison,379263,140,0,14.53,2034.77,610.43 Adrian,379252,160,15,salaried,5155.81,2835.7
In []:
mergeReports( 'reports/Hamingfram.csv', 'reports/Hamneed.csv', 'results/Hamham.csv' )
File
results/Hamham.csv
Name,Id,Hours,Overtime,Wage,Pay,Benefits Eleanor,379351,133,0,12.17,1618.41,485.52 Abigail,379352,146,0,11.41,1666.24,499.87 Mila,379361,137,0,14.09,1930.0,579.0 Ethan,379362,76,0,13.99,1063.35,53.17 Lily,379363,66,0,11.65,768.71,38.44 Sadie,379364,125,0,12.31,1538.81,461.64 Gabriel,379350,160,57,salaried,5691.47,3130.31 Rane,379267,83,0,12.7,1054.21,316.26 Samuel,379268,169,0,13.29,2246.17,898.47 Charlotte,379271,101,0,11.04,1114.76,334.43 Kai,379272,182,0,14.12,2569.0,1027.6 Ahtziri,379276,140,0,10.49,1468.52,440.56 Allister,379266,160,74,salaried,5391.25,2965.19

updateReport examples

updateReport takes an original report and a second report containing some updates and/or additions. It merges these two reports into an updated report, where employee rows from the original report which have an entry with the same ID among the updates are omitted, and those updated rows, along with any rows for employees with new IDs (i.e., all of the rows of updates), are added to the end of the original report.

This function is allowed to assume that the files being combined have the same columns in the same order.

In []:
updateReport('reports/Newt.csv', 'updates/Newt2.csv', 'results/NewtUpdated.csv')
File
results/NewtUpdated.csv
Jay,379245,76,0,13.85,1052.89,52.64 Naliyah,379247,186,0,10.34,1923.65,769.46 Ibrahim,379248,145,0,14.3,2073.74,622.12 Londyn,379249,207,0,12.53,2592.75,1037.1 Skye,379243,113,0,11.39,1287.49,383.62 Arlo,379251,137,0,13.91,1919.59,575.92 Jordan Jr.,379267,25,0,25.00,525.00,0.00 Jordan,379240,128,0,salaried,5066.93,2786.81
In []:
updateReport( 'reports/Thamwal.csv', 'updates/Thamwal2.csv', 'results/ThamwalUpdated.csv' )
File
results/ThamwalUpdated.csv
Penelope,379278,138,0,13.14,1812.86,543.86 Dasia,379279,71,0,12.14,861.76,43.09 Leo,379280,61,0,11.12,678.02,33.9 Nicholas,379281,139,0,11.5,1598.71,479.61 Remi,379292,54,0,10.48,566.17,28.31 Lynnea,379295,147,0,12.67,1862.27,558.68 Xavier,379297,150,0,13.82,2072.94,621.88 Tucker,379300,148,0,12.3,1819.77,545.93 Zamir,379304,59,0,11.93,703.63,35.18 Isaiah,379305,139,0,12.75,1772.46,531.74 Mason,379277,160,38,salaried,5676.21,3121.92 Luis,379291,0,0,14.53,0,0 Luisa,379291,57,0,14.53,828.04,41.4

salariesReport examples

salariesReport takes a directory path and inspects all of the '.csv' files in that directory, looking for rows where the value of the "Wage" column is 'salaried', indicating a salaried employee who does not have an hourly wage. It gathers all of these rows from every file it inspects, and writes a combined report, which has the following columns:

Name, Id, Pay, Hours, Overtime

Note that neither "Wage" nor "Benefits" are included. This function may assume that each CSV file in the target directory will include at least those columns, but it may not assume that they'll be in the same order from file to file. It must write them in the order shown above when it creates its output file.

In []:
salariesReport('reports', 'results/reports_salaries.csv')
File
results/reports_salaries.csv
Name,Id,Pay,Hours,Overtime Allister,379266,5391.25,160,74 Adrian,379252,5155.81,160,15 Gabriel,379350,5691.47,160,57 Genesis,379315,5278.66,134,0 Ayden,379366,5668.94,109,0 Mason,379277,5676.21,160,38 Jordan,379240,5066.93,103,0
In []:
salariesReport('obvious_reports', 'results/obvious_salaries.csv')
File
results/obvious_salaries.csv
Name,Id,Pay,Hours,Overtime Crew,379293,5865.94,160,3 Keith,379262,5885.87,121,0 Gideon,379353,5580.95,160,64 Lucy,379326,5753.87,133,0 Venus,379376,5537.56,160,99 Connor,379308,5499.36,160,54 Luis,379239,5389.61,160,50

Rubric

 
unknown Style Requirements
How your code is written.
 
unknown Core goals
Complete all core goals for core credit. Get partial credit for completing at least half, and more partial credit for completing at least 90%.
 
unknown All functions are documented
Each function you define must include a non-empty documentation string as the very first thing in the function.
 
unknown Procedure Requirements
What code you use to solve the problem.
 
unknown Core goals
Complete all core goals for core credit. Get partial credit for completing at least half, and more partial credit for completing at least 90%.
 
unknown Define columnAverage with 2 parameters
Use def to define columnAverage with 2 parameters
 
unknown Call open
Within the definition of columnAverage with 2 parameters, call open in at least once place.
 
unknown Use a CSV reader
Within the definition of columnAverage with 2 parameters, use csv.reader or csv.DictReader.
 
unknown Use a loop
Within the definition of columnAverage with 2 parameters, use any kind of loop in at least once place.
 
unknown Use a return statement
Within the definition of columnAverage with 2 parameters, use return _ in at least once place.
 
unknown Define mergeReports with 3 parameters
Use def to define mergeReports with 3 parameters
 
unknown Call open
Within the definition of mergeReports with 3 parameters, call open in at least once place.
 
unknown Define updateReport with 3 parameters
Use def to define updateReport with 3 parameters
 
unknown Call open
Within the definition of updateReport with 3 parameters, call open in at least once place.
 
unknown Use a loop
Within the definition of updateReport with 3 parameters, use any kind of loop in at least once place.
 
unknown Use a CSV reader
Within the definition of updateReport with 3 parameters, use csv.reader or csv.DictReader.
 
unknown Use a CSV writer
Within the definition of updateReport with 3 parameters, use csv.writer or csv.DictWriter.
 
unknown Define salariesReport with 2 parameters
Use def to define salariesReport with 2 parameters
 
unknown Call os.listdir
Within the definition of salariesReport with 2 parameters, call os.listdir or listdir in at least once place.
 
unknown Use a loop
Within the definition of salariesReport with 2 parameters, use any kind of loop in at least once place.
 
unknown Call open
Within the loop within the definition of salariesReport with 2 parameters, call open in at least once place.
 
unknown Use a CSV reader
Within the loop within the definition of salariesReport with 2 parameters, use csv.reader or csv.DictReader.
 
unknown Use a CSV writer
Within the definition of salariesReport with 2 parameters, use csv.writer or csv.DictWriter.
 
unknown Extra goals
Complete all extra goals in addition to the core goals for a perfect score.
 
unknown Do not ignore the results of any fruitful function calls
According to the "Don't waste fruit" principle, every place you call a fruitful function (built-in or custom) you must store the result in a variable, or that function call must be part of a larger expression that uses its return value.
 
unknown Do not create any variables that you never make use of
According to the "Don't waste boxes" principle, every time you create a variable (using = or by defining a parameter for a function) you must also later use that variable as part of another expression. If you need to create a variable that you won't use, it must have the name _, but you should only do this if absolutely necessary.
 
unknown Define columnAverage with 2 parameters
Use def to define columnAverage with 2 parameters
 
unknown Use a CSV DictReader
Within the definition of columnAverage with 2 parameters, use csv.DictReader.
 
unknown Use a loop
Within the definition of columnAverage with 2 parameters, use any kind of loop in at most 2 places.
 
unknown Use a with statement
Within the definition of columnAverage with 2 parameters, use with _ as _: ___ in at least once place.
 
unknown Define mergeReports with 3 parameters
Use def to define mergeReports with 3 parameters
 
unknown Call open
Within the definition of mergeReports with 3 parameters, call open in at least one and at most 3 places.
 
unknown Use a with statement
Within the definition of mergeReports with 3 parameters, use with _ as _: ___ in at least one and at most 3 places.
 
unknown Define updateReport with 3 parameters
Use def to define updateReport with 3 parameters
 
unknown Call open
Within the definition of updateReport with 3 parameters, call open in at least one and at most 3 places.
 
unknown Use a with statement
Within the definition of updateReport with 3 parameters, use with _ as _: ___ in at least one and at most 3 places.
 
unknown Use a CSV DictReader
Within the definition of updateReport with 3 parameters, use csv.DictReader.
 
unknown Use a CSV DictWriter
Within the definition of updateReport with 3 parameters, use csv.DictWriter.
 
unknown Define salariesReport with 2 parameters
Use def to define salariesReport with 2 parameters
 
unknown Call os.path.join
Within the definition of salariesReport with 2 parameters, call os.path.join, path.join, or join in at least once place.
 
unknown Use a loop
Within the definition of salariesReport with 2 parameters, use any kind of loop in at least once place.
 
unknown Use a with statement
Within the loop within the definition of salariesReport with 2 parameters, use with _ as _: ___ in at least one and at most 3 places.
 
unknown Use a CSV DictReader
Within the loop within the definition of salariesReport with 2 parameters, use csv.DictReader.
 
unknown Call open
Within the definition of salariesReport with 2 parameters, call open in at least one and at most 2 places.
 
unknown Product Requirements
Your code's result values.
 
unknown Core goals
Complete all core goals for core credit. Get partial credit for completing at least half, and more partial credit for completing at least 90%.
 
unknown columnAverage must return the correct result
The result returned when your columnAverage function is run must match the solution result.
 
unknown Extra goals
Complete all extra goals in addition to the core goals for a perfect score.
 
unknown columnAverage must return the correct result
The result returned when your columnAverage function is run must match the solution result.
 
unknown Behavior Requirements
What your code does from the user's perspective.
 
unknown Core goals
Complete all core goals for core credit. Get partial credit for completing at least half, and more partial credit for completing at least 90%.
 
unknown mergeReports must write the correct data into the appropriate file
The data written to the appropriate file when your mergeReports function is run must match what the solution writes.
 
unknown updateReport must write the correct data into the appropriate file
The data written to the appropriate file when your updateReport function is run must match what the solution writes.
 
unknown salariesReport must write the correct data into the appropriate file
The data written to the appropriate file when your salariesReport function is run must match what the solution writes.
 
unknown Extra goals
Complete all extra goals in addition to the core goals for a perfect score.
 
unknown mergeReports must write the correct data into the appropriate file
The data written to the appropriate file when your mergeReports function is run must match what the solution writes.
 
unknown updateReport must write the correct data into the appropriate file
The data written to the appropriate file when your updateReport function is run must match what the solution writes.
 
unknown salariesReport must write the correct data into the appropriate file
The data written to the appropriate file when your salariesReport function is run must match what the solution writes.