@extends('template')

@section('title')
    Lab 10, File I/O with JSON
@stop

@section('content')

# Lab 10, File I/O with JSON

Every time the `getNameDictionary` function from Part 1 is invoked, it opens the approriate SSA data file and processes the content into a dictionary:

```py
def getNameDictionary(year, sex):

    # Get the data
    filename = 'names/yob' + str(year) + '.txt'
    lines = linesFromFile(filename)

    nameDictionary = {} # A new, empty dictionary

    rank = 1 # Start with #1 ranked name

    for line in lines: # For every name in the file

        # Break the CSV line (e.g. Isabella, F, 17490) into a list where
        # 0 = name, 1 = sex, 2 = count of babies that had that name
        splitLine = line.strip().split(',')

        if splitLine[1] == sex:
            nameDictionary[splitLine[0]] = rank
            rank += 1

    return nameDictionary
```

In this part of today's lab, we want to expand on this by writing the results of `getNameDictionary` to a file, creating a &ldquo;cache&rdquo; of results for a given year/sex.

>> *&ldquo;In computing, a **cache** is a hardware or software component that **stores data so future requests for that data can be served faster**; the data stored in a cache might be the result of an earlier computation [...]. &rdquo;* [-ref](https://en.wikipedia.org/wiki/Cache_(computing))

Once we do this, we'll have a more efficient way of fetching the same data repeatedly&mdash; rather than invoking `getNameDictionary` repeatedly, we can simply read the data from a cache file (if it exists).

If a cache file doesn't exist, then we will have to invoke `getNameDictionary` to fetch/process the original data, *but* we can then create a cache file with the results so that the next time we need that data we have it.

This exercise will give us practice with the following concepts:

1. Reading/writing files
2. Working with JSON data in Python


## Task 1. *loadNameDictionary*
At the end of `lab10/names.py`, you'll define a new function called `loadNameDictionary` that takes the parameters `year` and `sex`.

This function should return a dictionary mapping baby names to their rank for the given year.

In short, this function needs to return the same thing that `getNameDictionary` does, e.g.

```py
results = loadNameDictionary(2013, 'F')
print results
```

```xml
{'Amyrikal': 16699,
 'Andreah': 11057,
 'Bailea': 14854,
 'Camyra': 14931,
 'Jojo': 9858,
 'Kaegan': 13820,
 'Laryiah': 13961,
 'Lyberti': 14017,
 'Poetry': 8913,
 'Rory': 898
 [...etc...]
}
```


**However**, in contrast to `getNameDictionary`, here's how `loadNameDictionary` function should work:

First, check if there's a cached version of the requested data in the form of a .json file. Assuming the following:
    + The .json cache files are stored in a subdirectory in your lab10 folder called `cache`.
    + The .json cache files are named with the year followed by the sex, e.g. `cache/2015f.json`

<br>
If the .json cache file __does not exist__, the function should...
    + Invoke `getNameDictionary` to build the name dictionary and
    + Write the resulting dictionary as JSON data to a cache file (e.g. `cache/2015f.json`). <small>(See hints below!)</small>

<br>
If the .json cache file __does exist__, the function should...
    + Load the contents of the .json cache file as a dictionary <small>(See hints below!)</small>
<br><br>


Finally, regardless of *how* the data is loaded (be it from the cache or invoking `getNameDictionary`), the resulting dictionary should be returned.

__Read the following hints before starting work on your function.__


### Hint 1 - Does the cache file exist?
Here's an example of how you can check to see if a file exists:
```py
os.path.exists('cache/2014f.json') # Returns boolean as to whether file exists
```

<small markdown=1>ref: [Lecture 15, slide 7: File System Operations: path.exists](http://cs111.wellesley.edu/content/lectures/lecture15/files/15_file-system-ops-solns.pdf)</small>


### Hint 2 - Reading and writing the cache file
As discussed in [Lecture 15, Slide 20](http://cs111.wellesley.edu/content/lectures/lecture16/files/16_files_4up.pdf),
JSON (JavaScript Object Notation) is a standard format we can use to encode lists and dictionaries as strings. These JSON-formatted strings can then be stored in text files, effectively giving us a &ldquo;snapshot&rdquo; of a Python dictionary or list.

To work with JSON in Python, you need to import the `json` module at the top of your file:
```py
import json
```

Here's an example showing how you can **write** a Python dictionary to a .json file:
```py
stateCapitalDict = {
    'MA': 'Boston',
    'NY': 'Albany',
    'NH': 'Concord'
}

with open('stateCapitals.json', 'w') as inputFile:
    json.dump(stateCapitalDict, inputFile)
```

<br>
Here's an example showing how you can **read** a .json file and load the contents as a Python dictionary.
```py
with open('stateCapitals.json', 'r') as inputFile:
    stateCapitalDict = json.load(inputFile)
```


## Task 2. Confirm *loadNameDictionary* is working as expected

A simple test of `loadNameDictionary` would be to invoke it and confirm you're getting a dictionary with the appropriate data back, e.g.:

```py
results = loadNameDictionary(2013, 'F')
print results
```

This confirms *what* this function is producing, but we also want to confirm *how* it's producing it.

To do this, edit your `loadNameDictionary` function so that...

__If the data is being loaded from the original data__ (using `getNameDictionary)`), it should print the message :

>> *Loading dictionary from original data...*

__If the data is being loaded from cache__, it should print the message:

>> *Loading dictionary from cache...*


Test your function out using some year/sex combination you have not used before:

```
# Change year/sex to a combination you have not used before
results = loadNameDictionary(1994, 'F')
````

On first run, it should print *Loading dictionary from original data...*

Then, run the same code again, and this time it should print: *Loading dictionary from cache...*

<br>
(Furthermore, you should see the resulting cache .json files appearing in `lab10/cache` for each test you run.)

@include('/labs/lab10/_toc')

@stop