Table of Contents
makeSquarePairs
key
parameterpeople
by the the length of nameskey
functionskey
work?split
and join
methodsWe can simplify the mapping/filtering patterns with a syntactic device called list comprehension.
# The old way of doing mapping
nums = [17, 42, 6, 23, 38]
result = []
for x in nums:
result.append(x*2)
result # list value at the end of mapping process
# The new way of doing mapping with a list comprehension:
[x*2 for x in nums]
# The old way of doing filtering
nums = [17, 42, 6, 23, 38]
result = []
for n in nums:
if n%2 == 0:
result.append(n)
result # list value at the end of mapping process
# The new way of doing filtering
[n for n in nums if n%2 == 0]
states = ["Alabama", "California", "Illinois", "Massachusetts",
"Michigan", "Ohio", "Oklahoma", "Washington"]
Ex 1a: Use list comprehension to write a single line that creates a list of the lengths of the strings in states
:
# Your code here
[len(state) for state in states]
Ex 1b: Use list comprehension to write a single line that creates a list of the abbreviations of the states in states
(i.e. ['AL', 'AR', 'CA', 'IL', 'MA', 'MI', 'OK', 'UT', 'WA']
)
# Your code here
[state[0:2].upper() for state in states]
Ex 1c: Use list comprehension to write a single line that creates a list of all states in states
that end in 'a'
:
# Your code here
[state for state in states if state[-1]=='a']
makeSquarePairs
¶Define the function makeSquarePairs
that given a single integer num
returns a list of tuples containing all numbers from 1 to the num inclusive and their square values. An example is shown below:
makeSquarePairs(5)
[(1, 1), (2, 4), (3, 9), (4, 16), (5, 25)]
Use list comprehension!
def makeSquarePairs(num):
# flesh out the body of this function
# you only need ONE line of code
# Your code here
return [(n, n*n) for n in range(1, num+1)]
makeSquarePairs(5)
It is possible to do both mapping and filtering in a single list comprehension. Examine the example below which filters a list by even numbers and creates a new list of their squares.
# Mapping and filtering together
[(x**2) for x in range(10) if x % 2 == 0]
Note that the expression for mapping still comes before the for
keyword and the filtering with the if
keyword still comes after the sequence expression. Below is the equivalent code without list comprehension.
newList = []
for x in range(10):
if x % 2 == 0:
newList.append(x**2)
newList
states2 = ['Arkansas', 'Idaho', 'North Carolina', 'New Mexico',
'Oregon', 'Rhode Island', 'South Dakota', 'Utah']
Ex 3a: Use list comprehension to write a single line that creates a list of the abbreviations of only the one-word states in states2
(i.e., ['AR', 'ID', 'OR', 'UT']
)
# Your code here
[state[0:2].upper() for state in states2 if ' ' not in state]
Ex 3b: Use list comprehension to write a single line that creates a list of the abbreviations of only the two-word states in states2
(i.e., ['NC', 'NM', 'RI', 'SD']
)
Reminder: the .split()
method of strings can split a string into multiple parts wherever spaces occur.
# Your code here
# Solution #1
[state.split()[0][0] + state.split()[1][0] for state in states2 if ' ' in state]
# Solution #2 uses nested list comprehensions to avoid calling .split() twice on a state.
[strings[0][0] + strings[1][0]
for strings in [state.split() for state in states2]
if len(strings) == 2]
Let's start with the simplest case, sorting a list of unordered numbers, positive and negative.
numbers = [35, -2, 17, -9, 0, 12, 19]
We will use Python's built-in function, sorted
to sort the list. This function always returns a new list.
sorted(numbers)
And we can verify that numbers
hasn't changed:
numbers
Note: This suggests that if we want to use the result of sorted
, we must define a variable to save its returned value, for example:
sortedNumbers = sorted(numbers)
By default the list is sorted in the ascending order (from the smalled value to the largest), but we can easily reverse the order, using the reverse
keyword parameter of the function sorted
, as shown below:
sorted(numbers, reverse=True)
You have seen keyword parameters to functions before in the context of print
. Remember the sep
and end
keyword parameters for print
?
Strings and tuples can also be sorted in the same way. The result is always going to be a new list.
Characters in a string will be ordered in dictionary order:
sorted('facetiously')
phrase = 'Red Code 1'
sorted(phrase)
Question: Why do we see a space as the first element in the sorted list of characters for phrase
?
Answer: Because of the ASCII representation of characters.
We can use the Python built-in function ord
to find the ASCII code of every character:
ord(' ')
We can write a for
loop to print the code for every character.
for item in sorted(phrase):
print(f"'{item}' has ASCII code {ord(item)}")
The above example uses a so-called f-string of the form f"..."
, where "..."
is a template string in which parts delimited by curly braces contain Python expressions whose values are automatically converted to strings and then are inserted into the template via concatenation.
For example, f"'{item}' has ASCII code {ord(item)}"
is just a more convenient way to write the harder-to-read concatenation expression
"'" + item + "' has ASCII code " + str(ord(item))
Just as in the case of the list numbers
in the above example, the string value of phrase
hasn't changed:
phrase
This is to be expected, because strings are immutable.
digits = (9, 7, 5, 3, 1) # this is a tuple
type(digits) # check the type
sorted(digits)
Notice that the result of the sorting is a list, not a tuple. This is because the function sorted
always returns a list.
digits
The original tuple value hasn't changed.
We can sort list of sequences such as list of strings, list of tuples, and list of lists.
Sorting the list of tuples and the list of lists is going to be similar. The same principles will apply.
# a long string that we will split into a list of words
phrase = "99 red balloons *floating* in the Summer sky"
words = phrase.split()
words
sorted(words)
Question: Can you explain the results of sorting here? What rules are in place?
Answer: Words that start with special characters come first, then words that start with digits, words starting with uppercase letters, and finally, words with lowercase letters in alphabetical order. This ordering corresponds to the ASCII table numerical representations of each word's first character.
*String characters are ordered by these rules:*
!
+
,
.
etc.)0
1
2
etc.):
<
?
etc.)A
B
C
etc.)^
_
etc.)a
b
c
etc.)|
~
etc.)sorted(words, reverse=True)
Remember, the original list is unchanged:
words
Tuples are compared element by element, starting with the one at index 0. This is known as lexicographic order, which is a generalization of dictionary order on strings in which each tuple element generalizes a character in a string.
triples = [(8, 'a', '$'), (7, 'c', '@'),
(7, 'b', '+'), (8, 'a', '!')]
sorted(triples)
Q: What happens in the case of ties for the first elements of tuples?
A: We keep comparing elements with the same indices until we find two that are not the same. (See example for the two tuples that start with 8.)
ord('!') < ord('$')
print(ord('!'), ord('$'))
That is, the reason '!'
is less than '$'
is that the first has a smaller ASCII code than the latter.
key
keyword parameter¶Often there are cases in which we want to sort by an element that is not first in a sequence, for example, given the list of tuples people
(below), we want to sort by the age of a person.
people = [('Mary Beth Johnson', 18),
('Ed Smith', 17),
('Janet Doe', 25),
('Bob Miller', 31)]
Simply using sorted
as we have done so far will not work. But the function sorted
has been designed to deal with this scenario in mind. Let us read its doc string.
help(sorted)
Notice the phrase: A custom key function can be supplied to customize the sort order. This means that we can specify a function that for each element determines how it should be compared to other elements of the iterable. Let us see an example.
people = [('Mary Beth Johnson', 18),
('Ed Smith', 17),
('Janet Doe', 25),
('Bob Miller', 31)]
We'll create the function age
that given a person tuple (name, age) will return the age value.
def age(personTuple):
"""Helper function to use in sorted"""
return personTuple[1]
age(('Janet Doe', 25))
Now that we have this function, we will use it as the value for the key keyword parameter in sorted
.
sorted(people, key=age)
The list was sorted by the age values! Let's see one more example. We will create a helper function lastName
that returns a person's last name.
def lastName(personTuple):
"""Helper function to use in sorted"""
return personTuple[0].split()[-1] # first access the whole name (has index=0 in the tuple)
# then split it (will create a list),
# then return its last element (index=-1)
lastName(('Bob Miller', 31))
sorted(people, key=lastName)
Important: The keyword parameter key
is being assigned as its value a function value. Functions in Python are values, see the examples below:
age
lastName
We can create a variable, assign it a function value, and then call that variable as if it was a function (because indeed it's an alias for a function).
boo = age
boo(('Janet Doe', 25))
foo = lastName
foo(('Ed Smith', 17))
The variables boo
and foo
are aliases for the functions age
and lastName
, which we can easily verify:
boo
foo
sorted(people, key=boo) # boo is an alias for age
sorted(people, key=foo) # foo is an alias for lastName
people
by the the length of names¶Suppose we want to sort people
in ascending order by the lengths of their names. I.e. the sorted result should be:
[('Ed Smith', 17),
('Janet Doe', 25),
('Bob Miller', 31)
('Mary Beth Johnson', 18),
]
Define a helper function nameLength
that can be used as the key
parameter for sorted
to perform sorting by name length
# define the nameLength function below
# Your code here
def nameLength(personTuple):
return len(personTuple[0])
sorted(people, key=nameLength)
key
functions¶Assume we have a new list of person tuples, where there are lots of ambiguities in terms of what comes first. Concretely:
people2 = [('Ed Jones', 18),
('Bob Doe', 25),
('Ed Doe', 18),
('Ana Doe', 25),
('Ana Jones', 18)]
Notice that we have several individuals with the same age, or the same first name, or the same last name. How should we sort elements in this situation?
We can create a function that uses a tuple to break the ties.
def ageLastFirst(person):
return (age(person), lastName(person), firstName(person))
Your Turn
Define a function firstName
, that mimics lastName
, but returns the first name of a person.
# Your code here
def firstName(personTuple):
"""Helper function to use in sorted"""
return personTuple[0].split()[0]
If you defined firstName
, now we can write ageLastFirst
:
def ageLastFirst(person):
"""Helper function to use in sorted"""
return (age(person), lastName(person), firstName(person))
people2 = [('Ed Jones', 18),
('Bob Doe', 25),
('Ed Doe', 18),
('Ana Doe', 25),
('Ana Jones', 18)]
sorted(people2, key=ageLastFirst)
Notice that in the result, the tuples are sorted first by age, then by last name (when the same age), and then by first name (when same age and last name).
Suppose we want to sort people2
first by the length of their names, and then sort names with the same length alphabetically. I.e., we want the result to be:
[('Ed Doe', 18),
('Ana Doe', 25),
('Bob Doe', 25),
('Ed Jones', 18),
('Ana Jones', 18)]
Define a helper function nameLengthThenAlphabetic
that can be used as the key
parameter for sorted
to perform sorting first by name length, and then alphabetically by name to break ties.
# Your code here
def nameLengthThenAlphabetic(person):
return (nameLength(person), person[0]) # person[0] will sort names of same length alphabetically
sorted(people2, key=nameLengthThenAlphabetic)
key
work?¶When sorted
is called with a key
parameter, the first thing it does is to invoke the function that is referred to by key
for each element of the sequence. If we think of the value returned by the key
function as keyvalue
, then what sorted
does is to create a tuple (keyvalue,value)
, sort the sequence based on this tuple, and then get rid of the tuple and return the sorted values only.
This process is also known as Decorate, Sort, Undecorate and we can try it too:
# Step 1 (Decorate): create a list of tuples (keyvalue, value)
decorated = [(age(person), person) for person in people]
decorated
# Step 2 (Sort): invoke the function sorted without the key function
decoratedSorted = sorted(decorated)
decoratedSorted
# Step 3 (Undecorate): extract now the value from each (keyvalue,value) pair to create the end result
undecoratedResult = [item[1] for item in decoratedSorted]
undecoratedResult
As you might remember, when we include key in sorted
the result is the same:
sorted(people, key=age)
Basically, the parameter key
works, becuase of the rules for sorting a list of tuples, that we saw earlier on the notebook.
split
and join
methods¶There are two string methods that are quite useful when dealing with lists of strings, including when sorting them. These are split
and join
. split
when applied to a string will split that string up into pieces, returning a list of strings. Without any argument, the string will be split wherever whitespace is found (i.e., spaces, newlines, tabs, etc.). If an argument is given, wherever that specific character or sequence of characters is found will become a split point. For example:
'This is a sentence'.split()
'one-two-three'.split('-')
'To be or not to be'.split('o')
'I saw the cat befriend the mouse that ate the cheese'.split(' the ')
The join
method is the opposite of split
, and you also give it infromation in the opposite order: the separator comes before .join
and the things to join together (a list of strings) is provided as an argument. It returns the string that results from concatenating all the strings in the list separated by the separator. For example:
', '.join(['aardvark', 'bunny', 'cat', 'dingo'])
';'.join(['aardvark', 'bunny', 'cat', 'dingo'])
' '.join(['aardvark', 'bunny', 'cat', 'dingo'])
''.join(['aardvark', 'bunny', 'cat', 'dingo'])
Split and join can be used together to separate off one part of a string and put the rest back together. For example, if you want to chop off the last word of a multi-word string, you can split it, slice the result, and then join that slice back together, like this:
text = 'every word is important'
' '.join(text.split()[:-1])
If we break that down into steps using variables, it would be:
text = 'every word is important'
words = text.split()
allButLast = words[:-1]
' '.join(allButLast)
Lists have two methods for sorting: sort
and reverse
. These methods mutate the original list and return None
rather than returning the mutated list.
sort
¶numbers = [35, -2, 17, -9, 0, 12, 19]
numbers.sort()
Notice that no value was returned, because sort
mutates the original list.
numbers
reverse
¶numbers2 = [35, -2, 17, -9, 0, 12, 19]
numbers2.reverse()
This method also does not return a value, because it too mutates the list.
numbers2
In combination, sort and reverse can sort a list in reverse order:
numbers2.sort()
numbers2.reverse()
numbers2
sorted
function, the sort
method can also use the keyword parameters key
and reverse
¶people.sort(key=age, reverse=True)
people
This is the end of this notebook!