Selecting and operating on a subset of items from a list or group is a very common idiom in programming.
Python provides several built-in ways to do this task efficiently.
1. Python Filter Function
The built-in filter() function operates on any iterable type (list, tuple, string, etc).
It takes a function and an iterable as arguments. filter() will invoke the function on each element of the iterable, and return a new iterable composed of only those elements for which the function returned True.
The return type depends on the type of the iterable passed in. If the iterable is either a string or a tuple, the return type will reflect the input type. Otherwise, the filter function will always return a list.
2. Python Filter with Number
Lets look at a couple of examples. We’ll work in the Python intrepreter.
First let’s make a list of numbers:
>>> numbers = [1, 6, 3, 8, 4, 9]
Next, we’ll define a function to act as our criteria to filter on:
>>> def lessThanFive(element): ... return element < 5 ...
Notice that in order to work properly, our criteria function must take a single argument (filter() will call it on each element of the iterable, one at a time). Using our newly defined lessThanFive() criteria function, we expect filter() to return a list of only the elements with a value of less than 5. and that’s exactly what we get.
>>> filter(lessThanFive, numbers) [1, 3, 4]
3. Python Filter with String
Let’s look at another example. This time, we’ll make a tuple of names:
>>> names = ('Jack', 'Jill', 'Steve', '')
Notice that the last name in the tuple is the empty string. If None is handed as the first argument to filter(), it falls back to use the identity function (a function that simply returns the element).
Since Python evaluates empty strings, 0’s and None’s as False in a boolean context, this default behaviour can be useful to remove the ‘empty’ elements of a literable:
>>> filter(None, names) ('Jack', 'Jill', 'Steve')
4. Python Filter with a Function
Let’s write a function to find only the names that start with ‘J’:
>>> def startsWithJ(element): ... if len(element) > 0: ... return element == 'J' ... return False ...
I check the length of the name first to ensure that there is in fact a first character.
We should expect to get back a tuple containing only ‘Jack’ and ‘Jill’ when we use this as our criteria function:
>>> filter(startsWithJ, names) ('Jack', 'Jill')
Again, notice that the return type is a tuple, and not a list.
For more information about the filter function, type help(filter) in a python interpreter to see the man page, or browse the online python filter docs.
Another way to approach this idiom lists is to use a list comprehension. Essentially, a list comprehension is a compact for-loop that builds lists. Each iteration, an element can be appended to list being built. The syntax is:
5. Basic List Comprehension Usage
[ <output value> for <element> in <list> <optional criteria> ]
This looks like a lot so lets start off with a simple example. Remember the ‘numbers’ list we defined earlier?
>>> numbers [1, 6, 3, 8, 4, 9]
we can use a list comprehension to simply ‘echo’ the list:
>>> [ num for num in numbers ] [1, 6, 3, 8, 4, 9]
6. Process List as For Loop
Let’s break this expression down:
- The middle part of the comprehension, ‘for num in numbers’, looks exactly like a for loop. It tells us that we are iterating over the ‘numbers’ list, and binds the name ‘num’ to the current element each iteration.
- The leading ‘num’ tells us what to append to the list we are building. We could have just as easily said: to build a list with each element of ‘numbers’ doubled.
>>> [ num * 2 for num in numbers ] [2, 12, 6, 16, 8, 18]
7. If Condition in Python List
Let’s add a criteria to recreate what we did with the built-in filter function earlier:
>>> [ num for num in numbers if num < 5 ] [1, 3, 4]
This time, we’re still building our new list from the original values, but we only append an original value to the new list if it satisfies the criteria (‘if num < 5’). I hope you can start to see the similarities between comprehensions and filtering.
8. Python List like an Array
Our intentions don’t always have to be to create a new list when using list comprehensions though. We can make use of side-effects, carrying out behaviour on a subset of a list all in one step. Let’s look a slightly less trivial example. Lets say we have a list of customers:
>>> customers = [('Jack', 'email@example.com', True), ('Jill', 'firstname.lastname@example.org', False)]
Customers are represented by a 3-tuple containing their name, email address, and whether they want to receive email notifications, in that order. We can write:
>>> def subscribesForUpdates(customer): ... return customer ...
to capture the intent of this 3rd field.
Now, lets say we have a message to send, but we only want to send it to only those customers who subscribe for updates. The following function will be responsible to send the notifications:
>>> def emailUpdate(customer): ... # stub for actually sending the email ... print 'emailing update to: %s' % customer ...
this function is only a stub. If we were doing this for real, the logic to send an email would go here, but in our example, we simply print out to show that the function was called for a given customer. Notice that emailUpdate does not return a customer, but instead carries out behaviour using one.
9. Function, If condition and For loop in Python List
Now, we have everything we need to use a list comprehension to send out our notifications:
>>> [ emailUpdate(customer) for customer in customers if subscribesForUpdates(customer) ] emailing update to: email@example.com [None]
There are similar comprehension syntaxes for Dictionaries, tuples, and even sets. To read more about list comprehensions, visit the offical python lists docs.
I hope you’re getting familiar with the list comprehension syntax, but let’s break this one down:
- The middle of the comprehension, ‘for customer in customers’, tells us that we are iterating over the list ‘customers’. It also tells us that the current list element for each iteration will be bound to the variable ‘customer’.
- The start of the comprehension, ’emailUpdate(customer)’, tells us that we will invoke the emailUpdate function on each customer (this is where the side-effect behaviour is important), and then append its return value to the list we are building.
- The end of the comprehension, ‘if subscribesForUpdates(customer)’, says to only execute the emailUpdate function if the condition is true (i.e. if the customer subscribed for emails).
From the output, we see that the emailUpdate function was indeed only invoked on Jack (as Jill does not subscribe for emails).
We also see that the return value from the comprehension was [None]. Why is this?
Remember that list comprehensions build lists. This time, we are building a list using the return of emailUpdates().
Since emailUpdates() doesn’t explicitly return a value, Python will make it implicitly return None. Since only Jack subscribed for email updates, the emailUpdates() function was only called once, and only 1 None was appended to the initially empty list that the comprehension built for us.