Home > Articles > Useful Python language features for interviews

Useful Python language features for interviews

collections.namedtuple

namedtuple can make your code a lot more readable. In an interview, that's helpful for a few reasons. First, it can help you demonstrate a good understanding of some of Python's standard libraries. Second, it helps you show off that you place importance on writing readable code. Third, it makes writing your code easier. If you're passing around tuples, it can be easy to forget what the object at each index into the tuple is. Using a namedtuple can help you avoid that.

Consider the case where you need to represent colors. You could choose to do so with a 3-tuple of the form (i, j, k) (where i, j, and k are integers on the range 0-255). This representation seems intuitive and natural enough. i could be the value for red, j for green, and k for blue. A problem with this approach is that you may forget which of the three numbers represents which primary color of light. Using a namedtuple could help with this:

Color = namedtuple('Color', ['red', 'green', 'blue'])

What does this change? Well, building a Color is almost the same as building that tuple you were previously building. Instead of doing (i, j, k), you'll now write Color(i, j, k). This is perhaps a little more readable, and it adds some more semantic meaning to your code. We're no longer just building a tuple; we're building a Color.

The real win for namedtuple is in access to its elements. Before, to get the red value for a color c, we would use brackets: c[0]. By comparison, if we have a Color called c, we could use a more friendly dot syntax: c.red. In my experience, while not having to remember the index of the red element is nice, the real win is in how much more readable c.red is in contrast to c[0].

collections.defaultdict and collections.Counter

Suppose your interviewer asks you to find the most common string in a list of strings. We can solve this problem using a defaultdict (let's call it d). We could loop through the list, incrementing d[elem] for each element. Then, we just find the one we saw most. The implementation would look like this:

def most_common_dd(lst):
    d = defaultdict(int)
    for e in lst:
        d[e] += 1

    return max(d.iteritems(), key=lambda t: t[1])

Apparently, users and maintainers of Python saw this pattern enough that they decided to create Counter. Counter lets us write a much more succinct version of this function, because Counter encapsulates the process of counting the number of ocurrences of elements in an iterable. Implementing this functionality with a `Counter object would look like this:

def most_common_ctr(lst):
    return Counter(lst).most_common(1)[0]

These both have the same result:

from collections import Counter, defaultdict

strings = ['bear', 'down', 'you', 'bears', 'of', 'old', 'baylor', 'u', "we're",
        'all', 'for', 'you', "we're", 'gonna', 'show', 'dear', 'old', 'baylor',
        'spirit', 'through', 'and', 'through', 'come', 'on', 'and', 'fight',
        'them', 'with', 'all', 'your', 'might', 'you', 'bruins', 'bold', 'and',
        'win', 'all', 'our', 'victories', 'for', 'the', 'green', 'and', 'gold']

'''
definitions for most_common_ctr and most_common_dd
'''

assert most_common_dd(strings) == most_common_ctr(strings)

But the version using Counter is more concise.

Comprehensions

I love list comprehensions. They can make code much more concise and readable. Consider a problem where we have a start point and an end point on a grid:

|S|_|_|_|
|_|_|_|_|
|_|_|_|_|
| | | |E|

Let's further say that from a given cell, you can travel up, down, left or right into another cell (but not diagonally). We may want to do a bread-first search to find the minimum cost to get from the start to the end. At some point, we'll need to push the neighbors of the current cell onto the queue we're using for the BFS. This could look something like this:

for neigh in neighbors(cell):
    # validate neigh
    queue.append(neigh)

How should neighbors(cell) work? Well, we could use a double for loop to generate the neighbors:

def neighbors(cell):
    for i in range(-1, 2):
        for j in range(-1, 2):
            if i == 0 and j == 0 or abs(i) + abs(j) > 1:
                continue
            yield (cell[0] + i, cell[0] + j)

This works, but it's ugly. Instead, we could use a list comprehension:

DIRS = [(0, 1), (0, -1), (1, 0), (-1, 0)]
def neighbors(cell):
    return [(cell[0] + d[0], cell[1] + d[1]) for d in DIRS]

We're also probably going to want to keep track of which cells we've already visited (so we don't try to go back through them). We could create a matrix of bools the same size as our original grid (let's call it visited) and set visited[r][c] when we visit the cell located at row r and column c. But how should we initialize this matrix? We could do something like this:

visited = []
for i in range(n):
    visited.append([])
    for j in range(n):
        visited[i].append(False)

But list comprehensions can make this much more concise:

visited = [[False for _ in range(n)] for _ in range(n)]

The possibilities with list comprehensions are just about endless, so I'll leave it at that!