Home > Articles > Build a "function with a memory" in Python

Build a "function with a memory" in Python

Are you familiar with the __call__ method in Python? By defining this method, an instance of your class can be called as though it were a function. Here's a contrived example solely to demonstrate how it works:

class Foo:
    def __init__(self, name):
        self.name = name
        self.call_ct = 0
        print(self.name, 'initialized')

    def __call__(self):
        self.call_ct += 1
        print(self.name, self.call_ct)

bar = Foo('bar')
bar()

baz = Foo('baz')
baz()
bar()

Output:

bar initialized
bar 1
baz initialized
baz 1
bar 2

A more interesting use case is given by Brandon Rhodes, that of swapping out an http_get(url) method for an object that caches pages. Say for instance that we are maintaining a project that includes the following web crawling code:

import urllib.request
from urllib.error import HTTPError, URLError

def http_get(url):
    with urllib.request.urlopen(url) as page:
        return page.getcode(), page.read().decode('utf-8')

def get_links(page_content):
    loc = page_content.find('href')
    while loc != -1:
        start = loc + len('href="')
        quote_char = page_content[start - 1]
        end = page_content.find(quote_char, start)
        yield page_content[start:end]

        loc = page_content.find('href', loc + 1)

def crawl(start_page, max_depth, on_page):
    stack = [(0, start_page)]
    while stack:
        depth, url = stack.pop()
        try:
            code, content = http_get(url)
        except (ValueError, HTTPError, URLError):
            continue

        on_page(url, code, content)
        if depth < max_depth and code == 200:
            stack.extend((depth + 1, u) for u in get_links(content))

if __name__ == '__main__':
    crawl('https://www.python.org', max_depth=1, on_page=lambda url, code, content: print(code, url))

Now say we become unsatisfied with the performance of this code and want to stop getting the same page multiple times. The standard library provides a caching mechanism that we could decorate our http_get function with.

from functools import lru_cache

# ...

@lru_cache()
def http_get(url):
    with urllib.request.urlopen(url) as page:
        return page.getcode(), page.read().decode('utf-8')

But another option is an object that implements __call__(self). What might that look like?

# ...
class CachedHttpGet:
    def __init__(self):
        self.cache = {}

    def __call__(self, url):
        if url not in self.cache:
            with urllib.request.urlopen(url) as page:
                self.cache[url] = (page.getcode(), page.read().decode('utf-8'))
        return self.cache[url]

http_get = CachedHttpGet()
# ...

While lru_cache is probably better in this contrived example, I hope this article gives you another tool for your toolbox. The official docs are here. Keep this in mind the next time you're refactoring something; it may be the right choice.