Cache implementation using weakref

Fri 30 April 2021
Bird's cache (Photo credit: Wikipedia)

This article presents a quick example of how to use weakref to implement a home-made cache mechanism.

Let's use the following use case:

  • Let's consider items:
    • items can be stored on a storage
    • items can be retrieved from storage
    • items are identify by an ID
  • All processing on items take as input an iterable over items

Items considered

As I'm lazy and I don't want to setup a database, I'll use Q-items and reuse some functions from a previous article. Items must expose methods to build from storage and to store into storage. I'll call them from_storage() and to_storage().

Let's consider a QItem. The from_database() takes the item from an external resource (wikidata). The to_storage() is a dummy function as we don't want to modify wikidata. In real life, this function should check the storage and either store the item or update it if it exists.

class QItem:
    def __init__(self, q_num):
        self.q_num = q_num
        self.values = {}

    def from_storage(self):
        """Build item from storage"""
        self.values = wikidata_to_dict(get_item(self.q_num))

    def to_storage(self):
        """Store this item to storage.

        If item already exists in storage, update it
        """
        # nothing to do here, we won't try to modify wikidata
        print(f"storing {self.q_num}")

    def iter_properties(self):
        """Whatever method to pretend this item is not useless."""
        for k, v in self.values.items():
            yield {k: v}

    def get_any_property(self):
        """Whatever method to pretend this item is not useless."""
        return next(self.iter_properties())

Cache implementation

Let's consider a collection of items. This collection consist of all items IDs and a dict of items. The dict acts as a cache. If the item is in the dict, then it is returned, otherwise it is build from the storage (based on its ID) and put into the dict. Memory freeing is handle by weakref. The set of ids is used to keep all dict keys that have been put in the dict.

from weakref import WeakValueDictionary


class WikidataCollection:
    def __init__(self):
        self.items = WeakValueDictionary()
        self.ids = set()

    def get(self, q_num):
        if q_num not in self.ids:
            raise ValueError("unknown item")
        try:
            # item in cache
            return self.items[q_num]
        except KeyError:
            # get item form elsewhere (e.g.database)
            q_item = QItem(q_num)
            q_item.from_storage()
            self.items[q_num] = q_item
            self.ids.add(q_num)
            return q_item

    def set_item(self, q_num, q_item):
        """Add q_item with id q_num in cache"""
        q_item.to_storage()
        self.items[q_num] = q_item
        self.ids.add(q_num)

    def iter_items(self):
        for q_num in self.ids:
            yield self.get(q_num)

I tried to keep this implementation as simple as possible in order to be able to adapt it as easily as possible to other objects.

Now let's use it. First, let's create a collection and populate it:

from time import sleep


def get_qitem(q_num):
    qitem = QItem(q_num)
    qitem.from_storage()
    sleep(1) # don't overload wikidata
    return qitem


q_collection = WikidataCollection()

# populate collection
for num in range(42, 56):
    if num in (47, 50):
        continue
    q_num = "Q" + str(num)
    q_collection.set_item(q_num, get_qitem(q_num))

Now you can iterate over qitems belonging to the q_collection. Note that the WeakValueDict q_collection.items can have less items than the set of ids q_collection.ids.

[qitem.get_any_property() for qitem in q_collection.iter_items()]

Category: how to Tagged: python cache weakref


Tkinter and Asyncio

Thu 18 February 2021
Asynchronous process results waiting (Photo credit: Wikipedia)

Graphical interfaces are typically the kind of object that can take advantage of asynchrounous programming as a GUI spend lot of time waiting for user input.

Tkinter <https://docs.python.org/3/library/tkinter.html#module-tkinter>_ is a kind of standard for …

Category: how to Tagged: python asyncio

Read More

Travis setup

Tue 12 May 2020
One job in continuous integration pipeline (Photo credit: Wikipedia)

The goal is to setup a CI pipeline based on Travis with external dependencies integrated to a Github repository

Travis basics

To enable Travis integration in Github, one must edit ./.travis.yml file.

I won't go into detail. The setup is …

Category: how to Tagged: travis ci how to

Read More

Wikidata crawling

Sun 26 April 2020
Graph database representation (Photo credit: Wikipedia)

I wish to have reliable data about vehicles. I decided to rely on one large source, namely Wikipedia. I chose it because it is reviewable and most of the time reviewed, and regularly updated and completed.

Wikipedia - Wikidata relationship

Wikidata items are made to …

Category: how to Tagged: python wikipedia wikidata html

Read More

awesome global shortcut

Mon 04 January 2016
Multimedia keyboard

Multimedia keyboard (Photo credit: Wikipedia)

The awesome window manager does not provide GUI configuration tool.

Here is a litte how to to provide a feature using global shortcut, illustrated with wolume control.

Defining and identifying the feature and the shortcut

The wanted feature is usually accessible via the CLI . For …

Category: how to Tagged: alsa ArchLinux awesome Configuration file FAQs Help and Tutorials Unix window manager tools unix-like

Read More
Page 1 of 2

Next »