'How do I clear the cache from @cached_property decorator?
I have an function called "value" that makes heavy calculation...
The result of the function is always the same if the dataset is not changed for the identifier.
Once the dataset is changed for some identifier, I want to clear the cache, and let the function calculate it again.
You can better understand me by looking at this code:
from functools import cached_property
class Test:
identifiers = {}
dataset = an empty object of dataset type
def __init__(self, identifier, ...)
self.identifier = identifier
...
Test.identifiers[identifier] = self
...
@cached_property
def value(self):
result = None
# heavy calculate based on dataset
return result
@classmethod
def get(cls, identifier):
if identifier in cls.identifiers:
return cls.identifiers[identifier]
else:
return cls(identifier, ...)
@classmethod
def update(cls, dataset):
for block in dataset:
# assume there is block['identifier'] in each block
# here i want to clear the cache of value() function
instance = cls.get(block['identifier'])
# clear @cached_property of instance
cls.dataset.append(block)
Solution 1:[1]
As you can read in the CPython source, the value for a cached_property
in Python 3.8 is stored in an instance variable of the same name. This is not documented, so it may be an implementation detail that you should not rely upon.
But if you just want to get it done without regards to compatibility, you can remove the cache with del instance.value
. And who knows, maybe the current behavior will be documented in the future, so it will be safe to use it in any version or interpreter implementation.
Solution 2:[2]
(Aditional to @Blckknght answer)
In case that you have a mutable object and you need to refresh all the @cached_property
(because the object has been mutated), you could delete the properties that are already cached on the self.__dict__
dictionary (that's where the properties are storaged)
from functools import cached_property
class Test:
datalist: List[int]
@cached_property
def value(self):
result = None
# heavy calculate based on datalist
return result
def add_element(self, new:int)-> None:
# restore cache if calculated
self.__dict__.pop('value', None) # this will delete the cached val if already cached, otherwise do nothing
self.datalist.append(new)
or in case you want to do it more elegant you can directly edit the __setattr__
method
from functools import cached_property
class Test:
datalist: List[int]
@cached_property
def value(self):
result = None
# heavy calculate based on datalist
return result
def __setattr__(self, name, val):
self.__dict__[name] = val
self.__dict__.pop('value', None)
Solution 3:[3]
I offer an alternative approach, which might be useful in some cases.
If the type of the dataset you need to do the computation on is hashable, you can make use of the regular functools.cache
or lru_cache
decorator, applied to a static method that takes the dataset as input.
Here is an example of what I mean:
from functools import lru_cache
class MyClass():
def __init__(self, data):
self.data = data
@property
def slow_attribute(self):
return self._slow_attribute(self.data)
@staticmethod
@lru_cache
def _slow_attribute(data):
# long computation, using data,
# here is just an example
return sum(data)
Here there is no need to concern yourself with when to clear the cache: if the underlying dataset changes, the staticmethod automatically knows it cannot use the cached value anymore.
This has the additional perk that, if the dataset were to be restored to a previously-used state, the lookup may still be able to use a cached value.
Here is a demo of the code above working:
from time import perf_counter_ns
def print_time_and_value_of_computation(c):
t1 = perf_counter_ns()
val = c.slow_attribute
t2 = perf_counter_ns()
print(f'Time taken: {(t2 - t1)/1000} microseconds')
print(f'Value: {val}')
c = MyClass(range(10_000))
print_time_and_value_of_computation(c)
print_time_and_value_of_computation(c)
print('Changing the dataset!')
c.data = range(20_000)
print_time_and_value_of_computation(c)
print_time_and_value_of_computation(c)
print('Going back to the original dataset!')
c.data = range(10_000)
print_time_and_value_of_computation(c)
which returns:
Time taken: 162.074 microseconds
Value: 49995000
Time taken: 2.152 microseconds
Value: 49995000
Changing the dataset!
Time taken: 264.121 microseconds
Value: 199990000
Time taken: 1.989 microseconds
Value: 199990000
Going back to the original dataset!
Time taken: 1.144 microseconds
Value: 49995000
Solution 4:[4]
I ran across this problem and came across this thread as I was trying to solve it. The data in my case effectively is immutable, except that the setup of this object in some cases involves using the properties, with the properties being out of date after the setup. @Pablo's answer was helpful, but I wanted that process to dynamically reset everything cached.
Here's a generic example:
Setup and broken thing:
from functools import cached_property
class BaseThing:
def __init__(self, *starting_numbers: int):
self.numbers = []
self.numbers.extend(starting_numbers)
@property
def numbers_as_strings(self) -> dict[int, str]:
"""This property method will be referenced repeatedly"""
def process_arbitrary_numbers(self, *arbitrary_numbers: int) -> list[str]:
return [self.numbers_as_strings.get(number) for number in arbitrary_numbers]
def extend_numbers(self, *additional_numbers: int):
self.numbers.extend(additional_numbers)
class BrokenThing(BaseThing):
@cached_property
def numbers_as_strings(self) -> dict[int, str]:
print("Working on:", " ".join(map(str, self.numbers)))
return {number: str(number) for number in self.numbers}
output:
>>> thing = BrokenThing(1, 2, 3, 4)
>>> thing.process_arbitrary_numbers(1, 3) == ["1", "3"]
Working on: 1 2 3 4
True
>>> thing.extend_numbers(4, 5, 6)
>>> thing.process_arbitrary_numbers(5, 6) == ["5", "6"]
False
@cached_property
replaced with @property
to make it work, leaving it inefficient:
class InefficientThing(BaseThing):
@property
def numbers_as_strings(self) -> dict[int, str]:
print("Working on:", " ".join(map(str, self.numbers)))
return {number: str(number) for number in self.numbers}
output:
>>> thing = InefficientThing(1, 2, 3)
>>> thing.process_arbitrary_numbers(1, 3) == ["1", "3"]
Working on: 1 2 3
Working on: 1 2 3
True
>>> thing.extend_numbers(4, 5, 6)
>>> thing.process_arbitrary_numbers(5, 6) == ["5", "6"]
Working on: 1 2 3 4 5 6
Working on: 1 2 3 4 5 6
True
Solution:
class EfficientThing(BaseThing):
def _clear_cached_properties(self):
for name in dir(type(self)):
if isinstance(getattr(type(self), name), cached_property):
print(f"Clearing self.{name}")
vars(self).pop(name, None)
def extend_numbers(self, *additional_numbers: int):
self._clear_cached_properties()
return super().extend_numbers(*additional_numbers)
@cached_property
def numbers_as_strings(self) -> dict[int, str]:
print("Working on:", " ".join(map(str, self.numbers)))
return {number: str(number) for number in self.numbers}
output:
>>> thing = EfficientThing(1, 2, 3, 4)
>>> thing.process_arbitrary_numbers(1, 3) == ["1", "3"]
Working on: 1 2 3 4
True
>>> thing.extend_numbers(4, 5, 6)
Clearing self.numbers_as_strings
>>> thing.process_arbitrary_numbers(5, 6) == ["5", "6"]
Working on: 1 2 3 4 4 5 6
True
This loops through all attributes of the object's parent class. If the value of the attribute is an instance of cached_property
, it's most likely a cached_property. The attribute is then popped from the instance dictionary. None
is passed to pop
in case the property hadn't been cached yet.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Blckknght |
Solution 2 | |
Solution 3 | Jacopo Tissino |
Solution 4 | Richard Dodson |