ApifyCacheStorage
Index
Methods
__init__
Parameters
settings: BaseSettings
Returns None
close_spider
Close the cache storage for a spider.
Runs a best-effort cleanup sweep that deletes expired entries when expiration is enabled, then shuts down the background event-loop thread. The thread is always closed, even if the sweep fails.
Parameters
_: Spider
The spider being closed. Part of Scrapy's storage interface, unused here.
optionalcurrent_time: int | None = None
Unix time in seconds used as the current time when deciding which entries have expired. Defaults to the current time.
Returns None
open_spider
Open the cache storage for a spider.
Starts the background event-loop thread and opens the spider's key-value store. If opening the store fails, the freshly started thread is closed so it is not leaked.
Parameters
spider: Spider
The spider the cache storage is being opened for.
Returns None
retrieve_response
Retrieve a cached response for a request.
A malformed, legacy, or expired cache entry is treated as a miss, so Scrapy re-fetches the request and re-stores it in the current format.
Parameters
_: Spider
The spider making the request. Part of Scrapy's storage interface, unused here.
request: Request
The request to look up in the cache.
optionalcurrent_time: int | None = None
Unix time in seconds used as the current time when checking whether the entry has expired. Defaults to the current time.
Returns Response | None
The cached response on a hit, or
Noneon a miss, an expired entry, or an unreadable entry.
store_response
Store a response in the cache storage.
Parameters
_: Spider
The spider that produced the response. Part of Scrapy's storage interface, unused here.
request: Request
The request the response belongs to. Its fingerprint is used as the cache key.
response: Response
The response to store in the cache.
Returns None
A Scrapy cache storage that uses the Apify
KeyValueStoreto store responses.It can be set as a storage for Scrapy's built-in
HttpCacheMiddleware, which caches responses to requests. See HTTPCache middleware settings (prefixed withHTTPCACHE_) in the Scrapy documentation for more information. Requires the asyncio Twisted reactor to be installed.