scrachy.http_.CachedResponseMixin
- class scrachy.http_.CachedResponseMixin(scrape_timestamp: datetime | None = None, extracted_text: str | None = None, body_length: int | None = None, extracted_text_length: int | None = None, scrape_history: list[ScrapeHistory] | None = None, *args, **kwargs)[source]
Bases:
object
A subclass of
scrapy.http.HttpResponse
that contains a subset of the extra information stored in the cache.- Parameters:
scrape_timestamp – The most recent date the request was scraped.
body_number_of_bytes – The total number of bytes of the downloaded html.
text_number_of_bytes – The number of bytes in the extracted plain text.
body_text – The text extracted from the HTML.
- __init__(scrape_timestamp: datetime | None = None, extracted_text: str | None = None, body_length: int | None = None, extracted_text_length: int | None = None, scrape_history: list[ScrapeHistory] | None = None, *args, **kwargs)[source]
A subclass of
scrapy.http.HttpResponse
that contains a subset of the extra information stored in the cache.- Parameters:
scrape_timestamp – The most recent date the request was scraped.
body_number_of_bytes – The total number of bytes of the downloaded html.
text_number_of_bytes – The number of bytes in the extracted plain text.
body_text – The text extracted from the HTML.
Methods
__init__
([scrape_timestamp, extracted_text, ...])A subclass of
scrapy.http.HttpResponse
that contains a subset of the extra information stored in the cache.