scrachy.http_.CachedXmlResponse
- class scrachy.http_.CachedXmlResponse(*args: Any, **kwargs: Any)[source]
Bases:
CachedResponseMixin,XmlResponseA subclass of
scrapy.http.HttpResponsethat contains a subset of the extra information stored in the cache.- Parameters:
scrape_timestamp – The most recent date the request was scraped.
body_number_of_bytes – The total number of bytes of the downloaded html.
text_number_of_bytes – The number of bytes in the extracted plain text.
body_text – The text extracted from the HTML.
- __init__(*args, **kwargs)[source]
A subclass of
scrapy.http.HttpResponsethat contains a subset of the extra information stored in the cache.- Parameters:
scrape_timestamp – The most recent date the request was scraped.
body_number_of_bytes – The total number of bytes of the downloaded html.
text_number_of_bytes – The number of bytes in the extracted plain text.
body_text – The text extracted from the HTML.
Methods
__init__(*args, **kwargs)A subclass of
scrapy.http.HttpResponsethat contains a subset of the extra information stored in the cache.copy()Return a copy of this Response
css(query)Shortcut method implemented only by responses whose content is text (subclasses of TextResponse).
follow(url[, callback, method, headers, ...])Return a
Requestinstance to follow a linkurl.follow_all([urls, callback, method, ...])A generator that produces
Requestinstances to follow all links inurls.jmespath(query, **kwargs)Shortcut method implemented only by responses whose content is text (subclasses of TextResponse).
json()New in version 2.2.
replace(*args, **kwargs)Create a new Response with the same attributes except for those given new values
urljoin(url)Join this Response's url with a possible relative url to form an absolute interpretation of the latter.
xpath(query, **kwargs)Shortcut method implemented only by responses whose content is text (subclasses of TextResponse).
Attributes
A tuple of
strobjects containing the name of all public attributes of the class that are also keyword parameters of the__init__method.bodycb_kwargsencodingmetaselectorBody as unicode
url- attributes: Tuple[str, ...] = ('url', 'status', 'headers', 'body', 'flags', 'request', 'certificate', 'ip_address', 'protocol', 'encoding')
A tuple of
strobjects containing the name of all public attributes of the class that are also keyword parameters of the__init__method.Currently used by
Response.replace().
- copy()
Return a copy of this Response
- css(query)
Shortcut method implemented only by responses whose content is text (subclasses of TextResponse).
- follow(url, callback=None, method='GET', headers=None, body=None, cookies=None, meta=None, encoding=None, priority=0, dont_filter=False, errback=None, cb_kwargs=None, flags=None) Request
Return a
Requestinstance to follow a linkurl. It accepts the same arguments asRequest.__init__method, buturlcan be not only an absolute URL, but alsoa relative URL
a
Linkobject, e.g. the result of Link Extractorsa
Selectorobject for a<link>or<a>element, e.g.response.css('a.my_link')[0]an attribute
Selector(not SelectorList), e.g.response.css('a::attr(href)')[0]orresponse.xpath('//img/@src')[0]
See A shortcut for creating Requests for usage examples.
- follow_all(urls=None, callback=None, method='GET', headers=None, body=None, cookies=None, meta=None, encoding=None, priority=0, dont_filter=False, errback=None, cb_kwargs=None, flags=None, css=None, xpath=None) Generator[Request, None, None]
A generator that produces
Requestinstances to follow all links inurls. It accepts the same arguments as theRequest’s__init__method, except that eachurlselement does not need to be an absolute URL, it can be any of the following:a relative URL
a
Linkobject, e.g. the result of Link Extractorsa
Selectorobject for a<link>or<a>element, e.g.response.css('a.my_link')[0]an attribute
Selector(not SelectorList), e.g.response.css('a::attr(href)')[0]orresponse.xpath('//img/@src')[0]
In addition,
cssandxpatharguments are accepted to perform the link extraction within thefollow_allmethod (only one ofurls,cssandxpathis accepted).Note that when passing a
SelectorListas argument for theurlsparameter or using thecssorxpathparameters, this method will not produce requests for selectors from which links cannot be obtained (for instance, anchor tags without anhrefattribute)
- jmespath(query, **kwargs)
Shortcut method implemented only by responses whose content is text (subclasses of TextResponse).
- json()
New in version 2.2.
Deserialize a JSON document to a Python object.
- replace(*args, **kwargs)
Create a new Response with the same attributes except for those given new values
- property text: str
Body as unicode
- urljoin(url)
Join this Response’s url with a possible relative url to form an absolute interpretation of the latter.
- xpath(query, **kwargs)
Shortcut method implemented only by responses whose content is text (subclasses of TextResponse).