假设body有命名空间如下
<body xmlns="http://www.w3.org/1999/xhtml">
那么直接用response.xpath取body
response.selector.register_namespace('w', 'http://www.w3.org/1999/xhtml')
body = response.xpath('//w:body').extract()
上面的这个response.selector实际上是scrapy.selector.XmlXPathSelector,等同于
from scrapy.selector import XmlXPathSelector
x = XmlXPathSelector(response)
x.register_namespace('g', 'http://www.w3.org/1999/xhtml')
x.select('//g:body')