Doridori,
You can use the request_httprepr() function (from scrapy.utils.request) to
print the raw HTTP representation of requests. They used to be method of the
Request objects, but there's no need for it to be a method so I just moved them
to a utils function, so make sure you update you Scrapy code first.
Then you can write a simple downloader middleware to print it:
from scrapy.utils.request import request_httprepr
class DumpRawRequestsMiddleware(object):
def process_request(self, request, spider):
print request_httprepr(request)
You'd want to put that middleware as close to the downloader as possible, since
some downloader middlewares (like the User-Agent one) modify the requests. So,
for exapmle:
DOWNLOADER_MIDDLEWARES = {
'myproject.middlewares.DumpRawRequestsMiddleware': 999,
}
I don't think I've understood the "parse out the URL parameter" part.
Pablo.
On Fri, Jul 24, 2009 at 10:53:09PM -0700, doridori Jo wrote:
> I guess simply put, my question is
>
> How to make the spider's *dump HTTP header*, and parse out the URL parameter
> information ?
>
> On Fri, Jul 24, 2009 at 9:24 PM, doridori Jo <
dori...@gmail.com> wrote:
>
> > yes, but im looking to see if this can be accomplished via scrapy. just
> > capturing the outgoing POST parameters.
> >
> > 2009/7/24 Aníbal Pacheco <
apach...@gmail.com>