Depending on available memory, and your request object size, you should typically get anywhere from tens of thousands to tens of millions of requests in memory. If you are worried about memory pressure, I think it's probably wise to test unless your numbers are clearly in excess of this.
The latest scrapy has support for serializing requests to disk, which happens during the crawl and will reduce memory usage if the number of outstanding requests is large. See
http://scrapy.readthedocs.org/en/latest/topics/jobs.html for more details.
As Martin mentioned, scrapy-redis is another option. It's particularly useful if you want to share state between more than one scrapy process.
If you do want to implement something yourself, it's worth looking at the code for scrapy-redis and the persistent job state, as they provide good examples of how to hook into scrapy to manage the requests.