I need to do a query that, in its simplest form, would be:
for row in session.query(Message.body).order_by(Message.session_num):
...do read-only stuff with row and then discard...
However, as that query will return roughly 20-40 million rows per time,
I'm guessing the above may not be what I need.
What should I use?
cheers,
Chris
--
Simplistix - Content Management, Batch Processing & Python Consulting
- http://www.simplistix.co.uk
> Hi All,
>
> I need to do a query that, in its simplest form, would be:
>
> for row in session.query(Message.body).order_by(Message.session_num):
> ...do read-only stuff with row and then discard...
>
> However, as that query will return roughly 20-40 million rows per time, I'm guessing the above may not be what I need.
>
> What should I use?
it's a very simple query, so if you were using psycopg2, query(..).yield_per(1000) would pull rows over in chunks of 1000 at a time. Should work with Oracle as well. Other DBAPIs more or less buffer the entire result set at the moment though some might have ways to get around it (I have a vague recollection that OurSQL does).
the other recipe is the windowed range query which I normally use for this at http://www.sqlalchemy.org/trac/wiki/UsageRecipes/WindowedRangeQuery, that's also a Postgresql/Oracle/MSSQL only thing.
Finally there's just doing LIMIT/OFFSET though you might find the result slowing down considerably as you start doing OFFSETs of like 30 million.
So, this would be the way to go for Microsoft SQL Server?
> On 02/03/2012 15:59, Michael Bayer wrote:
>>
>> the other recipe is the windowed range query which I normally use for this at http://www.sqlalchemy.org/trac/wiki/UsageRecipes/WindowedRangeQuery,
>
> So, this would be the way to go for Microsoft SQL Server?
its what I use with SQL server, sure.
its too bad it can't be easier though, I agree.