iterselect bug?

156 views
Skip to first unread message

Tim Richardson

unread,
Sep 10, 2015, 8:51:43 AM9/10/15
to web2py-developers
I will ticket this if needed.

this example code drops out of the loop after one iteration. 
with the update_or_insert, it iterates over each row. 


Desired behaviour is only possible with select. 

Do I misunderstand something? 


# iterates only once
for p in db(db.ul_product_cache.id>0).iterselect():
    product
= json.loads(p['jsondata'])
    db
.prototypes.update_or_insert(db.prototypes.Prototype ==
                            product
['ProductGroup']['GroupName'],
   
Prototype = product['ProductGroup']['GroupName'],
            origin
='Unleashed')

# iterates over everything
for p in db(db.ul_product_cache.id>0).iterselect():
    product = json.loads(p['jsondata'])

Paolo Valleri

unread,
Sep 11, 2015, 2:21:08 AM9/11/15
to web2py-d...@googlegroups.com
Hi Tim,
I'm not in the condition to run deep tests, however, I guess the problem is related to the execution of the inner query (update_or_insert).
Have a look at https://github.com/web2py/pydal/blob/master/pydal/objects.py#L2714, the next() method calls self.db._adapter._fetchone(), which calls cursor.fetchone()
Essentially, by running the inner query, you are overriding the original cursor with the unexpected result you got.

I'll try to update IterRows, to save locally the original Cursor, but I'm not sure that in the same connection we can execute a new sql operation while keeping the other on 'hold'.


 Paolo

--
-- mail from:GoogleGroups "web2py-developers" mailing list
make speech: web2py-d...@googlegroups.com
unsubscribe: web2py-develop...@googlegroups.com
details : http://groups.google.com/group/web2py-developers
the project: http://code.google.com/p/web2py/
official : http://www.web2py.com/
---
You received this message because you are subscribed to the Google Groups "web2py-developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to web2py-develop...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Tim Richardson

unread,
Sep 11, 2015, 3:16:20 AM9/11/15
to web2py-d...@googlegroups.com
OK, thanks. 
I'll file a ticket so we can keep track of it

You received this message because you are subscribed to a topic in the Google Groups "web2py-developers" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/web2py-developers/kxGnh1qlpQ8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to web2py-develop...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--


Tim Richardson CPA, Director
GrowthPath, Profit Engineers

Mobile: +61 423 091 732
Office: +61 3 8678 1850

GrowthPath Pty Ltd
ABN 76 133 733 963





Tim Richardson

unread,
Sep 11, 2015, 3:24:06 AM9/11/15
to web2py-developers

Paolo Valleri

unread,
Sep 13, 2015, 3:56:30 AM9/13/15
to web2py-developers
Hi all,
Tim's example fails because the cursor used by the iterselect it's overridden by the inner query.
Given that, while looping over an iterator, the inner query requires its own cursor, which means, as far as I know, to establish a new connection.

I started to work on a fix, implementing a kind of stack to keep track of all new connections.
The idea is the following, as soon as the iterselect is over, the connection used by the inner query is closed (or placed in the connection pool) and the application continues with the original one. 
The workflow, somehow works, however, it can lead to really very weird results due to the fact that the inner query work on a different connection. Since 
there is no visibility of the original transaction, everything that hasn't been committed it isn't -visible-.
For example the following example

        db = DAL(DEFAULT_URI, check_reserved=['all'])
        t0 = db.define_table('t0', Field('name'), Field('name_copy'))
        for n in ['web2py', 'pydal', 'Massimo']:
            t0.insert(name=n)

        for r in db(db.t0).iterselect():
            db.t0.update_or_insert(db.t0.id == r.id, name_copy = r.name)

        print (db(db.t0).count())

prints 9, while the "expected" result is 3, due to the fact that the example is executed on two different transactions.

Should we -really- let the user to follow such approach ? 
Is there an alternative approach?

Paolo

On Friday, September 11, 2015 at 9:24:06 AM UTC+2, Tim Richardson wrote:

Paolo Valleri

unread,
Sep 13, 2015, 10:23:44 AM9/13/15
to web2py-d...@googlegroups.com
I found a workaround, according to https://www.python.org/dev/peps/pep-0249/#cursor
we can create as many cursor we want in the same connection; cursors created from the same connection are not isolated, which is want I was looking for.
@Tim, if you want to test it, the working copy is here https://github.com/ilvalle/pydal/tree/fix-iterparse-innerQuery

 Paolo

--

Tim Richardson

unread,
Sep 15, 2015, 4:02:06 AM9/15/15
to web2py-d...@googlegroups.com
It's not working, but with a different problem. I tried it in some existing code, postgresql 9.4 OS X and the binary driver I think.


I'm getting postresql errors like this:


ProgrammingError: ('ERROR', '34000', 'portal "pg8000_portal_973" does not exist')



This happens after an unpredictable number of iterations:

for row in db(db.olap_sales_fact.id>0).iterselect():
 
(cogs_unit_price,source) = get_product_cogs(unleashed_api=unleashed_api,product_code = row.ProductCode,
 warehouse
=row.Warehouse,cache=cost_cache,db=db)
 
if cogs_unit_price:
     
row.update_record(COGS_Central = cogs_unit_price * -1 * row.cost_qty,COGS_Source=source)
     db
.commit()

Michele Comitini

unread,
Sep 15, 2015, 5:03:57 AM9/15/15
to web2py-developers
Seems a dbapi driver bug, can you try with psycopg2?

2015-09-15 10:02 GMT+02:00 Tim Richardson <t...@tim-richardson.net>:
It's not working, but with a different problem. I tried it in some existing code, postgresql 9.4 OS X and the binary driver I think.


I'm getting postresql errors like this:


ProgrammingError: ('ERROR', '34000', 'portal "pg8000_portal_973" does not exist')



This happens after an unpredictable number of iterations:

for row in db(db.olap_sales_fact.id>0).iterselect():
 
(cogs_unit_price,source) = get_product_cogs(unleashed_api=unleashed_api,product_code = row.ProductCode,
 warehouse
=row.Warehouse,cache=cost_cache,db=db)

 
if not cogs_unit_price is None:
Message has been deleted

Tim Richardson

unread,
Sep 15, 2015, 6:41:33 AM9/15/15
to web2py-developers
Sorry, I deleted my first answer because I forgot that I had reverted to master.


Niphlod

unread,
Sep 15, 2015, 6:43:48 AM9/15/15
to web2py-developers
btw: ok for updating while iterselecting, but it seems you're doing another query to get the price...... the point is multiple cursors being open and fetching the relevant resultsets........isn't "happening random" really when you get back a resutlset from your inner query ?

Tim Richardson

unread,
Sep 15, 2015, 6:57:39 AM9/15/15
to web2py-d...@googlegroups.com

On Tue, Sep 15, 2015 at 7:03 PM, Michele Comitini <michele....@gmail.com> wrote:
Seems a dbapi driver bug, can you try with psycopg2?

So it seems. It completes without error when using psycopg2 2.6.1

Tim Richardson

unread,
Sep 15, 2015, 7:09:55 AM9/15/15
to web2py-d...@googlegroups.com

On Tue, Sep 15, 2015 at 8:43 PM, Niphlod <nip...@gmail.com> wrote:
btw: ok for updating while iterselecting, but it seems you're doing another query to get the price...... the point is multiple cursors being open and fetching the relevant resultsets........isn't "happening random" really when you get back a resutlset from your inner query ?

with pg8000 it crashes an unpredictable number of iterations in the loop, but only when it tries to db.commit().
#db.commit() 
results in no crash.
 

Deferring the db.commit() to after the for loop is also ok as far as no crashing. 

Michele Comitini

unread,
Sep 15, 2015, 8:03:39 AM9/15/15
to web2py-developers
there is no good reason to commit every cycle, it's correct to do it at the end.  Mind that the two (dbapi) cursors are in the same connection and they share that same transaction.
If you want the insert command to work on another transaction you would have to use 2 different connections.  I think you can create
2 DAL instances provided that you have migrations off .
db1 = DAL(' postgres://... /db')
db2 = DAL('postgres:// ... /db')


Paolo Valleri

unread,
Sep 15, 2015, 12:25:13 PM9/15/15
to web2py-d...@googlegroups.com
@Tim, thanks for testing. 
After a commit the cursor used by iterselect is no longer fetching rows correctly, if it works on psycopg2 perhaps it's an issue of pg8000. I'd suggest opening an issue on https://github.com/mfenniak/pg8000 to get in touch with the maintainer.
I'll integrate the unitests with an example of you are trying to achieve to test all the other backends.

 Paolo

Paolo Valleri

unread,
Sep 15, 2015, 2:03:37 PM9/15/15
to web2py-developers
The following:
    def testMultiSelectWithCommit(self):
        db = DAL(DEFAULT_URI, check_reserved=['all'])
        t0 = db.define_table('t0', Field('nn', 'integer'))
        for n in xrange(1, 100, 1):
            t0.insert(nn=n)
        db.commit()
        s = db.t0.nn.sum()
        tot = db(db.t0).select(s).first()[s]
        c = 0
        for r in db(db.t0).iterselect(db.t0.ALL):
            db.t0.update_or_insert(db.t0.id == r.id, nn = r.nn * 2)
            db.commit()
            c += 1

        self.assertEqual(c, db(db.t0).count())
        self.assertEqual(tot * 2, db(db.t0).select(s).first()[s])        

        db._adapter.execute_test_query()
        t0.drop()
        db.close()
        return

worked on all backends but sqlite on python > 2.7

Increasing the number of iterations up to 1000 cause pg8000 to raise an exception.

Paolo

 Paolo


details : http://groups.google.com/group/web2py-developers
the project: http://code.google.com/p/web2py/
official : http://www.web2py.com/
---
You received this message because you are subscribed to the Google Groups "web2py-developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to web2py-developers+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
-- mail from:GoogleGroups "web2py-developers" mailing list

details : http://groups.google.com/group/web2py-developers
the project: http://code.google.com/p/web2py/
official : http://www.web2py.com/
---
You received this message because you are subscribed to the Google Groups "web2py-developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to web2py-developers+unsubscribe@googlegroups.com.

Niphlod

unread,
Sep 15, 2015, 4:26:21 PM9/15/15
to web2py-developers
uhm, poor sqlite: concurrent writes :D . Did you try enabling WAL ? (db.executesql("PRAGMA journal_mode=WAL;"))

 Paolo


details : http://groups.google.com/group/web2py-developers
the project: http://code.google.com/p/web2py/
official : http://www.web2py.com/
---
You received this message because you are subscribed to the Google Groups "web2py-developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to web2py-develop...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
-- mail from:GoogleGroups "web2py-developers" mailing list

details : http://groups.google.com/group/web2py-developers
the project: http://code.google.com/p/web2py/
official : http://www.web2py.com/
---
You received this message because you are subscribed to the Google Groups "web2py-developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to web2py-develop...@googlegroups.com.

Paolo Valleri

unread,
Sep 16, 2015, 1:59:47 AM9/16/15
to web2py-d...@googlegroups.com
Same result with WAL enabled, iterselect iterates two times more (tested with 100 and 10000 iterations).
I've added a time.sleep(1) between the update_or_insert and the commit, but I got the same result.


 Paolo

Niphlod

unread,
Sep 16, 2015, 4:54:03 AM9/16/15
to web2py-developers
given that sqlite locks while writing, I'd assume you can't read while updating

Paolo Valleri

unread,
Sep 17, 2015, 5:49:24 AM9/17/15
to web2py-d...@googlegroups.com
It's a sqlite issue, the answer I got in https://github.com/ghaering/pysqlite/issues/88 mentions this bug http://bugs.python.org/issue10513


 Paolo

Niphlod

unread,
Sep 18, 2015, 5:19:01 AM9/18/15
to web2py-developers
nice, I use pysqlite all the time :D

Paolo Valleri

unread,
Oct 16, 2015, 1:10:58 PM10/16/15
to web2py-d...@googlegroups.com
Pydal doesn't use at all the "parametric" cursor.execute() which is the reason of the exception. I've to check if it's feasible in pydal to use cursor.execute() as proposed in the comment.

 Paolo

Niphlod

unread,
Oct 16, 2015, 2:16:56 PM10/16/15
to web2py-developers
we're going to. In 2015, it's silly to prepare our own

Paolo Valleri

unread,
Oct 18, 2015, 7:32:48 AM10/18/15
to web2py-d...@googlegroups.com
I see. the first thing is to decide which paramstyle we want to use as default for the base adapter [0].
There isn't a common approach across drivers: pysqlite supports the qmark and named styles [1], psycopg the format and pyformat styles [2], pymysql at least the format style etc.

Paolo

Niphlod

unread,
Oct 19, 2015, 5:22:28 AM10/19/15
to web2py-developers
in theory any dbapi 2.0 compliant interface has a nifty "paramstyle" property. Each adapter should then override with its own. Base should use a template which adapter translates to the paramstyled version before sending commands over the wire.
Reply all
Reply to author
Forward
0 new messages