limitby and grid pagination

111 views
Skip to first unread message

Anthony

unread,
Oct 9, 2013, 9:42:12 AM10/9/13
to web2py-d...@googlegroups.com
See this discussion on the users group about orderby_on_limitby and pagination in the grid.

Michele makes a good point -- outside of the grid, should orderby_on_limitby default to False instead of True (and in that case, do we even need it)? Why should we implicitly add an orderby if the user doesn't do so? Has this always been the case?

On the other hand, within the grid, I wonder if orderby_on_limitby alone is sufficient for proper pagination. The problem is that if the user specifies an explicit orderby with the grid, then orderby_on_limitby is ignored, and only the explicit orderby is used. However, unless the explicit orderby happens to specify a unique value or combination of values for each record, a fixed order of results will not be guaranteed. So, should the grid go a step further and append the primary keys to the explicit orderby, just to ensure a fixed order. I suppose it could check whether the explicit orderby fields have unique=True and not bother with the primary key for those tables. For joins, it might also attempt to intelligently select the minimum necessary number of primary keys (e.g., with a one-to-many, I think you only need to order on the primary key of the second table).

Anthony

Niphlod

unread,
Oct 9, 2013, 2:26:21 PM10/9/13
to web2py-d...@googlegroups.com
I'm mostly +1 on eliminating any non-explicit orderby on limitby. Any sane user in the need of pagination would need to specify the order explicitely.
On the other hand, on the grid with non-explicit orderby, the current approach is the only one that can guarantee in any backend consistent results.
On the upper-side of things, I agree that order by an id AND by the reference to it (Foreign key) is unuseful, and if it happens it should be stripped from the query only if it's not explicitely passed as orderby but.... at DAL level, not on grid's level.

Anthony

unread,
Oct 9, 2013, 2:37:39 PM10/9/13
to web2py-d...@googlegroups.com
On Wednesday, October 9, 2013 2:26:21 PM UTC-4, Niphlod wrote:
I'm mostly +1 on eliminating any non-explicit orderby on limitby. Any sane user in the need of pagination would need to specify the order explicitely.
On the other hand, on the grid with non-explicit orderby, the current approach is the only one that can guarantee in any backend consistent results.
On the upper-side of things, I agree that order by an id AND by the reference to it (Foreign key) is unuseful, and if it happens it should be stripped from the query only if it's not explicitely passed as orderby but.... at DAL level, not on grid's level.

What are your thoughts on the grid implicitly adding a primary key orderby even if the user explicitly specifies their own orderby (just in case the explicit orderby does not have unique values for all records and therefore won't guarantee a consistent order)? Currently that does not happen (the implicit orderby is only added if the user specifies no orderby at all).

Anthony

Massimo DiPierro

unread,
Oct 9, 2013, 2:37:51 PM10/9/13
to web2py-d...@googlegroups.com
Notice that some backends (I think that was mysql) do not allow limitby without an orderby and error otherwise.

--
-- mail from:GoogleGroups "web2py-developers" mailing list
make speech: web2py-d...@googlegroups.com
unsubscribe: web2py-develop...@googlegroups.com
details : http://groups.google.com/group/web2py-developers
the project: http://code.google.com/p/web2py/
official : http://www.web2py.com/
---
You received this message because you are subscribed to the Google Groups "web2py-developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to web2py-develop...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Niphlod

unread,
Oct 9, 2013, 2:49:36 PM10/9/13
to web2py-d...@googlegroups.com
mysql doesn't. This is straight from their docs

With two arguments, the first argument specifies the offset of the first row to return, and the second specifies the maximum number of rows to return. The offset of the initial row is 0 (not 1):

SELECT * FROM tbl LIMIT 5,10;  # Retrieve rows 6-15

Niphlod

unread,
Oct 9, 2013, 2:53:28 PM10/9/13
to web2py-d...@googlegroups.com


On Wednesday, October 9, 2013 8:37:39 PM UTC+2, Anthony wrote:
pped from the query only if it's not explicitely passed as orderby but.... at DAL level, not on grid's level.

What are your thoughts on the grid implicitly adding a primary key orderby even if the user explicitly specifies their own orderby (just in case the explicit orderby does not have unique values for all records and therefore won't guarantee a consistent order)? Currently that does not happen (the implicit orderby is only added if the user specifies no orderby at all).

Anthony

I'm all up for letting the user deal with it. The more DAL is consistent with SQL syntax the less unpredictable things will happen. So...

no explicit --> grid adds it because the pagination will be screwed if no orderby at all is "issued to the backend"
explicit --> grid (and/or DAL) doesn't do anything. If the user choosed a field "unelegible" for pagination, he explicitely wanted a "bad" behaviour and he should have it (then he can deal with it or accept it as it is).

Massimo DiPierro

unread,
Oct 9, 2013, 2:57:46 PM10/9/13
to web2py-d...@googlegroups.com
I do not have a strong opinion. What do others think?

Anthony

unread,
Oct 9, 2013, 3:24:49 PM10/9/13
to
I'm all up for letting the user deal with it. The more DAL is consistent with SQL syntax the less unpredictable things will happen. So...

no explicit --> grid adds it because the pagination will be screwed if no orderby at all is "issued to the backend"
explicit --> grid (and/or DAL) doesn't do anything. If the user choosed a field "unelegible" for pagination, he explicitely wanted a "bad" behaviour and he should have it (then he can deal with it or accept it as it is).

You're saying that when the user isn't explicit, the grid should automatically do an orderby that guarantees proper pagination. But just because the user specifies an orderby does not mean they are being explicit about ordering with regard to pagination. When you specify orderby, there is no way to distinguish whether you are doing so for purposes of pagination or simply because you care about the particular order in which the records are displayed. So, if the user specifies an orderby, we shouldn't necessarily take that as being explicit about pagination any more than when no orderby is specified. As long as we're going to ensure proper pagination in the first case, why not ensure it in the second? Put another way, why should we assume someone who specifies an orderby "explicitly wants bad behavior" any more than we should assume that of someone who doesn't specify an orderby at all? Instead, shouldn't we assume the grid always requires proper pagination?

Anthony

Niphlod

unread,
Oct 9, 2013, 3:53:26 PM10/9/13
to web2py-d...@googlegroups.com
My starting point is "DAL should do what I tell it to do and nothing else".
That being said, one of web2py's leitmotiv is unless it poses a security threat, web2py allows you to do anything you want. It proposes/includes/magically-include sane defaults.

There's a "strange" although widely-spread "wanted-behaviour" that is more or less: "if I do select(limitby=(0,50)) and then select(limitby=(50,100)) I expect to not fetch the same record twice", that is more or less the hard requirement of pagination.

My "dba mind" forces me to do always select(orderby=..., limitby=...), but I do acknowledge that since this was never "hardly required" most of application's code done by "programmers minds" DON'T specify any orderby BUT still expect to not fetch the same record twice while paginating (that is, they rely on the fact that DAL is smart enough to include an implicit orderby).
This is DAL's "adding smart defaults" to somehow fill a gap (let's say to "ease") between db backends and "natural mindset" of programmers.
That being said, if developers want to to shoot theirself in the foot being explicit, it's their fault.

Following this "stream of thoughts", I translate the exact same POV to the grid.... we've seen that the grid is often misused and the code has been adapted to a certain degree of madness to accomodate newbies and pro users to do all sorts of things.... I expect from grid's pagination a consistent behaviour by **default** (again, to "ease" the gap between SQL syntax and "natural mindset").

BUT, if a user wants to get fancy, no web2py code should prevent him to do exactly what he wants. If he choose a "bad" field to orderby to - in the same way he chooses a select(orderby=..., limitby=...) - web2py should not complain, it's the code that is requesting something that doesn't follow the "natural mindset" **explicitely**.
Let me pass an extreme case: if the user wants to select(limitby=..., orderby='<random>') **explicitely**, DAL doesn't complain. Grid shouldn't too.

Anthony

unread,
Oct 9, 2013, 5:25:18 PM10/9/13
to web2py-d...@googlegroups.com
Let's say I have a db.address table storing US addresses.

Scenario 1: I don't care how the addresses are sorted, so I do:

grid = SQLFORM.grid(db.address)

Scenario 2: I want the addresses sorted by state but don't care about order within state, so I do:

grid = SQLFORM.grid(db.address, orderby=db.address.state)

In #1, the orderby should be db.address.id, and in #2, the orderby should be [db.address.state, db.address.id]. In both case, I have made the mistake of omitting db.address.id from the orderby. Yet in #1, the grid automatically corrects my mistake for the sake of proper pagination. Why should the grid not correct my mistake in #2?

You seem to assume that in #2, I am purposely omitting db.address.id from the orderby because I want to break pagination (or am too foolish to know/care). Whether or not that's a reasonable assumption (I don't think so), surely it is no better an assumption in #2 than in #1. So either we ensure proper pagination in both cases, or in neither case. I don't see the rationale for doing so in one case and not the other. I lean toward ensuring proper pagination in both cases because it's generally not desirable to break pagination in a grid (if we can think of a reason to do so, we can always include an option to prevent the implicit ordering).

Anthony

Niphlod

unread,
Oct 10, 2013, 5:17:35 AM10/10/13
to web2py-d...@googlegroups.com
#1 is providing safe defaults, #2 is getting in the way. 
If the user cares enough to specify an orderby, then changing it "adding" to the explicitely provided parameter is "too much magic" (read it as "not safe") in my POV.
One could very well assume that for its case and its backend it is safe to orderby only by state, another instead is a "total noob" tinkering with the grid... web2py can't figure out user minds.

Anthony

unread,
Oct 10, 2013, 10:43:21 AM10/10/13
to web2py-d...@googlegroups.com
Let's say we append db.address.id to the orderby in scenario 2. If db.address.state already provides a single unique ordering, then adding db.address.id will have no effect, so no harm done. But if db.address.state does not provide a single unique ordering, then pagination is broken and db.address.id is needed in the orderby. So, why not append the id field -- in many cases it will prevent broken pagination, and at worst it will simply have no effect? In any case in which it does affect the ordering, it is by definition needed to ensure proper pagination. What's the downside?

If you think it is important for advanced users to be able to take complete control of the orderby, we could add an argument like add_pks_to_orderby=True, and you could set it to False if for some reason you don't want that behavior.

I think the expectations for the grid should be different from those for an explicit select(). When someone uses the grid, they expect pagination to work automatically, without having to explicitly work out the details of the select -- that is one of the benefits and intended purposes of the grid. When someone specifies the orderby in the grid, they are doing so to control the order in which the records are displayed, not so they can take over the manner in which the grid ensures proper pagination.

Anthony

Massimo Di Pierro

unread,
Oct 10, 2013, 11:20:19 AM10/10/13
to web2py-d...@googlegroups.com
Than why not default orderby_on_limitby to false except in the case of the grid? 

Niphlod

unread,
Oct 10, 2013, 11:34:51 AM10/10/13
to web2py-d...@googlegroups.com
@Anthony: I think we should agree on disagreeing in this one ... however I don't have a strong opinion about it, so feel free to do as you say (adding always pk)

@Massimo: last time I checked breaking backward compatibility was an issue. If someone relies on the implicit orderby added right now, you're screwing (badly) their code.

Anthony

unread,
Oct 10, 2013, 11:57:56 AM10/10/13
to web2py-d...@googlegroups.com
On Thursday, October 10, 2013 11:20:19 AM UTC-4, Massimo Di Pierro wrote:
Than why not default orderby_on_limitby to false except in the case of the grid?

The problem is that orderby_on_limitby only takes effect if no orderby is specified at all. However, just because someone specifies an orderby in the grid does not mean the particular field(s) specified are adequate for pagination purposes (i.e., if you order on a field that contains non-unique values, that's not sufficient for pagination). Right now, if you don't specify orderby with the grid, we automatically add the primary key for each table in the grid query (via the orderby_on_limitby argument to .select(), which defaults to True for all queries). I'm suggesting that perhaps we should do the same thing even if the user does specify an orderby, just to ensure that pagination works properly in all cases.

Anthony 

Anthony

unread,
Oct 10, 2013, 12:00:12 PM10/10/13
to web2py-d...@googlegroups.com
On Thursday, October 10, 2013 11:34:51 AM UTC-4, Niphlod wrote:
@Anthony: I think we should agree on disagreeing in this one ... however I don't have a strong opinion about it, so feel free to do as you say (adding always pk)

OK, but do you see any real downside to adding the pk to orderby?

Thanks.

Anthony 

Michele Comitini

unread,
Oct 10, 2013, 12:00:16 PM10/10/13
to web2py-developers

Why not add an optional parameter to DAL __init__() ?

Il 10/ott/2013 17:34 "Niphlod" <nip...@gmail.com> ha scritto:
@Anthony: I think we should agree on disagreeing in this one ... however I don't have a strong opinion about it, so feel free to do as you say (adding always pk)

@Massimo: last time I checked breaking backward compatibility was an issue. If someone relies on the implicit orderby added right now, you're screwing (badly) their code.

Massimo Di Pierro

unread,
Oct 10, 2013, 12:17:19 PM10/10/13
to web2py-d...@googlegroups.com
I am not sure GAE allows it. 

Anthony

unread,
Oct 10, 2013, 12:46:29 PM10/10/13
to web2py-d...@googlegroups.com
On Thursday, October 10, 2013 12:17:19 PM UTC-4, Massimo Di Pierro wrote:
I am not sure GAE allows it. 

What's the objection? We're already adding the pk to orderby in some cases.

Anthony

Massimo Di Pierro

unread,
Oct 10, 2013, 2:29:08 PM10/10/13
to web2py-d...@googlegroups.com
On GAE you can sort by IT but I am not sure can sort by something else and ID always.

Niphlod

unread,
Oct 10, 2013, 3:10:11 PM10/10/13
to web2py-d...@googlegroups.com
I know this is taking long, but giving a lot of thoughts may help to avoid further inconsistencies.

Let's talk for a moment in "postgresql dialect" (feel free to translate to you own preferred) and only in DAL's perspective.....I think the main issue here is that SQL has two kinds of objectives. one is with limit only and the other combined with offset.
Two main "natural mindsets" coming from the programmer's world are:
- fetch only a subset
- divide the set into "pages"

In the current api, this is more or less achieved with limitby=(0,10), orderby_on_limitby=False vs limitby=(0,10), orderby_on_limitby=True

A posteriori, I'd have introduced two different APIs to select(): limit=10 and pages=(orderbyfield, start, stop) (or pages=(orderby, howmany, page_no))

I do think that the current limitby API sucks a little bit (not requiring an explicit "what should I choose as order") but I really don't see a real issue with grid's as it stands. DAL does propose a smart default and grid does it too.
Adding yet another PK to the orderby to "fill the gap" in a potentially screwed code is just complicating the code and separating grid's behaviour from DAL's one.
At this very high demand of accuracy (i.e. adding pk even if and orderby clause has been specified), extremist may start to wonder why grid doesn't lock the table for any user session just to prevent further deletions/additions/updates to achieve a correct pagination while the user jumps to n pages.

Anthony

unread,
Oct 10, 2013, 3:31:00 PM10/10/13
to web2py-d...@googlegroups.com
On Thursday, October 10, 2013 2:29:08 PM UTC-4, Massimo Di Pierro wrote:
On GAE you can sort by IT but I am not sure can sort by something else and ID always.

I'm just saying that orderby_on_limitby is already ignored by GAE, so anything the grid does regarding orderby for the sake of pagination can and should continue to be ignored by GAE (and any adapter that doesn't support it).

Anthony

Tim Richardson

unread,
Oct 10, 2013, 4:53:52 PM10/10/13
to web2py-d...@googlegroups.com


On Thursday, 10 October 2013 20:17:35 UTC+11, Niphlod wrote:
#1 is providing safe defaults, #2 is getting in the way. 


I prefer defaults to operate like this. You touch the brake, cruise control is off. When you override a default, you want control. 

Anthony

unread,
Oct 10, 2013, 6:25:27 PM10/10/13
to
Good heuristic, but doesn't quite apply in this case. Perhaps the problem is that the grid API is a bit too low-level in this case -- the orderby argument is really exposing an implementation detail rather than capturing the user's high-level intentions. At the implementation level, orderby serves two distinct purposes: (1) list the records in a sensible order for user consumption according to business requirements; (2) ensure pagination works properly. Users want control of #1, but they expect #2 to be taken care of no matter what. The problem is that we have only one parameter to deal with both #1 and #2, so the user cannot express intentions regarding #1 without causing #2 to break. You are assuming that if a user takes control of #1, they also want control of #2, but those are two independent goals.

From an API standpoint, pagination should just work, and it should continue to work even if the user wants to specify a particular sort order other than the default. The user should not have to worry about pagination implementation details just because they want to sort the records in the display. We simply haven't provided the proper API to achieve this. Instead of a single orderby argument, perhaps there should be two distinct arguments: (1) sort_fields=None (a list of fields to determine the displayed order of records), and (2) auto_pagination=True (automatically add the PK's to orderby if they are not already in sort_fields in order to ensure proper pagination). In that case, I suspect explicit specification of sort_fields would not be highly correlated with overriding auto_pagination (currently, we assume the correlation is 1.0).

Of course, we need to keep the orderby argument for backward compatibility, so it's probably not worth adding a new sort_fields argument. But we certainly could add something like auto_pagination. With an additional argument like that, if your true intention is to override the default pagination behavior, you can do so unambiguously.

Anthony

Anthony

unread,
Oct 10, 2013, 6:36:29 PM10/10/13
to web2py-d...@googlegroups.com
Also, note that there is a bigger problem. Once the initial grid has loaded, if the end-user clicks on a column header to change the sort order, only that single field is used in the orderby, which will break pagination if that field does not contain unique values. This happens regardless of what is in the orderby argument.

Should we fix that? I think so. But if we're going to fix pagination in that case, why not in the case where the developer has specified an initial orderby that fails to sort on a unique set of values?

Anthony


On Thursday, October 10, 2013 6:22:55 PM UTC-4, Anthony wrote:
On Thursday, October 10, 2013 4:53:52 PM UTC-4, Tim Richardson wrote:

Niphlod

unread,
Oct 11, 2013, 3:03:02 PM10/11/13
to web2py-d...@googlegroups.com
well.... as always, you're right.
orderby in the grid right now serves the purpose of pagination only strictly if sortable=False.
Anyone clicking on the headers may get an incorrect pagination in some backends.
The built query for a grid should always include field_id (it's what it's used to create the links, so it's unique) as the last orderby field.
Doing that we strip away for "power users" (indeed, they are the ones reporting problems when the grid does something "magic") the possibility to avoid appending pks to the orderby query sent to the backend.
I think we have to fix concurrently the two issues:
- orderby in grid won't be anymore a close relative to orderby in select() but it will be an unfortunately named in the same way...in the grid it will represent the order the grid should be displayed in the first page
- we add a "strict_pagination" or a "auto_pagination" (as always, naming apis is not the best job for me) to let "power user" disregard the auto-addition of field_id in the orderby passed to the underlying query

Everbody is on board with this plan?

Tim Richardson

unread,
Oct 11, 2013, 3:25:19 PM10/11/13
to web2py-d...@googlegroups.com
auto_pagination defaults to True sounds good.


--
-- mail from:GoogleGroups "web2py-developers" mailing list
make speech: web2py-d...@googlegroups.com
unsubscribe: web2py-develop...@googlegroups.com
details : http://groups.google.com/group/web2py-developers
the project: http://code.google.com/p/web2py/
official : http://www.web2py.com/
---
You received this message because you are subscribed to a topic in the Google Groups "web2py-developers" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/web2py-developers/nklB1tRh_zU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to web2py-develop...@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.



--
Tim Richardson

Anthony

unread,
Oct 11, 2013, 4:10:23 PM10/11/13
to web2py-d...@googlegroups.com
+1

Massimo DiPierro

unread,
Oct 11, 2013, 5:27:33 PM10/11/13
to web2py-d...@googlegroups.com
If you and Simone agree, this is good enough for me. ;-)

--
-- mail from:GoogleGroups "web2py-developers" mailing list
make speech: web2py-d...@googlegroups.com
unsubscribe: web2py-develop...@googlegroups.com
details : http://groups.google.com/group/web2py-developers
the project: http://code.google.com/p/web2py/
official : http://www.web2py.com/
---
You received this message because you are subscribed to the Google Groups "web2py-developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to web2py-develop...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages