Hello,
it has been brought to my attention by a request in IRC #trac today,
that a user is experiencing following issue with report 6 in default
configuration on latest stable 0.12:
'My Tickets' is shown _after_ 'Active Tickets'
Before the changeset it was ok. Work-around is to rename groups or use
ORDER BY (COALESCE(owner, '') = '$USER') DESC, ...
instead of
ORDER BY (owner = '$USER') DESC, ...
Hint: The underlying query seem to not get the DESC rule, when the
column is added the new way, as SQL logging reveals ([2] last line).
Well, looks like there is simply no way to attach a 'DESC' to the
__group__ directive, right?
Since I'm not a SQL guru, I start with discussion here rather than
filing a ticket right-away.
Sincerely,
Steffen Hoffmann
[1] http://trac.edgewall.org/changeset/10892
[2] http://pastebin.com/G3criDRQ
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iEYEARECAAYFAk8TOHMACgkQ31DJeiZFuHerswCgt0vEC5hw/eloK/i3UbNcxnG1
jFAAoOJpFbfqY9Qn/YhpkX7hrDH+KTEp
=W5uA
-----END PGP SIGNATURE-----
Well, then it looks like the current way we do the pagination is
flawed. It's strange that we never noticed it so far. I wonder if
we couldn't adopt the approach taken in the custom queries and
simply append LIMIT ... OFFSET ... to the query?
>
> 1) Should we just ignore it? As I jokingly said on IRC, just rename
> report {8} to "Active Tickets, Mine last"? :-)
>
Would be nice if we could fix this for 0.12.3 as apparently
r10892 made the problem more or differently visible.
> 2) Should we try to parse the report SQL and somehow extract any
> underlying order specification?
It seems it could be "enough" to check if the SQL query is ending
with an `ORDER BY`, and in that case simply insert our extra
order columns, if any. But as that clause may contain arbitrary
expressions, we should at least check for an even count of
parentheses between the ORDER BY and the end of the query.
> 3) Rewrite report pagination to not use a subquery?
Yep
> 4) Any other good ideas?
Let's see if we can implement the above...
-- Christian
For me, the most serious problem here is this undefined order.
#216 was implemented on the wrong assumption that the outer query
will just keep the order given by the inner query, but this is
apparently wrong. Now that we know, we can't leave it like that.
>
> b) The order of groups as now reported as a 'regression' remains, but
> only when you re-sort the report. Like our own /demo-0.12/report/8
> ("Active Tickets, Mine first") do this test;
> - open report, observe that it says "My Tickets"
> - click any column header to resort table (like "Component")
> - observe that it now says "Active Tickets" as first group.
> All r10892 did was make this bug apparent as without parsing the query
> we can't know if the order we insert is supposed to be ASC or DESC.
> After r10892 the bug will be clearly visible for all reports using
> ORDER BY __group__ DESC.
We could introduce a __groupdesc__ keyword...
>
> 2. DROP SUBQUERY AND PAGINATE IN PYTHON
> ---------------------------------------
>
> Instead of LIMIT and OFFSET in SQL, return the full resultset and pick
> the rows we want depending on paging status.
>
> a) The sorting would be correct. But, it will come at the cost of
> performance as the cursor will be populated with full data sets for
> each page in the report. The performance must be measured and compared
> for larger datasets as the penalty may be very noticeable.
Pagination was precisely introduced to avoid doing this,
so I think this solution can't be used.
>
> b) Switching to Python paging does not help with detecting ASC or DESC
> sorting of groups. Issue remains.
>
> 3. USE ROW_NUMBER IN SQL
> ------------------------
>
> Supported for MySQL and Postgres, we could mofify the inner SQL to be
> "SELECT ROW_NUMBER(), ...<remaing query>..." to have an ID to follow
> each row that we then could reuse at outer level.
>
> a) The sorting would be correct, but no solution in SQLite that does
> not support it. Would need conditional code that uses initial
> behaviour for SQLite, and other behaviour for those that are known to
> support it. There should be no noticeable performance differences
> compared to current status.
Eek. SQL dialect-hell. I also first looked into this direction but
found nothing convincing.
>
> b) Again, no new knowledge about __group__ ASC or DESC.
>
> 4. PARSE QUERY AND EXTRACT ORDER
> --------------------------------
>
> Somehow parse the query and extract the details we need, really just
> the ORDER BY clause content. If __group__ are part of current columns,
> we would need to split the ORDER BY content into parts (can order by
> more than one item of course) and detect if DESC is mentioned for
> first order item. As the SQL in theory can contain further nested
> subqueries, and "ORDER BY" text may otherwise appear in functions or
> elsewhere, care must be taken to extract this correctly.
Maybe we could analyze further what are the columns inside the
ORDER BY, though that's probably difficult in the general case.
As a first step however, only locating the ORDER BY should be
enough (see below).
>
> a) By appending the subquery ORDER BY to the outer query, the results
> will be ordered correctly.
>
> b) With the contents of the ORDER BY clause in hand, parsed and
> correctly identified ASC/DESC of first item, we can also re-sort
> grouped reports by doing "ORDER BY __group__ ASC|DESC, selected_column
> ASC|DESC". The outer and inner order should also be correct.
>
> 5. JUST APPEND LIMIT AND OFFSET TO THE QUERY
> --------------------------------------------
>
> As Christian suggested, just drop the use of subquery and append
> "LIMIT ... OFFSET .." to the original query.
Not only append LIMIT / OFFSET, but also add an ORDER BY clause
or augment an existing top-level one.
>
> a) The order would be correct. New issue may be that the report would
> fail if the user-entered SQL already contains the same keywords.
> Typically reports that contain the "top number of.." or the "most
> recent.." that itself restricts the dataset. It would be a bug, and we
> would need a fallback strategy - like for instance just use current
> strategy if exception occurs (1.).
Ok, if there's already a LIMIT, we should not add one (and probably
disable pagination controls).
>
> b) Re-sorting columns would not be possible seeing we would not be
> able to change the ORDER BY clause. This would of be much worse than
> the current/initial situation.
Actually not... My solution even has the nice side-effect to
always make it possible to sort with an user selected column.
>
> 6. "YOUR" SUGGESTION?
> ---------------------
>
> Feel free to suggest other approaches - or combination of approaches.
>
Right, so my proposed solution is a kind of combined 4-5 approach.
You can see a "proof of concept" attached to the ticket I created
for tracking this issue:
http://trac.edgewall.org/attachment/ticket/10530/10530-rework-report-pagination-r10914.patch
The pagination is now done using a single query, and it seems
to work well with the current reports and I hope with most
existing ones. We can probably make it more even more robust.
The problem with the "Mine ticket" report is still there, but
starting from that patch it should be possible to try 4.b) or
my __groupdesc__ suggestion.
But other ideas welcome, of course.
-- Christian