Replicating complex query with two dense_rank() functions using ORM

74 views
Skip to first unread message

Brad Buran

unread,
Nov 3, 2020, 11:34:22 PM11/3/20
to Django users
I have a "puzzle of the day" that users can answer. I track the puzzle of the day using the Puzzle model. I track the answers using the PuzzleAnswer model. I would like to calculate the number of consecutive puzzles a particular user (i.e., the author) gets right in a row. The current SQL I use that can calculate the start date of the streak, end date of the streak and the number of days in the streak. As you can see, it does a dens_rank over the puzzles (to count them in order), then does a join with the PuzzleAnswer, then does a second dense rank over the merged tables. I figured out how to use the DenseRank function in the Django ORM on the Puzzle manager, but I cannot figure out how to do the left join next. Any advice?

SELECT min(s.id) AS id,
   count(s.date) AS streak,
   min(s.date) AS start_streak,
   max(s.date) AS end_streak,
   s.author_id
  FROM ( SELECT dense_rank() OVER (ORDER BY ROW(pa.author_id, pr.rank)) AS id,
           pa.created AS date,
           (pr.rank - dense_rank() OVER (ORDER BY ROW(pa.author_id, pr.rank))) AS g,
           pa.author_id
          FROM (( SELECT "POTD_puzzle".id,
                   dense_rank() OVER (ORDER BY "POTD_puzzle".published) AS rank
                  FROM public."POTD_puzzle") pr
            JOIN public."POTD_puzzleanswer" pa ON ((pr.id = pa.puzzle_id)))
         WHERE pa.correct) s
 GROUP BY s.author_id, s.g
 ORDER BY count(s.date) DESC;

The models are:

class PuzzleAnswer(models.Model):        
   puzzle = models.ForeignKey(Puzzle, editable=True, on_delete=models.CASCADE)
   answer = models.CharField(max_length=64)                          
   correct = models.BooleanField(editable=False)       
   created = models.DateTimeField(auto_now_add=True)
   author = models.ForeignKey(settings.AUTH_USER_MODEL, blank=True, null=True,
                              on_delete=models.SET_NULL)                                                                                          
class Puzzle(models.Model):
   category = models.ForeignKey(PuzzleCategory, on_delete=models.CASCADE, help_text=category_help)   
   notation = models.CharField(max_length=64)            
   correct_answer = models.CharField(max_length=64)
   published = models.DateField(blank=True, null=True, db_index=True, unique=True)                                                                                                                
        

Ramon NHB

unread,
Nov 4, 2020, 12:25:30 PM11/4/20
to Django users
Hi,

In Django queryset you can query PuzzleAnswer, filter by author (assuming that is the user), order_by('created)' and return the created dates only: values_list('created', flat=True). Then take the len() of that list and check the [0] and [-1] entries for first and last date. The rest is date math.

In general I try to keep the number of database accesses as low as possible. A pre-calculated answer takes less resources to present than calculating this over and over again, especially if you have many users. So you could do the above and store the result in a new table every time a new PuzzleAnswer has been added for a user.

Ramon

Op woensdag 4 november 2020 om 05:34:22 UTC+1 schreef Brad Buran:

Simon Charette

unread,
Nov 4, 2020, 6:53:40 PM11/4/20
to Django users
This is unfortunately not possible to do through the ORM right now due to lack of support for filtering against window expressions[0]

Until this ticket is solved you'll have to rely on raw SQL.

Cheers,
Simon

Reply all
Reply to author
Forward
0 new messages