query to return single row for each ID

42 views
Skip to first unread message

Floyd Arguello

unread,
Feb 2, 2023, 7:52:19 PM2/2/23
to Druid User
Hi folks,

I'm trying to write a query that returns the latest row per ID. This works to an extent:

SELECT
  MAX(__time),
  ID,
  Status,
FROM table

It returns something similar to this:
__time,  ID,  Status
08:31,   001, SUCCESS  
08:30,   001, RUNNING  
08:31,   002, RUNNING
08:31,   003, RUNNING
08:29,   004, FAILED
08:28,   004, RUNNING

It returns the latest row per ID & Status.

How do I write the query so that it returns the latest row per ID, regardless of the status (or any other column for that matter)?
__time,  ID,  Status
08:31,   001, SUCCESS 
08:31,   002, RUNNING
08:31,   003, RUNNING
08:29,   004, FAILED

Thanks,
Floyd


John Kowtko

unread,
Feb 2, 2023, 11:06:16 PM2/2/23
to Druid User
Hi Floyd,

Since Druid doesn't support SQL window functions (yet), I think you'll have to do a subquery with a group by to find the max time, and then join it back in.  Here I have written it as a CTE:

with max_time as (
 select ID, 
        max(__time) max_time
   from table
  group by ID
)
 select m.max_time,
        t.ID,
        t.Status
   from table t 
   join max_time m on t.ID = m.ID

If that doesn't cut it let me know.

Thanks.  John

Floyd Arguello

unread,
Feb 16, 2023, 12:49:36 PM2/16/23
to Druid User
Thanks, John - that was really helpful.

John Kowtko

unread,
Feb 17, 2023, 9:19:53 AM2/17/23
to Druid User
Oops I think I forgot to add the _time to the join condition,  so ...


with max_time as (
 select ID, 
        max(__time) max_time
   from table
  group by ID
)
 select m.max_time,
        t.ID,
        t.Status
   from table t 
   join max_time m on t.ID = m.ID and t.__time = m.max_time
Reply all
Reply to author
Forward
0 new messages