Error in Cumulative sum

36 views
Skip to first unread message

Sangeetha Gopinath

unread,
Jan 12, 2023, 5:09:44 AM1/12/23
to Druid User
Hi, I tried to find cumulative sum , this works in small data set only, but it shows error in large data set. Could you please help?

select c1.*,Cum_Sum from "50mbcsvdata_auraa_net_staging" c1
join
(SELECT a.bnx_id as ID,a.tripduration as Student_Trip,SUM(b.tripduration) AS Cum_Sum
FROM "50mbcsvdata_auraa_net_staging" a, "50mbcsvdata_auraa_net_staging" b
WHERE a.bnx_id >= b.bnx_id
GROUP BY a.bnx_id,a.tripduration ORDER BY a.bnx_id) c2
on c2.ID = c1.bnx_id 


Error is :
Error: Resource limit exceeded

Subquery generated results beyond maximum[100000]

org.apache.druid.query.ResourceLimitExceededException

Sergio Ferragut

unread,
Jan 25, 2023, 8:43:28 PM1/25/23
to Druid User
Hi,
Yes, that is limited to only very small datasets because it requires a cartesian product join to do it that way.
Unfortunately Apache Druid does not yet have window functions that will give you a cumulative sum for any data volume.
The good news is that if you look at the github repo and search for Window Functions, you'll see there are folks working on it.

- Sergio

Ben Krug

unread,
Jan 25, 2023, 8:50:07 PM1/25/23
to druid...@googlegroups.com
I don't know if any of these would be helpful or more efficient than your query, but you might be

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/75af5d30-3a14-45cf-ab5d-041a45b6b64bn%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages