Since there are no loops or user defined stored procedure in clickhouse how to perform the following query over a range of items

3,433 views
Skip to first unread message

Mark FM

unread,
Jul 23, 2018, 1:34:41 PM7/23/18
to ClickHouse

So, if I wanted to perform the query below over many ip addresses how would I do that in clickhouse? In another database i might put this in a for loop in a stored procedure

I'm new to clickhouse and columnar databases so forgive my ignorance.


select count() + 1

from (

  SELECT

    address,

    date_visit_exact,

    runningDifference(date_visit_exact) AS delta

  FROM

  (

    SELECT

        address,

        date_visit_exact

    FROM stats

    WHERE date_visit BETWEEN '2018-06-01' and '2018-06-02' and address = '192.168.0.1'

    ORDER BY date_visit_exact ASC

  )

)

where delta > 1800


Alex Zatelepin

unread,
Jul 24, 2018, 8:34:30 AM7/24/18
to ClickHouse
Hi Mark, what do you mean by "the query over many ip addresses"? As far as I can tell the query should return visit times when there is a gap in visits for a certain address.

If you want the same for several addresses, just substitute
address = '192.168.0.1'
for
address IN ('addr1', 'addr2', etc...)
and add the address column to the ORDER BY clause:
ORDER BY address, date_visit_exact

Of course it would be better to use window functions for this task, but they are not available in ClickHouse yet.
Also you can try an array-based approach: group all values that you want to process together into arrays using the groupArray() function and iterate over these arrays using lambda functions.

Mark FM

unread,
Jul 24, 2018, 10:43:07 AM7/24/18
to ClickHouse
Hi Alex, I meant if I had 20,000 addresses for instance that I wanted to iterate over. I could have a inner query like

select distinct (address) from Table

then iterate over the list of returned addresses.
Reply all
Reply to author
Forward
Message has been deleted
0 new messages