Materialized view using countIf - does not work

367 views
Skip to first unread message

michal...@glassboxdigital.com

unread,
Sep 10, 2020, 11:07:37 AM9/10/20
to ClickHouse
Hi. i am using a materialized view with target and i defined some of the columns in materialized view as:
countIf(struggle_severity = 'error') as error_count,
countIf(struggle_severity = 'warn') as warn_count,

on the target table they are defined as:
error_count Nullable(UInt64),
warn_count Nullable(UInt64),

When i try to query these fields from materialized view using:
sum(clicks_count)
I get a very big number.

So if i use select * on materialized view this returns a right value but if i use sum()
it returns an invalid number.

Any ideas?
thanks



Denis Zhuravlev

unread,
Sep 10, 2020, 11:12:38 AM9/10/20
to ClickHouse
1. What is clicks_count ? I don't see this column in your description.
2. What table engine do you use for MV?
3. Can you show simplified DDL for CREATE MV ?

Denis Zhuravlev

unread,
Sep 13, 2020, 4:40:36 PM9/13/20
to ClickHouse
>sum_duration AggregateFunction(sum, Int32),
>error_count Nullable(UInt64)
>ENGINE = AggregatingMergeTree()

>sumState(duration) as sum_focus_duration,
>countIf(severity = 'error') as error_count


Because AggregatingMergeTree does not sums numeric columns by default. It just do any() as documented.

And it's all slightly overengineered and incorrect.
I would not use AggregateFunction I would use SummingMT.

CREATE TABLE test.session_target
( session_ts UInt32,
  session_uuid UUID,
  appid UInt32,
  client_ip String,
  sum_duration Int32,
  error_count UInt64
ENGINE = SummingMergeTree()
PARTITION BY toStartOfWeek(toDateTime(session_ts))
ORDER BY (session_ts, session_uuid, appid, client_ip);

CREATE MATERIALIZED VIEW test.session_agg_MV TO test.session_target
as select session_ts, session_uuid, appid, client_ip,
sumState(duration) as sum_focus_duration,
countIf(severity = 'error') as error_count
from event
where toDateTime(session_ts)>='2020-09-10 00:00:00'
group by session_ts, session_uuid, appid, client_ip

select from MV
select
session_ts, session_uuid, appid, client_ip,
sum(sum_duration) as sum_duration,
sum(error_count) as error_count,
from test.session_agg_MV
group by session_ts, session_uuid, appid, client_ip



If you really need AggregatingMergeTree for something else, I would use SimepleAggregateFunction
SimepleAggregateFunction are twice faster and take less space than AggregateFunction because they don't need state.
and sum -- does not need state.

CREATE TABLE test.session_target
( session_ts UInt32,
  session_uuid UUID,
  appid UInt32,
  client_ip String,
  sum_duration SimepleAggregateFunction(sum, Int32),
  error_count SimepleAggregateFunction(sum, UInt64)
ENGINE = ENGINE = AggregatingMergeTree()
PARTITION BY toStartOfWeek(toDateTime(session_ts))
ORDER BY (session_ts, session_uuid, appid, client_ip);


There are by the way countStateIf and countMerge -- but they useless in this case.

Michal Singer

unread,
Sep 14, 2020, 6:04:12 AM9/14/20
to Denis Zhuravlev, ClickHouse
Great. Thanks, I will test this.

--
You received this message because you are subscribed to the Google Groups "ClickHouse" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clickhouse+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/clickhouse/f84b5352-e0bb-4f85-b995-1e3e6d9c88c3n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages