Question about LowCardinality with Kafka

181 views
Skip to first unread message

todd

unread,
Mar 29, 2019, 1:02:49 PM3/29/19
to click...@googlegroups.com
Checking out the new LowCardinality stuff and I am seeing this exception

2019.03.29 12:55:21.737777 [ 41 ] {} <Error> void
DB::StorageKafka::streamThread(): Code: 53, e.displayText() =
DB::Exception: Type mismatch for column Collector_ID. Column has type
LowCardinality(String), got type String: while pushing to view
default.kafka_consumer_new, Stack trace:

Here are some abbreviated examples of the tables I have setup

CREATE TABLE test_local (
timestamp DateTime,
Collector_ID LowCardinality(String)
) ENGINE = MergeTree()


CREATE TABLE test_kafka (
__timestamp String,
Collector_ID String
) ENGINE = Kafka


CREATE MATERIALIZED VIEW kafka_consumer_new to test_local
AS SELECT
parseDateTimeBestEffort(__timestamp) as timestamp,
Collector_ID
FROM test_kafka

Is there something I should be doing differently?

This is on ClickHouse server version 19.3.5 revision 54415.

Thanks,

Todd

Denis Zhuravlev

unread,
Mar 29, 2019, 3:12:19 PM3/29/19
to ClickHouse
You need to cast String to LowCardinality(String) in MV.

CREATE MATERIALIZED VIEW kafka_consumer_new to test_local
  AS SELECT
  parseDateTimeBestEffort(__
timestamp) as timestamp,
  toLowCardinality(Collector_ID) as Collector_ID
  FROM test_kafka

todd

unread,
Mar 30, 2019, 11:30:34 AM3/30/19
to click...@googlegroups.com
Oops, I meant for that reply to go the mailing list.

The good news is, I got my Array(LowCardinality(String)) question
below to work if I used the 'AS' variation of 'CAST'

cast(splitByChar('|',__Destination_Flow_Matches) as
Array(LowCardinality(String))) as Destination_Flow_Matches

This didn't work though

cast(splitByChar('|', __Destination_Flow_Matches),
Array(LowCardinality(String))) AS Destination_Flow_Matches

I got this exception, not sure why, but I have a working path forward,
and am just sharing the results of my investigation in case it helps
anyone else.

Received exception from server (version 19.3.5):

Code: 47. DB::Exception: Received from d3.tb.arbor.net:9001,
172.18.26.190. DB::Exception: Unknown identifier: String, context:
query: 'SELECT parseDateTimeBestEffort(__timestamp) AS timestamp,
cast(splitByChar('|', __Destination_Flow_Matches),
Array(LowCardinality(String))) AS Destination_Flow_Matches FROM
default.test_kafka ' required_names: 'String' '__timestamp'
'__Destination_Flow_Matches' source_tables: 'default.test_kafka'
table_aliases: private_aliases: column_aliases:
'Destination_Flow_Matches' 'timestamp' public_columns:
'Destination_Flow_Matches' 'timestamp' masked_columns:
array_join_columns: source_columns: '__timestamp'
'__Destination_Flow_Matches' .

Todd

---------- Forwarded message ---------
From: todd <odd...@gmail.com>
Date: Fri, Mar 29, 2019 at 10:52 PM
Subject: Re: Question about LowCardinality with Kafka
To: Denis Zhuravlev <denis.z...@gmail.com>


Denis,

Yes, that works, thank you!

The sad part is, I had tried to do a cast, but used
LowCardinality(Collector_ID) instead of
toLowCardinality(Collector_ID).

I'm more of a C programmer than a SQL programmer :)

I did run into a new problem though that I'm not sure how to handle,
when I have an Array(LowCardinality(String))

I see a similar error,

2019.03.30 02:35:56.887299 [ 42 ] {} <Error> void
DB::StorageKafka::streamThread(): Code: 53, e.displayText() =
DB::Exception: Type mismatch for column Destination_Flow_Matches.
Column has type Array(LowCardinality(String)), got type Array(String):
while pushing to view default.kafka_consumer_new, Stack trace:

And my tables look like this.

CREATE TABLE test_local (
timestamp DateTime,
Destination_Flow_Matches Array(LowCardinality(String))
) ENGINE = MergeTree()


CREATE TABLE test_kafka (
__timestamp String,
__Destination_Flow_Matches String
) ENGINE = Kafka


CREATE MATERIALIZED VIEW kafka_consumer_new to test_local
AS SELECT
parseDateTimeBestEffort(__timestamp) as timestamp,
splitByChar('|',__Destination_Flow_Matches) as Destination_Flow_Matches,
FROM test_kafka

I tried

splitByChar('|',toLowCardinality(__Destination_Flow_Matches)) as
Destination_Flow_Matches,

which gives this execption

2019.03.30 02:50:28.352443 [ 33 ] {} <Error> void
DB::StorageKafka::streamThread(): Code: 53, e.displayText() =
DB::Exception: Type mismatch for column Destination_Flow_Matches.
Column has type Array(LowCardinality(String)), got type Array(String):
while pushing to view default.kafka_consumer_new, Stack trace:

and I tried

toLowCardinality(splitByChar('|',__Destination_Flow_Matches)) as
Destination_Flow_Matches,

But that is not legal.

So neither works. How should I handle it?

Thanks ,

Todd
> --
> You received this message because you are subscribed to the Google Groups "ClickHouse" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to clickhouse+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/clickhouse/9320c116-14d4-4ab5-a3cf-f9e964fab028%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Denis Zhuravlev

unread,
Mar 30, 2019, 2:11:20 PM3/30/19
to ClickHouse
arrayMap(x -> toLowCardinality(x), splitByChar('|', __Destination_Flow_Matches))

select arrayMap(x -> toLowCardinality(x), splitByChar('|', __Destination_Flow_Matches)) AS Destination_Flow_Matches
 from (select 'fsd|fsdfdf' __Destination_Flow_Matches)
┌─Destination_Flow_Matches─┐
│ ['fsd','fsdfdf']         │
└──────────────────────────┘


BTW, yesterday was released CH v19.4.2.7-stable 
with one bugfix: Fixed a bug in the work of Array (LowCardinality (...)) if a whole block of data in a row is filled with empty arrays.


> To unsubscribe from this group and stop receiving emails from it, send an email to click...@googlegroups.com.

todd

unread,
Mar 30, 2019, 3:02:57 PM3/30/19
to ClickHouse
Huh, interesting.

So the CAST vs ArrayMap seem to give the same results.

Is there an advantage of one over the other then?

lead :) select arrayMap(x -> toLowCardinality(x), splitByChar('|',
__Destination_Flow_Matches)) AS Destination_Flow_Matches from
(select 'fsd|fsdfdf' __Destination_Flow_Matches);


SELECT arrayMap(x -> toLowCardinality(x), splitByChar('|',
__Destination_Flow_Matches)) AS Destination_Flow_Matches

FROM

(

SELECT 'fsd|fsdfdf' AS __Destination_Flow_Matches

)


┌─Destination_Flow_Matches─┐

│ ['fsd','fsdfdf'] │

└──────────────────────────┘


1 rows in set. Elapsed: 0.053 sec.


lead :) select CAST(splitByChar('|', __Destination_Flow_Matches) AS
Array(LowCardinality(String))) AS Destination_Flow_Matches from
(select 'fsd|fsdfdf' __Destination_Flow_Matches);


SELECT CAST(splitByChar('|', __Destination_Flow_Matches),
'Array(LowCardinality(String))') AS Destination_Flow_Matches

FROM

(

SELECT 'fsd|fsdfdf' AS __Destination_Flow_Matches

)


┌─Destination_Flow_Matches─┐

│ ['fsd','fsdfdf'] │

└──────────────────────────┘


1 rows in set. Elapsed: 0.055 sec.

On Sat, Mar 30, 2019 at 2:11 PM Denis Zhuravlev
> To unsubscribe from this group and stop receiving emails from it, send an email to clickhouse+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/clickhouse/a1e92115-5161-4958-86bb-7428ea6f4948%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages