Problem with collect

David Engel

unread,

Apr 16, 2025, 5:40:33 PM4/16/25

to MR3

While running some comparisons between Hive on MR3 and Hadoop, I ran
into a problem with a moderately large aggregation query. I believe
I've narrowed the problem down to an issue with collect_set(). Here
are the smallest set of queries I've come up with to reproduce the
problem.

drop table if exists t ;
create table t ( k string, b binary, s string ) ;
insert into table t values
('0', unhex('00'), '0'),
('0', unhex('0001'), '0a' ),
('1', unhex('01'), '1') ;
select k, collect_set(b) as bset, collect_set(s) as sset
from t group by k ;

Attached are the resulting logs from running these right after a
restart. This is on Kubernetes using the
mr3project/hive:4.0.0.mr3.2.0 image.

David
--
David Engel
da...@istwok.net

collect_set_logs.zip

Sungwoo Park

unread,

Apr 17, 2025, 5:33:27 AM4/17/25

to MR3

This is a problem related with Java 17, introduced in a recent commit to Hive.

Let me upload the fixed version later.

--- Sungwoo

Sungwoo Park

unread,

Apr 17, 2025, 11:09:16 AM4/17/25

to MR3

Problem fixed. I just uploaded a new Docker image: mr3project/hive:4.0.0.mr3.2.0

--- Sungwoo

--
You received this message because you are subscribed to the Google Groups "MR3" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hive-mr3+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/hive-mr3/590956f2-80d3-4c95-ab77-0f4322a638d0n%40googlegroups.com.

David Engel

unread,

Apr 17, 2025, 2:48:20 PM4/17/25

to Sungwoo Park, MR3

I confirmed your fix is good here. Thanks!

David

> > <https://groups.google.com/d/msgid/hive-mr3/590956f2-80d3-4c95-ab77-0f4322a638d0n%40googlegroups.com?utm_medium=email&utm_source=footer>
> > .

> >
>
> --
> You received this message because you are subscribed to the Google Groups "MR3" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to hive-mr3+u...@googlegroups.com.

> To view this discussion visit https://groups.google.com/d/msgid/hive-mr3/CAKHFPXDMVXqfdd_Bb%2B7FqdNqSis-DEbu5yW0Bp2VBw3f8tbe7g%40mail.gmail.com.

--
David Engel
da...@istwok.net

Ill

unread,

Jun 19, 2025, 11:26:37 AM6/19/25

to David Engel, Sungwoo Park, MR3

Does hive4.0.1 on mr3-1.12 have the same issue?

To view this discussion visit https://groups.google.com/d/msgid/hive-mr3/aAFM8WeELW1kItD7%40opus.istwok.net.

Reply all

Reply to author

Forward

Problem with collect_set()

David Engel

Sungwoo Park

Sungwoo Park

David Engel

Ill