Order of rows in query

59 views
Skip to first unread message

Juan Emilio Gabito Decuadra

unread,
May 8, 2018, 11:09:14 PM5/8/18
to cstore users
Hi everyone.

I have question about the order of rows in the result of a query.

So, I have this table:

CREATE FOREIGN TABLE public.censo2011
   (cload_id bigint ,
    departamento character varying(14) ,
    genero character varying(9) ,
    anoymesdenacimiento pg_catalog.date ,
    anodenacimiento character varying(80) ,
    mesdenacimiento character varying(80) ,
    edad integer ,
    ascendencia character varying(80) ,
    sabeleeryescribir character varying(80) ,
    maximoniveleducativo character varying(80) ,
    situacionlaboral character varying(80) ,
    hijosnacidosvivos integer ,
    poblacion integer )
   SERVER cstore_server
   OPTIONS (compression 'pglz');

And I COPY the data in descendant order by the "departamento" column. If I run a query I can see that the data is in fact sorted (Basically a SELECT * FROM censo2011 [yes I know I should not be running that type of query in a columnar dbms]).

But when I run the query SELECT departamento, COUNT(*) FROM censo2011 GROUP BY departamento this is what I get:

"Artigas";73377
"Lavalleja";58815
"San José";108304
"Canelones";520173
"Rocha";68088
"Soriano";82594
"Maldonado";164298
"Río Negro";54765
"Salto";124861
"Cerro Largo";84698
"Florida";67047
"Durazno";57084
"Tacuarembó";90051
"Montevideo";1318755
"Treinta y Tres";48134
"Rivera";103473
"Paysandú";113107
"Colonia";123203
"Flores";25050

As you cant see, the data is not sorted. If I add a ORDER BY to the query it works just fine.

So the question is, the order benefits compression but then, when we get the results that order is lost maybe as a side effect of decompression?

We are using cstore_fdw as the analytical database for a BI on the cloud platform so I really want to know how does it work. 

Thanks!

Murat Tuncer

unread,
May 9, 2018, 5:49:56 AM5/9/18
to Juan Emilio Gabito Decuadra, cstore users
Hello Juan Emilio

Behavior you are seeing is coming from postgres and outside of cstore_fdw. cstore_fdw only provides selected columns/rows to postgres for further processing. Any group by, aggregate, function call, window function operation is handled by postgres,

This is expected postgres behavior. It may use different group by strategies that may alter ordering. It is probably using HashAggregate strategy. You can see the executed plan by running this query under explain.  Try

explain analyze SELECT departamento, COUNT(*) FROM censo2011 GROUP BY departamento

You should not assume any ordering unless you specifically asked for it.

Hope it helps
Murat


--
You received this message because you are subscribed to the Google Groups "cstore users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cstore-users+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Murat Tuncer
Software Engineer | Citus Data
mtu...@citusdata.com

Juan Emilio Gabito Decuadra

unread,
May 9, 2018, 2:39:51 PM5/9/18
to cstore users
Indeed is using the HashAggregate strategy:

"HashAggregate  (cost=49445.99..49446.00 rows=1 width=186) (actual time=698.762..698.764 rows=19 loops=1)"
"  Group Key: departamento"
"  ->  Foreign Scan on censo5  (cost=0.00..33016.60 rows=3285877 width=178) (actual time=1.682..226.853 rows=3285877 loops=1)"
"        CStore File: /var/lib/postgresql/10/main/cstore_fdw/16384/16411"
"        CStore File Size: 15511714"
"Planning time: 25.549 ms"
"Execution time: 699.194 ms"


What we are doing is translating questions asked in Spanish and English to SQL so that gives us a lot of control in what we are gereating so we always sort if the question has a dimensions on it, eg, what's the average age by state translates into SELECT state, AVG(age) FROM census GROUP BY state ORDER BY sate.

Regards!
To unsubscribe from this group and stop receiving emails from it, send an email to cstore-users...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages