Parquet columnar storage format for Hadoop

14 views
Skip to first unread message

Angel Java Lopez

unread,
Mar 12, 2013, 1:30:31 PM3/12/13
to spain-scala...@googlegroups.com
Hola gente!

Puede que interese para el tema de la lista:

Parquet is a columnar storage format for Hadoop.

We created Parquet to make the advantages of compressed, efficient columnar data representation available to any project in the Hadoop ecosystem, regardless of the choice of data processing framework, data model, or programming language.

Parquet is built from the ground up with complex nested data structures in mind, and uses the repetition/definition level approach to encoding such data structures, as popularized by Google Dremel. We believe this approach is superior to simple flattening of nested name spaces.

Iván de Praadoo

unread,
Mar 13, 2013, 5:42:56 AM3/13/13
to spain-scala...@googlegroups.com
Interesante. Parece una alternativa como el Trevni de Doug Cutting: http://avro.apache.org/docs/current/trevni/spec.html

Iván


--
Has recibido este mensaje porque estás suscrito al grupo "spain-scalability-users" de Grupos de Google.
Para anular la suscripción a este grupo y dejar de recibir sus correos electrónicos, envía un correo electrónico a spain-scalability...@googlegroups.com.
Para obtener más opciones, visita https://groups.google.com/groups/opt_out.
 
 

Reply all
Reply to author
Forward
0 new messages