watch tuple of Hyracks during query execution

28 views
Skip to first unread message

Keren-Audrey Ouaknine

unread,
Nov 14, 2012, 9:45:51 AM11/14/12
to hyrack...@googlegroups.com
Hello,

I would like to ask you about the records format in Hivesterix. 

In particular, in which function can I watch data records during runtime?
I put a breakpoint in nextFrame() function of the class IFrameWriter as well as on inputs of the write operator:
ILogicalOperator src = writeResultOp.getInputs().get(0).getValue();

But didn't see the tuples.

I found under /tmp/kereno, files of waf format containing some data that seems query relevant at first sight. 
What are these files for?

Thanks for your help,
Keren

Vinayak Borkar

unread,
Nov 14, 2012, 12:40:21 PM11/14/12
to hyrack...@googlegroups.com
Hi Keren,


Tuple data is moved to and from Hyracks operators using the IFrameWriter
interface. However, all data is moved as bytes placed in ByteBuffer
objects. Tuples are not kept as Java objects that you can "watch" in a
debugger. The bytes placed in the byte buffer are in a binary format
that are not readable in the Java debugger.

One way to look at the tuples flowing in the frames is to have a
FrameTupleAccessor instance that holds the ByteBuffer pretty print the
frame. Look inside the code of the operators and you should find an
instance of a FrameTupleAccessor object that is used to read tuple data
from the ByteBuffer objects. You can just call prettyPrint on one of
those instances to see a human-readable representation of tuple-data.

The .waf files in /tmp/kereno are temporary files that are created by
spilling operators such as the Sort operator and the Join operators to
hold run-file data. Those are also in binary format that may not make a
whole lot of sense when looked at in a text editor.

Vinayak


On 11/14/12 6:45 AM, Keren-Audrey Ouaknine wrote:
> Hello,
>
> I would like to ask you about the records format in Hivesterix.
>
> In particular, in which function can I watch data records during runtime?
> I put a breakpoint in *nextFrame() *function of the class IFrameWriter as
> well as on inputs of the write operator:
> *ILogicalOperator src = writeResultOp.getInputs().get(0).getValue();*

Keren-Audrey Ouaknine

unread,
Nov 20, 2012, 9:37:40 AM11/20/12
to hyrack...@googlegroups.com
Thanks Vinayak. I was looking for TupleFrameAccessor in DataSourceScanPOOperator, and didn't find it.
TupleAccessor is built by PartitionDataWriter which writes tuples to a FrameTupleAppender and Hybrid Join (and probably others too - just checking on h12 query).

So how are tuples sent to DataSourceScanPOOperator? My purpose is to change the tuples deserialization from Hyracks to Hadoop Format (an implementation of InputFormat). Then I can create a JobConf, with an InputFormat and pass the tuples to a JobSpecification which merely scans the data and writes the same bytes to an outputformat. Does it makes sense?

Thanks,
Keren
Reply all
Reply to author
Forward
0 new messages