Unable to allocate frame exception

37 views
Skip to first unread message

Pushkar Khadilkar

unread,
Sep 21, 2013, 5:41:19 AM9/21/13
to hyrack...@googlegroups.com, abhishek gupta
 Hi,

We are trying to run u2_gby_external.hive query on Hyracks (through Hivesterix).

Our setup has 4 machines in cluster with all default Hyracks parameter values. We are on Hyracks 2.9 and the dataset we are using is TPCH ;

Query throws following exception and continues running.

edu.uci.ics.hyracks.api.exceptions.HyracksDataException:
edu.uci.ics.hyracks.api.exceptions.HyracksDataException: Unable to allocate frame: Not enough memory
    at edu.uci.ics.hyracks.control.nc.Task.pushFrames(Task.java:320)
    at edu.uci.ics.hyracks.control.nc.Task.run(Task.java:261)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:724)
Caused by: edu.uci.ics.hyracks.api.exceptions.HyracksDataException: Unable to allocate frame: Not enough memory
    at edu.uci.ics.hyracks.control.nc.Joblet.allocateFrame(Joblet.java:229)
    at edu.uci.ics.hyracks.control.nc.Task.allocateFrame(Task.java:113)
    at edu.uci.ics.hyracks.dataflow.std.structures.SerializableHashTable.insert(SerializableHashTable.java:62)
    at edu.uci.ics.hyracks.dataflow.std.group.HashSpillableTableFactory$1.insert(HashSpillableTableFactory.java:227)
    at edu.uci.ics.hyracks.dataflow.std.group.external.ExternalGroupBuildOperatorNodePushable.nextFrame(ExternalGroupBuildOperatorNodePushable.java:87)
    at edu.uci.ics.hyracks.control.nc.Task.pushFrames(Task.java:304)
    ... 4 more

We tried increasing JVM memory. Based on stack trace it appears that a spillable table is used. But when we read the code for HashSpillableFactory, we found that there is no check to confirm if memory limit is reached and data should be spilled to disk. The exception is thrown from SerializableHashTable.insert (which is not aware of spill) and it is not handled in anonymous class in HashSpillableFactory.

We could be wrong here as we do not know corresponding code too well. How can we resolve this error ?

Any help appreciated.

Thanks,
Pushkar

Vinayak Borkar

unread,
Sep 21, 2013, 10:29:52 AM9/21/13
to hyrack...@googlegroups.com
This issue is related to a bug in the current Hyracks release.

In order to temporarily avoid this issue, please make this modification
to the allocateFrame method in edu.uci.ics.hyracks.control.nc.Joblet class.

The current method reads:

ByteBuffer allocateFrame() throws HyracksDataException {
if (appCtx.getMemoryManager().allocate(frameSize)) {
memoryAllocation.addAndGet(frameSize);
return ByteBuffer.allocate(frameSize);
}
throw new HyracksDataException("Unable to allocate frame: Not
enough memory");
}


Please comment out the check and the exception so the new code looks as
follows:

ByteBuffer allocateFrame() throws HyracksDataException {
// if (appCtx.getMemoryManager().allocate(frameSize)) {
memoryAllocation.addAndGet(frameSize);
return ByteBuffer.allocate(frameSize);
// }
// throw new HyracksDataException("Unable to allocate frame: Not
enough memory");
}


Thanks,
Vinayak

Pushkar Khadilkar

unread,
Sep 22, 2013, 5:41:04 AM9/22/13
to hyrack...@googlegroups.com
Hi Vinayak,

Thanks for your quick response.

We made the changes you suggested. That specific error is gone, now the query throws OutofMemoryException.

Exception in thread "Thread-23" Exception in thread "Thread-25" Exception in thread "Thread-24" Exception in thread "Thread-26" java.lang.OutOfMemoryError: Java heap space
    at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
    at java.nio.ByteBuffer.allocate(ByteBuffer.java:331)
    at edu.uci.ics.hyracks.control.nc.Joblet.allocateFrame(Joblet.java:227)
    at edu.uci.ics.hyracks.control.nc.Task.allocateFrame(Task.java:113)
    at edu.uci.ics.hyracks.dataflow.std.group.HashSpillableTableFactory$1.nextAvailableFrame(HashSpillableTableFactory.java:378)
    at edu.uci.ics.hyracks.dataflow.std.group.HashSpillableTableFactory$1.insert(HashSpillableTableFactory.java:216)

    at edu.uci.ics.hyracks.dataflow.std.group.external.ExternalGroupBuildOperatorNodePushable.nextFrame(ExternalGroupBuildOperatorNodePushable.java:87)
    at edu.uci.ics.hyracks.control.nc.Task.pushFrames(Task.java:304)
    at edu.uci.ics.hyracks.control.nc.Task.run(Task.java:261)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:724)

Please find attached complete stack trace.

Our modified code has:


ByteBuffer allocateFrame() throws HyracksDataException {
     //   if (appCtx.getMemoryManager().allocate(frameSize)) {
            memoryAllocation.addAndGet(frameSize);
            return ByteBuffer.allocate(frameSize);
     //   }
     //   throw new HyracksDataException("Unable to allocate frame: Not enough memory");
    }

Thanks,
Pushkar Khadilkar
outofmemoryerror.txt

Vinayak Borkar

unread,
Sep 22, 2013, 1:16:40 PM9/22/13
to hyrack...@googlegroups.com
Hi Pushkar,


This error shows that the amount of memory allocated to the various
memory intensive operators exceeds the amount of memory provided to the
JVM heap. Can you provide the memory configuration that you are running
the JVM with and also the Hivesterix query that you are executing when
you see this error?

Thanks,
Vinayak

Pushkar Khadilkar

unread,
Sep 23, 2013, 6:05:09 AM9/23/13
to hyrack...@googlegroups.com
Hi Vinayak,

We run Hyracks with default values for all parameters.

Our cluster.properties has following JAVA_OPTS

CCJAVA_OPTS="-Xmx1g -Djava.util.logging.config.file=logging.properties"
NCJAVA_OPTS="-Xmx1g -Djava.util.logging.config.file=logging.properties"

Is there any other configuration parameter related to JVM ?

Thanks,
Pushkar Khadilkar

Yingyi Bu

unread,
Sep 23, 2013, 4:04:24 PM9/23/13
to hyrack...@googlegroups.com
Maybe you can increase the -Xmx paramter to let the JVM has larger maximum heap size.

Yingyi


--
You received this message because you are subscribed to the Google Groups "hyracks-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hyracks-user...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Pushkar Khadilkar

unread,
Sep 24, 2013, 1:47:47 AM9/24/13
to hyrack...@googlegroups.com
Hi Yingyi,

We tried setting it to 2GB which did not help.

We are running it on shared cluster and hence we cannot increase the memory limits beyond 1 or 2 GB.

Is there a minimum memory requirement (> 1GB) ?

Will Hyracks not spill data to disk when the memory limit is reached ?

Thanks,
Pushkar Khadilkar 

Mike Carey

unread,
Sep 24, 2013, 1:51:54 AM9/24/13
to hyrack...@googlegroups.com
It sounds like you need to look at the Hyracks job w.r.t. what you are giving the operators as memory budgets and how the sum of those compares to your available memory.  (Hyracks' operators are spilling, but not because they sense in their environment that it's spilling time - you still need to give them budgets, and then they will stay in bounds and spill if they are locally tempted to go over budget.)

Cheers,
Mike

Pushkar Khadilkar

unread,
Sep 24, 2013, 2:13:42 AM9/24/13
to hyrack...@googlegroups.com
Hi,

Thanks for a real time reply.

I thought that the memory limit is for all operators aggregated and operators spill when they cannot allocate a new memory frame.

Is there a way by which one can specify the memory limits when building a physical DAG for a job to be submitted to Hyracks ?

We currently do not create a physical DAG but rely on Hivesterix to do that. We have added a hook in Hivesterix which lets us use our optimizer instead of default Hivesterix optimizer.

Could you point us to an example of Hyracks code for setting operator memory limits ?

Thanks,
Pushkar Khadilkar

Mike Carey

unread,
Sep 24, 2013, 2:53:13 AM9/24/13
to hyrack...@googlegroups.com
Ah!  Got it.  Okay, yes, Hivesterix probably does not have great sophistication there (at least not yet :-)).  I think if you look at the constructor arguments being called as Hivesterix goes through its Hyracks job gen phase you should be able to see where the values are being passed.  (The bounds will be parameters to the operators that are being instantiated in the job graph.)  It would probably be useful to debug this by stopping after job gen for the job that's failing and seeing what these values are set to in that case.  Joins, sorts, and grouped-aggregates have budgets.

Cheers,
Mike

JArod Wen

unread,
Sep 25, 2013, 9:23:07 AM9/25/13
to hyrack...@googlegroups.com
Hi Pushkar,

As Mike has suggested, most of these operators have an argument in their constructor to control the memory quote (should be the "framesLimit" parameter). From your error message I saw the OOM happens for the hash-group-by operator. Currently the memory quote for this operator is controlled by a configurable parameter called "hive.algebricks.groupby.external.memory" and its default value is 256MB for Hivesterix. You can add an entry in MANAGIX_HOME/conf/asterix-configuration.xml for this parameter and assign a smaller value. 

However this parameter cannot be set to be an arbitrary small memory, since this operator creates an in-memory hash table, and this hash table has a minimum frames of 2563 frames (around 12MB) as it needs enough space to create the hash table structure. This can be changed too; if you need a finer control on the memory budget through your optimizer, you can also change the "tableSize" argument in the constructor of the hash-group-by operator. If you need more information about this, feel free to let me know.

Best,

Pushkar Khadilkar

unread,
Sep 26, 2013, 6:32:40 AM9/26/13
to hyrack...@googlegroups.com
Hi Vinayak, Mike, JArod

Your comments have cleared a lot of doubts. Thanks for helping us in finding the problem.

We are currently discussing internally how to best implement this change in time limit that we have.

Configuration file change is doable. Dynamically setting the operator limits might require some more analysis on our part.

Thanks,
Pushkar Khadilkar

JArod Wen

unread,
Sep 26, 2013, 8:27:22 AM9/26/13
to hyrack...@googlegroups.com
One thing to clarify: the parameter "hive.algebricks.groupby.external.memory" should be in the hivesterix-dist/conf/hive-default.xml instead of managix configuration file, as it is hive-specific parameters. Sorry for the confusing.
Reply all
Reply to author
Forward
0 new messages