java.lang.OutOfMemory Exception

16 views
Skip to first unread message

teddius

unread,
Jul 28, 2008, 6:30:47 AM7/28/08
to OpenAnzo
Hi OpenAnzo Team,

When I try to "anzoClient.getReplicaGraph(...)" , then I get following
exception. I am using the Nightly Build from 20080727. I started the
anzo server with -Xmx1024 -Xms1024 and use postgresql. In the database
the graph has about 50MB in the statements table. Is the graph too
big?

Should I use getServerGraph(...) instead? What is the difference?

Thank you for your help,
Andreas


Here is the exception stacktrace:

org.openanzo.common.exceptions.AnzoException: ErrorCode[66048:16405]
CorrelationId[05c371c9-72ac-4c98-95b4-597f559b61b5] [INTERNAL_ERROR]
Unknown server exception executing request java.lang.OutOfMemoryError:
Java heap space
at
org.openanzo.servicecontainer.backends.JMSConnectionBackend.requestResponse(JMSConnectionBackend.java:
760)
at
org.openanzo.servicecontainer.services.proxy.jms.JMSModelServiceProxy.findStatements(JMSModelServiceProxy.java:
101)
at
org.openanzo.servicecontainer.services.proxy.jms.JMSModelServiceProxy.findStatements(JMSModelServiceProxy.java:
66)
at org.openanzo.client.ServerQuadStore.find(ServerQuadStore.java:70)
at org.openanzo.client.TransactionProxy.find(TransactionProxy.java:
118)
at org.openanzo.rdf.NamedGraph.find(NamedGraph.java:150)
at org.openanzo.rdf.NamedGraph.clear(NamedGraph.java:69)

teddius

unread,
Jul 28, 2008, 7:22:24 AM7/28/08
to OpenAnzo
...and this is th output on the server:

Anzo->Exception in thread "EventPublisherThread"
java.lang.OutOfMemoryError: Java heap space
at
org.apache.activemq.util.MarshallingSupport.writeUTF8(MarshallingSupport.java:
273)
at
org.apache.activemq.command.ActiveMQTextMessage.beforeMarshall(ActiveMQTextMessage.java:
117)
at
org.apache.activemq.openwire.v3.MessageMarshaller.tightMarshal1(MessageMarshaller.java:
114)
at
org.apache.activemq.openwire.v3.ActiveMQMessageMarshaller.tightMarshal1(ActiveMQMessageMarshaller.java:
77)
at
org.apache.activemq.openwire.v3.ActiveMQTextMessageMarshaller.tightMarshal1(ActiveMQTextMessageMarshaller.java:
77)
at
org.apache.activemq.openwire.OpenWireFormat.marshal(OpenWireFormat.java:
228)
at
org.apache.activemq.transport.tcp.TcpTransport.oneway(TcpTransport.java:
164)
at
org.apache.activemq.transport.InactivityMonitor.oneway(InactivityMonitor.java:
233)
at
org.apache.activemq.transport.TransportFilter.oneway(TransportFilter.java:
83)
at
org.apache.activemq.transport.WireFormatNegotiator.oneway(WireFormatNegotiator.java:
100)
at
org.apache.activemq.transport.failover.FailoverTransport.oneway(FailoverTransport.java:
443)
at
org.apache.activemq.transport.MutexTransport.oneway(MutexTransport.java:
40)
at
org.apache.activemq.transport.ResponseCorrelator.oneway(ResponseCorrelator.java:
60)
at
org.apache.activemq.ActiveMQConnection.doAsyncSendPacket(ActiveMQConnection.java:
1176)
at
org.apache.activemq.ActiveMQConnection.asyncSendPacket(ActiveMQConnection.java:
1170)
at
org.apache.activemq.ActiveMQSession.send(ActiveMQSession.java:1628)
at
org.apache.activemq.ActiveMQMessageProducer.send(ActiveMQMessageProducer.java:
227)
at
org.apache.activemq.ActiveMQMessageProducerSupport.send(ActiveMQMessageProducerSupport.java:
241)
at
org.openanzo.server.repository.publisher.EventPublisher.publishTransactionUpdateMessage(EventPublisher.java:
404)
at org.openanzo.server.repository.publisher.EventPublisher
$PublisherThread.run(EventPublisher.java:526)





Thanks, Andreas

Ben Szekely

unread,
Jul 28, 2008, 9:48:42 AM7/28/08
to open...@googlegroups.com
Hi Andreas,
Using a server graph might solver your problem in the short run, but
a replica graph should also work. A server graph never brings the
entire graph down to the client, unless you ask it to via a large
find(..) operation. I'd like to see a bit more of the client stack
trace, if you can reproduce it.

- Ben

teddius

unread,
Jul 31, 2008, 2:13:11 PM7/31/08
to OpenAnzo
Hi Ben,

On Jul 28, 3:48 pm, Ben Szekely <b...@cambridgesemantics.com> wrote:
> Using a server graph might solver your problem in the short run, but
> a replica graph should also work. A server graph never brings the
> entire graph down to the client, unless you ask it to via a large
> find(..) operation.

Thank you for your explanation! In which ways is a serverGraph more
lightweight than a replicaGraph? What are the advantages? In my use
case I am storing a lot of data in short intervals. This means that I
need to add about 20-30 statements a seconds to the graph and need
write and read access to the new data right away. Should I use a
server or replica graph for that?

Do I have to use the graph - graph.close() - every time I add
statements to it?
Do I have to close the AnzoClient - anzoClient.close() - every time I
add statements to the graph?
Is it possible to never close the graph during adding, deleting,
updateRepository(), etc. and having still the actual data in the
graph? Would it be ok to close the AnzoClient only at the closing of
the application?

I know these are not few question but I am looking forward for your
anwsers and would like to thank you in advance for your time!

>I'd like to see a bit more of the client stack
> trace, if you can reproduce it.

I tried to reproduce it, but I could not. I guess the reason was that
I started the server with no -Xmx512M -Xms512M command line arguments.
Could that be?

Sincerely,
Andreas

teddius

unread,
Jul 31, 2008, 2:13:18 PM7/31/08
to OpenAnzo
Hi Ben,

On Jul 28, 3:48 pm, Ben Szekely <b...@cambridgesemantics.com> wrote:
> Using a server graph might solver your problem in the short run, but
> a replica graph should also work. A server graph never brings the
> entire graph down to the client, unless you ask it to via a large
> find(..) operation.

Thank you for your explanation! In which ways is a serverGraph more
lightweight than a replicaGraph? What are the advantages? In my use
case I am storing a lot of data in short intervals. This means that I
need to add about 20-30 statements a seconds to the graph and need
write and read access to the new data right away. Should I use a
server or replica graph for that?

Do I have to use the graph - graph.close() - every time I add
statements to it?
Do I have to close the AnzoClient - anzoClient.close() - every time I
add statements to the graph?
Is it possible to never close the graph during adding, deleting,
updateRepository(), etc. and having still the actual data in the
graph? Would it be ok to close the AnzoClient only at the closing of
the application?

I know these are not few question but I am looking forward for your
anwsers and would like to thank you in advance for your time!

>I'd like to see a bit more of the client stack
> trace, if you can reproduce it.

Ben Szekely

unread,
Jul 31, 2008, 3:55:22 PM7/31/08
to open...@googlegroups.com
Hi Andreas,
Those are all very good questions about how the Anzo API is to be
used, and they expose how behind we are on documentation :) I'll
provide some general explanations, and then answer your questions
specifically.

All graphs handed out by the AnzoClient, both replica and server, are
reference counted. Each call to getReplicaGraph(URI) , increments a
reference count of the single graph instance, and each call to
graph.close() decrements the count on that instance. The last call to
close that decrements the count to 0 will cause the graph to be
destroyed and fully closed. Replica and server graphs are maintained
separately, and so have different reference counts, even for the same URI.

A Replica Graph is a graph that is cached in the local replica on the
client. By default, this replica is held in memory. Each replicaGraph,
opened by anzoClient.getReplicaGraph(...) is held in the replica until
the last caller calls graph.close() (as above). Until then the state of
the graph in the replica is automatically maintained by the combus
notification system, and these updates are surfaced in the API via
events on the replica graph objects. All reads performed on the replica
graph *ultimately go against the local replica.

A Server Graph, on the other hand, is not kept locally at all. All
reads *ultimately go against the server. However, updates to the graphs
on the server are communicated via notification and surfaced in the API
by events on the server graph objects.

Now, the reason why I said *ultimately above, is that until open
transactions have been committed, and until client.updateRepository()
has been called, all involved writes (to both server and replica graphs)
reside in a transaction queue, waiting to be sent to the server.
Therefore, to have consistent read/write, all reads (to both server and
replica graphs) are filtered through this in memory transaction queue.
The same transaction queue is shared between replica and server graphs.

> Thank you for your explanation! In which ways is a serverGraph more
> lightweight than a replicaGraph?
>

It's not entirely clear which is more lightweight. A server graph is
more lightweight in that it does not require that the entire contents of
the graph reside in memory. A replica graph is more lightweight in that
it requires fewer calls to the server for everyday read operations.
> What are the advantages?
The short answer is that replica graphs provide instant read access, and
server graphs provide access to arbitrarily large graphs. However,
there are more subtle differences we can discuss if you are interested.

> In my use
> case I am storing a lot of data in short intervals. This means that I
> need to add about 20-30 statements a seconds to the graph and need
> write and read access to the new data right away. Should I use a
> server or replica graph for that?
>

With those constraints, either should suffice. But you should decide
based on how big your graphs are, and whether you need it cached locally
in the client.

> Do I have to use the graph - graph.close() - every time I add
> statements to it?
>

No. Simply perform any updates on the graph, preferably within a
transaction boundary, and then call updateRepository() to commit the
changes to the server. Updates are sent to the sever on a client wide
basis, not per graph.

> Do I have to close the AnzoClient - anzoClient.close() - every time I
> add statements to the graph?
>

In general, you should only call anzoClient.close() when your
application is ready to terminate. However, things get a bit
complicated with web applications. Even though AnzoClient is
multi-threaded, only a single request should access an AnzoClient
instance at a time, much like JDBC database connection objects, and so
they should be pooled. We have experimented a bit with such pooling,
and I'd be interested in working with you to see if it is necessary for
what you are doing.

> Is it possible to never close the graph during adding, deleting,
> updateRepository(), etc. and having still the actual data in the
> graph? Would it be ok to close the AnzoClient only at the closing of
> the application?
>

You may be confusing closing the graph with closing the client. Close
the graph only when that part of the application is finished reading or
writing, or no longer needs access to the graph. In the replica graph
case, the graph should be closed when that graph is no longer desired to
exist in the replica. In otherwords, perhaps you are done
reading/writing for now, but don't want to download the graph again
later. The client should really only be closed when access to Anzoas a
whole is no longer required, usually when the application is being
stopped or shut down.


> I know these are not few question but I am looking forward for your
> anwsers and would like to thank you in advance for your time!
>

I hope this helps. Let me know if you want to talk more about your
specific application or architecture.

- Ben

teddius

unread,
Aug 23, 2008, 10:46:43 AM8/23/08
to OpenAnzo
Hi Ben,

Thanks for your long reply. It helped a lot to understand the things
behind Anzo better. I think it would be great if you could add some of
the insides below to the wiki as well.

On Jul 31, 9:55 pm, Ben Szekely <b...@cambridgesemantics.com> wrote:
> With those constraints, either should suffice. But you should decide
> based on how big your graphs are, and whether you need it cached locally
> in the client.

I chose to use a replicaGraph, because I have to access the triples in
the graph on a regular basis.

> I hope this helps. Let me know if you want to talk more about your
> specific application or architecture.

Thanks this helped a lot!

Best regards,
Andreas

teddius

unread,
Aug 23, 2008, 10:47:10 AM8/23/08
to OpenAnzo
Hi Ben,

Thanks for your long reply. It helped a lot to understand the things
behind Anzo better. I think it would be great if you could add some of
the insides below to the wiki as well.

On Jul 31, 9:55 pm, Ben Szekely <b...@cambridgesemantics.com> wrote:
> With those constraints, either should suffice. But you should decide
> based on how big your graphs are, and whether you need it cached locally
> in the client.

I chose to use a replicaGraph, because I have to access the triples in
the graph on a regular basis.

> I hope this helps. Let me know if you want to talk more about your
> specific application or architecture.

Reply all
Reply to author
Forward
0 new messages