Sesame ValueFactory createLiteral(Date date) catastrophically slow

3 views
Skip to first unread message

Aleksandar Stojadinovic

unread,
Jan 28, 2014, 9:53:00 AM1/28/14
to sta...@clarkparsia.com, Roland Stuehmer
Hello, 

I'm programmatically filling up Stardog with some data that should be used for testing (the whole app is basically a test, this is a simulation), and it goes something like this:

for (int i = 0; i < 15000; i++) {


               
BNode hrBnode = factory.createBNode();
               
int heartrate = <some calculation>
               
Literal value = factory.createLiteral(heartrate);
               
Literal time = factory.createLiteral(dt.toDate());


                m
.add(factory.createStatement(patient, hasHeartrate, hrBnode));
                m
.add(factory.createStatement(hrBnode, hasValue, value));
                m
.add(factory.createStatement(hrBnode, observationTime, time));
                dt
= dt.plusSeconds(1);
               
if(i % 1000 ==0 )
                   
System.out.println(String.valueOf(i));
           
}
            connection
.begin();
            connection
.add(m);
            connection
.commit();
            connection
.close();


The issue is that the Literal time = factory.createLiteral(dt.toDate()); is very, very slow. I can litteraly measure the latency with it in seconds. I've ran it through the profiler an it consumes over 71% of the CPU time. When I put some stub instead (for example createLiteral(long l)) it goes much, much faster. The issue is I will probably have to use the createLiteral(Date d) in production. I can fall back to the (long l) version but then I lose time zone information and articulated time comparison, it comes down to numbers, and that's not a really good idea. 

Has anyone noticed this before? Any other advice about the idea in full? 

Tnx in advance.

Mike Grove

unread,
Jan 28, 2014, 9:59:05 AM1/28/14
to stardog, Roland Stuehmer
On Tue, Jan 28, 2014 at 9:53 AM, Aleksandar Stojadinovic <sal...@gmail.com> wrote:
Hello, 

I'm programmatically filling up Stardog with some data that should be used for testing (the whole app is basically a test, this is a simulation), and it goes something like this:

for (int i = 0; i < 15000; i++) {


               
BNode hrBnode = factory.createBNode();
               
int heartrate = <some calculation>
               
Literal value = factory.createLiteral(heartrate);
               
Literal time = factory.createLiteral(dt.toDate());


                m
.add(factory.createStatement(patient, hasHeartrate, hrBnode));
                m
.add(factory.createStatement(hrBnode, hasValue, value));
                m
.add(factory.createStatement(hrBnode, observationTime, time));
                dt
= dt.plusSeconds(1);
               
if(i % 1000 ==0 )
                   
System.out.println(String.valueOf(i));
           
}
            connection
.begin();
            connection
.add(m);
            connection
.commit();
            connection
.close();


The issue is that the Literal time = factory.createLiteral(dt.toDate()); is very, very slow.

You can see the Sesame implementation for this method here [1].

I think your only option would be to create the XMLGregorianCalendar yourself, or construct a string, using String.format, which has the correct lexical format and use the createLiteral(String, URI) method on the Sesame ValueFactory.  

Cheers,

Mike 

 
I can litteraly measure the latency with it in seconds. I've ran it through the profiler an it consumes over 71% of the CPU time. When I put some stub instead (for example createLiteral(long l)) it goes much, much faster. The issue is I will probably have to use the createLiteral(Date d) in production. I can fall back to the (long l) version but then I lose time zone information and articulated time comparison, it comes down to numbers, and that's not a really good idea. 

Has anyone noticed this before? Any other advice about the idea in full? 

Tnx in advance.

--
-- --
You received this message because you are subscribed to the C&P "Stardog" group.
To post to this group, send email to sta...@clarkparsia.com
To unsubscribe from this group, send email to
stardog+u...@clarkparsia.com
For more options, visit this group at
http://groups.google.com/a/clarkparsia.com/group/stardog?hl=en

Aleksandar Stojadinovic

unread,
Jan 28, 2014, 10:17:59 AM1/28/14
to sta...@clarkparsia.com, Roland Stuehmer
I already tried with the Calendar myself, and it is a bit faster, but still not dramatically. I will try with the string. My only issue is the URI parameter. Is there a constant I should put for the xsd:dateTime or not? How should the URI look like/be constructed? I'm a bit new, so that's maybe a stupid question. 

Markus Stocker

unread,
Jan 28, 2014, 10:25:21 AM1/28/14
to Stardog, Roland Stuehmer
On Tue, Jan 28, 2014 at 5:17 PM, Aleksandar Stojadinovic
<sal...@gmail.com> wrote:
> I already tried with the Calendar myself, and it is a bit faster, but still
> not dramatically. I will try with the string. My only issue is the URI
> parameter. Is there a constant I should put for the xsd:dateTime or not?

I believe you can use XMLSchema.DATETIME from org.openrdf.model.vocabulary.

Cheers, m.
> --
> -- --
> You received this message because you are subscribed to the C&P "Stardog"
> group.
> To post to this group, send email to sta...@clarkparsia.com
> To unsubscribe from this group, send email to
> stardog+u...@clarkparsia.com
> For more options, visit this group at
> http://groups.google.com/a/clarkparsia.com/group/stardog?hl=en
>
> To unsubscribe from this group and stop receiving emails from it, send an
> email to stardog+u...@clarkparsia.com.

Aleksandar Stojadinovic

unread,
Jan 28, 2014, 10:45:28 AM1/28/14
to sta...@clarkparsia.com, Roland Stuehmer
That did it! Thanks a bunch you all!
Reply all
Reply to author
Forward
0 new messages