trig cannot be parsed in executable jar packaged by maven assembly plugin

184 views
Skip to first unread message

Yuanzhe Yang

unread,
Sep 26, 2016, 7:09:02 AM9/26/16
to RDF4J Users
Hi all,

Thanks a lot for your work on the RDF4J project.

I am currently working on a small sparql client which will run in command line as an executable jar, but I get stuck in a strange behavior in parsing RDF in trig:

Here is a very short code snippet: 

package cn.yyz.rdf4j;
import org.eclipse.rdf4j.model.Model;
import org.eclipse.rdf4j.rio.RDFFormat;
import org.eclipse.rdf4j.rio.Rio;
import java.io.StringReader;

public class TrigTest {
    public static void main(String[] args) throws Exception {
        Model results = Rio.parse(new StringReader("<http://example.org/s> <http://example.org/p> <http://example.org/o>"), "http://example.org/base/", RDFFormat.TRIG);
    }
}

And this is the pom:

<?xml version="1.0" encoding="UTF-8"?>
  <modelVersion>4.0.0</modelVersion>

    <groupId>cn.yyz.rdf4j</groupId>
  <artifactId>trig-test</artifactId>
  <version>1.0-SNAPSHOT</version>

    <properties>
      <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
  </properties>

    <dependencies>
      <dependency>
          <groupId>org.eclipse.rdf4j</groupId>
          <artifactId>rdf4j-repository-sparql</artifactId>
          <version>2.0.1</version>
      </dependency>
      <dependency>
          <groupId>org.eclipse.rdf4j</groupId>
          <artifactId>rdf4j-rio-trig</artifactId>
          <version>2.0.1</version>
      </dependency>
      <dependency>
          <groupId>org.slf4j</groupId>
          <artifactId>slf4j-api</artifactId>
          <version>1.7.21</version>
      </dependency>
      <dependency>
          <groupId>org.slf4j</groupId>
          <artifactId>slf4j-log4j12</artifactId>
          <version>1.7.21</version>
      </dependency>
  </dependencies>

    <build>
      <plugins>
          <plugin>
              <artifactId>maven-compiler-plugin</artifactId>
              <configuration>
                  <source>1.8</source>
                  <target>1.8</target>
              </configuration>
          </plugin>
          <plugin>
              <artifactId>maven-assembly-plugin</artifactId>
              <configuration>
                  <descriptorRefs>
                      <descriptorRef>jar-with-dependencies</descriptorRef>
                  </descriptorRefs>
                  <archive>
                      <manifest>
                          <mainClass>cn.yyz.rdf4j.TrigTest</mainClass>
                      </manifest>
                  </archive>
              </configuration>
              <executions>
                  <execution>
                      <id>make-assembly</id>
                      <phase>package</phase>
                      <goals>
                          <goal>single</goal>
                      </goals>
                  </execution>
              </executions>
          </plugin>
      </plugins>
  </build>
</project>

I have the same piece of java code in a JUnit test, and the test runs properly without problem during building.

And here is the result when I try to run the jar:

java -jar target/trig-test-1.0-SNAPSHOT-jar-with-dependencies.jar 
DEBUG [main] (ServiceRegistry.java:52) - Registered service class org.eclipse.rdf4j.rio.ntriples.NTriplesParserFactory
Exception in thread "main" org.eclipse.rdf4j.rio.UnsupportedRDFormatException: Did not recognise RDF format object TriG (mimeTypes=application/trig, application/x-trig; ext=trig)
        at org.eclipse.rdf4j.rio.Rio.lambda$unsupportedFormat$0(Rio.java:568)
        at java.util.Optional.orElseThrow(Optional.java:290)
        at org.eclipse.rdf4j.rio.Rio.createParser(Rio.java:100)
        at org.eclipse.rdf4j.rio.Rio.createParser(Rio.java:118)
        at org.eclipse.rdf4j.rio.Rio.parse(Rio.java:295)
        at org.eclipse.rdf4j.rio.Rio.parse(Rio.java:212)
        at cn.yyz.rdf4j.TrigTest.main(TrigTest.java:15)


If I use N-Triples, it works, maybe because NTriplesParserFactory is loaded but TriGParserFactory is not?

Interestingly, if I remove dependency rdf4j-repository-sparql from pom and recreate the jar, it works properly again:

java -jar target/trig-test-1.0-SNAPSHOT-jar-with-dependencies.jar 
DEBUG [main] (ServiceRegistry.java:52) - Registered service class org.eclipse.rdf4j.rio.trig.TriGParserFactory
DEBUG [main] (ServiceRegistry.java:52) - Registered service class org.eclipse.rdf4j.rio.datatypes.XMLSchemaDatatypeHandler
DEBUG [main] (ServiceRegistry.java:52) - Registered service class org.eclipse.rdf4j.rio.datatypes.RDFDatatypeHandler
DEBUG [main] (ServiceRegistry.java:52) - Registered service class org.eclipse.rdf4j.rio.datatypes.DBPediaDatatypeHandler
DEBUG [main] (ServiceRegistry.java:52) - Registered service class org.eclipse.rdf4j.rio.datatypes.VirtuosoGeometryDatatypeHandler
DEBUG [main] (ServiceRegistry.java:52) - Registered service class org.eclipse.rdf4j.rio.datatypes.GeoSPARQLDatatypeHandler
DEBUG [main] (ServiceRegistry.java:52) - Registered service class org.eclipse.rdf4j.rio.languages.RFC3066LanguageHandler

My development environment is Mac OS X and Oracle JDK:

java -version
java version "1.8.0_60"
Java(TM) SE Runtime Environment (build 1.8.0_60-b27)
Java HotSpot(TM) 64-Bit Server VM (build 25.60-b23, mixed mode)


Then I got really confused. Is there anything wrong with my configuration? Any suggestion is welcome and highly appreciated. Thank you very much and have a nice day!

Regards,
Yang


Jeen Broekstra

unread,
Sep 26, 2016, 7:30:21 PM9/26/16
to rdf4j...@googlegroups.com

The problem is caused by your use of the maven assembly plugin to produce a fat jar (or “onejar”). RDF4J’s parsers use the Java Service Provider Interface (SPI), which includes the use of an interface registration file in META-INF/services in the jar. 

RDF4J includes several implementations of the same interface (namely, one for every specific parser implementation). SPI, unfortunately, uses the convention that the name of the registration file is the same as the fully-qualified name of the interface. This is not a problem in normal circumstances as each jar file has its own META-INF/services directory. However, because you combine everything in a single jar, you have effectively have two files with the exact same name. The assembly plugin is not smart enough to merge these files, so they get overwritten. The end result is that one or more of your parser implementations can no longer be detected by RDF4J. 

TL;DR: don’t create a fat jar using the assembly plugin. Either use multiple jars to avoid the issue, or else use the maven shade plugin  to create your fat jar. The shade plugin has a configuration option to automatically merge META-INF/services files

HTH,

Jeen

--
You received this message because you are subscribed to the Google Groups "RDF4J Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rdf4j-users...@googlegroups.com.
To post to this group, send email to rdf4j...@googlegroups.com.
Visit this group at https://groups.google.com/group/rdf4j-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/rdf4j-users/348ce9d3-fe61-4c25-a21e-ceb577bc039f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

"Yuanzhe Yang (杨远哲)"

unread,
Sep 27, 2016, 6:20:21 AM9/27/16
to rdf4j...@googlegroups.com
Hi Jeen,

Thanks a lot for your prompt reply! This explanation enlightened me suddenly. Actually I have also tried maven shade plugin. Since I am not aware of this SPI mechanism, I did not use the “ServicesResourceTransformer” to merge service entries, which lead to the same result as the one produced by maven assembly plugin. Thank you very much for your suggestions!

PS: I switched the order of dependencies in the pom and it works! Probably because the service entries for trig is not overridden. But I will follow your advice to avoid other potential problems.

Regards,
Yang

You received this message because you are subscribed to a topic in the Google Groups "RDF4J Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/rdf4j-users/Au--U7LGrAg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to rdf4j-users...@googlegroups.com.

To post to this group, send email to rdf4j...@googlegroups.com.
Visit this group at https://groups.google.com/group/rdf4j-users.
Reply all
Reply to author
Forward
0 new messages