java.io.EOFException in kryo+blvp error in bulk loading

Elizabeth

unread,

Jun 28, 2017, 9:02:27 AM6/28/17

to JanusGraph users list

Hi all,

I was using the Kryo format and BulkLoaderVertexProgram to load large files into Janusgraph, and encountered an error:

gremlin> hdfs.copyFromLocal('data/test.kryo','data/test.kryo')

==>null

gremlin> graph = GraphFactory.open('conf/hadoop-graph/hadoop-load.properties')

==>hadoopgraph[gryoinputformat->gryooutputformat]

gremlin>

gremlin> blvp = BulkLoaderVertexProgram.build().writeGraph('conf/janusgraph-hbase-es.properties').create(graph)

==>BulkLoaderVertexProgram[bulkLoader=IncrementalBulkLoader, vertexIdProperty=bulkLoader.vertex.id, userSuppliedIds=false, keepOriginalIds=true, batchSize=0]

gremlin>

gremlin> result = graph.compute(SparkGraphComputer).program(blvp).submit().get()

20:21:32 ERROR org.apache.spark.executor.Executor - Exception in task 0.0 in stage 0.0 (TID 0)

java.io.EOFException

at java.io.DataInputStream.readByte(DataInputStream.java:267)

at org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoRecordReader.seekToHeader(GryoRecordReader.java:93)

at org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoRecordReader.initialize(GryoRecordReader.java:85)

at org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoInputFormat.createRecordReader(GryoInputFormat.java:38)

has anyone ever had this error before, please help me with this last step!

Any idea is appreciated!

Best,

Meng

marc.de...@gmail.com

unread,

Jul 3, 2017, 9:17:45 AM7/3/17

to JanusGraph users list

Hi Eliz and Meng,

Did the seqence of gremlin commands work for the tinkerpop-modern.kryo and grateful-dead.kryo example files?

How did you create the test.kryo file?

Marc

Op woensdag 28 juni 2017 15:02:27 UTC+2 schreef Elizabeth:

Message has been deleted

Angelasweet

unread,

Jul 5, 2017, 10:45:22 PM7/5/17

to bi...@xs4all.nl, JanusGraph users list

Hi Biko and Marc,

Below is the method I used to create test.kryo, which was converted from a CSV file: vertices.txt

vi vertices.txt

123

11692

11814

12473

14180

14319

14667

112228

vi CsvToJavaObject.java

package mypkg;

import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import com.esotericsoftware.kryo.Kryo;
import com.esotericsoftware.kryo.io.Input;
import com.esotericsoftware.kryo.io.Output;
import java.io.File;
import java.io.FileOutputStream;
import java.io.FileInputStream;
import java.util.Arrays;



public class CsvToJavaObject {

    public static void main(String[] args) {
        String csvFile = "/home/dev/wanmeng/kryo/mypkg/vertices.txt";
        BufferedReader br = null;
        String line = "";
        String csvSplitBy = "\n";
        List<Long> list = new ArrayList<Long>();
        File file;
        FileOutputStream fop = null;

       try {

            br = new BufferedReader(new FileReader(csvFile));
            while ((line = br.readLine()) != null) {

                //trim the line breaks at the end of line.
                line = line.trim();
                // adding Long objects to a list.
                list.add(Long.parseLong(line));
                //System.out.println(list);

            }

        } catch (FileNotFoundException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            if (br != null) {
                try {
                    br.close();
                } catch (IOException e) {
                    e.printStackTrace();
                }
            }
        }

   try{
        Kryo kryo = new Kryo();
        file = new File("test.kryo");
        fop = new FileOutputStream(file);
        //if the file doesnt exist, then create it.
        if(!file.exists()) {
            file.createNewFile();
        }

        Output output = new Output(fop);
        kryo.writeObject(output, list);
        output.close();
    }
    catch (Exception ex) {
        ex.printStackTrace();
    }
    finally {
      try{
        if (fop != null) {
        fop.close();
        }
      } catch (IOException e) {
        e.printStackTrace();
     }
    }

}


    public void deserialize() {
    File file;
    FileInputStream fip = null;
    // deserialize
    try{
    Kryo kryo = new Kryo();
    file = new File("test.kryo");
    fip = new FileInputStream(file);
    if(!file.exists()) {
         file.createNewFile();
    }

    Input input = new Input(fip);
    //ArrayList<Long> listin = kryo.readObject(input, Long.class);
    Long listin = kryo.readObject(input, Long.class);
    //System.out.println(Arrays.toString(listin.toArray()));
    System.out.println(listin);
    input.close();
    }
    catch (Exception ex) {
        ex.printStackTrace();
    }
    finally {
      try{
        if (fip != null) {
        fip.close();
        }
      } catch (IOException e) {
        e.printStackTrace();
     }
    }
   }
}

javac -d . CsvToJavaObject.java

java mypkg.CsvToJavaObject.java

Thanks,

Eliz

2017-07-02 23:14 GMT+08:00 <bi...@xs4all.nl>:

Hi Elis,

Did the same sequence of gremlin statements work fine for the tinkerpop-modern.kryo or grafeful-dead.kryo example files?

If so, your test.kryo file is the problem. How did you create it?

Cheers, Marc

Op woensdag 28 juni 2017 15:02:27 UTC+2 schreef Elizabeth:

Hi all,

--
You received this message because you are subscribed to a topic in the Google Groups "JanusGraph users list" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/janusgraph-users/iJqOtNl1-AE/unsubscribe.
To unsubscribe from this group and all its topics, send an email to janusgraph-users+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

HadoopMarc

unread,

Jul 6, 2017, 1:45:05 AM7/6/17

to JanusGraph users list, bi...@xs4all.nl

Hi Eliz,

You are supposed to use Tinkerpop's gryo writer to create kryo graph input files (I do not believe there is a formal spec of the format). There is also the BulkDumperVertexProgram to create a kryo outputfile from a HadoopGraph. If you have a large dataset on hdfs and you want a distributed load into JanusGraph it is also possible to use Spark mapPartitions and have each Spark task make its own connection to the same JanusGraph.

HTH, Marc

Op donderdag 6 juli 2017 04:45:22 UTC+2 schreef Elizabeth:

To unsubscribe from this group and all its topics, send an email to janusgraph-use...@googlegroups.com.

Reply all

Reply to author

Forward