How to set up offline data push

64 views
Skip to first unread message

OfflinedatPush

unread,
Jun 27, 2018, 8:58:51 PM6/27/18
to Pinot Users
Hello Team,

I'm trying to setup the offline data push. And i get the below error

 hadoop jar pinot-hadoop-0.016.jar SegmentCreation job.properties
Error: Could not find or load main class org.apache.hadoop.util.RunJar


I'm in pinot folder and tried to build the project using 

mvn clean install -DskipTests

when i try to run this: hadoop jar pinot-hadoop-1.0-SNAPSHOT.jar SegmentCreation job.properties
i get error.

Can you please guide me

Thanks

kishore g

unread,
Jun 27, 2018, 10:46:22 PM6/27/18
to OfflinedatPush, Pinot Users
Are you running this on a hadoop node. Can you provide the following info
- Hadoop Version
- yarn classpath

--
You received this message because you are subscribed to the Google Groups "Pinot Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pinot_users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pinot_users/e36bb1b2-7d1b-44c0-9219-6d89442d7e55%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Nandini Choudary

unread,
Jun 28, 2018, 12:58:15 AM6/28/18
to kishore g, Pinot Users
No.. I'm not sure how to get this up and running. I build the entire project. And everything builds fine in pinot folder. Can you please help me with next steps?


OfflineDatPush

unread,
Jun 28, 2018, 1:33:49 PM6/28/18
to Pinot Users
Now i'm running into this error.

Exception in thread "main" java.lang.ClassNotFoundException: SegmentCreation
        at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:348)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:237)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:158)


Hadoop version installed is 3.0.3. 

I downloaded the flights data set. Removed all the files except .avro flies and using them as input.

Any help would be much appreciated.

Thanks

On Wednesday, June 27, 2018 at 9:58:15 PM UTC-7, Nandini Choudary wrote:
No.. I'm not sure how to get this up and running. I build the entire project. And everything builds fine in pinot folder. Can you please help me with next steps?


On Wed, Jun 27, 2018 at 7:46 PM, kishore g <g.ki...@gmail.com> wrote:
Are you running this on a hadoop node. Can you provide the following info
- Hadoop Version
- yarn classpath

jennife...@gmail.com

unread,
Jun 28, 2018, 5:46:12 PM6/28/18
to Pinot Users
Thanks for reporting this! We've added an option to create a fat jar so that you shouldn't run into dependency problems. Please pull the latest version of Pinot.

Could you try the following command? 
hadoop jar pinot-hadoop-0.016.jar com.linkedin.pinot.hadoop.PinotHadoopJobLauncher SegmentCreation job.properties
 
Make sure you also have a job.properties file, schema file, and input files. Please feel free to reach out if you run into any more problems. Thank you so much!
To unsubscribe from this group and stop receiving emails from it, send an email to pinot_users...@googlegroups.com.

jennife...@gmail.com

unread,
Jun 28, 2018, 6:59:31 PM6/28/18
to Pinot Users
The fat jar should be created under pinot-hadoop/target/pinot-hadoop-0.016-jar-with-dependencies.jar

nandumal...@gmail.com

unread,
Jun 28, 2018, 7:03:54 PM6/28/18
to Pinot Users
Thank you for the quick response Jennifer. Now i see there is a fat jar created. I'm in pinot-hadoop/target folder/ Have my job.properties file there. And tried to run

hadoop jar pinot-hadoop-0.016.jar com.linkedin.pinot.hadoop.PinotHadoopJobLauncher SegmentCreation job.properties

and i get this error now:

Exception in thread "main" java.lang.NoClassDefFoundError: com/linkedin/pinot/common/Utils
        at com.linkedin.pinot.hadoop.job.SegmentCreationJob.<init>(SegmentCreationJob.java:73)
        at com.linkedin.pinot.hadoop.PinotHadoopJobLauncher.kickOffPinotHadoopJob(PinotHadoopJobLauncher.java:54)
        at com.linkedin.pinot.hadoop.PinotHadoopJobLauncher.main(PinotHadoopJobLauncher.java:90)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:234)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
Caused by: java.lang.ClassNotFoundException: com.linkedin.pinot.common.Utils
        at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        ... 9 more


When i try to use pinot-hadoop-0.016-jar-with-dependencies.jar instead, i get 

Exception in thread "main" java.io.IOException: Mkdirs failed to create C:\cygwin64\tmp\hadoop-unjar3221557010695647844\META-INF\license
        at org.apache.hadoop.util.RunJar.ensureDirectory(RunJar.java:140)
        at org.apache.hadoop.util.RunJar.unJar(RunJar.java:109)
        at org.apache.hadoop.util.RunJar.unJar(RunJar.java:85)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:222)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:148)


Thanks

nandumal...@gmail.com

unread,
Jun 28, 2018, 7:48:31 PM6/28/18
to Pinot Users
I deleted the META-INF\LICENSE file from the jar . Now everything works fine. But when i query the table, i still get 0 as count. I see a output segment file is created in the specified output location. How can i debug this?

Jennifer Dai

unread,
Jun 28, 2018, 8:07:30 PM6/28/18
to nandumal...@gmail.com, pinot...@googlegroups.com
Thanks for trying that! I will edit the wiki, but essentially, you want to run the same command and replace SegmentCreation with SegmentTarPush. The SegmentCreation job will help you to create your data, but it doesn't push it to the cluster. 

You received this message because you are subscribed to a topic in the Google Groups "Pinot Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/pinot_users/hju-3grlLD4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to pinot_users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pinot_users/2e2c3598-9f1e-4cb4-9bb5-bd40e6b0a89e%40googlegroups.com.

9239...@qq.com

unread,
Nov 7, 2018, 6:15:30 AM11/7/18
to Pinot Users
SegmentCreation generates the.txt file, why does SegmentTarPush require.tar.gz?

在 2018年6月29日星期五 UTC+8上午8:07:30,Jennifer Dai写道:

Jennifer Dai

unread,
Nov 7, 2018, 8:50:38 AM11/7/18
to 9239...@qq.com, Pinot Users
SegmentCreation should generate a .tar.gz file since we store data in a Pinot generated format and that includes more than one file (thus the .tar.gz.) Could you send over your job.properties file, the command you ran, and the txt file you got?

Best,
Jennifer


For more options, visit https://groups.google.com/d/optout.
--
Jennifer Dai
University of California, Berkeley
BA Computer Science
jennif...@berkeley.edu | (650) 714-2316

9239...@qq.com

unread,
Nov 8, 2018, 12:33:49 AM11/8/18
to Pinot Users
thanks, I understand that I have solved this problem because SegmentCreationjob has no resource execution on yarn and fails to generate the.tar.gz file
Best
LiYang

在 2018年11月7日星期三 UTC+8下午9:50:38,Jennifer Dai写道:

9239...@qq.com

unread,
Nov 8, 2018, 2:41:20 AM11/8/18
to Pinot Users
hello,you know now the version of the pinot (1.0 or 2.0) and multi-tenancy http://lva1-pinot-controller-vip-1.corp.linkedin.com:11984/dataresources this interface to access?
curl -i -X POST -H 'Content-Type: application/json' -d '{"requestType":"create", "resourceName":"XLNT","tableName":"T1", "timeColumnName":"daysSinceEpoch", "timeType":"daysSinceEpoch","numberOfDataInstances":4,"numberOfCopies":2,"retentionTimeUnit":"DAYS", "retentionTimeValue":"700","pushFrequency":"daily", "brokerTagName":"XLNT", "numberOfBrokerInstances":1, "segmentAssignmentStrategy":"BalanceNumSegmentAssignmentStrategy", "resourceType":"OFFLINE", "metadata":{"d2.name":"xlntBetaPinot"}}' http://lva1-pinot-controller-vip-1.corp.linkedin.com:11984/dataresources
 content-type application/json status code 404 Not Found

在 2018年11月8日星期四 UTC+8下午1:33:49,9239...@qq.com写道:
Reply all
Reply to author
Forward
0 new messages