Adding lucene jars to build.gradle

622 views
Skip to first unread message

natalia.v...@gmail.com

unread,
Apr 1, 2014, 3:40:12 PM4/1/14
to mongod...@googlegroups.com
Hello,

   I would like to add some Lucene parsing functionality to my application.   I started with the historicalYield example in mongo-hadoop and re-wrote the mapper and reducer files.  Now, I would like to tokenize certain inputs in the mapper using Lucene.   So I added the relevant lucene imports (such as org.apache.lucene.analysis.ngram.NGramTokenizer) to the code.  I also copied lucene-core-4.6.1.jar and lucene-analyzers-common-4.6.0.jar into hadoop-binaries, and did the following in build.gradle:

    dependencies {
        compile "org.mongodb:mongo-java-driver:2.11.4"
        compile "org.apache.lucene:lucene-core:4.6.1"
        compile "org.apache.lucene:lucene-analyzers-common:4.6.0"

        testCompile 'junit:junit:4.11'
        testCompile 'org.hamcrest:hamcrest-all:1.3'
        testCompile 'org.apache.lucene:lucene-analyzers-common:4.6.0'
    }

 and 

copy {
        from "core/build/libs/mongo-hadoop-core-${project(':core').version}-hadoop_${hadoop_version}.jar"
into hadoopLib
        rename { "mongo-hadoop-core.jar" }

        from "/Users/hadoop/hadoop-binaries/lucene-core-4.6.1.jar"
into hadoopLib
rename { "lucene-core.jar" }

        from "/Users/hadoop/hadoop-binaries/lucene-analyzers-common-4.6.0.jar"
        into hadoopLib
        rename { "lucene-analyzers-common.jar" }
    }

 When I do "./gradlew jar", everything compiles without any errors.  When I do "./gradlew historicalYield", however, I get:

14/04/01 15:29:04 WARN mapred.LocalJobRunner: job_local1807224390_0001
java.lang.Exception: java.lang.NoClassDefFoundError: org/apache/lucene/analysis/ngram/NGramTokenizer
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:403)
Caused by: java.lang.NoClassDefFoundError: org/apache/lucene/analysis/ngram/NGramTokenizer
at com.mongodb.hadoop.examples.treasury.TreasuryYieldMapper.map(TreasuryYieldMapper.java:76)

    I admit that I am completely new to gradle builds, so I don't really know what I am going.  Suggestions would be much appreciated.

    Thank you!

    Natalia Connolly


Justin Lee

unread,
Apr 1, 2014, 3:48:45 PM4/1/14
to mongod...@googlegroups.com
First things first, make sure your lucene jars are actually in the hadoop lib directory.  Second, try this:
copy {
        from ("core/build/libs/mongo-hadoop-core-${project(':core').version}-hadoop_${hadoop_version}.jar") {
                rename "mongo-hadoop-core.jar"
        }

        from ("/Users/hadoop/hadoop-binaries/lucene-core-4.6.1.jar") {
        rename "lucene-core.jar"
        }

        from ("/Users/hadoop/hadoop-binaries/lucene-analyzers-common-4.6.0.jar") {
                rename "lucene-analyzers-common.jar"
        }

        into hadoopLib
    }




--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
 
For other MongoDB technical support options, see: http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user...@googlegroups.com.
To post to this group, send email to mongod...@googlegroups.com.
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/476f7f8b-b222-4992-b52e-60fc74d329b8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

natalia.v...@gmail.com

unread,
Apr 1, 2014, 4:13:27 PM4/1/14
to mongod...@googlegroups.com
Hi Justin,

   The jars are definitely there, I copied them just about everywhere I could think of:

> ls -l /Users/hadoop/hadoop-binaries/
total 226336
drwxr-xr-x  12 hadoop  staff        408 Mar 25 13:39 hadoop-2.2.0
-rw-r--r--   1 hadoop  staff  109229073 Oct  7 02:46 hadoop-2.2.0.tar.gz
drwxr-xr-x  12 hadoop  staff        408 Mar 25 09:05 hadoop-2.3.0
-rw-r--r--   1 hadoop  staff    2707127 Mar 24 15:35 hadoop-2.3.0.tar.gz
-rw-r--r--@  1 hadoop  staff    1589250 Apr  1 15:06 lucene-analyzers-common-4.6.0.jar
-rw-r--r--@  1 hadoop  staff    2348962 Apr  1 14:31 lucene-core-4.6.1.jar

>ls -l ../hadoop-binaries/hadoop-2.3.0/lib
total 8440
-rw-r--r--@  1 hadoop  staff  1589250 Apr  1 16:05 lucene-analyzers-common-4.6.0.jar
-rw-r--r--@  1 hadoop  staff  2348962 Apr  1 16:05 lucene-core-4.6.1.jar
-rw-r--r--   1 hadoop  staff   291946 Mar 25 09:08 mongo-2.7.3.jar
-rw-r--r--   1 hadoop  staff    79740 Mar 25 09:08 mongo-hadoop-core_2.2.0-1.2.0.jar
drwxr-xr-x  10 hadoop  staff      340 Mar 25 09:05 native

> ls -l ../hadoop-binaries/hadoop-2.3.0/share/hadoop/common/
total 11824
-rw-r--r--@ 1 hadoop  staff  1589250 Apr  1 16:06 lucene-analyzers-common-4.6.0.jar
-rw-r--r--  1 hadoop  staff  1589250 Apr  1 16:08 lucene-analyzers-common.jar
-rw-r--r--@ 1 hadoop  staff  2348962 Apr  1 16:06 lucene-core-4.6.1.jar
-rw-r--r--  1 hadoop  staff    93997 Apr  1 13:10 mongo-hadoop-core.jar
-rw-r--r--  1 hadoop  staff   419108 Jan 27 12:17 mongo-java-driver.jar


When I re-write the copy … into clause the way you suggested, I got the following error:


 > ./gradlew historicalYield
:installHadoop
:historicalYield
connected to: 127.0.0.1
Tue Apr  1 16:01:49.247 dropping: mongo_hadoop.yield_historical.in
Tue Apr  1 16:01:49.257 imported 2 objects
:historicalYield FAILED

FAILURE: Build failed with an exception.

* Where:
Build file '/Users/hadoop/mongo-hadoop/build.gradle' line: 342

* What went wrong:
Execution failed for task ':historicalYield'.
> No signature of method: org.gradle.api.internal.file.copy.CopySpecWrapper_Decorated.rename() is applicable for argument types: (java.lang.String) values: [mongo-hadoop-core.jar]
  Possible solutions: rename(java.util.regex.Pattern, java.lang.String), rename(groovy.lang.Closure), rename(java.lang.String, java.lang.String), getAt(java.lang.String), any(), each(groovy.lang.Closure)

* Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output.

Justin Lee

unread,
Apr 1, 2014, 4:37:42 PM4/1/14
to mongod...@googlegroups.com
Sorry.  I misread the gradle pages slightly.  this works:
    copy {
        from("core/build/libs/mongo-hadoop-core-${project(':core').version}-hadoop_${hadoop_version}.jar") {
            rename { "mongo-hadoop-core.jar" }
        }
        from("${hadoopBinaries}/lucene-core-4.6.1.jar") {
            rename { "lucene-core.jar" }
        }

        from("${hadoopBinaries}/lucene-analyzers-common-4.6.1.jar") {
            rename { "lucene-analyzers-common.jar" }
        }
        into hadoopLib
    }


That should be sufficient for those classes.  It's possible that you're missing a dependent class down the line.  Double check for any other "caused by" sections or other CNFE entries.


Reply all
Reply to author
Forward
0 new messages