working through problems with ogpIngest

104 views
Skip to first unread message

Garey Mills

unread,
Oct 1, 2014, 2:51:34 PM10/1/14
to opengeop...@googlegroups.com
Hi -

     We are trying to get ogpIngest installed and working here. For a sample FGDC record from Esri, when we tried to upload it to Solr only, we got

11:28:10.969 [http-bio-8080-exec-27] ERROR org.apache.solr.core.SolrCore - org.apache.solr.common.SolrException: Document i
s missing mandatory uniqueKey field: id
        at org.apache.solr.update.AddUpdateCommand.getIndexedId(AddUpdateCommand.java:93)
        at org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:880)
        at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:690)
        at org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:247)
        at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:174)


The log file then goes on to say that the record was ingested:

2014-10-01 11:28:11 SolrJClient [ERROR] SolrException: SolrRecord values =Publisher: Esri,DataType: Undefined,Originator: Esri,Bounds: -96.0,20.0,-83.0,34.0,LayerName: Downloadable Data,Access: Public,LayerId: Berkeley.Downloadable Data,Title: Custom Link Sample,ContentDate: 2011-01-01T01:01:01Z,
2014-10-01 11:28:11 SolrJClient [INFO] committing add to Solr
2014-10-01 11:28:11 OwsSolrIngest [INFO] Successfully added layer to solr.


But when we tried to search for the record through Solr mgmt. interface, using the Publisher field, we got

11:32:33.483 [http-bio-8080-exec-33] ERROR org.apache.solr.core.SolrCore - org.apache.solr.common.SolrException: undefined field Publisher
        at org.apache.solr.schema.IndexSchema.getDynamicFieldType(IndexSchema.java:1267)
        at org.apache.solr.schema.IndexSchema$SolrQueryAnalyzer.getWrappedAnalyzer(IndexSchema.java:433)
        at org.apache.lucene.analysis.DelegatingAnalyzerWrapper$DelegatingReuseStrategy.getReusableComponents(DelegatingAnalyzerWrapper.java:74)
        at org.apache.lucene.analysis.Analyzer.tokenStream(Analyzer.java:175)


Any suggestions? Is there a configuration we are missing to put LayerIds in, or is there an FGDC field we should use?

Does the failure to find Publisher (we used publisher as well, with the same results) suggest that our schema.xml is improperly placed in the Solr directories?

Garey Mills
Library Systems Office
UC Berkeley

--
Generate messages about directories that cannot  be
read,  files  that  cannot be opened ... rather than being silent ... 
(from `man du`)

Chris Barnett

unread,
Oct 1, 2014, 3:30:19 PM10/1/14
to opengeop...@googlegroups.com
Hi Garey,

Does the failure to find Publisher (we used publisher as well, with the same results) suggest that our schema.xml is improperly placed in the Solr directories?

I suspect that this is true. 

11:32:33.483 [http-bio-8080-exec-33] ERROR org.apache.solr.core.SolrCore - org.apache.solr.common.SolrException: undefined field Publisher
        at org.apache.solr.schema.IndexSchema.getDynamicFieldType(IndexSchema.java:1267)

It looks here as if solr is trying to get “Publisher” as a dynamic field type, presumably since it’s not found in the schema.

11:28:10.969 [http-bio-8080-exec-27] ERROR org.apache.solr.core.SolrCore - org.apache.solr.common.SolrException: Document i
s missing mandatory uniqueKey field: id
        at org.apache.solr.update.AddUpdateCommand.getIndexedId(AddUpdateCommand.java:93)

Also, solr doesn’t seem to know which field should be used as an id, which is defined in the schema.

In the solr admin interface, there should be a schema explorer for each core, in which case, you’ll know for sure. Another possibility is that you are pointing to another solr core with a different schema.

You will also run in to problems with the LayerId, I think. ogpIngest just concatenates the Institution name and LayerName (from the ftname tag in FGDC) to form the LayerId.  In this specific case, the LayerName is likely problematic, since it has a space.

Also, are you guys still using EZId’s for your identifiers? If so, I think you will need to add code to look for LayerId in another field. It shouldn’t be more than a few lines in 2 or 3 classes, if you can identify an acceptable location in the metadata to put the LayerId.

thanks,
Chris

--
You received this message because you are subscribed to the Google Groups "OpenGeoportal Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to opengeoportal-...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Garey Mills

unread,
Oct 1, 2014, 7:12:34 PM10/1/14
to opengeop...@googlegroups.com
With 

    <env-entry>
       <env-entry-name>solr/home</env-entry-name>
       <env-entry-value>/opt/tomcat/webapps/solr-4.10.0/solr</env-entry-value>
       <env-entry-type>java.lang.String</env-entry-type>
    </env-entry>

in the web.xml of our Solr instance,

and /opt/tomcat/webapps/solr-4.10.0/solr containing

drwxrwxr-x 4 tcatmgr tcatmgr 4096 Sep 24 13:44 collection1

./collection1:
total 32
drwxr-xr-x 6 tcatmgr tcatmgr 4096 Oct  1 12:33 conf
-rw-r--r-- 1 tcatmgr tcatmgr   16 Sep 24 13:44 core.properties
drwxrwxr-x 4 tcatmgr tcatmgr 4096 Sep 24 13:44 data
-rw-r--r-- 1 tcatmgr tcatmgr 2146 Sep 24 13:44 README.txt

./collection1/conf:
total 520
-rw-r--r-- 1 tcatmgr tcatmgr  1068 Sep 24 13:44 admin-extra.html
-rw-r--r-- 1 tcatmgr tcatmgr   928 Sep 24 13:44 admin-extra.menu-bottom.html
-rw-r--r-- 1 tcatmgr tcatmgr   926 Sep 24 13:44 admin-extra.menu-top.html
drwxr-xr-x 3 tcatmgr tcatmgr  4096 Sep 24 13:44 clustering
-rw-r--r-- 1 tcatmgr tcatmgr  3974 Sep 24 13:44 currency.xml
-rw-r--r-- 1 tcatmgr tcatmgr  1348 Sep 24 13:44 elevate.xml
drwxr-xr-x 2 tcatmgr tcatmgr  4096 Sep 24 13:44 lang
-rw-r--r-- 1 tcatmgr tcatmgr 78514 Sep 24 13:44 mapping-FoldToASCII.txt
-rw-r--r-- 1 tcatmgr tcatmgr  2868 Sep 24 13:44 mapping-ISOLatin1Accent.txt
-rw-r--r-- 1 tcatmgr tcatmgr   873 Sep 24 13:44 protwords.txt
-rw-r--r-- 1 tcatmgr tcatmgr    33 Sep 24 13:44 _rest_managed.json
-rw-r--r-- 1 tcatmgr tcatmgr   450 Sep 24 13:44 _schema_analysis_stopwords_english.json
-rw-r--r-- 1 tcatmgr tcatmgr   172 Sep 24 13:44 _schema_analysis_synonyms_english.json
-rw-r--r-- 1 tcatmgr tcatmgr 74910 Oct  1 12:23 schema.xml
-rw-r--r-- 1 tcatmgr tcatmgr 60689 Sep 24 13:44 schema.xml.default
-rw-r--r-- 1 tcatmgr tcatmgr   921 Sep 24 13:44 scripts.conf
-rw-r--r-- 1 tcatmgr tcatmgr 74697 Sep 24 13:44 solrconfig.xml
-rw-r--r-- 1 tcatmgr tcatmgr    13 Sep 24 13:44 spellings.txt
-rw-r--r-- 1 tcatmgr tcatmgr   781 Sep 24 13:44 stopwords.txt
-rw-r--r-- 1 tcatmgr tcatmgr 10121 Oct  1 12:27 synonymsIso.txt
-rw-r--r-- 1 tcatmgr tcatmgr  6765 Oct  1 12:27 synonymsLcsh.txt
-rw-r--r-- 1 tcatmgr tcatmgr  1470 Oct  1 12:27 synonymsState.txt
-rw-r--r-- 1 tcatmgr tcatmgr  1148 Oct  1 12:27 synonyms.txt
-rw-r--r-- 1 tcatmgr tcatmgr  1119 Sep 24 13:44 synonyms.txt.default
-rw-r--r-- 1 tcatmgr tcatmgr  1416 Sep 24 13:44 update-script.js
drwxr-xr-x 2 tcatmgr tcatmgr  4096 Sep 24 13:44 velocity
drwxr-xr-x 2 tcatmgr tcatmgr  4096 Sep 24 13:44 xslt


no cores are found. If we swap schema.xml.default for schema.xml, we have a core, but schema.xml.default is the default solr schema.

OGP's schema.xml validates, so I am not sure where to go from here.

Garey

Chris Barnett

unread,
Oct 2, 2014, 11:58:31 AM10/2/14
to opengeop...@googlegroups.com
Hi Garey,

Are you using the schema.xml from your Solr 1.4?  If so, grab a new copy from 

One thing that I know has changed is that LayerId is for some reason defined as a multi value field in the original, which is not allowed as of Solr 3.x

If that’s not the case, do you see anything amiss in the tomcat logs when you start up solr?

In the near future, we should, as a group, start talking about changes we can make to the solr schema. Just in terms of leveraging new Solr features and better experience/knowledge of how Solr works, there are a lot of things that we can do better without even changing what fields we are capturing. Changing the schema as far as information captured is another issue that should also be explored, likely in concert with the Metadata Working Group.

thanks,
Chris

Garey Mills

unread,
Oct 2, 2014, 6:52:41 PM10/2/14
to opengeop...@googlegroups.com
Chris - I had the new schema.xml. Here is what I see in the logs:

ERROR org.apache.solr.servlet.SolrDispatchFilter - null:java.lang.LinkageError: loader constr
aint violation: when resolving method "java.lang.invoke.MethodHandle.invokeExact()Lorg/apache/lucene/util/AttributeImpl;" the clas
s loader (instance of org/apache/catalina/loader/WebappClassLoader) of the current class, org/apache/lucene/util/AttributeFactory$
1, and the class loader (instance of <bootloader>) for resolved class, java/lang/invoke/MethodHandle, have different Class objects
 for the type andle.invokeExact()Lorg/apache/lucene/util/AttributeImpl; used in the signature
        at org.apache.lucene.util.AttributeFactory$1.createInstance(AttributeFactory.java:140)
        at org.apache.lucene.util.AttributeFactory$StaticImplementationAttributeFactory.createAttributeInstance(AttributeFactory.j
ava:103)
        at org.apache.lucene.util.AttributeSource.addAttribute(AttributeSource.java:222)
        at org.apache.lucene.analysis.util.CharTokenizer.<init>(CharTokenizer.java:115)
        at org.apache.lucene.analysis.core.WhitespaceTokenizer.<init>(WhitespaceTokenizer.java:58)
        at org.apache.lucene.analysis.synonym.FSTSynonymFilterFactory$1.createComponents(FSTSynonymFilterFactory.java:98)
        at org.apache.lucene.analysis.Analyzer.tokenStream(Analyzer.java:182)
        at org.apache.lucene.analysis.synonym.SynonymMap$Parser.analyze(SynonymMap.java:313)
        at org.apache.lucene.analysis.synonym.SolrSynonymParser.addInternal(SolrSynonymParser.java:99)
        at org.apache.lucene.analysis.synonym.SolrSynonymParser.parse(SolrSynonymParser.java:70)
        at org.apache.lucene.analysis.synonym.FSTSynonymFilterFactory.loadSynonyms(FSTSynonymFilterFactory.java:142)
        at org.apache.lucene.analysis.synonym.FSTSynonymFilterFactory.inform(FSTSynonymFilterFactory.java:112)
        at org.apache.lucene.analysis.synonym.SynonymFilterFactory.inform(SynonymFilterFactory.java:89)
        at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:675)
        at org.apache.solr.schema.IndexSchema.<init>(IndexSchema.java:166)
        at org.apache.solr.schema.IndexSchemaFactory.create(IndexSchemaFactory.java:55)
        at org.apache.solr.schema.IndexSchemaFactory.buildIndexSchema(IndexSchemaFactory.java:69)
        at org.apache.solr.core.ConfigSetService.createIndexSchema(ConfigSetService.java:90)
        at org.apache.solr.core.ConfigSetService.getConfig(ConfigSetService.java:62)
        at org.apache.solr.core.CoreContainer.create(CoreContainer.java:489)
        at org.apache.solr.core.CoreContainer.create(CoreContainer.java:466)
        at org.apache.solr.handler.admin.CoreAdminHandler.handleCreateAction(CoreAdminHandler.java:575)
        at org.apache.solr.handler.admin.CoreAdminHandler.handleRequestInternal(CoreAdminHandler.java:199)
        at org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:188)
        at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
        at org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:729)
        at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:258)
        at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:170)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:98)
        at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:950)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)
        at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1040)
        at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:607)
        at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:313)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:724)

Does this suggest anything to you? Are you configuring the SolrDispatchFilter somewhere?


I agree about having to review the schema; but until I get it working that's all I can think about.

Garey

Chris Barnett

unread,
Oct 3, 2014, 10:27:18 AM10/3/14
to opengeop...@googlegroups.com
Hi Garey,

I’m not doing any configuration outside of the files in the repository and setting solr home. I have solr 4.10 on localhost…let me try running it with a fresh copy of the config docs from the repository, and I’ll let you know how it goes.

-Chris

Chris Barnett

unread,
Oct 3, 2014, 10:56:00 AM10/3/14
to opengeop...@googlegroups.com
Hi Garey,

The schema works ok for me on 4.10. Googling the error it seems related to loading cores, but you already knew that! 

The only thing that I can see here that I wonder about is your directory structure.  For me, I have /usr/local/tomcat7/webapps/solr4_10, which contains:

drwxr-xr-x   5 cbarne02  staff        170 Sep 24 06:07 META-INF
-rw-r--r--   1 cbarne02  staff       2536 Oct  1 21:00 README.txt
drwxr-xr-x   7 cbarne02  staff        238 Oct  3 10:42 WEB-INF
-rw-r--r--   1 cbarne02  staff       5990 Sep  8 03:56 admin.html
drwxr-xr-x   2 cbarne02  staff         68 Oct  1 21:00 bin
drwxr-xr-x   6 cbarne02  staff        204 Oct  2 11:37 collection1
drwxr-xr-x   4 cbarne02  staff        136 Sep  8 03:56 css
-rw-r--r--   1 cbarne02  staff       1146 Sep  8 03:56 favicon.ico
drwxr-xr-x  14 cbarne02  staff        476 Sep  8 03:56 img
drwxr-xr-x   6 cbarne02  staff        204 Sep  8 03:56 js
drwxr-xr-x   3 cbarne02  staff        102 Oct  1 20:46 logs
-rw-r--r--   1 cbarne02  staff   29730295 Oct  1 20:26 solr-4.10.1.war
-rw-r--r--   1 root      staff  155952894 Oct  1 20:17 solr-4.10.1.zip
-rw-r--r--   1 cbarne02  staff       1760 Oct  1 21:00 solr.xml
drwxr-xr-x  16 cbarne02  staff        544 Sep  8 03:56 tpl
-rw-r--r--   1 cbarne02  staff        518 Oct  1 21:00 zoo.cfg


my collection1 and conf are similar to yours. What I don’t see in your solr home directory are WEB-INF, etc., which may explain why there are problems with class loading.

I find the download package from the solr site to be strange and hard to work with.  I take the WAR file from the dist directory, then copy in “collection1” from example/solr, then copy in the logging jars from example/lib/ext and the log4j properties from example/resources.

thanks,
-Chris

Garey Mills

unread,
Oct 3, 2014, 11:21:03 AM10/3/14
to opengeop...@googlegroups.com
Chris -

    I can't apologize enough. In the heat of the moment and the torpor of the afternoon I mistakenly thought that I was dealing with ogpIngest instead of Solr. Of course it's not your problem, and thank you so much for continuing and looking at it anyway. I'll let you know how it goes.

Garey

Chris Barnett

unread,
Oct 3, 2014, 12:18:21 PM10/3/14
to opengeop...@googlegroups.com
Hi Garey,

Not a problem! I wanted a 4.10 install to play with anyway, and now I know that our schema works with it! Plus, issues with solr are certainly relevant.

+1 for the use of the word “torpor”

thanks,
Chris

Garey Mills

unread,
Oct 3, 2014, 12:39:24 PM10/3/14
to opengeop...@googlegroups.com
Chris - moving solr/home up to solr-4.10.0 (instead of solr-4.10.0/solr) and moving collection1 up to solr-4.10.0 (instead of solr-4.10.0/solr) and changing solr/home in the web.xml, where solr-4.10.0 is the directory found in <tomcat>/webapps, worked. I can see collection1 as a core and I can browse the OGP solr schema from the solr admin page. I suppose that this means that core directories have to be siblings of WEB-INF (as you suggested), which I wish they would have said.

I can attest that the documentation on setting up the solr/home directory is quite poor.

Thanks for your help;

Garey
Reply all
Reply to author
Forward
0 new messages