I'm using the latest version of solrmarc by building it from the
trunk. I have established out that I will need to have my own custom
indexer (to define a custom id for each record) and I'm a bit confused
on how to set this up.
I read the thread called "Making Configuration of SolrMarc easier"
where it's mentioned that it will be possible to create a separate jar
with custom built code in it, what is the status of it and how to
configure it? I tried adding my compiled Indexer.class to the
Custom_SolrMarc.jar but this results in errors.
Thanks for advice,
Willem
The European Library
Den Haag, The Netherlands.
I have been hard at work making SolrMarc 2.1 which (except for finishing
the documentation) is ready to go. It currently resides in the SolrMarc
SVN repo at branches/solrmarc-2.1 but will soon be moved to replace the
trunk. One of the main improvements of version 2.1 over version 2.0
(which is the current trunk) is that it is easier to create custom java
indexing routines, and it is much easier to use them.
The initial step of:
ant init
will create a directory named local_build within this created
directory there is a src directory, in which you can place the java
files for your custom indexing routines.
Subsequently if you run:
ant dist
it will compile your source into a separate jar file and modify the
config.properties file to reference this jar, and your created class,
and place the resulting files in the dist directory.
The config.properties required for SolrMarc 2.1 are largely the same as
what had been used in SolrMarc 2.0 (and earlier) and the
index.properties file are unchanged.
If you'd like to be an early adopter, I can help you get the new branch
up and running as quickly as possible, partly because doing so will then
make it easier to make the documentation as complete as possible.
-Bob Haschart
Sounds great Bob,
I'll give it a go on Monday and let you know about the results. I'm currently in prototype phase of our project anyway.
Best,
Willem
Op 8 jan 2010 19:20 schreef "Robert Haschart" <rh...@virginia.edu>:
Willem,
I have been hard at work making SolrMarc 2.1 which (except for finishing the documentation) is ready to go. It currently resides in the SolrMarc SVN repo at branches/solrmarc-2.1 but will soon be moved to replace the trunk. One of the main improvements of version 2.1 over version 2.0 (which is the current trunk) is that it is easier to create custom java indexing routines, and it is much easier to use them.
The initial step of: ant init will create a directory named local_build within this created directory there is a src directory, in which you can place the java files for your custom indexing routines.
Subsequently if you run:
ant dist
it will compile your source into a separate jar file and modify the config.properties file to reference this jar, and your created class, and place the resulting files in the dist directory.
The config.properties required for SolrMarc 2.1 are largely the same as what had been used in SolrMarc 2.0 (and earlier) and the index.properties file are unchanged.
If you'd like to be an early adopter, I can help you get the new branch up and running as quickly as possible, partly because doing so will then make it easier to make the documentation as complete as possible.
-Bob Haschart
WillemVermeer wrote: > Hi everybody, > > I'm using the latest version of solrmarc by building ...
--
You received this message because you are subscribed to the Google Groups "solrmarc-tech" group.
To post to this group, send email to solrma...@googlegroups.com.
To unsubscribe from this group, send email to solrmarc-tec...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/solrmarc-tech?hl=en.
============================================
Migration notes SOLRMARC 2.0 to 2.1
These are my notes of the steps I had to take to upgrade a working 2.0
version of solrmarc to the latest version 2.1.
1. check out the 2.1 branch from svn
>> cd <my dev dir>
>> scn co http://solrmarc.googlecode.com/svn/branches/solrmarc-2.1/ solrmarc21
2. run ant init
>> cd solrmarc21
>> ant init
This creates a subdirectory called local_build which contains all the
local configuration files as well as any custom java code. Previously
ant init would create a subdirectory using your custom prefix for its
subdirectory name. In 2.1 the subdirectory is always called
local_build and the config files in it are prefixed with the custom
prefix.
During this initialization procedure I selected [none] for example
configuration as we will use the base version of solrmarc only; our
site-specific prefix; left the heap size to its default; set encoding
to BESTGUESS (as our source may contain either UNIMARC or MARC8);
selected a custom solr configuration; entered the URL of our SOLR
server; the full path to the solr home directory contaning the conf
subdirectory and finally the full path to the solr war file location.
3. add custom indexer
>> cd local_build
>> cd src
>> mkdir -p src/org/solrmarc/index
>> cp <dev location>/org/solrmarc/index/MyIndexer.java src/org/solrmarc/index
4. modify indexer properties
I basically reused the indexer property file from the old installation
>> cp <solrmarc 2.0>/<my site prefix>/custom_index.properties local_build/<my site prefix>_index.properties
4. run ant dist
>> cd ..
>> ant dist
Running this target requires no manual intervention. It creates a
(top-level) dist directory containing the general SolrMarc.jar and the
site specific MyIndexer.jar as well as the site specific indexing
properties
5. installation is ready
Solrmarc has detected that the local_build/src directory contained a
custom indexer and has set the properties solrmarc.custom.jar.path and
solr.indexer in <site-prefix>_config.properties accordingly.
6. start indexing
Previously the indexing scripts lived in the dist directory but now
they have been moved to local_build/script_templates. There is also an
index_scripts directory which contains a small README_SCRIPTS
suggesting to put the custom index scripts into that directory.
When trying to run the script_templates/indexfile script I first
needed to add execution permission to the scipt file:
>> chmod +x script_templates/indexfile
Then I could run indexfile from out of the local_build directory:
>> cd <solrmarc21 home>/local_build
>> script_templates/indexfile my_marc_file.mrc
except that it immediately crashes with the error message:
Exception in thread "main" java.lang.NoClassDefFoundError: @MEM_ARGS@
So I first copied the template to the index_scripts directory:
>> cp script_templates/indexfile index_scripts
and then replaced the unfound reference to @MEM_ARGS@ by a hardcoded
value -Xmx256m, just to get things going. However then it complains it
cannot find the SolrMarc.jar. So I tried to run indexfile from out of
the dist directory:
>> cd <solrmarc21>/dist
>> cp <solrmarc21>/local_build/index_scripts/indexfile index_scripts
>> index_scripts/indexfile my_marc_file.mrc
The SolrMarc indexer can now be found but my custom indexer can't.
Relevant section of my arrow_config.properties:
solrmarc.solr.war.path=/Users/willem/dev/apache-tomcat-6.0.18/webapps/apache-solr-1.4-dev.war
# solrmarc.custom.jar.path - Jar containing custom java code to use in
indexing.
# If solr.indexer below is defined (other than the default of
org.solrmarc.index.SolrIndexer)
# you MUST define this value to be the Jar containing the class listed there.
solrmarc.custom.jar.path=ArrowIndexer
# Path to your solr instance
solr.path = /Users/willem/kb/solrarrow
# - solr.indexer - full name of java class with custom indexing functions. This
# class must extend SolrIndexer; Defaults to SolrIndexer.
solr.indexer = org.solrmarc.index.ArrowIndexer
# - solr.indexer.properties -indicates how to populate Solr index fields from
# marc data. This is the core configuration file for solrmarc.
solr.indexer.properties = arrow_index.properties
In /dist I have the file ArrowIndexer.jar with the following contents:
<[Mon Jan 11 20:50:52][willem@/Users/willem/dev/solrmarc21/dist]>jar
tf ArrowIndexer.jar
META-INF/
META-INF/MANIFEST.MF
org/
org/solrmarc/
org/solrmarc/index/
org/solrmarc/index/ArrowIndexer.class
My indexer class:
public class ArrowIndexer extends SolrIndexer {
public ArrowIndexer(String indexingPropsFile, String propertyDirs[]) {
super(indexingPropsFile, propertyDirs);
System.out.println("[ARROW]ArrowIndexer instantiated");
}
public String getTelid() {
System.out.println("[ARROW]getTelid invoked");
return "id" + System.currentTimeMillis();
}
}
I don't see any reference to the custom jar in the indexfile script. I
guess I'm still missing something. If you want I can try to run the
whole again from scratch.
Thanks again,
Willem
In the line:
solrmarc.custom.jar.path=ArrowIndexer
it needs the .jar at the end. thusly:
solrmarc.custom.jar.path=ArrowIndexer.jar
This is likely a problem in the new build script where it fills the
necessary value in the config file as the config file is being created.
But simply changing that value should get things working for you.
-Bob Haschart
Now moving right along to the next problem :-)
In my arrow_index.properties I have:
id = custom, getTelid
and my custom indexer looks like:
public class ArrowIndexer extends SolrIndexer {
public ArrowIndexer(String indexingPropsFile, String propertyDirs[]) {
super(indexingPropsFile, propertyDirs);
System.out.println("[ARROW]ArrowIndexer instantiated");
}
public String getTelid(Record record) {
System.out.println("[ARROW]getTelid invoked");
return "id" + System.currentTimeMillis();
}
}
but when running it I get:
<[Mon Jan 11 21:16:47][willem@/Users/willem/dev/solrmarc21/dist/bin]>./indexfile
~/kb/arrow/testdata/tel117_utf8.mrc
INFO [main] (MarcImporter.java:637) - Starting SolrMarc indexing.
INFO [main] (Utils.java:188) - Opening file:
/Users/willem/dev/solrmarc21/dist/arrow_config.properties
INFO [main] (MarcHandler.java:286) - Attempting to open data file:
/Users/willem/kb/arrow/testdata/tel117_utf8.mrc
ERROR [main] (SolrIndexer.java:347) - Unable to find custom indexing
function getTelid
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:494)
at org.solrmarc.marc.MarcHandler.loadIndexer(MarcHandler.java:449)
at org.solrmarc.marc.MarcHandler.init(MarcHandler.java:103)
at org.solrmarc.marc.MarcImporter.main(MarcImporter.java:643)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:585)
at com.simontuffs.onejar.Boot.run(Boot.java:334)
at com.simontuffs.onejar.Boot.main(Boot.java:170)
Caused by: java.lang.IllegalArgumentException: Unable to find custom
indexing function getTelid
at org.solrmarc.index.SolrIndexer.verifyCustomMethodExists(SolrIndexer.java:349)
at org.solrmarc.index.SolrIndexer.verifyCustomMethodsAndTransMaps(SolrIndexer.java:282)
at org.solrmarc.index.SolrIndexer.fillMapFromProperties(SolrIndexer.java:265)
at org.solrmarc.index.SolrIndexer.<init>(SolrIndexer.java:101)
at org.solrmarc.index.ArrowIndexer.<init>(ArrowIndexer.java:11)
... 13 more
ERROR [main] (MarcHandler.java:471) - Unable to load Custom indexer:
org.solrmarc.index.ArrowIndexer
ERROR [main] (MarcImporter.java:647) - Error configuring Indexer from
properties file. Exiting...
Error configuring Indexer from properties file. Exiting...
Is it allowed to define a custom function for the 'id' field? Should
the name of the method match exactly how it's referred to in
_index.properties?
Thanks again,
Willem
If you have any more questions as you move forward, please feel free to
contact the solrmarc group. Your projects plans to handle both USMarc
and Unimarc records is very interesting.
-Bob Haschart
Just to wrap this thread up I have updated my migration notes and it
now describes what to do to use the 2.1 branch with a custom indexer
function. Feel free to use it to your liking.
=========================================================
Migration notes SOLRMARC 2.0 to 2.1 with a custom built indexer
Running an ant dist will copy these scripts to /dist/bin. It is
suggested to run the index scripts from out of that directory
/dist/bin. Before we can start the indexing we need to fix one minor
bug:
At the time of writing there was one bug in the generation of the
properties files: in custom_index.properties you need to add .jar to
the name of the custom jar file, i.e. replace:
solrmarc.custom.jar.path=MyIndexer
by:
solrmarc.custom.jar.path=MyIndexer.jar
This will probably fixed in the final 2.1 version.
Now we can start the indexing:
>> cd <devlocation>/dist/bin
>> ./indexfile my_marc.mrc
One note about the custom indexer: the example provided on the
ConfiguringSolrMarc page misses an import which is needed to
succesfully compile the indexer, it should read:
import org.solrmarc.index.SolrIndexer;
import org.marc4j.marc.Record;
public class BlacklightIndexer extends SolrIndexer
{
public BlacklightIndexer(String propertiesMapFile, String propertyPaths[])
{
super(propertiesMapFile, propertyPaths);
}
public Set<String> getRecordingAndScore(Record record)
{
Set<String> result = new LinkedHashSet<String>();
// content omitted, see ConfiguringSolrMarc page
return result;
}
}
During the compilation phase in ant dist somehow the import is
resolved by adding marc4j.jar to the classpath, as a developer you
need not worry about that.