Linking records in ReDBox is causing an exception during curation.

16 views
Skip to first unread message

Jay van Schyndel

unread,
Oct 30, 2012, 5:53:32 PM10/30/12
to redbo...@googlegroups.com
Hi Everyone,

After trying to solve this one on my own it's time to ask for some assistance.
All help is appreciated.

At JCU, when I harvest details on a bird species, I am creating two entries in ReDBox, 'suitability' and 'occurrence'.
A relationship is setup between the two entries for each bird species.

Here is the setup of the ReDBox data for the two entries.

Bird: Pale-headed Rosella - Platycercus (Violania) adscitus - current and future species distribution models
dc:relation.vivo:Dataset.1.dc:identifier =  jcu.edu.au/tdh/collection/Pale-headed Rosella - Platycercus (Violania) adscitus/occurrences
dc:relation.vivo:Dataset.1.dc:title = Pale-headed Rosella - Platycercus (Violania) adscitus - current and future species distribution models
dc:relation.vivo:Dataset.1.redbox:publish = on
dc:relation.vivo:Dataset.1.vivo:Relationship.rdf:PlainLiteral = isDerivedFrom
dc:relation.vivo:Dataset.1.vivo:Relationship.skos:prefLabel = Derived from:


Bird: Pale-headed Rosella - Platycercus (Violania) adscitus - occurrence records filtered for species distribution modelling

dc:relation.vivo:Dataset.1.dc:identifier = jcu.edu.au/tdh/collection/Pale-headed Rosella - Platycercus (Violania) adscitus/suitability
dc:relation.vivo:Dataset.1.dc:title = Pale-headed Rosella - Platycercus (Violania) adscitus - current and future species distribution models
dc:relation.vivo:Dataset.1.redbox:publish = on
dc:relation.vivo:Dataset.1.vivo:Relationship.rdf:PlainLiteral = hasDerivedCollection
dc:relation.vivo:Dataset.1.vivo:Relationship.skos:prefLabel = Has derivation:

From above, the 'current' entry is a derived relation from the 'occurrence' entry, and the 'occurrence' entry has a derivation, being the 'current' entry.

During curation I receive an exception on this relationship.
Here is an example of the exception.

2012-10-31 08:16:37,093 transactionManager DEBUG  CurationManager     
{
    "task": "curation-request",
    "oid": "jcu.edu.au/tdh/collection/Pale-headed Rosella - Platycercus (Violania) adscitus/occurrences",
    "relationships": [
        {
            "identifier": "jcu.edu.au/tdh/collection/Pale-headed Rosella - Platycercus (Violania) adscitus/suitability",
            "curatedPid": "jcu.edu.au/tdh/collection/Pale-headed Rosella - Platycercus (Violania) adscitus/suitability",
            "broker": "tcp://localhost:9101",
            "isCurated": true,
            "relationship": "hasDerivedCollection",
            "oid": "3a4132b1e51731bf0efb4be5ac571aa6"
        }
    ],
    "respond": {
        "broker": "tcp://localhost:9101",
        "oid": "3a4132b1e51731bf0efb4be5ac571aa6",
        "task": "curation-pending"
    }
}
2012-10-31 08:16:37,095 transactionManager ERROR  CurationManager      Error accessing object 'jcu.edu.au/tdh/collection/Pale-headed Rosella - Platycercus (Violania) adscitus/occurrences' in storage:
com.googlecode.fascinator.api.storage.StorageException: oID 'jcu.edu.au/tdh/collection/Pale-headed Rosella - Platycercus (Violania) adscitus/occurrences' doesn't exist in storage.
    at com.googlecode.fascinator.storage.filesystem.FileSystemStorage.getObject(FileSystemStorage.java:206) ~[plugin-storage-filesystem-1.1.2.jar:na]
    at com.googlecode.fascinator.redbox.plugins.curation.redbox.CurationManager.getConfigFromStorage(CurationManager.java:1672) [plugin-transaction-curation-redbox-1.5.2.jar:na]
    at com.googlecode.fascinator.redbox.plugins.curation.redbox.CurationManager.curation(CurationManager.java:523) [plugin-transaction-curation-redbox-1.5.2.jar:na]
    at com.googlecode.fascinator.redbox.plugins.curation.redbox.CurationManager.parseMessage(CurationManager.java:1383) [plugin-transaction-curation-redbox-1.5.2.jar:na]
    at com.googlecode.fascinator.common.transaction.GenericTransactionManager.parseMessage(GenericTransactionManager.java:172) [fascinator-common-1.1.2.jar:na]
    at com.googlecode.fascinator.messaging.TransactionManagerQueueConsumer.onMessage(TransactionManagerQueueConsumer.java:382) [fascinator-core-1.1.2.jar:na]
    at org.apache.activemq.ActiveMQMessageConsumer.dispatch(ActiveMQMessageConsumer.java:1088) [activemq-all-5.3.0.jar:5.3.0]
    at org.apache.activemq.ActiveMQSessionExecutor.dispatch(ActiveMQSessionExecutor.java:127) [activemq-all-5.3.0.jar:5.3.0]
    at org.apache.activemq.ActiveMQSessionExecutor.iterate(ActiveMQSessionExecutor.java:197) [activemq-all-5.3.0.jar:5.3.0]
    at org.apache.activemq.thread.PooledTaskRunner.runTask(PooledTaskRunner.java:122) [activemq-all-5.3.0.jar:5.3.0]
    at org.apache.activemq.thread.PooledTaskRunner$1.run(PooledTaskRunner.java:43) [activemq-all-5.3.0.jar:5.3.0]
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) [na:1.6.0_37]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) [na:1.6.0_37]
    at java.lang.Thread.run(Thread.java:680) [na:1.6.0_37]
2012-10-31 08:16:37,095 transactionManager ERROR  CurationManager      Error accessing item configuration!

As you can see from the above exception the code is trying to use the "identifier": "jcu.edu.au/tdh/collection/Pale-headed Rosella - Platycercus (Violania) adscitus/suitability" to read from storage, but this is a hash value normally and thus an exception is generated.

After looking at the code in CurationManager.java and the setup in the .json for this relationship, I made a few changes to my harvest rules file.
Here is the relation setup from system-config.json

        "relations": {
            "dc:relation.vivo:Dataset.0.dc:identifier": {
                "path": ["dc:relation", "vivo:Dataset"],
                "identifier": ["dc:identifier"],
                "relationship": ["vivo:Relationship", "rdf:PlainLiteral"],
                "excludeCondition": {
                    "path": ["redbox:publish"],
                    "value": ""
                },
                "system": "redbox",
                "optional": true
            },

This is the only relationship setup that has system set to redbox, by default, in CurationManager, the other relationships use mint.
The method that does the work in CurationManger is 'lookForRelation'

In an attempt to resolve this exception myself, I altered my harvest script so that the identifier generated was the same as the hash OID. (I borrowed some code from my harvester to do this)
This did resolve the above exception, but the curation process ended up in a loop, going on forever just filling up logs.

How can I get curation to work properly when I am linking two entries in ReDBox ?

To help, I have include a copy of my log. This includes the harvesting and curation. I am harvesting directly into 'Published' and then curation is being kicked off by the harvest rules file.

I'm at a dead end on this one, so any help is greatly appreciated.

Thanks,
              Jay.
Curation exception log.rtf

Greg Pendlebury

unread,
Oct 30, 2012, 8:54:56 PM10/30/12
to redbo...@googlegroups.com
Hey Jay is it possibly this line that is the real culprit?


2012-10-31 08:16:37,093 transactionManager DEBUG  CurationManager     
{
    "task": "curation-request",
    "oid": "jcu.edu.au/tdh/collection/Pale-headed Rosella - Platycercus (Violania) adscitus/occurrences",
    "relationships": [
        {
            "identifier": "jcu.edu.au/tdh/collection/Pale-headed Rosella - Platycercus (Violania) adscitus/suitability",
            "curatedPid": "jcu.edu.au/tdh/collection/Pale-headed Rosella - Platycercus (Violania) adscitus/suitability",
            "broker": "tcp://localhost:9101",
            "isCurated": true,
            "relationship": "hasDerivedCollection",
            "oid": "
3a4132b1e51731bf0efb4be5ac571aa6"
        }
    ],
    "respond": {
        "broker": "tcp://localhost:9101",
        "oid": "3a4132b1e51731bf0efb4be5ac571aa6",
        "task": "curation-pending"
    }
}

I suspect (without reading any code) that the curation manager should try to resolve an 'identifier' to an 'oid' before using it, but the 'oid' that has been directly provided will go straight to storage. If I am wrong about the resolving step it may simply be that Mint does this and ReDBox does not, which would require a deeper alteration.

Ta,
Greg


--
You received this message because you are subscribed to the Google Groups "ReDBox Development" group.
To post to this group, send an email to redbo...@googlegroups.com.
To unsubscribe from this group, send email to redbox-dev+...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msg/redbox-dev/-/11P4gIV2a8UJ.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Message has been deleted

Greg Pendlebury

unread,
Nov 5, 2012, 12:41:05 AM11/5/12
to redbo...@googlegroups.com
Hi Jay,

It looks like this message is coming back from Mint:

{

    "task": "curation-pending",

    "oid": "5863d7f96f78a5799ebb6f19a06e324d",

    "originId": "jcu.edu.au/parties/people/CEAAD8CB24C56047F844D4D1E8A7BA87",

    "originOid": "f661c1b94f165e63bf662bac38bc100a",

    "curatedPid": "http://127.0.0.1:9001/mint/published/detail/f661c1b94f165e63bf662bac38bc100a"

}

The 'curation-pending' messages are sent from children to parents after curation finishes on the child. This record looks like it is coming from a Party record in Mint. It would be worth checking the party record for any stale relationship data from previous dev work. I used to see this myself when I was hacking away at ReDBox and forgot to keep my dev Mint in sync with my dev ReDBox.

You can also ignore it. I also did that until they annoyed me enough to reset both systems.

Ta,
Greg

On 1 November 2012 09:52, Jay van Schyndel <oze...@gmail.com> wrote:
Hi Greg,

Yes, you are correct, the line below is the culprit.

I have resolved the exception by making changes to CurationManager.java.

During curation, when parsing "curation-confirm" to create the "curation-request' message the method checkChildren() causes the problem.
Here is the logic from checkChildren()
            // We need to find OIDs to match IDs (only for local records)
            String relatedOid = json.getString(null, "oid");
            if (relatedOid == null && localRecord) {
                String identifier = json.getString(null, "identifier");
                if (identifier == null) {
                    throw new TransactionException(
                            "NULL identifer provided!");
                }
                relatedOid = idToOid(identifier);
                if (relatedOid == null) {
                    throw new TransactionException(
                            "Cannot resolve identifer: " + identifier);
                }
                ((JsonObject) relation).put("oid", relatedOid);
                saveData = true;
            }

The above can convert the idToOid(), but it doesn't as the OID is not null and it is also a local record.
To fix this, when the relation is written to the .tfpackage, I have modified the method lookForRelation()
as follows:
        // ** -6- ** SYSTEM / BROKER
        String system = config.getString("mint", "system");
        if (system != null && system.equals("mint")) {
            newRelation.put("broker", mintBroker);
        } else {
            newRelation.put("broker", brokerUrl);
            log.info("Jay: this should be rebbox and no OID broker: ", brokerUrl);           
            // ReDBox record's should also be told that the ID is an OID
            //Jay commted this one out, causes an exception in CurationManager.
            //checkChildren() will convert the identifier to an oid when a
            //'curation-confirm' is processed
            //newRelation.put("oid", id);
        }

I just commented out the line that add the oid.

After testing the exception was resolved. :)
However, I am now receiving some newer ones. I don't believe they are due to my fix, but now the curation process is running further, I am encountering more.
Here is an example.

2012-11-01 08:38:02,013 transactionManager DEBUG  CurationManager     
{
    "task": "curation-pending",
    "oid": "5863d7f96f78a5799ebb6f19a06e324d",
    "originId": "jcu.edu.au/parties/people/CEAAD8CB24C56047F844D4D1E8A7BA87",
    "originOid": "f661c1b94f165e63bf662bac38bc100a",
    "curatedPid": "http://127.0.0.1:9001/mint/published/detail/f661c1b94f165e63bf662bac38bc100a"
}
2012-11-01 08:38:02,015 transactionManager ERROR  CurationManager      Error accessing object '5863d7f96f78a5799ebb6f19a06e324d' in storage:
com.googlecode.fascinator.api.storage.StorageException: oID '5863d7f96f78a5799ebb6f19a06e324d' doesn't exist in storage.

    at com.googlecode.fascinator.storage.filesystem.FileSystemStorage.getObject(FileSystemStorage.java:206) ~[plugin-storage-filesystem-1.1.2.jar:na]
    at com.googlecode.fascinator.redbox.plugins.curation.redbox.CurationManager.getConfigFromStorage(CurationManager.java:1677) [plugin-transaction-curation-redbox-1.5.2.jar:na]
    at com.googlecode.fascinator.redbox.plugins.curation.redbox.CurationManager.curation(CurationManager.java:528) [plugin-transaction-curation-redbox-1.5.2.jar:na]
    at com.googlecode.fascinator.redbox.plugins.curation.redbox.CurationManager.parseMessage(CurationManager.java:1388) [plugin-transaction-curation-redbox-1.5.2.jar:na]

    at com.googlecode.fascinator.common.transaction.GenericTransactionManager.parseMessage(GenericTransactionManager.java:172) [fascinator-common-1.1.2.jar:na]
    at com.googlecode.fascinator.messaging.TransactionManagerQueueConsumer.onMessage(TransactionManagerQueueConsumer.java:382) [fascinator-core-1.1.2.jar:na]
    at org.apache.activemq.ActiveMQMessageConsumer.dispatch(ActiveMQMessageConsumer.java:1088) [activemq-all-5.3.0.jar:5.3.0]
    at org.apache.activemq.ActiveMQSessionExecutor.dispatch(ActiveMQSessionExecutor.java:127) [activemq-all-5.3.0.jar:5.3.0]
    at org.apache.activemq.ActiveMQSessionExecutor.iterate(ActiveMQSessionExecutor.java:197) [activemq-all-5.3.0.jar:5.3.0]
    at org.apache.activemq.thread.PooledTaskRunner.runTask(PooledTaskRunner.java:122) [activemq-all-5.3.0.jar:5.3.0]
    at org.apache.activemq.thread.PooledTaskRunner$1.run(PooledTaskRunner.java:43) [activemq-all-5.3.0.jar:5.3.0]
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) [na:1.6.0_37]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) [na:1.6.0_37]
    at java.lang.Thread.run(Thread.java:680) [na:1.6.0_37]
2012-11-01 08:38:02,015 transactionManager ERROR  CurationManager      Error accessing item configuration!
2012-11-01 08:38:02,015 transactionManager ERROR  ManagerQueueConsumer Error indexing OID '5863d7f96f78a5799ebb6f19a06e324d'
com.googlecode.fascinator.api.indexer.IndexerException: com.googlecode.fascinator.api.storage.StorageException: oID '5863d7f96f78a5799ebb6f19a06e324d' doesn't exist in storage.
    at com.googlecode.fascinator.indexer.SolrIndexer.index(SolrIndexer.java:517) ~[plugin-indexer-solr-1.1.2.jar:na]
    at com.googlecode.fascinator.messaging.TransactionManagerQueueConsumer.index(TransactionManagerQueueConsumer.java:553) [fascinator-core-1.1.2.jar:na]
    at com.googlecode.fascinator.messaging.TransactionManagerQueueConsumer.processOrders(TransactionManagerQueueConsumer.java:429) [fascinator-core-1.1.2.jar:na]
    at com.googlecode.fascinator.messaging.TransactionManagerQueueConsumer.onMessage(TransactionManagerQueueConsumer.java:395) [fascinator-core-1.1.2.jar:na]

    at org.apache.activemq.ActiveMQMessageConsumer.dispatch(ActiveMQMessageConsumer.java:1088) [activemq-all-5.3.0.jar:5.3.0]
    at org.apache.activemq.ActiveMQSessionExecutor.dispatch(ActiveMQSessionExecutor.java:127) [activemq-all-5.3.0.jar:5.3.0]
    at org.apache.activemq.ActiveMQSessionExecutor.iterate(ActiveMQSessionExecutor.java:197) [activemq-all-5.3.0.jar:5.3.0]
    at org.apache.activemq.thread.PooledTaskRunner.runTask(PooledTaskRunner.java:122) [activemq-all-5.3.0.jar:5.3.0]
    at org.apache.activemq.thread.PooledTaskRunner$1.run(PooledTaskRunner.java:43) [activemq-all-5.3.0.jar:5.3.0]
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) [na:1.6.0_37]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) [na:1.6.0_37]
    at java.lang.Thread.run(Thread.java:680) [na:1.6.0_37]
Caused by: com.googlecode.fascinator.api.storage.StorageException: oID '5863d7f96f78a5799ebb6f19a06e324d' doesn't exist in storage.

    at com.googlecode.fascinator.storage.filesystem.FileSystemStorage.getObject(FileSystemStorage.java:206) ~[plugin-storage-filesystem-1.1.2.jar:na]
    at com.googlecode.fascinator.indexer.SolrIndexer.index(SolrIndexer.java:507) ~[plugin-indexer-solr-1.1.2.jar:na]
    ... 11 common frames omitted

 I don't know where the OID that the exception is occurring on comes from. I am currently investigating this one.
I have attached a new copy of the logs as they currently stand for a curation.

Cheers,
               Jay.
To view this discussion on the web, visit https://groups.google.com/d/msg/redbox-dev/-/EJ_KZEfYA14J.

Jay van Schyndel

unread,
Nov 11, 2012, 6:54:13 PM11/11/12
to redbo...@googlegroups.com
Hi Greg,

Thanks for the information.
I"ll do some more testing, but this seems to take care of my curation issues.

Cheers,
               Jay.
Reply all
Reply to author
Forward
0 new messages