Message from discussion
Strategy for Synchronization with Institutional Database
Received: by 10.236.91.229 with SMTP id h65mr1863107yhf.24.1344614501918;
Fri, 10 Aug 2012 09:01:41 -0700 (PDT)
X-BeenThere: islandora-dev@googlegroups.com
Received: by 10.236.120.244 with SMTP id p80ls7611913yhh.0.gmail; Fri, 10 Aug
2012 09:01:41 -0700 (PDT)
Received: by 10.236.133.146 with SMTP id q18mr1856922yhi.12.1344614501550;
Fri, 10 Aug 2012 09:01:41 -0700 (PDT)
Received: by 10.236.133.146 with SMTP id q18mr1856921yhi.12.1344614501539;
Fri, 10 Aug 2012 09:01:41 -0700 (PDT)
Return-Path: <mlegg...@mac.com>
Received: from st11p01mm-asmtpout002.mac.com (st11p01mm-asmtpout002.mac.com. [17.172.204.237])
by gmr-mx.google.com with ESMTP id i27si1188068yhe.4.2012.08.10.09.01.41;
Fri, 10 Aug 2012 09:01:41 -0700 (PDT)
Received-SPF: pass (google.com: domain of mlegg...@mac.com designates 17.172.204.237 as permitted sender) client-ip=17.172.204.237;
Authentication-Results: gmr-mx.google.com; spf=pass (google.com: domain of mlegg...@mac.com designates 17.172.204.237 as permitted sender) smtp.mail=mlegg...@mac.com
MIME-version: 1.0
Content-transfer-encoding: 7BIT
Content-type: text/plain; CHARSET=US-ASCII
Received: from [137.149.214.99] (unknown [137.149.214.99])
by st11p01mm-asmtp002.mac.com
(Oracle Communications Messaging Server 7u4-24.01(7.0.4.24.0) 64bit (built Jan
3 2012)) with ESMTPSA id <0M8J000FGR6QQ...@st11p01mm-asmtp002.mac.com> for
islandora-dev@googlegroups.com; Fri, 10 Aug 2012 16:01:40 +0000 (GMT)
X-Proofpoint-Virus-Version: vendor=fsecure
engine=2.50.10432:5.7.7855,1.0.260,0.0.0000
definitions=2012-08-10_06:2012-08-10,2012-08-10,1970-01-01 signatures=0
X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0
ipscore=0 suspectscore=4 phishscore=0 bulkscore=0 adultscore=0 classifier=spam
adjust=0 reason=mlx scancount=1 engine=6.0.2-1203120001
definitions=main-1208100161
Subject: Re: [islandora-dev] Strategy for Synchronization with Institutional
Database
From: Mark Leggott <mlegg...@mac.com>
In-reply-to: <097785ad-f1fb-41b7-9ecf-70e88524ade0@googlegroups.com>
Date: Fri, 10 Aug 2012 13:01:38 -0300
Message-id: <AA6D6089-1137-48F9-83BA-45BF41791...@mac.com>
References: <097785ad-f1fb-41b7-9ecf-70e88524ade0@googlegroups.com>
To: islandora-dev@googlegroups.com
X-Mailer: Apple Mail (2.1485)
Hi Roger,
Not sure if you got an answer to this offline or not, so I thought I would add a couple of comments.
- We have some examples of sync with external/enterprise systems, but as you can imagine they are highly customized to the local systems.
- We have an example here at UPEI where we built integration with the locally developed financial system. When a scanned set of PO documents were uploaded we queried the financial system for that PO record and it sent back a package of XML using a custom program stored on their side.
- We also have numerous examples of Fedora records being populated via queries of external systems (typically via an API) with creation/updating of the record whenever the existing object is accessed.
- We have another system developed via DiscoveryGarden which takes an export of the organizations ERP data (coming from multiple enterprise systems) each night and does an insertion/sync depending on the existence of new/modified records. This is the trickiest since the organization can edit the records in the repository (they have information there not present in the ERP systems), so the system has to sync specific fields so data on the repo side is not lost.
Anyway - there are lost of examples that I can think of, so if you had a more detailed description of your requirement we might be able to provide some examples. I don't believe there are specific examples of code in Git or anywhere else, although we are actively working on a section of the islandora.ca site that would provide this kind of service. We should be making it available in the Fall.
Re the Active Directory piece, you can integrate AD with Islandora via the Drupal LDAP module, which provides basic sync between systems. You can also add additional processing in the Drupal/Islandora side to enhance what is done. Re Drupal module vs free-standing, we have done both and it depends again on the specific requirements and therefore which makes better business or technical sense. In the financial system example above there was a business interest in not having us getting in to their back-end database, so they built a small Java app that we sent our query to and it sent us back what we needed.
Mark
On 2012-07-19, at 10:31 AM, Roger Hyam <rogerh...@googlemail.com> wrote:
> Hi,
>
> We have several institutional databases that contain both core business objects (e.g. specimen records) and organizational objects (e.g. people in ActiveDirectory and other places).
>
> We want to have these objects represented by objects in our Islandora repository and have cron jobs runs to periodically keep the repository updated with changes in the source dbs
>
> I have done something similar to this before by just indexing multiple resources using Solr but Islandora would act as a "sticky index" where objects would remain in their final state even if they drop out of the institutional databases. The repository will also contain a lot of stuff that doesn't reside anywhere else.
>
> What is the best strategy for developing these synchronization scripts? Would it be better to develop a Drupal module for each script or have free standing command line programs? I am comfortable hacking in PHP and Java.
>
> I'm sure this kind of thing is done quite commonly. Is there any documentation or any example code?
>
> Many thanks,
>
> Roger
>