Want to Migrate from EPrints to Dspace

zen zenitram

unread,

Feb 7, 2020, 4:56:47 AM2/7/20

to DSpace Technical Support

good day

Is the way to migrate eprints to dspace and get all the data that stored in eprints?

Thank you!

Tim Donohue

unread,

Feb 7, 2020, 10:10:44 AM2/7/20

to zen zenitram, DSpace Technical Support

Hello,

Unfortunately, there is not an official way to move content directly from EPrints to DSpace. However, DSpace has a variety of import options. So, if you can get data out of EPrints into a format that DSpace understands, then you can import it into DSpace.

DSpace's ingest options are all in the documentation at:
https://wiki.lyrasis.org/display/DSDOC6x/Ingesting+Content+and+Metadata

The most simplistic format that DSpace uses is it's Simple Archive Format, which just involves putting each file in a directory next to a corresponding metadata file: https://wiki.lyrasis.org/display/DSDOC6x/Importing+and+Exporting+Items+via+Simple+Archive+Format

There are also community tools for transforming an excel spreadsheet of metadata into that Simple Archive Format (SAF), for example: https://github.com/DSpace-Labs/SAFBuilder

If you require more directly help in migrating from EPrints to DSpace, we have a number of service providers who can be hired to do the migration for you. See https://duraspace.org/dspace/resources/service-providers/

Hopefully that gives a few hints on how to get started. If you have more specific questions, feel free to ask them on this list.

Tim

From: dspac...@googlegroups.com <dspac...@googlegroups.com> on behalf of zen zenitram <quick...@gmail.com>
Sent: Friday, February 7, 2020 3:55 AM
To: DSpace Technical Support <dspac...@googlegroups.com>
Subject: [dspace-tech] Want to Migrate from EPrints to Dspace

good day

Is the way to migrate eprints to dspace and get all the data that stored in eprints?

Thank you!

--
All messages to this mailing list should adhere to the DuraSpace Code of Conduct: https://duraspace.org/about/policies/code-of-conduct/
---
You received this message because you are subscribed to the Google Groups "DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dspace-tech...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dspace-tech/54b29638-bf63-403f-afc5-fa2d863c0563%40googlegroups.com.

Joshua Allan Westgard

unread,

Feb 8, 2020, 9:21:17 AM2/8/20

to DSpace Technical Support

Hi Zen,

We have done this migration recently, taking a standalone subject-based EPrints repository and importing it into our DSpace instance, where it now lives as one community in a campus-wide IR.

We have some scripts on Github [1], but these were not created with the intention that they would be anything more than a one-off solution to our particular problem, so I'm not sure how much help they would be to you. I suppose it depends on the requirements of your migration project. Our approach was to do what Tim described, essentially following the Extract-Transform-Load pattern.

(1) For extraction, for the most part we just pulled data from the Dublin Core representation of the metadata that was exposed by the existing EPrints instance. If I remember correctly there was just one thing we needed to pull from the database directly, and that was easy to integrate by matching on the EPrints ID.

(2) Next, we wrote some scripts that performed a custom crosswalk of this Dublin Core metadata to the metadata format required by our DSpace. This was pretty straightforward because DSpace also uses DC metadata, though there were things that needed to be mapped over (specifically the type vocabulary did not match up exactly between the two versions of DC). We also used this opportunity to bring the data into closer alignment with the standards and practices of our IR. One of the challenges for us was that the existing EPrints repository had included many external links (without the original files), and a large number of those links were broken. We ended up updating the links that were permanent redirects but excluding broken links from our import package unless the object also had a copy of the binary.

(3) Finally, we had scripts to assemble a package in Simple Archive Format for import to DSpace.

If any of this is of interest I'm happy to answer specific questions.

Best,

Josh Westgard

University of Maryland Libraries

[1] https://github.com/jwestgard/eprints2dspace

Reply all

Reply to author

Forward