bitstore-migrate fails on deleted items

58 views
Skip to first unread message

Gary Browne

unread,
Jul 30, 2019, 12:57:47 AM7/30/19
to DSpace Technical Support
Hi all,

On DSpace 6, I'm attempting to migrate my filesystem bitstreams to S3.

S3 (bitstore.xml) is correctly configured, however when I run:

$ [dspace]/bin/dspace bitstore-migrate -a 0 -b 1

I get the following error:

Exception during BitStoreMigrate: java.io.FileNotFoundException: /srv/dspace/assetstore/13/51/71/135171749737414497487002785424297776835 (No such file or directory)

I don't see any flag to specify anything abouut deleted items - I have verified that this file does not exist on the filesystem, so why is bitstore-migrate trying to migrate it?

Thanks,
Gary

Mark H. Wood

unread,
Jul 30, 2019, 9:35:38 AM7/30/19
to DSpace Technical Support
Apparently there is a Bitstream record in the database which claims
that this file exists. It is possible that the file *should* exist,
or that the Bitstream record should *not*. I would try to find the
object holding this Bitstream -- if there is none, then the Bitstream
should be deleted; otherwise I'd try to figure out where to get a copy
of the file and restore it.

--
Mark H. Wood
Lead Technology Analyst

University Library
Indiana University - Purdue University Indianapolis
755 W. Michigan Street
Indianapolis, IN 46202
317-274-0749
www.ulib.iupui.edu
signature.asc

Gary Browne

unread,
Jul 30, 2019, 9:58:20 AM7/30/19
to dspac...@googlegroups.com

Hi Mark,

 

Thanks for your reply.

 

You're right that there is a record in the database for this file, but the "deleted" field value is "true". I'm assuming in this case that there should be no file in the assetstore and that the bitstore migrator should ignore it?

 

Regards,

Gary

 

Gary Browne | Technical Manager, Developments

Online Services

University of Sydney Library

THE UNIVERSITY OF SYDNEY

Level 1, Fisher Library F03, The University of Sydney NSW 2006

T +61 2 9351 5946 | M +61 405 647 868

E gary....@sydney.edu.au

Sent from my plain old desktop computer

    --

    All messages to this mailing list should adhere to the DuraSpace Code of Conduct: https://protect-au.mimecast.com/s/Z2H-C0YZWVF9pw67hw81WX?domain=duraspace.org

    ---

    You received this message because you are subscribed to a topic in the Google Groups "DSpace Technical Support" group.

    To unsubscribe from this topic, visit https://protect-au.mimecast.com/s/956gCgZowLHjDZVoToASWP?domain=groups.google.com.

    To unsubscribe from this group and all its topics, send an email to dspace-tech...@googlegroups.com.

    To view this discussion on the web visit https://protect-au.mimecast.com/s/hxXbCjZrzqHwVQpBh5OKWM?domain=groups.google.com.

   

Mark H. Wood

unread,
Jul 30, 2019, 10:20:51 AM7/30/19
to dspac...@googlegroups.com
On Tue, Jul 30, 2019 at 01:58:15PM +0000, Gary Browne wrote:
> You're right that there is a record in the database for this file, but the "deleted" field value is "true". I'm assuming in this case that there should be no file in the assetstore and that the bitstore migrator should ignore it?

No, a Bitstream marked "deleted", with no corresponding file in the
assetstore, is damaged. The "deleted" flag just means that the
Bitstream record *and* assetstore file should be deleted when we run
'bin/dspace cleanup' ("cleanup: Remove deleted bitstreams from the
assetstore").

In a good conservative design, the migrator probably *should* attempt
to move a "deleted" Bitstream with its content. The Bitstream in
question here is something that should not happen. I would argue that
the migrator, when it cannot move a Bitstream, should skip over it and
continue without altering it, giving as much information as it can
about the nature of the failure.
signature.asc

Gary Browne

unread,
Jul 30, 2019, 10:36:11 AM7/30/19
to dspac...@googlegroups.com
Hi Mark,

Ah, now I get it.

Thanks very much for clearing that up.

Cheers,
Gary


Gary Browne | Technical Manager, Developments
Online Services
University of Sydney Library
THE UNIVERSITY OF SYDNEY
Level 1, Fisher Library F03, The University of Sydney NSW 2006
T +61 2 9351 5946 | M +61 405 647 868
E gary....@sydney.edu.au <https://webmail.sydney.edu.au/owa/redir.aspx?C=OXYu29eFmlOiJviVN3CHunM5oGoASVvNNYb-H0ZnmZGiO6bY9qPUCA..&URL=mailto%3agary.browne%40sydney.edu.au>
The University of Sydney Camperdown campus stands on land of the Gadigal peoples of the Eora nation.
--
All messages to this mailing list should adhere to the DuraSpace Code of Conduct: https://protect-au.mimecast.com/s/w_FfCmOxDQt797E6fG29SH?domain=duraspace.org
---
You received this message because you are subscribed to a topic in the Google Groups "DSpace Technical Support" group.
To unsubscribe from this topic, visit https://protect-au.mimecast.com/s/ylQACnxyErC1k1MxCJgL4f?domain=groups.google.com.
To unsubscribe from this group and all its topics, send an email to dspace-tech...@googlegroups.com.
To view this discussion on the web visit https://protect-au.mimecast.com/s/TJLiCoVzGQi1Q1x8CVmSgc?domain=groups.google.com.


Gary Browne

unread,
Jul 31, 2019, 12:45:18 AM7/31/19
to DSpace Technical Support
hi Mark,

One more question sorry - what about metadata-only records? do they also point to a dummy or blank file, or does the bitstore-migrator not look for files from metadata-only records?

Actually that was two questions.

Thanks,
Gary

Mark H. Wood

unread,
Jul 31, 2019, 9:02:59 AM7/31/19
to DSpace Technical Support
On Tue, Jul 30, 2019 at 09:45:18PM -0700, Gary Browne wrote:
> One more question sorry - what about metadata-only records? do they also
> point to a dummy or blank file, or does the bitstore-migrator not look for
> files from metadata-only records?

A metadata-only *Item* has no Bitstreams, so no Bitstore is involved.
There's nothing to migrate in this case.

Too much detail:

o an Item (what you submit) has metadata and zero or more Bitstream.
o Item, Bitstream, and metadata are all stored in the database, not
directly on the filesystem.
o Bitstream has fields which specify an assetstore (a directory tree
on some storage medium) and a path within that assetstore, which
together locate a file in a filesystem. The file is a copy of an
actual content file which was submitted.
o a Bitstore is code which knows how to access the files in an
assetstore. An instance of a specific kind of Bitstore is created
for each configured assetstore. They are indexed by number.

There's also a Bundle object which isn't relevant to your question and
would only confuse discussion. It's confusing enough when it *is*
relevant.

When a user requests one of the content files of an Item, DSpace looks
up the corresponding Bitstream and calls the indicated Bitstore to get
a connection to the assetstore file at the indicated path.

So, if an Item has no content, only metadata, there is nothing in any
assetstore which is associated with that Item.
signature.asc

Gary Browne

unread,
Jul 31, 2019, 9:53:56 PM7/31/19
to DSpace Technical Support
Hi Mark,

Thanks again - yes, I was forgetting that we are talking about different levels of records (item-level -v- bitstream-level).

And not too much detail - that's exactly why I come to this list, to get (and hopefully at times to provide) the lowdown.

Cheers,
Gary

Gary Browne

unread,
Aug 4, 2019, 10:49:13 PM8/4/19
to DSpace Technical Support
Hi Mark,

An addendum to this conundrum is that I've found that we have several hundred bitstream records that have:
- No associated bitstream on the filesystem
- No associated bundle record

So these are "orphan" records ie. I cannot trace the bitstream record back to an item record reliably. I've looked into a couple of them that I can match on (representational) filename (rather than disk filename) - from the "name" field in the bitstream table (rather than the "internal_id" field). Matching this way (only) I can see that there are two records for these bitstreams - one with deleted=t and one with deleted=f.

I don't have enough provenance information to know how this happened.

Cheers,
Gary


Gary Browne | Technical Manager, Developments
Online Services
University of Sydney Library
THE UNIVERSITY OF SYDNEY
Level 1, Fisher Library F03, The University of Sydney NSW 2006
T +61 2 9351 5946 | M +61 405 647 868
--
All messages to this mailing list should adhere to the DuraSpace Code of Conduct: https://protect-au.mimecast.com/s/DrFlCJyp0qh29nwvtVrjUS?domain=duraspace.org
---
You received this message because you are subscribed to a topic in the Google Groups "DSpace Technical Support" group.
To unsubscribe from this topic, visit https://protect-au.mimecast.com/s/2UQrCK1qJZtY71WNCvlbSL?domain=groups.google.com.
To unsubscribe from this group and all its topics, send an email to dspace-tech...@googlegroups.com.
To view this discussion on the web visit https://protect-au.mimecast.com/s/2CLUCL7rK8tgB39riPnpw2?domain=groups.google.com.


Reply all
Reply to author
Forward
0 new messages