[Dspace-tech] RE: [Dspace-general] DSpace assetstore

20 views
Skip to first unread message

Tansley, Robert

unread,
Aug 24, 2015, 3:35:00 PM8/24/15
to Grant Johnson, DSpace General, dspac...@lists.sourceforge.net
Hi Grant,

For future reference, the 'dspace-tech' list
(https://sourceforge.net/mail/?group_id=19984) is probably the best
forum for mails of a technical nature like this.

> I have withdrawn(and expunged) ALL of my test communities and
> Collections.
> Organizing the archivists to "build" their space.
> However - "old" documents, images etc...still exist in the
> ../dspace/assetstore.

When a bitstream is 'deleted' in DSpace, it's actually just tagged
'deleted'; the original file isn't immediately removed from the file
system. This allows 'rollback' functionality and is intended to help
prevent some nasty mistakes being made!

In order to actually delete files, the 'cleanup' script (by default,
/dspace/bin/cleanup) should be run. This should clear out the 'deleted'
files. If desired, this can be configured to run automatically every
night, weekend, month etc. as a 'cron' job on Unix/Linux or a Scheduled
Task on Windows XP.

Robert TANSLEY / HP Labs / MIT Visiting Researcher
http://www.hpl.hp.com/personal/Robert_Tansley/

Grant Johnson

unread,
Aug 24, 2015, 3:35:01 PM8/24/15
to Tansley, Robert, dspac...@lists.sourceforge.net
Thanks Robert,

Would the proper "Delete" process be to
1) Withdraw, Find the Item (by it's item number) and Expunge.
Then run the ./cleanup?
or
2) Just withdraw and run ./cleanup?

I have been doing #1 and then performing "maintenance" on the DSpace
Database itself in PGAdmin.

Tansley, Robert wrote:

>In order to actually delete files, the 'cleanup' script (by default,
>/dspace/bin/cleanup) should be run. This should clear out the 'deleted'
>files. If desired, this can be configured to run automatically every
>night, weekend, month etc. as a 'cron' job on Unix/Linux or a Scheduled
>Task on Windows XP.
>
-

F. Grant Johnson
566-0630 / fgjo...@upei.ca

Systems/Web Coordinator
RM 285 - Robertson Library
University of Prince Edward Island

***************
Attitude is IT!


Tansley, Robert

unread,
Aug 24, 2015, 3:35:03 PM8/24/15
to Grant Johnson, dspac...@lists.sourceforge.net
> Would the proper "Delete" process be to
> 1) Withdraw, Find the Item (by it's item number) and Expunge.
> Then run the ./cleanup?
> or
> 2) Just withdraw and run ./cleanup?

Withdrawing doesn't actually delete the bitstreams (even by tagging).
All 'withdraw' is intended to do is remove the item from public display,
while keeping the content and metadata 'in' and managed by the
repository.

Expunging is a distinct and separate operation, which completely blows
away the item, no need to 'withdraw' first.

> I have been doing #1 and then performing "maintenance" on the DSpace
> Database itself in PGAdmin.

Probably just 'expunge' then dspace/bin/cleanup is what you need.

If you manually deleted any rows from the 'bitstream' table for
bitstreams that were tagged 'deleted', the 'cleanup' tool won't know to
clean them up. It's possible to create a script to hunt the 'orphaned'
files in the asset store directory, but one isn't included in the DSpace
distribution. Someone may have already created such a script.

Rob

Grant Johnson

unread,
Aug 24, 2015, 3:35:12 PM8/24/15
to Tansley, Robert, dspac...@lists.sourceforge.net
Thank you Robert!
Your presence on this list is great.

I'll be adding this to my FAQ list!

Expnge in the GUI and ./cleanup to clear the tables and clear it out of
the datastore.
Perfect!

Tansley, Robert wrote:

> Withdrawing doesn't actually delete the bitstreams (even by tagging).
> All 'withdraw' is intended to do is remove the item from public display,
> while keeping the content and metadata 'in' and managed by the
> repository.

Right - so we can re-instate it if required - as discussed.

> If you manually deleted any rows from the 'bitstream' table for
> bitstreams that were tagged 'deleted, the 'cleanup' tool won't know to
> clean them up.

By tagged "deleted" do you mean the submission withdrawn? or a "removed"
bitstream?

I don't have too many "leftovers" at this point so I should be able to
do it manually for the few I have, I think! Any other tables that should
be cleaned up other than the bitstream table and the collection2bitstream?

Grant Johnson

unread,
Aug 24, 2015, 3:36:15 PM8/24/15
to dspac...@lists.sourceforge.net
How about it! Anyone created a script?

> It's possible to create a script to hunt the 'orphaned'
> files in the asset store directory, but one isn't included in the DSpace
> distribution. Someone may have already created such a script.

> Rob

--

Grant Johnson

unread,
Aug 24, 2015, 3:39:05 PM8/24/15
to dspac...@lists.sourceforge.net
Hi all,
Still having issues "removing" bitstreams from the assetstore.

Tried two things:

1) Went to the Item,
went to edit,
"Removed" the bitstream,
then expunged the item ID containing the bitstream,
ran ./cleanup on the server.
- The folder structure and file still exists in the assetstore.
(Although it's not in the postgres tables anymore)

2) Went to a different Item,
went to edit,
expunged the item ID containing the bitstream, (didn't remove the
bitstream first)
ran ./cleanup
- The folder structure and file still exists in the assetstore.
(Although it's not in the postgres tables anymore)

Nothing I do seems to actually "remove" the items submitted.
These are "tests" and I'd like to get the space back as they are quite
large.

Where do I look to try and discern what's going on.
Does this work from the collections level only?

Thanks in advance.


Tansley, Robert wrote:
>> I have withdrawn(and expunged) ALL of my test communities and
>> Collections.
>> Organizing the archivists to "build" their space.
>> However - "old" documents, images etc...still exist in the
>> ../dspace/assetstore.
> In order to actually delete files, the 'cleanup' script (by default,
> /dspace/bin/cleanup) should be run. This should clear out the 'deleted'
> files. If desired, this can be configured to run automatically every
> night, weekend, month etc. as a 'cron' job on Unix/Linux or a Scheduled
> Task on Windows XP.

Grant Johnson

unread,
Aug 24, 2015, 3:39:05 PM8/24/15
to dspac...@lists.sourceforge.net
OK - The "root" of assetstore is owned by dspace.
- All folders and sub-folders (the numbers) are owned by "root".
- Correct me if I'm wrong - Since only the owner can modify these
"dspace" can't!

Is this what's going on?

Should I chmod all files and folders to either allow all to modify?
Should I chmod to be owned by dspace??
"New" Communities and submissions created in the assetstore should be
owned by dspace. Right?

Thanks

Katy Earl

unread,
Sep 10, 2025, 7:37:12 PM (12 days ago) Sep 10
to DSpace Technical Support
Did anyone find a solution to that? I'm on DSpace 9.x and I am also finding that expunge and dspace cleanup does not blow away contents in the assetstore as I thought it would. There are still many many bitstreams in there. 

I had thought that expunge plus running the command line dspace cleanup, plus forcing a reindex would eliminate everything in the assetstore, but it doesn't. What is the expected functionality here? What is the proper way to completely clean out bitstreams that should be gone?

Katy
Reply all
Reply to author
Forward
0 new messages