Filter media running for 4 days is it normal process?

85 views
Skip to first unread message

Lewatle Johannes Phaladi

unread,
Nov 4, 2022, 7:22:25 AM11/4/22
to DSpace Technical Support
Dear DSpace Team,

I have run ./dspace filter-media for the past 4 days, it is still running, I am asking if this is how long it should run. our DSpace repository have +- 35 000 items. 

Regards,
Lewatle 

Mark H. Wood

unread,
Nov 4, 2022, 8:51:35 AM11/4/22
to dspac...@googlegroups.com
On Fri, Nov 04, 2022 at 04:22:25AM -0700, Lewatle Johannes Phaladi wrote:
> I have run ./dspace filter-media for the past 4 days, it is still running,
> I am asking if this is how long it should run. our DSpace repository have
> +- 35 000 items.

That depends on many factors, but it does seem a bit long.

Is the CPU very busy? Is the machine doing much swapping? Or perhaps
the command doesn't have enough heap and is spending most of its time
garbage-collecting.

--
Mark H. Wood
Lead Technology Analyst

University Library
Indiana University - Purdue University Indianapolis
755 W. Michigan Street
Indianapolis, IN 46202
317-274-0749
www.ulib.iupui.edu
signature.asc

Lewatle Johannes Phaladi

unread,
Nov 7, 2022, 8:40:26 AM11/7/22
to DSpace Technical Support
Hello Mark,

I have shared your respond with my colleagues to see if we can have solution. Is there anyway to run filter-media on community or collection.

maybe by breaking that index can make it complete the task.

Regards,
Lewatle 



Mark H. Wood

unread,
Nov 7, 2022, 9:21:23 AM11/7/22
to dspac...@googlegroups.com
On Mon, Nov 07, 2022 at 05:40:25AM -0800, Lewatle Johannes Phaladi wrote:
> I have shared your respond with my colleagues to see if we can have
> solution. Is there anyway to run filter-media on community or collection.
>
> maybe by breaking that index can make it complete the task.

'bin/dspace filter-media -h' shows that it can be run on individual
items, but I see no option to process a single community or
collection.

You could make lists of item IDs in each collection and create
scripts for 'bin/dspace read':

filter-media -i 12345/1
filter-media -i 12345/2
...

This should be faster than running each item separately.
signature.asc

Tim Donohue

unread,
Nov 8, 2022, 10:15:33 AM11/8/22
to DSpace Technical Support
Hi,

Based on my reading of the "filter-media" code, I think the "-i" flag can also be used to reference a Community or Collection.  In that scenario, all Items in that Community or Collection will be processed.


Tim

emilio lorenzo

unread,
Nov 8, 2022, 11:17:39 AM11/8/22
to dspac...@googlegroups.com

At least in versions 5 & 6 the command -i works flawlessly

example:  

dspace filter-media -f -i xxxxxxxxx/1685 -p "ImageMagick Image Thumbnail","ImageMagick PDF Thumbnail"   

being xxxxxxxxx/1685  a top level community or a collection or any handle

We haven´t tested yet in Version7.

Best luck

Emilio

--
All messages to this mailing list should adhere to the Code of Conduct: https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx
---
You received this message because you are subscribed to the Google Groups "DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dspace-tech...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dspace-tech/4f9696e2-1b07-4401-bbd6-6cd9d4d46454n%40googlegroups.com.

Lewatle Johannes Phaladi

unread,
Nov 9, 2022, 4:12:46 AM11/9/22
to DSpace Technical Support
Hello Tim,

Thanks a lot, definitely I will follow and do it on community level.

Regards,
Lewatle  

Lewatle Johannes Phaladi

unread,
Nov 9, 2022, 4:14:42 AM11/9/22
to DSpace Technical Support
Hello Emilio,

much appreciated this will help.

Regards,
Lewatle 

Lewatle Johannes Phaladi

unread,
Nov 10, 2022, 3:27:02 AM11/10/22
to DSpace Technical Support

Hi DSpace Team,

I  am getting this error when running filter-media on item or collection, please see error message :

ERROR filtering, skipping bitstream:
        Item Handle: 10539332/20434665       Bundle Name: ORIGINAL   File Size: 984696       Checksum: 3bb9367338a7d40a5e76ec4e5056e8c6 (MD5)        Asset Store: 0
org.im4java.core.CommandException: convert-im6.q16: attempt to perform an operation not allowed by the security policy `PDF' @ error/constitute.c/IsCoderAuthorized/421.
The script has completed
Reply all
Reply to author
Forward
0 new messages