ingesting HTML archives

67 views
Skip to first unread message

John Rodriquez (praaks)

unread,
Jan 11, 2023, 8:13:06 AM1/11/23
to DSpace Technical Support
Hi,

The documentation on how to handle HTML archives which contain relative links hasn't been updated for DSpace 7 (https://wiki.lyrasis.org/display/DSDOC7x/Ingesting+HTML+Archives). In DSpace 7 (I'm running 7.4), I realize bitstreams now have UUIDs instead of sequence ids in their URLs. This seems to break the relative links in the HTML file. Links that worked in DSpace 6 no longer work in DSpace 7. The relative link is converted to a link containing the main HTML files's UUID. For example:

The main HTML file link in the item record: 
http://dspace.xxxx.ee:8080/server/api/core/bitstreams/412b6597-1b7a-4edf-8ca1-1effd3185276/content

The relative link in the main HTML file for a "goto.html" becomes:
http://dspace.xxxx.ee:8080/server/api/core/bitstreams/412b6597-1b7a-4edf-8ca1-1effd3185276/goto.html   ... resulting in a 404 error ...

... though of course, this second file has its own UUID, and its direct address is:
http://testdspace.tktk.ee:8080/server/api/core/bitstreams/fe3299dc-327e-429f-9ec8-8652e84af8f3/content

Is there a way to get around this so relative links aren't broken? Maybe I've misunderstood something? Maybe it's a configuration issue?

Regards,

John Rodriquez
Educational Technologist
Tallinna Tehnikakõrgkool, UAS




Tim Donohue

unread,
Jan 13, 2023, 1:29:18 PM1/13/23
to John Rodriquez (praaks), DSpace Technical Support
Hi John,

It appears that this is not yet supported in DSpace 7.  I'm not seeing a way to get this to work either, because the DSpace 7 Bitstream URLs have changed from the structure in DSpace 6.

I've created a bug ticket for this: https://github.com/DSpace/DSpace/issues/8635  I'm not sure offhand how to resolve this, as I think it needs more detailed analysis & prioritization.

Tim

From: dspac...@googlegroups.com <dspac...@googlegroups.com> on behalf of John Rodriquez (praaks) <pra...@gmail.com>
Sent: Wednesday, January 11, 2023 7:10 AM
To: DSpace Technical Support <dspac...@googlegroups.com>
Subject: [dspace-tech] ingesting HTML archives
 
--
All messages to this mailing list should adhere to the Code of Conduct: https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx
---
You received this message because you are subscribed to the Google Groups "DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dspace-tech...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dspace-tech/b7d9836d-005d-4df7-91d7-ba8ee8907019n%40googlegroups.com.

John Rodriquez (praaks)

unread,
Jan 16, 2023, 1:53:16 AM1/16/23
to DSpace Technical Support
Thanks for the reply. I'll keep an eye on the issue. Maybe something will be worked out in the next version.

Regards,
John

Reply all
Reply to author
Forward
0 new messages