On Thu, Mar 23, 2023 at 01:24:16AM -0700, Agustina Martinez wrote:
> Thanks a lot for chipping in. All that you have described is exactly what
> is happening. To give some more details:
>
> - There is the single big transaction being built up, and it does not
> save until all the items in the collection have been processed.
> - My task implementation is annotated with the "@Distributive"
> annotation as I need for it to only report its status one it has iterated
> over all of its items (the task implements a custom report method that
> generates a csv file with items results and emails that to administrators)
> - I couldn't come up with a better way of doing it. Not having that
> annotation causes for the individual tasks to report their status, and
> hence send an email each.
> - In testing when / how commits happened, I was trying to get the
> individual item task to commit its changes, but as you report, there is the
> single context and after the first task commit I was getting a SQLException
> concerning the resultset being closed.
>
> For the time being, until I come up with a better approach to this bulk
> operation, the small change to the DOIConsumer (replace commit with
> uncacheEntity) has helped with the lock issue, and the long task completes
> successfully and all changes are committed to the DB as expected. Do you
> see any issue with this change?
>
> Is there any way of solving the issue with the single context? Curation
> tasks are a good framework for allowing bulk operations, but they should
> allow for changes to be committed frequently in cases where there is no
> dependency between single tasks being run. Although I can see this being
> tricky.
>
> I did try the route of creating a Script using the DSpaceRunnable classes
> which would use an iterator, but I was still facing the issue with the
> transaction lock, and this script was much more innefficient in terms of
> running times.
I made a quick and dirty tool to fix a metadata problem that we had,
which manages to get two Sessions and use them as I've suggested. I
had to ignore the relevant DAO and build my own Query, so that I could
use the second Session. Curator would have to be similarly hacked to
do it this way. I would rather see a proper solution, but for
illustration, my tool looks like this:
diff --git a/dspace/modules/additions/src/main/java/edu/iupui/ulib/dspace/util/FixThumbnailPolicies.java b/dspace/modules/additions/src/main/java/edu/iupui/ulib/dspace/util/FixThumbnailPolicies.java
new file mode 100644
index 0000000000..6e7b475238
--- /dev/null
+++ b/dspace/modules/additions/src/main/java/edu/iupui/ulib/dspace/util/FixThumbnailPolicies.java
@@ -0,0 +1,130 @@
+/*
+ * Copyright 2023 Indiana University.
+ */
+package edu.iupui.ulib.dspace.util;
+
+import java.sql.SQLException;
+import java.util.List;
+import java.util.function.Consumer;
+import java.util.stream.Stream;
+import javax.persistence.criteria.CriteriaBuilder;
+import javax.persistence.criteria.CriteriaQuery;
+import javax.persistence.criteria.Root;
+
+import org.dspace.authorize.AuthorizeException;
+import org.dspace.authorize.ResourcePolicy;
+import org.dspace.authorize.factory.AuthorizeServiceFactory;
+import org.dspace.authorize.service.ResourcePolicyService;
+import org.dspace.content.Bitstream;
+import org.dspace.content.Bundle;
+import org.dspace.content.Item;
+import org.dspace.core.Context;
+import org.dspace.core.ContextHelper;
+import org.hibernate.Session;
+import org.hibernate.query.Query;
+
+/**
+ * Find and fix thumbnail bitstreams having no policies. Copy the Bundle's
+ * policies, or the Item's policies if none.
+ *
+ * @author mwood
+ */
+public class FixThumbnailPolicies {
+ static ResourcePolicyService resourcePolicyService;
+ static Context context;
+
+ private FixThumbnailPolicies() {}
+
+ /**
+ * Check all Items' thumbnail bitstreams for missing policies.
+ *
+ * @param argv unused.
+ */
+ static public void main(String[] argv) {
+ resourcePolicyService = AuthorizeServiceFactory.getInstance()
+ .getResourcePolicyService();
+ context = new Context();
+
+ try (Session driverSession = ContextHelper.getReadOnlySession(context)) {
+ CriteriaBuilder qBuilder = driverSession.getCriteriaBuilder();
+ CriteriaQuery criteria = qBuilder.createQuery(Item.class);
+ Root<Item> root = criteria.from(Item.class);
+ criteria.select(root);
+ Query<Item> itemQuery = driverSession.createQuery(criteria);
+ try (Stream<Item> items = itemQuery.getResultStream()) {
+ items.forEachOrdered(new ItemConsumer());
+ }
+ } catch (SQLException e) {
+ // TODO do something
+ }
+ }
+
+ /**
+ * Update the policies of an Item's THUMBNAIL Bitstreams as needed.
+ */
+ static private class ItemConsumer
+ implements Consumer<Item> {
+ /**
+ * Add parent policies to THUMBNAIL Bitstreams if missing.
+ *
+ * @param item Update bitstream policies for this Item.
+ */
+ @Override
+ public void accept(Item item) {
+ for (Bundle bundle : item.getBundles("THUMBNAIL")) {
+ for (Bitstream bitstream : bundle.getBitstreams()) {
+ List<ResourcePolicy> bitstreamPolicies
+ = bitstream.getResourcePolicies();
+ if (bitstreamPolicies.isEmpty()) {
+ List<ResourcePolicy> bundlePolicies
+ = bundle.getResourcePolicies();
+ if (bundlePolicies.isEmpty()) {
+ List<ResourcePolicy> itemPolicies
+ = item.getResourcePolicies();
+ if (itemPolicies.isEmpty()) {
+ // do nothing
+ } else {
+ copyPolicies(itemPolicies, bitstream);
+ }
+ } else {
+ copyPolicies(bundlePolicies, bitstream);
+ }
+ }
+ }
+ }
+ }
+
+ /**
+ * Copy a list of policies to a Bitstream. Only copy custom and
+ * inherited policies.
+ *
+ * @param parentPolicies the policies to be copied.
+ * @param bitstream the Bitstream to receive new policies.
+ */
+ private static void copyPolicies(List<ResourcePolicy> parentPolicies,
+ Bitstream bitstream) {
+ for (ResourcePolicy parentPolicy : parentPolicies) {
+ try {
+ if (ResourcePolicy.TYPE_CUSTOM.equals(parentPolicy.getRpType())
+ || ResourcePolicy.TYPE_INHERITED.equals(parentPolicy.getRpType())) {
+ ResourcePolicy newPolicy = resourcePolicyService.clone(
+ context, parentPolicy);
+ newPolicy.setdSpaceObject(bitstream);
+ newPolicy.setRpType(ResourcePolicy.TYPE_INHERITED);
+ resourcePolicyService.update(context, newPolicy);
+ }
+ } catch (SQLException | AuthorizeException e) {
+ System.err.format("Failed to copy policy %d to Bitstream %s: %s%n",
+ parentPolicy.getID(), bitstream.getID().toString(),
+ e.getMessage());
+ }
+ }
+ try {
+ context.commit();
+ } catch (SQLException e) {
+ System.err.format("Failed to commit updates to Bitstream %s: %s%n",
+ bitstream.getID().toString(), e.getMessage());
+ }
+ }
+ }
+}
diff --git a/dspace/modules/additions/src/main/java/org/dspace/core/ContextHelper.java b/dspace/modules/additions/src/main/java/org/dspace/core/ContextHelper.java
new file mode 100644
index 0000000000..fde3df16a0
--- /dev/null
+++ b/dspace/modules/additions/src/main/java/org/dspace/core/ContextHelper.java
@@ -0,0 +1,38 @@
+/*
+ * Copyright 2023 Indiana University.
+ */
+package org.dspace.core;
+
+import java.sql.SQLException;
+import javax.persistence.FlushModeType;
+
+import org.hibernate.Session;
+
+/**
+ * Cheat methods to expose package-private DSpace {@link Context} internals
+ * without hacking Context.
+ *
+ * @author mwood
+ */
+public class ContextHelper {
+ /**
+ * Get an additional JPA {@link Session} from a DSpace {@link Context},
+ * set default-read-only.
+ * Be sure to {@code close()} the Session when finished -- Context does not
+ * track it.
+ *
+ * @param context the DSpace session to provide database context.
+ * @return a new read-only JPA {@link Session}.
+ * @throws SQLException passed through.
+ */
+ static public Session getReadOnlySession(Context context)
+ throws SQLException {
+ DBConnection<Session> connection = context.getDBConnection();
+ Session newSession = connection.getSession()
+ .getSessionFactory()
+ .openSession();
+ newSession.setDefaultReadOnly(true);
+ newSession.setFlushMode(FlushModeType.COMMIT);
+ return newSession;
+ }
+}
In hope of seeing a proper fix for the transaction lifetime conflict,
I've filed an Issue for consideration relative to DSpace 8:
https://github.com/DSpace/DSpace/issues/8731