Account Options

  1. Sign in
The old Google Groups will be going away soon.
Switch to the new Google Groups.
Google Groups Home
« Groups Home
Batch loads create 4 works records for each citation
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  Messages 1 - 25 of 28 - Collapse all  -  Translate all to Translated (View all originals)   Newer >
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
John  
View profile  
 More options Oct 31 2011, 11:53 am
From: John <fur...@gmail.com>
Date: Mon, 31 Oct 2011 08:53:00 -0700 (PDT)
Subject: Batch loads create 4 works records for each citation
Hello,

We just updated to the trunk version.  All appeared fine the last
couple of days.

Just tried a batch load of works.  Only 7 citations in Refworks XML
format.  No errors during the batch process, but the results screen
shows all works have duplicate records.  Upon further review, the dups
are there because during the add batch process each work was created 4
times.

I have been able to duplicate this bug.

Anyone seen this or have any ideas on what my be causing it?


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Howard Ding  
View profile  
 More options Oct 31 2011, 11:54 am
From: Howard Ding <hadi...@gmail.com>
Date: Mon, 31 Oct 2011 10:54:21 -0500
Local: Mon, Oct 31 2011 11:54 am
Subject: Re: [bibapp-dev] Batch loads create 4 works records for each citation
If you send me the information to duplicate it or put it in an issue on
Github I'll take a look at it.

Howard

On 10/31/2011 10:53 AM, John wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
John  
View profile  
 More options Oct 31 2011, 11:59 am
From: John <fur...@gmail.com>
Date: Mon, 31 Oct 2011 08:59:23 -0700 (PDT)
Local: Mon, Oct 31 2011 11:59 am
Subject: Re: [bibapp-dev] Batch loads create 4 works records for each citation
Thanks, just reposted to Github.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Claudia Jürgen  
View profile  
 More options Nov 3 2011, 12:37 pm
From: Claudia Jürgen <Claudia.Juer...@ub.tu-dortmund.de>
Date: Thu, 03 Nov 2011 17:37:19 +0100
Local: Thurs, Nov 3 2011 12:37 pm
Subject: Re: [bibapp-dev] Batch loads create 4 works records for each citation
Hi,

did reproduce this, seems the delayed job is processed 4 times failing 3
times with
AASM::InvalidTransition: Event 'review' cannot transition from
'reviewable' - 1 failed attempts

Whereas each time the works in the job are processed. The first time
they got the state 3 the following duplicates 2. You can see the
changing works_added column in the table imports.

Claudia

Am 31.10.2011 16:59, schrieb John:

--
Claudia Juergen
Universitaetsbibliothek Dortmund
Eldorado
0231/755-4043
https://eldorado.tu-dortmund.de/

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Howard Ding  
View profile  
 More options Nov 3 2011, 6:24 pm
From: Howard Ding <hadi...@gmail.com>
Date: Thu, 03 Nov 2011 17:24:45 -0500
Local: Thurs, Nov 3 2011 6:24 pm
Subject: Re: [bibapp-dev] Batch loads create 4 works records for each citation
Hi,

I've not had time to look at this, but thanks to both of you for putting
the information up on Github and to you Claudia for proposing a
different workflow to go along with this. I hope to get to it soon, but
I've been spending more time on another project recently.

Howard

On 11/3/2011 11:37 AM, Claudia J rgen wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Claudia Jürgen  
View profile  
 More options Nov 4 2011, 7:47 am
From: Claudia Jürgen <Claudia.Juer...@ub.tu-dortmund.de>
Date: Fri, 04 Nov 2011 12:47:48 +0100
Local: Fri, Nov 4 2011 7:47 am
Subject: Re: [bibapp-dev] Batch loads create 4 works records for each citation
Hello,

one other thing, no import review notification email is sent, allthough
the state changes from processing to reviewable. This was all tested on
the i18n branch. The settings for email are configured properly and the
system is able to sent mail on user registration etc.

Claudia

Am 03.11.2011 23:24, schrieb Howard Ding:

--
Claudia Juergen
Universitaetsbibliothek Dortmund
Eldorado
0231/755-4043
https://eldorado.tu-dortmund.de/

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
John  
View profile  
 More options Nov 7 2011, 9:51 am
From: John <fur...@gmail.com>
Date: Mon, 7 Nov 2011 06:51:30 -0800 (PST)
Local: Mon, Nov 7 2011 9:51 am
Subject: Re: Batch loads create 4 works records for each citation
Thanks Howard.  Do you have a rough idea of when this might get
resolved?

-John

On Nov 3, 6:24 pm, Howard Ding <hadi...@gmail.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Howard Ding  
View profile  
 More options Nov 7 2011, 11:06 am
From: Howard Ding <hadi...@gmail.com>
Date: Mon, 07 Nov 2011 10:06:12 -0600
Local: Mon, Nov 7 2011 11:06 am
Subject: Re: [bibapp-dev] Re: Batch loads create 4 works records for each citation
I should get to it sometime this week.

Howard

On 11/7/2011 8:51 AM, John wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Howard Ding  
View profile  
 More options Nov 7 2011, 12:13 pm
From: Howard Ding <hadi...@gmail.com>
Date: Mon, 07 Nov 2011 11:13:03 -0600
Local: Mon, Nov 7 2011 12:13 pm
Subject: Re: [bibapp-dev] Re: Batch loads create 4 works records for each citation
Hi John,

Would it be possible for you to  send me the file that was causing
duplicates to be generated? I've been trying to duplicate the bug with a
small (one record) file of my own creation, but I'm not seeing it.

Thanks,
Howard


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
John  
View profile  
 More options Nov 7 2011, 4:05 pm
From: John <fur...@gmail.com>
Date: Mon, 7 Nov 2011 13:05:01 -0800 (PST)
Local: Mon, Nov 7 2011 4:05 pm
Subject: Re: [bibapp-dev] Re: Batch loads create 4 works records for each citation
Just emailed it.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
hading2  
View profile  
 More options Nov 7 2011, 4:29 pm
From: hading2 <hadi...@gmail.com>
Date: Mon, 7 Nov 2011 13:29:33 -0800 (PST)
Local: Mon, Nov 7 2011 4:29 pm
Subject: Re: Batch loads create 4 works records for each citation
Thanks.

What you sent me imports fine on my development machine. There are 33
works, it shows 32 new works and one duplicate in the accept/reject
view. There is a duplicate in the file, so this is fine.

I go ahead and accept it and then look at the resulting import and it
is as expected still, 32 accepted works and one duplicate.

I do get a lot of orphans, but that's to be expected as I don't have
the relevant people to match these up.

Given what Claudia has uncovered I think there must be something
subtle going on here, but I'm not sure what. I'm a little bit
suspicious about delayed job, though.

Are you able to reproduce the problem on your system at will? If so I
wonder if you might:

 a) reproduce it
 b) do a 'bundle exec rake bibapp:restart' to restart delayed job
 c) confirm that you can reproduce it again

I'll keep thinking about it and look at Claudia's comments more
closely as well. Unfortunately it's going to be hard to do anything if
I can't reproduce it.

This is probably irrelevant, but what database system are you using?

Thanks,
Howard

On Nov 7, 3:05 pm, John <fur...@gmail.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
hading2  
View profile  
 More options Nov 7 2011, 4:39 pm
From: hading2 <hadi...@gmail.com>
Date: Mon, 7 Nov 2011 13:39:13 -0800 (PST)
Local: Mon, Nov 7 2011 4:39 pm
Subject: Re: Batch loads create 4 works records for each citation
Just to record it here, this is how it appears the import is supposed
to work:

1. User goes to import/new view and uploads a file
2. create action in controller creates a new import, attaches the
uploaded file to it, and saves it. This triggers an after_save
callback


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
hading2  
View profile  
 More options Nov 7 2011, 4:43 pm
From: hading2 <hadi...@gmail.com>
Date: Mon, 7 Nov 2011 13:43:49 -0800 (PST)
Local: Mon, Nov 7 2011 4:43 pm
Subject: Re: Batch loads create 4 works records for each citation
Sorry, accidentally sent that before it was complete.

Just to record it here, this is how it appears the import is
supposedto work:
 1. User goes to import/new view and uploads a file 2. create action
in controller creates a new import, attaches theuploaded file to it,
and saves it. This triggers an after_savecallback on the import.
3. After save callback transitions for recieved (sic) to processing
and runs queue_import, which just runs batch_import via delayed job.
4. batch_import parses import file, creates works from it, saves them,
transitions from processing to reviewable, which runs notify_user,
which should email the user that the batch is ready to review.
5. User goes to show view and accepts/rejects batch
6a. On reject there is a transition to rejected state and all works
are destroyed (via delayed job)
6b. On accept there is a transition to accepted state and
process_accepted_import is run via delayed job. This creates
contributorships, auto-verifies if the imports were for a specific
person, indexes, etc.

Howard


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Howard Ding  
View profile  
 More options Nov 7 2011, 5:08 pm
From: Howard Ding <hadi...@gmail.com>
Date: Mon, 07 Nov 2011 16:08:18 -0600
Local: Mon, Nov 7 2011 5:08 pm
Subject: Re: [bibapp-dev] Re: Batch loads create 4 works records for each citation
Another longshot, but what version of the assm gem are you using?

bundle show assm

I don't have any version information for this in the Gemfile, but I do
recall that I had to rewrite some of the Import model code as a result
of going to a new version of assm. I'm currently on 2.3.1. Based on the
history of aasm it was probably a 2.1 (or prior) => 2.2 problem.

Regardless, I'll more specific in the Gemfile.

Thanks,
Howard


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Claudia Jürgen  
View profile  
 More options Nov 8 2011, 7:30 am
From: Claudia Jürgen <Claudia.Juer...@ub.tu-dortmund.de>
Date: Tue, 08 Nov 2011 13:30:26 +0100
Local: Tues, Nov 8 2011 7:30 am
Subject: Re: [bibapp-dev] Re: Batch loads create 4 works records for each citation
Hello Howard,

the aasm gem used on the i18n instance with the same behaviour is 2.3.1.

Claudia

Am 07.11.2011 23:08, schrieb Howard Ding:

> Another longshot, but what version of the assm gem are you using?

> bundle show assm

> I don't have any version information for this in the Gemfile, but I do
> recall that I had to rewrite some of the Import model code as a result
> of going to a new version of assm. I'm currently on 2.3.1. Based on the
> history of aasm it was probably a 2.1 (or prior) => 2.2 problem.

> Regardless, I'll more specific in the Gemfile.

> Thanks,
> Howard

--
Claudia Juergen
Universitaetsbibliothek Dortmund
Eldorado
0231/755-4043
https://eldorado.tu-dortmund.de/

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Claudia Jürgen  
View profile  
 More options Nov 8 2011, 8:26 am
From: Claudia Jürgen <Claudia.Juer...@ub.tu-dortmund.de>
Date: Tue, 08 Nov 2011 14:26:02 +0100
Local: Tues, Nov 8 2011 8:26 am
Subject: Re: [bibapp-dev] Re: Batch loads create 4 works records for each citation
Hi Howard,

I think it's step 4 which causes the trouble
The 'last_error" from the table delayed_jobs is always the same:
Event 'review' cannot transition from 'reviewable'
...

One other question, might be related to the issue, but can just be me
being new to rails, ruby etc.

The email notifications are not sent.

Is this due to the fact, that I'm running in development mode as
config/environments/development.rb indicates.
But other emails (register, password forgotten) etc are sent.

Claudia

Am 07.11.2011 22:43, schrieb hading2:

--
Claudia Juergen
Universitaetsbibliothek Dortmund
Eldorado
0231/755-4043
https://eldorado.tu-dortmund.de/

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
John  
View profile  
 More options Nov 8 2011, 9:53 am
From: John <fur...@gmail.com>
Date: Tue, 8 Nov 2011 06:53:21 -0800 (PST)
Local: Tues, Nov 8 2011 9:53 am
Subject: Re: Batch loads create 4 works records for each citation
Hi Howard,

assm gem is at 2.3.1

I restarted as you said above and still getting the same result during
the batch import.

Also, just noticed that when I choose to reject the batch, the works
are not destroyed.  They remain in the database.

Database system is mysql

On Nov 7, 5:08 pm, Howard Ding <hadi...@gmail.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
John  
View profile  
 More options Nov 8 2011, 10:05 am
From: John <fur...@gmail.com>
Date: Tue, 8 Nov 2011 07:05:36 -0800 (PST)
Local: Tues, Nov 8 2011 10:05 am
Subject: Re: Batch loads create 4 works records for each citation
Scratch that.  Records are being destroyed, just in small chunks and
very slowly.

On Nov 8, 9:53 am, John <fur...@gmail.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "Deduplication and Delayed Jobs Configuration was [Re: [bibapp-dev] Re: Batch loads create 4 works records for each citation]" by Claudia Jürgen
Claudia Jürgen  
View profile  
 More options Nov 8 2011, 11:19 am
From: Claudia Jürgen <Claudia.Juer...@ub.tu-dortmund.de>
Date: Tue, 08 Nov 2011 17:19:26 +0100
Local: Tues, Nov 8 2011 11:19 am
Subject: Deduplication and Delayed Jobs Configuration was [Re: [bibapp-dev] Re: Batch loads create 4 works records for each citation]
Hi Howard,

one other question about step 4. When the work is saved, isn't the
deduplication run? This sets the state to either accepted or duplicate.
There is no option for in process at this point.

And a question about the delayed_jobs. There is no configuration like
config/initializers/delayed_job.rb

So the default settings are taken and if I got it right a job is tried
25 times.

creating
config/initializers/delayed_job.rb
with just the
Delayed::Worker.max_attempts = 1

stopped the perpetual creation of duplicate works during import

Nonetheless the work created during import still got STATE_ACCEPTED
instead of STATE_IN_PROCESS.

Claudia

Am 07.11.2011 22:43, schrieb hading2:

--
Claudia Juergen
Universitaetsbibliothek Dortmund
Eldorado
0231/755-4043
https://eldorado.tu-dortmund.de/

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "Batch loads create 4 works records for each citation" by Howard Ding
Howard Ding  
View profile  
 More options Nov 8 2011, 12:53 pm
From: Howard Ding <hadi...@gmail.com>
Date: Tue, 08 Nov 2011 11:53:34 -0600
Local: Tues, Nov 8 2011 12:53 pm
Subject: Re: [bibapp-dev] Re: Batch loads create 4 works records for each citation
On 11/8/2011 9:05 AM, John wrote:
> Scratch that.  Records are being destroyed, just in small chunks and
> very slowly.

At present on rejection the works are destroyed via delayed_job, so this
(at least) is as expected.

Howard


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Howard Ding  
View profile  
 More options Nov 8 2011, 4:32 pm
From: Howard Ding <hadi...@gmail.com>
Date: Tue, 08 Nov 2011 15:32:24 -0600
Local: Tues, Nov 8 2011 4:32 pm
Subject: Re: [bibapp-dev] Re: Batch loads create 4 works records for each citation
I still don't have a fix for this, although I do have some thoughts.

Based on the delayed_job errors that Claudia reported it looks like
somehow batch_import in the Import model is getting called more than
once on an Import. batch_import is supposed to take an import in the
processing state, create works from it, and then save it and transition
it to the reviewable state. The error that she reports seems to indicate
that it is finding an import already in the reviewable state and hence
throwing an error. However, the attempt to transition occurs at the end
of the method, so the work creation happens regardless, which could
explain the creation of multiples.

A dirty fix might be to a) have this method check that the import it is
working on is in the right state and/or b) have it work inside a
transaction so that the extra works get thrown away if there is a
problem. I don't consider this ideal because it still doesn't give me
any idea _why_ batch_import is getting called more than once. I
speculate that this could be because there an error occurs and
delayed_job retries the job, but things aren't getting rolled back
correctly - but this explanation doesn't really help to understand why
it eventually succeeds instead of failing eternally.

I'm still thinking about it, but I may implement my dirty fix, as I
don't see any harm in it regardless. If I do that I'll write again when
I push it and you both can let me know if it fixes things or at least
changes what goes wrong. I still haven't managed to duplicate the issue
here.

Howard


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Howard Ding  
View profile  
 More options Nov 8 2011, 5:11 pm
From: Howard Ding <hadi...@gmail.com>
Date: Tue, 08 Nov 2011 16:11:30 -0600
Local: Tues, Nov 8 2011 5:11 pm
Subject: Re: [bibapp-dev] Re: Batch loads create 4 works records for each citation
Hi,

I've pushed a possible dirty fix of the nature I described to master. If
John and/or Claudia (it's not in i18n yet, though) would be willing to
give this a try and let me know if it fixes or changes anything I'd
appreciate it.

Note that since it affects delayed_job tasks in addition to restarting
your rails server you also need to 'bundle exec rake bibapp:restart' so
that delayed_job will run with the new code.

Thanks,
Howard


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Claudia Jürgen  
View profile  
 More options Nov 9 2011, 2:41 am
From: Claudia Jürgen <Claudia.Juer...@ub.tu-dortmund.de>
Date: Wed, 09 Nov 2011 08:41:41 +0100
Local: Wed, Nov 9 2011 2:41 am
Subject: Re: [bibapp-dev] Re: Batch loads create 4 works records for each citation
Hi all,

as for the why it is the delayed_job being retried. Setting the maximum
attempts to 1 via a config/initializers/delayed_job.rb containing
Delayed::Worker.max_attempts = 1
will result in only 1 work being created. Still the transisiton seems to
fails as notify_user is not reached.

As you can't reproduce the error, might be the environment. I'm working
on Suse Linux 10 with Postgres 8.4.

There were other problems e.g. starting Solr, which were solved by
replacing localhost with 127.0.0.1 in
app/models/publisher.rb
config/initializers/solr.rb
lib/tasks/solr.rake

Claudia

Am 08.11.2011 22:32, schrieb Howard Ding:

--
Claudia Juergen
Universitaetsbibliothek Dortmund
Eldorado
0231/755-4043
https://eldorado.tu-dortmund.de/

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Claudia Jürgen  
View profile  
 More options Nov 9 2011, 3:26 am
From: Claudia Jürgen <Claudia.Juer...@ub.tu-dortmund.de>
Date: Wed, 09 Nov 2011 09:26:40 +0100
Local: Wed, Nov 9 2011 3:26 am
Subject: Re: [bibapp-dev] Re: Batch loads create 4 works records for each citation
A bit more information,

the transition error only occurs on the second attempt. Setting the
delayed job to 1 attempt the last_error in the delayed_jobs is

  {No route matches {:controller=>"imports", :action=>"show",
:user_id=>34, :locale=>#<User id: 2, email: "claudia.juer...@udo.edu",
crypted_password: "0909b8a03
acc14caf0ab333b07557f9d8d2e3ed0", salt: "KqFX4JzsdFWcvdJHTLh",
created_at: "2011-10-25 12:41:16", updated_at: "2011-10-25 12:41:37",
remember_token: nil, remember_token_expires_a
t: nil, activation_code: nil, activated_at: "2011-10-25 12:41:37",
persistence_token:
"9f76012b609391560adef7c3ef1c46558b6ba33730bd5e9be86...">}
...

Claudia

Am 09.11.2011 08:41, schrieb Claudia J rgen:

--
Claudia Juergen
Universitaetsbibliothek Dortmund
Eldorado
0231/755-4043
https://eldorado.tu-dortmund.de/

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Howard Ding  
View profile  
 More options Nov 9 2011, 11:23 am
From: Howard Ding <hadi...@gmail.com>
Date: Wed, 09 Nov 2011 10:23:21 -0600
Local: Wed, Nov 9 2011 11:23 am
Subject: Re: [bibapp-dev] Re: Batch loads create 4 works records for each citation
Well that might actually be a bit of a clue.

So perhaps the first time through it is actually making it most of the
way through, including creating the work and transitioning the state of
the Import, but then errors out, leaving the other attempts to fail on
attempting the transition.

My change from yesterday might partially fix this, or not, depending on
where the error below is happening. I'm not sure where it's coming from,
but I will look at it today. If it does partially fix it what should
happen now is that the delayed job will still fail, I think, but all the
db state will be rolled back, so it would keep retrying from the same
place.

I think it's possible that the error below comes from the notifier - it
may be trying to generate the URL for the user to get back to review the
import, but the parameters are obviously wrong here. I'm not sure why
the master branch would fail in this place, though.

Ah well, I'll give it a look soon.

Howard

On 11/9/2011 2:26 AM, Claudia J rgen wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Messages 1 - 25 of 28   Newer >
« Back to Discussions « Newer topic     Older topic »