We just updated to the trunk version. All appeared fine the last
couple of days.
Just tried a batch load of works. Only 7 citations in Refworks XML
format. No errors during the batch process, but the results screen
shows all works have duplicate records. Upon further review, the dups
are there because during the add batch process each work was created 4
times.
I have been able to duplicate this bug.
Anyone seen this or have any ideas on what my be causing it?
> We just updated to the trunk version. All appeared fine the last > couple of days.
> Just tried a batch load of works. Only 7 citations in Refworks XML > format. No errors during the batch process, but the results screen > shows all works have duplicate records. Upon further review, the dups > are there because during the add batch process each work was created 4 > times.
> I have been able to duplicate this bug.
> Anyone seen this or have any ideas on what my be causing it?
Howard Ding wrote:
> If you send me the information to duplicate it or put it in an issue on
> Github I'll take a look at it.
> Howard
> On 10/31/2011 10:53 AM, John wrote:
> > Hello,
> > We just updated to the trunk version. All appeared fine the last
> > couple of days.
> > Just tried a batch load of works. Only 7 citations in Refworks XML
> > format. No errors during the batch process, but the results screen
> > shows all works have duplicate records. Upon further review, the dups
> > are there because during the add batch process each work was created 4
> > times.
> > I have been able to duplicate this bug.
> > Anyone seen this or have any ideas on what my be causing it?
did reproduce this, seems the delayed job is processed 4 times failing 3 times with AASM::InvalidTransition: Event 'review' cannot transition from 'reviewable' - 1 failed attempts
Whereas each time the works in the job are processed. The first time they got the state 3 the following duplicates 2. You can see the changing works_added column in the table imports.
> Howard Ding wrote: >> If you send me the information to duplicate it or put it in an issue on >> Github I'll take a look at it.
>> Howard
>> On 10/31/2011 10:53 AM, John wrote: >>> Hello,
>>> We just updated to the trunk version. All appeared fine the last >>> couple of days.
>>> Just tried a batch load of works. Only 7 citations in Refworks XML >>> format. No errors during the batch process, but the results screen >>> shows all works have duplicate records. Upon further review, the dups >>> are there because during the add batch process each work was created 4 >>> times.
>>> I have been able to duplicate this bug.
>>> Anyone seen this or have any ideas on what my be causing it?
I've not had time to look at this, but thanks to both of you for putting the information up on Github and to you Claudia for proposing a different workflow to go along with this. I hope to get to it soon, but I've been spending more time on another project recently.
> did reproduce this, seems the delayed job is processed 4 times failing > 3 times with > AASM::InvalidTransition: Event 'review' cannot transition from > 'reviewable' - 1 failed attempts
> Whereas each time the works in the job are processed. The first time > they got the state 3 the following duplicates 2. You can see the > changing works_added column in the table imports.
> Claudia
> Am 31.10.2011 16:59, schrieb John: >> Thanks, just reposted to Github.
>> Howard Ding wrote: >>> If you send me the information to duplicate it or put it in an issue on >>> Github I'll take a look at it.
>>> Howard
>>> On 10/31/2011 10:53 AM, John wrote: >>>> Hello,
>>>> We just updated to the trunk version. All appeared fine the last >>>> couple of days.
>>>> Just tried a batch load of works. Only 7 citations in Refworks XML >>>> format. No errors during the batch process, but the results screen >>>> shows all works have duplicate records. Upon further review, the dups >>>> are there because during the add batch process each work was created 4 >>>> times.
>>>> I have been able to duplicate this bug.
>>>> Anyone seen this or have any ideas on what my be causing it?
one other thing, no import review notification email is sent, allthough the state changes from processing to reviewable. This was all tested on the i18n branch. The settings for email are configured properly and the system is able to sent mail on user registration etc.
> I've not had time to look at this, but thanks to both of you for putting > the information up on Github and to you Claudia for proposing a > different workflow to go along with this. I hope to get to it soon, but > I've been spending more time on another project recently.
>> did reproduce this, seems the delayed job is processed 4 times failing >> 3 times with >> AASM::InvalidTransition: Event 'review' cannot transition from >> 'reviewable' - 1 failed attempts
>> Whereas each time the works in the job are processed. The first time >> they got the state 3 the following duplicates 2. You can see the >> changing works_added column in the table imports.
>> Claudia
>> Am 31.10.2011 16:59, schrieb John: >>> Thanks, just reposted to Github.
>>> Howard Ding wrote: >>>> If you send me the information to duplicate it or put it in an issue on >>>> Github I'll take a look at it.
>>>> Howard
>>>> On 10/31/2011 10:53 AM, John wrote: >>>>> Hello,
>>>>> We just updated to the trunk version. All appeared fine the last >>>>> couple of days.
>>>>> Just tried a batch load of works. Only 7 citations in Refworks XML >>>>> format. No errors during the batch process, but the results screen >>>>> shows all works have duplicate records. Upon further review, the dups >>>>> are there because during the add batch process each work was created 4 >>>>> times.
>>>>> I have been able to duplicate this bug.
>>>>> Anyone seen this or have any ideas on what my be causing it?
> I've not had time to look at this, but thanks to both of you for putting
> the information up on Github and to you Claudia for proposing a
> different workflow to go along with this. I hope to get to it soon, but
> I've been spending more time on another project recently.
> Howard
> On 11/3/2011 11:37 AM, Claudia J rgen wrote:
> > Hi,
> > did reproduce this, seems the delayed job is processed 4 times failing
> > 3 times with
> > AASM::InvalidTransition: Event 'review' cannot transition from
> > 'reviewable' - 1 failed attempts
> > Whereas each time the works in the job are processed. The first time
> > they got the state 3 the following duplicates 2. You can see the
> > changing works_added column in the table imports.
> > Claudia
> > Am 31.10.2011 16:59, schrieb John:
> >> Thanks, just reposted to Github.
> >> Howard Ding wrote:
> >>> If you send me the information to duplicate it or put it in an issue on
> >>> Github I'll take a look at it.
> >>> Howard
> >>> On 10/31/2011 10:53 AM, John wrote:
> >>>> Hello,
> >>>> We just updated to the trunk version. All appeared fine the last
> >>>> couple of days.
> >>>> Just tried a batch load of works. Only 7 citations in Refworks XML
> >>>> format. No errors during the batch process, but the results screen
> >>>> shows all works have duplicate records. Upon further review, the dups
> >>>> are there because during the add batch process each work was created 4
> >>>> times.
> >>>> I have been able to duplicate this bug.
> >>>> Anyone seen this or have any ideas on what my be causing it?
> Thanks Howard. Do you have a rough idea of when this might get > resolved?
> -John
> On Nov 3, 6:24 pm, Howard Ding<hadi...@gmail.com> wrote: >> Hi,
>> I've not had time to look at this, but thanks to both of you for putting >> the information up on Github and to you Claudia for proposing a >> different workflow to go along with this. I hope to get to it soon, but >> I've been spending more time on another project recently.
>> Howard
>> On 11/3/2011 11:37 AM, Claudia J rgen wrote:
>>> Hi, >>> did reproduce this, seems the delayed job is processed 4 times failing >>> 3 times with >>> AASM::InvalidTransition: Event 'review' cannot transition from >>> 'reviewable' - 1 failed attempts >>> Whereas each time the works in the job are processed. The first time >>> they got the state 3 the following duplicates 2. You can see the >>> changing works_added column in the table imports. >>> Claudia >>> Am 31.10.2011 16:59, schrieb John: >>>> Thanks, just reposted to Github. >>>> Howard Ding wrote: >>>>> If you send me the information to duplicate it or put it in an issue on >>>>> Github I'll take a look at it. >>>>> Howard >>>>> On 10/31/2011 10:53 AM, John wrote: >>>>>> Hello, >>>>>> We just updated to the trunk version. All appeared fine the last >>>>>> couple of days. >>>>>> Just tried a batch load of works. Only 7 citations in Refworks XML >>>>>> format. No errors during the batch process, but the results screen >>>>>> shows all works have duplicate records. Upon further review, the dups >>>>>> are there because during the add batch process each work was created 4 >>>>>> times. >>>>>> I have been able to duplicate this bug. >>>>>> Anyone seen this or have any ideas on what my be causing it?
Would it be possible for you to send me the file that was causing duplicates to be generated? I've been trying to duplicate the bug with a small (one record) file of my own creation, but I'm not seeing it.
> Would it be possible for you to send me the file that was causing
> duplicates to be generated? I've been trying to duplicate the bug with a
> small (one record) file of my own creation, but I'm not seeing it.
What you sent me imports fine on my development machine. There are 33
works, it shows 32 new works and one duplicate in the accept/reject
view. There is a duplicate in the file, so this is fine.
I go ahead and accept it and then look at the resulting import and it
is as expected still, 32 accepted works and one duplicate.
I do get a lot of orphans, but that's to be expected as I don't have
the relevant people to match these up.
Given what Claudia has uncovered I think there must be something
subtle going on here, but I'm not sure what. I'm a little bit
suspicious about delayed job, though.
Are you able to reproduce the problem on your system at will? If so I
wonder if you might:
a) reproduce it
b) do a 'bundle exec rake bibapp:restart' to restart delayed job
c) confirm that you can reproduce it again
I'll keep thinking about it and look at Claudia's comments more
closely as well. Unfortunately it's going to be hard to do anything if
I can't reproduce it.
This is probably irrelevant, but what database system are you using?
> > Would it be possible for you to send me the file that was causing
> > duplicates to be generated? I've been trying to duplicate the bug with a
> > small (one record) file of my own creation, but I'm not seeing it.
Just to record it here, this is how it appears the import is supposed
to work:
1. User goes to import/new view and uploads a file
2. create action in controller creates a new import, attaches the
uploaded file to it, and saves it. This triggers an after_save
callback
Sorry, accidentally sent that before it was complete.
Just to record it here, this is how it appears the import is
supposedto work:
1. User goes to import/new view and uploads a file 2. create action
in controller creates a new import, attaches theuploaded file to it,
and saves it. This triggers an after_savecallback on the import.
3. After save callback transitions for recieved (sic) to processing
and runs queue_import, which just runs batch_import via delayed job.
4. batch_import parses import file, creates works from it, saves them,
transitions from processing to reviewable, which runs notify_user,
which should email the user that the batch is ready to review.
5. User goes to show view and accepts/rejects batch
6a. On reject there is a transition to rejected state and all works
are destroyed (via delayed job)
6b. On accept there is a transition to accepted state and
process_accepted_import is run via delayed job. This creates
contributorships, auto-verifies if the imports were for a specific
person, indexes, etc.
Another longshot, but what version of the assm gem are you using?
bundle show assm
I don't have any version information for this in the Gemfile, but I do recall that I had to rewrite some of the Import model code as a result of going to a new version of assm. I'm currently on 2.3.1. Based on the history of aasm it was probably a 2.1 (or prior) => 2.2 problem.
the aasm gem used on the i18n instance with the same behaviour is 2.3.1.
Claudia
Am 07.11.2011 23:08, schrieb Howard Ding:
> Another longshot, but what version of the assm gem are you using?
> bundle show assm
> I don't have any version information for this in the Gemfile, but I do > recall that I had to rewrite some of the Import model code as a result > of going to a new version of assm. I'm currently on 2.3.1. Based on the > history of aasm it was probably a 2.1 (or prior) => 2.2 problem.
I think it's step 4 which causes the trouble The 'last_error" from the table delayed_jobs is always the same: Event 'review' cannot transition from 'reviewable' ...
One other question, might be related to the issue, but can just be me being new to rails, ruby etc.
The email notifications are not sent.
Is this due to the fact, that I'm running in development mode as config/environments/development.rb indicates. But other emails (register, password forgotten) etc are sent.
> Sorry, accidentally sent that before it was complete.
> Just to record it here, this is how it appears the import is > supposedto work: > 1. User goes to import/new view and uploads a file 2. create action > in controller creates a new import, attaches theuploaded file to it, > and saves it. This triggers an after_savecallback on the import. > 3. After save callback transitions for recieved (sic) to processing > and runs queue_import, which just runs batch_import via delayed job. > 4. batch_import parses import file, creates works from it, saves them, > transitions from processing to reviewable, which runs notify_user, > which should email the user that the batch is ready to review. > 5. User goes to show view and accepts/rejects batch > 6a. On reject there is a transition to rejected state and all works > are destroyed (via delayed job) > 6b. On accept there is a transition to accepted state and > process_accepted_import is run via delayed job. This creates > contributorships, auto-verifies if the imports were for a specific > person, indexes, etc.
> Another longshot, but what version of the assm gem are you using?
> bundle show assm
> I don't have any version information for this in the Gemfile, but I do
> recall that I had to rewrite some of the Import model code as a result
> of going to a new version of assm. I'm currently on 2.3.1. Based on the
> history of aasm it was probably a 2.1 (or prior) => 2.2 problem.
> I restarted as you said above and still getting the same result during
> the batch import.
> Also, just noticed that when I choose to reject the batch, the works
> are not destroyed. They remain in the database.
> Database system is mysql
> On Nov 7, 5:08 pm, Howard Ding <hadi...@gmail.com> wrote:
> > Another longshot, but what version of the assm gem are you using?
> > bundle show assm
> > I don't have any version information for this in the Gemfile, but I do
> > recall that I had to rewrite some of the Import model code as a result
> > of going to a new version of assm. I'm currently on 2.3.1. Based on the
> > history of aasm it was probably a 2.1 (or prior) => 2.2 problem.
> > Regardless, I'll more specific in the Gemfile.
Discussion subject changed to "Deduplication and Delayed Jobs Configuration was [Re: [bibapp-dev] Re: Batch loads create 4 works records for each citation]" by Claudia Jürgen
one other question about step 4. When the work is saved, isn't the deduplication run? This sets the state to either accepted or duplicate. There is no option for in process at this point.
And a question about the delayed_jobs. There is no configuration like config/initializers/delayed_job.rb
So the default settings are taken and if I got it right a job is tried 25 times.
creating config/initializers/delayed_job.rb with just the Delayed::Worker.max_attempts = 1
stopped the perpetual creation of duplicate works during import
Nonetheless the work created during import still got STATE_ACCEPTED instead of STATE_IN_PROCESS.
> Sorry, accidentally sent that before it was complete.
> Just to record it here, this is how it appears the import is > supposedto work: > 1. User goes to import/new view and uploads a file 2. create action > in controller creates a new import, attaches theuploaded file to it, > and saves it. This triggers an after_savecallback on the import. > 3. After save callback transitions for recieved (sic) to processing > and runs queue_import, which just runs batch_import via delayed job. > 4. batch_import parses import file, creates works from it, saves them, > transitions from processing to reviewable, which runs notify_user, > which should email the user that the batch is ready to review. > 5. User goes to show view and accepts/rejects batch > 6a. On reject there is a transition to rejected state and all works > are destroyed (via delayed job) > 6b. On accept there is a transition to accepted state and > process_accepted_import is run via delayed job. This creates > contributorships, auto-verifies if the imports were for a specific > person, indexes, etc.
I still don't have a fix for this, although I do have some thoughts.
Based on the delayed_job errors that Claudia reported it looks like somehow batch_import in the Import model is getting called more than once on an Import. batch_import is supposed to take an import in the processing state, create works from it, and then save it and transition it to the reviewable state. The error that she reports seems to indicate that it is finding an import already in the reviewable state and hence throwing an error. However, the attempt to transition occurs at the end of the method, so the work creation happens regardless, which could explain the creation of multiples.
A dirty fix might be to a) have this method check that the import it is working on is in the right state and/or b) have it work inside a transaction so that the extra works get thrown away if there is a problem. I don't consider this ideal because it still doesn't give me any idea _why_ batch_import is getting called more than once. I speculate that this could be because there an error occurs and delayed_job retries the job, but things aren't getting rolled back correctly - but this explanation doesn't really help to understand why it eventually succeeds instead of failing eternally.
I'm still thinking about it, but I may implement my dirty fix, as I don't see any harm in it regardless. If I do that I'll write again when I push it and you both can let me know if it fixes things or at least changes what goes wrong. I still haven't managed to duplicate the issue here.
I've pushed a possible dirty fix of the nature I described to master. If John and/or Claudia (it's not in i18n yet, though) would be willing to give this a try and let me know if it fixes or changes anything I'd appreciate it.
Note that since it affects delayed_job tasks in addition to restarting your rails server you also need to 'bundle exec rake bibapp:restart' so that delayed_job will run with the new code.
as for the why it is the delayed_job being retried. Setting the maximum attempts to 1 via a config/initializers/delayed_job.rb containing Delayed::Worker.max_attempts = 1 will result in only 1 work being created. Still the transisiton seems to fails as notify_user is not reached.
As you can't reproduce the error, might be the environment. I'm working on Suse Linux 10 with Postgres 8.4.
There were other problems e.g. starting Solr, which were solved by replacing localhost with 127.0.0.1 in app/models/publisher.rb config/initializers/solr.rb lib/tasks/solr.rake
> I still don't have a fix for this, although I do have some thoughts.
> Based on the delayed_job errors that Claudia reported it looks like > somehow batch_import in the Import model is getting called more than > once on an Import. batch_import is supposed to take an import in the > processing state, create works from it, and then save it and transition > it to the reviewable state. The error that she reports seems to indicate > that it is finding an import already in the reviewable state and hence > throwing an error. However, the attempt to transition occurs at the end > of the method, so the work creation happens regardless, which could > explain the creation of multiples.
> A dirty fix might be to a) have this method check that the import it is > working on is in the right state and/or b) have it work inside a > transaction so that the extra works get thrown away if there is a > problem. I don't consider this ideal because it still doesn't give me > any idea _why_ batch_import is getting called more than once. I > speculate that this could be because there an error occurs and > delayed_job retries the job, but things aren't getting rolled back > correctly - but this explanation doesn't really help to understand why > it eventually succeeds instead of failing eternally.
> I'm still thinking about it, but I may implement my dirty fix, as I > don't see any harm in it regardless. If I do that I'll write again when > I push it and you both can let me know if it fixes things or at least > changes what goes wrong. I still haven't managed to duplicate the issue > here.
> as for the why it is the delayed_job being retried. Setting the maximum > attempts to 1 via a config/initializers/delayed_job.rb containing > Delayed::Worker.max_attempts = 1 > will result in only 1 work being created. Still the transisiton seems to > fails as notify_user is not reached.
> As you can't reproduce the error, might be the environment. I'm working > on Suse Linux 10 with Postgres 8.4.
> There were other problems e.g. starting Solr, which were solved by > replacing localhost with 127.0.0.1 in > app/models/publisher.rb > config/initializers/solr.rb > lib/tasks/solr.rake
> Claudia
> Am 08.11.2011 22:32, schrieb Howard Ding: >> I still don't have a fix for this, although I do have some thoughts.
>> Based on the delayed_job errors that Claudia reported it looks like >> somehow batch_import in the Import model is getting called more than >> once on an Import. batch_import is supposed to take an import in the >> processing state, create works from it, and then save it and transition >> it to the reviewable state. The error that she reports seems to indicate >> that it is finding an import already in the reviewable state and hence >> throwing an error. However, the attempt to transition occurs at the end >> of the method, so the work creation happens regardless, which could >> explain the creation of multiples.
>> A dirty fix might be to a) have this method check that the import it is >> working on is in the right state and/or b) have it work inside a >> transaction so that the extra works get thrown away if there is a >> problem. I don't consider this ideal because it still doesn't give me >> any idea _why_ batch_import is getting called more than once. I >> speculate that this could be because there an error occurs and >> delayed_job retries the job, but things aren't getting rolled back >> correctly - but this explanation doesn't really help to understand why >> it eventually succeeds instead of failing eternally.
>> I'm still thinking about it, but I may implement my dirty fix, as I >> don't see any harm in it regardless. If I do that I'll write again when >> I push it and you both can let me know if it fixes things or at least >> changes what goes wrong. I still haven't managed to duplicate the issue >> here.
So perhaps the first time through it is actually making it most of the way through, including creating the work and transitioning the state of the Import, but then errors out, leaving the other attempts to fail on attempting the transition.
My change from yesterday might partially fix this, or not, depending on where the error below is happening. I'm not sure where it's coming from, but I will look at it today. If it does partially fix it what should happen now is that the delayed job will still fail, I think, but all the db state will be rolled back, so it would keep retrying from the same place.
I think it's possible that the error below comes from the notifier - it may be trying to generate the URL for the user to get back to review the import, but the parameters are obviously wrong here. I'm not sure why the master branch would fail in this place, though.
> Am 09.11.2011 08:41, schrieb Claudia J rgen: >> Hi all,
>> as for the why it is the delayed_job being retried. Setting the maximum >> attempts to 1 via a config/initializers/delayed_job.rb containing >> Delayed::Worker.max_attempts = 1 >> will result in only 1 work being created. Still the transisiton seems to >> fails as notify_user is not reached.
>> As you can't reproduce the error, might be the environment. I'm working >> on Suse Linux 10 with Postgres 8.4.
>> There were other problems e.g. starting Solr, which were solved by >> replacing localhost with 127.0.0.1 in >> app/models/publisher.rb >> config/initializers/solr.rb >> lib/tasks/solr.rake
>> Claudia
>> Am 08.11.2011 22:32, schrieb Howard Ding: >>> I still don't have a fix for this, although I do have some thoughts.
>>> Based on the delayed_job errors that Claudia reported it looks like >>> somehow batch_import in the Import model is getting called more than >>> once on an Import. batch_import is supposed to take an import in the >>> processing state, create works from it, and then save it and transition >>> it to the reviewable state. The error that she reports seems to >>> indicate >>> that it is finding an import already in the reviewable state and hence >>> throwing an error. However, the attempt to transition occurs at the end >>> of the method, so the work creation happens regardless, which could >>> explain the creation of multiples.
>>> A dirty fix might be to a) have this method check that the import it is >>> working on is in the right state and/or b) have it work inside a >>> transaction so that the extra works get thrown away if there is a >>> problem. I don't consider this ideal because it still doesn't give me >>> any idea _why_ batch_import is getting called more than once. I >>> speculate that this could be because there an error occurs and >>> delayed_job retries the job, but things aren't getting rolled back >>> correctly - but this explanation doesn't really help to understand why >>> it eventually succeeds instead of failing eternally.
>>> I'm still thinking about it, but I may implement my dirty fix, as I >>> don't see any harm in it regardless. If I do that I'll write again when >>> I push it and you both can let me know if it fixes things or at least >>> changes what goes wrong. I still haven't managed to duplicate the issue >>> here.