Migration Issue 3.6 to 4.x - Accounts/ Passwords

Amber Leahey

unread,

Mar 24, 2016, 1:00:59 PM3/24/16

to Dataverse Users Community

Hi Dataverse and community,

At Scholars Portal, we have migrated our 3.6 data to 4.2.4, and we ran into some problems with multiple user accounts and passwords.

So for example, I had four DV accounts in 3.6, all tied to the same e-mail account. (Not exactly best practice I'm sure, but it just evolved that way.) What we are wondering is how can we migrate all accounts into one for the same e-mail address, from what we see, once the user resets the password, only the data that was tied to the last DV account (creation) gets associated with the new account / password. So I've lost control of some data in my case, given that I had multiple accounts.

We've also considered cleaning up our data in 3.6, is this required before migration?

Also, there doesn't appear to be a way to migrate passwords? So for example, upon login to the upgraded system, using my e-mail address, I am prompted to create a new password, is this to be expected for all users once we upgrade?

Many many thanks for any ideas or clarification!

Best,

Amber Leahey

Durand, Gustavo

unread,

Mar 24, 2016, 4:10:37 PM3/24/16

to dataverse...@googlegroups.com

Hi Amber,

As part of migration, we did a pre-migration step of merging accounts that had the same e-mail address. In the dataverse/scripts/migration directory, there is a file, scrub_duplicate e-mails:

https://github.com/IQSS/dataverse/blob/develop/scripts/migration/scrub_duplicate_emails.sql

Part of what we need to do for migration is better explain how to use this, so I'll send some quick notes here:

- the first three commented out queries are there to help you see the extent of the duplicates e-mails you have*

* this doesn't account for e-mails with different cases, e.g. gdu...@iq.harvard.edu and gdu...@IQ.harvard.edu. In 4.3 we made changes to disallow this (i.e. they would be considered duplicates) and again had to do some merging. BUT it wold be easier / better to do that as part of this merging state - so also on our todo is modify these to merge those accounts. My guess is it would just be a case of adding some strategic "lower()" calls to the queries.

- the next query is the delete query that removes the duplicates. This will not immediately work, until after accounts have been "merged"

- lastly you have the update queries - these go through and modify references so that all references go to the user account with the lowest value id (based on the assumption that this was the account the user originally created). E.g if I have 2 users with one dataset each, this will "move" the references from the 2nd dataset to the 1st, so then you would have 1 user with 2 datasets, and the 2nd with none, and so it can now be deleted.

- some of those updates could fail - say you have given permission to the same dataset to both users accounts. When doing the transfer, you would end up with two rows that are the same, which would violate unique constraint. In this case, you don't have to transfer anything, since that permission is already there for the original account (the one you will still have when all is said and done). If you're lucky none of these will fails, but if they do there is a commented out section to help with this:

" if any of the below fail because of duplicate constraints, you will need to first delete the duplicates

here is a sample query for deleting the duplicate entries from studyfile_vdcuser (the most likey to fail))"

As you can see this is a complicated process, so it was done fairly manually here. I hope the above can give you some guidance, but let me know how more I can help.

One word of advice: make sure you have a copy of your data before you start the above clean up, in case you need to go back to it and start over.

Re: passwords, we don't copy them over in the original script, but we did copy them over after we were done with all migration steps and ready to go live. (we didn't want anyone to try and login before we were ready for them). I don't recall if we have that documented anywhere (Kevin may know), but it's a fairly straightforward script. Especially compared to all of the above. That said, even with the old passwords copied, users will be prompted for new passwords, as 4.0 changes the encryption to a stronger encryption.

Gustavo

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.
To post to this group, send email to dataverse...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/2424aa03-326a-4ffe-8efe-e785786cff89%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Bikramjit Singh

unread,

Jun 10, 2016, 11:33:15 AM6/10/16

to Dataverse Users Community

Hi Gustavo,

I am System Admin from Scholars Portal working on testing migration of DV3.6 to DV4. In our first test migration couple of months ago I was not able to migrate passwords which Amber specified in first comment on this thread. I just finished Second test migration yesterday and your above comment helped a lot for better understanding like we had upper,lowercase email address issue. Fixed that by modifying migrate_user sqls etc. As you mentioned migration of password is not present in your migration sqls, but I moved passwords over with a simple sql after all steps were done. My question is that it still didn't work! I wonder what should be the value in "passwordencryptionversion" column of builinuser table? I see that DV4 has 1 for that field. I tried setting that 0 but was still unable to login. Please assist.

Regards,

Bikramjit Singh

Systems Support Specialist

Scholars Portal

Reply all

Reply to author

Forward