how to recover corrupted project?

1,005 views
Skip to first unread message

John Chung

unread,
Aug 5, 2013, 5:02:09 PM8/5/13
to openr...@googlegroups.com
Hi all,

After an unexpected shut down, I lost all my google refine's projects. (ubuntu 13.04, google refine 2.5)

I look at the workspace folder. All project has been renamed as {number}.porject.corrupted.

The metadata.json file in each project folder is empty.
The workspace.json is empty.

But the data.zip and history folder are still exist. 

I wonder is there any method to recover these projects?

Thanks

Best,
John

Tom Morris

unread,
Aug 6, 2013, 9:29:27 AM8/6/13
to openr...@googlegroups.com
First, before doing anything else, make a backup of that entire directory tree.

If you're willing to share the data with the development team, it would be very useful for us to get a copy, along with any information you have related to the unexpected shutdown (e.g. were you running low on memory/swap, etc).

The projects are renamed to .corrupted after attempts to recover them have failed.  If this was a  transient problem, you may be able to get them back by removing the .corrupted extension (AFTER making and saving your backup) and restarting Refine.  Do this from a shell so you can see what it says during startup.  You should see messages related to the recovery attempts.  If that doesn't do the trick, it'll probably require manual intervention.

Another thing to look for is variants of the workspace.json.  There will potentially be a workspace.temp.json and/or a workspace.old.json which can be used for recovery (the writing process does a write new, rename old, rename new, delete old dance to attempt to make sure there's always a valid file)

Interestingly, I was working on this yesterday in an attempt to make it more robust.  We've only seen this kind of problem very rarely, but obviously we don't ever want to see it.  Hopefully, 2.6 will be even more robust.

Tom

John Chung

unread,
Aug 6, 2013, 11:43:20 AM8/6/13
to openr...@googlegroups.com
Thanks, Tom

Your reply is always very helpful.

Unfortunately, I have already tried to rename the folder's name from project.corrupted back to project, and the wonder did not happen. :( The console showed some messages as "fail to recover " etc.

The link  ( http://d.pr/f/p6Ty ) is the broken projects' folder, I have already backup on my local machine, please feel free to use the data.

Best,
John


--
You received this message because you are subscribed to the Google Groups "OpenRefine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openrefine+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Tom Morris

unread,
Aug 9, 2013, 7:26:26 PM8/9/13
to openr...@googlegroups.com
Hi John.  Sorry it took me so long to look at this.  It doesn't appear that that tar file includes a workspace.json file.  Was it missing entirely? That contains your preferences and your expression history (for reuse, not undo), so those things are gone without it.

Many of the projects also have zero length metadata.json files which means that things like the project name, creation date, etc are unavailable.  I've written a little recovery algorithm which will be in 2.6 which attempts to patch things together on whatever information it can find.  It creates a descriptive name for the project such as "<recovered project> - 2 cols X 147 rows - precinct|geometry."  Hopefully the list of column names plus the fact that it was last modified a week ago will be enough to help remember what the name was or at least come up with a new name.

I've also revised the saving algorithm to hopefully make it more robust.  It won't write unchanged metadata and it leaves the old version of the file around to increase the odds of a usable file being available in the case of a problem.

Let me know if you want to the recovered projects and I can put them someplace for you.

Tom

John Chung

unread,
Aug 9, 2013, 8:59:40 PM8/9/13
to openr...@googlegroups.com
Thanks, Tom! You save my life!

I am sure the workspace.json and metadata.json file disappeared, so that's the reason why google refine can not recover it automatically. I don't know why it happened, I guess maybe the unexpected shut down interrupt the save file process.

To share a file, droplr.com should be a good choice.

Best,
John

Steve W

unread,
Nov 13, 2013, 2:59:56 PM11/13/13
to openr...@googlegroups.com, jno...@gmail.com
I'm wondering if anyone can help with this same problem?  I had an unexpected shut down and 40 of my projects converted to {number}.project.corrupted

Scenario: left computer in Hibernate Mode.  Opened Windows the next day (I'm running Win7, 64-bit / Refine Version 2.5 [r2407]) ... and accidentally powered down while PC was rebooting out of Hibernate.  Restarted, boot as normal.  Opened refine and could not find latest files I've been working on.  Visited Workspace directory and saw that over 40 of my files were showing as corrupted.

Based on this forum post I tried renaming a file from WindowsExplorer (not sure how to do it from shell directly) called: 1607117493017.project.  Notice in attachment that when restarting Refine the shell states "FileProjectManager Failed to Recover Project in Directory" for the project of same name.

Other attachment shows file directory: it doesn't appear to have any additional workspace.json files.  Also notice that ALL of the files in the directory seem to have been "last modified" at the time of the shutdown.


On a whim, 2 days ago I had made a backup of that folder, so I hopefully can recover a good portion of the files from the backup, but I'd like to know how to try and save the corrupted files and/or share them with your development team so that it's less likely to happen to others in the future. 

I'm new to coding and had been experimenting with a couple of things the night before which may or may not be related: 
1) unsuccessful attempt of tutorial for ScraperWiki: https://scraperwiki.com/help/make-your-own-tool/ which required SSH into tool from GitBash 
2) had installed and been using this extension for 1 or 2 days before corruption (vib-bits): https://www.bits.vib.be/index.php/software-overview/openrefine
3) I also noticed at the time of the shutdown many "TEMP" files were saved to my C:\Users folder.

Any help would be greatly appreciated.  My latest work won't be in the backup so I'd like to salvage the corrupted files if possible.  In the meantime, to use my backup files, do I simply have to copy and paste them into this directory or do I need to do something with shell commands so that it knows to open these new files?   Like I say, not much of a hacker...still learning. 


Thanks,
Steve

Steve Woolsey

unread,
Nov 13, 2013, 3:26:41 PM11/13/13
to openr...@googlegroups.com, jno...@gmail.com
Forgot attachments...


--
You received this message because you are subscribed to a topic in the Google Groups "OpenRefine" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/openrefine/2IPs0CXTKY4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to openrefine+...@googlegroups.com.
11-13-2013 11-40-20 AM.png
11-13-2013 11-45-35 AM.png

Martin Magdinier

unread,
Nov 13, 2013, 9:34:58 PM11/13/13
to openrefine, jno...@gmail.com
Quick answer regarding loading your backup. You should use the option named import project. You will see it on the start page of Refine (once launched). 

You can try to import your project in OpenRefine 2.6 and see if the import is more robust than 2.5 (as per previous email from Tom).

If you are willing to you can share your project with the team so we can look into it.


Martin

Steve Woolsey

unread,
Nov 14, 2013, 2:47:02 PM11/14/13
to openr...@googlegroups.com, jno...@gmail.com
Thanks Martin,

I was able to open my backups by removing all of the existing corrupted files and  workspace.json file from the Workspace dir and moving all of the backup files to that directory.  When I launched OpenRefine, the back-up files loaded correctly without the Import step.

For the files that were not in my backup folder, I tried to Import the data.zip folder from my old workspace directory but encountered an error that it was the wrong file format (I also tried importing the metadata.txt and data.txt files that are within the .zip with no success).  As a last attempt, I ran a conversion tool on the .zip file to .tar.gz format, and the import failed for "reason unknown".

I'd be happy to send a few files for the team to look at; what's a good email to send to (some sensitive data so hopefully I can share outside of forum).

Regards,
Steve


Steve W

unread,
Nov 14, 2013, 3:31:10 PM11/14/13
to openr...@googlegroups.com, jno...@gmail.com
New development on same issue:
After closing Open Refine properly (Ctrl+C), restarting computer, and re-opening Refine... 3 random projects failed to load correctly.  See attachment.
These are not projects I've opened for awhile so problem appears to be random. 
-Steve

Martin


Forgot attachments...


To unsubscribe from this group and all its topics, send an email to openrefine+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "OpenRefine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openrefine+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the Google Groups "OpenRefine" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/openrefine/2IPs0CXTKY4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to openrefine+unsubscribe@googlegroups.com.
11-14-2013 12-21-10 PM.png
11-14-2013 12-26-28 PM.png

Jorge F. Chavez

unread,
Feb 18, 2014, 6:15:05 PM2/18/14
to openr...@googlegroups.com, jno...@gmail.com
Hi
I just had the same issue (unexpected shut down which corrupted metadata.json files) and I just found a way to recover my files. 
My problem was that these metadata files were the ones that got corrupted (lots of "nulls").
So I made a backup of all the files as suggested here. 
Then  I took one metadata file from another computer that also run Open Refine and copied it in each folder. In each case I only needed to edit  the first field of each metadata file with the name of the project (I used the date of the data.zip to name my recovered projects). Then I changed the .corrupt extension of the files, run Open Refine and it worked perfectly.
Maybe it is not an elegant solution, but at least I could recover my data!
Best,
Jorge

Martin


Forgot attachments...


To unsubscribe from this group and all its topics, send an email to openrefine+...@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "OpenRefine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openrefine+...@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the Google Groups "OpenRefine" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/openrefine/2IPs0CXTKY4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to openrefine+...@googlegroups.com.

Steve Woolsey

unread,
Feb 18, 2014, 9:40:59 PM2/18/14
to openr...@googlegroups.com, jno...@gmail.com
Thanks Jorge, 
I'll give your suggestion a try as there are still several files I was never able to recover from the past.
-Steve

Abigail Rumsey

unread,
May 16, 2014, 11:02:22 AM5/16/14
to openr...@googlegroups.com, jno...@gmail.com
This solution worked for me. It saved a whole lot of headache! Thanks Jorge!

James Kim

unread,
Jul 3, 2015, 4:19:15 PM7/3/15
to openr...@googlegroups.com, jno...@gmail.com
Jorge's solution worked for me. You don't even have to find a metadatafile in another computer. As long as one of your metadata files in your refine is not in a .corrupt folder you can copy+ paste+replace into all the other .corrupt folders to replace those metadata files. Then you open up the .json file and change the field for name to whatever you want.
Reply all
Reply to author
Forward
0 new messages