Failing to importing CSV file

已查看 210 次
跳至第一个未读帖子

John Harris

未读,
2017年1月18日 05:40:032017/1/18
收件人 OpenRefine
On a Mac OSX 10.11.2 Openrefine 2.6-rc 2 will not import a 49mb CSV file giving a "Memory usage 100% (954/954MB) error message.
Importing a small 4mb file doesn't succeed either with reading hanging "Memory usage 75% (717/954MB)
Can any one help? I was generating "Named entities" and getting some great preliminary results!

John Little

未读,
2017年1月18日 12:30:092017/1/18
收件人 OpenRefine
John:

From my experience import problems are sometimes related to file size and sometimes related to the idiosyncrasies of the format of the data being imported, and sometimes available memory.  Owen Stephens has mentioned on this list his observation that as a dataset becomes wider (more columns) the odds of having a data import problem increase.  That said, I don't have an immediate answer to your problem.  Maybe you can give a bit more information.

So a few questions...

Is the small 4mb file a derivative of the big file?  I'm trying to understand if you can open any file, or if it's just this file, regardless of the size.)
Can you import a very plain CSV file?  
How many columns exist in the problem dataset?  
Can you import or open the problem dataset in Excel?

There is also the possibility if increasing the default RAM setting.  Some information about that is available in the FAQ

--John Little

--
You received this message because you are subscribed to the Google Groups "OpenRefine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openrefine+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Owen Stephens

未读,
2017年1月19日 06:21:152017/1/19
收件人 OpenRefine
I'd definitely increase the memory available to OR - 1000Mb just isn't enough. As John L has pointed at directions on how to do this are available at https://github.com/OpenRefine/OpenRefine/wiki/FAQ:-Allocate-More-Memory

If this doesn't help.... I've said in a previous post (https://groups.google.com/forum/#!searchin/openrefine/columns$20owen%7Csort:date/openrefine/LAP8W3zz41I/kzmyR-tnBAAJ):
----
My general experience is that if you hit issues with importing data to OpenRefine and you've already upped the memory to your maximum is that the only thing you can do is do some pre-processing on the data to get it ready to put into OR.

Sometimes changing format can help (e.g. xls may get imported more easily than equivalent csv)
Sometimes making sure you are only importing the data you really need can help (e.g. empty columns in spreadsheets can cause problems on import - and removing the cols in excel beforehand is highly advisable and will make the import work without problems)
Sometimes you have to only import a partial set of data (either fewer records, or only selected fields from each record)

Hope some of this helps. Increasing memory is definitely the first thing I'd do

Owen
回复全部
回复作者
转发
0 个新帖子