How to increase memory in open refine

2,187 views
Skip to first unread message

Rohan Ray

unread,
Aug 11, 2022, 3:38:34 AM8/11/22
to OpenRefine
Hello! I am struggling with this, everything is okay till I press the "Create Project" button it is getting failed meanwhile it shows "Error Code: out of memory" Pls help.
I have tried to increase memory as stated in the manual but it is not working. Can anyone please help me out to increase the memory. Can't understand the manual 
Screenshot 2022-08-10 144647.png

Parthasarathi Mukhopadhyay

unread,
Aug 11, 2022, 12:53:40 PM8/11/22
to openr...@googlegroups.com
Hello Rohan

As you are in Windows OS, open up openrefine.l4j file in a text editor (available under your OpenRefine folder), and add /uncomment the following line with half of RAM in your machine (example is here for a machine with 8 GB RAM) -

# max memory memory heap size
-Xmx4096M

Restart OpenRefine.

Hope this helps ... Best



On Thu, Aug 11, 2022 at 1:08 PM Rohan Ray <rohanch...@gmail.com> wrote:
Hello! I am struggling with this, everything is okay till I press the "Create Project" button it is getting failed meanwhile it shows "Error Code: out of memory" Pls help.
I have tried to increase memory as stated in the manual but it is not working. Can anyone please help me out to increase the memory. Can't understand the manual 

--
You received this message because you are subscribed to the Google Groups "OpenRefine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openrefine+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/openrefine/3aa7fd40-8b17-465f-9d1f-9ad8a3baaea4n%40googlegroups.com.

Rohan Ray

unread,
Aug 11, 2022, 2:39:50 PM8/11/22
to OpenRefine
Thank you for your precious reply. A screenshot is attached below for your reference. Kindly confirm.Screenshot 2022-08-12 000638.png

Rohan Ray

unread,
Aug 11, 2022, 3:28:15 PM8/11/22
to OpenRefine
I have gone through your instructions memory of 4096MB got full again. However, the file I uploaded was just the size of 308MB only.
Kindly go check the screenshot for better clarity.Screenshot 2022-08-12 005119.pngScreenshot 2022-08-12 005252.png

Walton Goga

unread,
Aug 11, 2022, 6:01:22 PM8/11/22
to openr...@googlegroups.com
Hi Rohan,

With a file size of  308MB  I think you will need a machine with a reasonable amount of RAM assuming the OutOfMemory error is not caused by a memory leak bug.

Regards,
Walton



Rohan Ray

unread,
Aug 12, 2022, 5:41:33 AM8/12/22
to openr...@googlegroups.com
Thank you so much for your reply, Walton. I have a machine with 8Gb of RAM. Isn't it enough to perform such tasks?

Thad Guidry

unread,
Aug 12, 2022, 8:47:37 AM8/12/22
to openr...@googlegroups.com
    I have been working with the user through LinkedIn messaging and see that the issue is that many PMC XML files [hepatic failure - PMC - NCBI (nih.gov)] will generate a considerable amount of record rows AND columns that ends up needing a huge amount of memory both in the backend and frontend of OpenRefine.  It breaks all the browsers I have tried it with as well.   Firefox was the only one that after 3 mins did display a column header, but ultimately just locked up when clicking on any column menu.
    See my notes to them below through LinkedIn chat.
    ideally, you would have clicked on <article> element so that you have a record in OpenRefine for each article and all it's data.
  • even if you did that, the XML format is not ideal for OpenRefine when many columns need to be created.  Which is the case with this PMC XML file, unfortunately.

  • One option if you don't need full citation data, is to just choose the MEDLINE format which is text... OpenRefine will see it as Line-based and autodetect the column break.  Then on Column1 just choose - Transform - Key/Value columns, and use Column1 as KEY, and Column2 as VALUE then click OK...wait a min...and then you will have nice rows... You can then do a filldown on Column1 if you wish.   The MEDLINE format has most of the columns, but is not fully detailed as the XML file is.  So it depends on your use case of what you are trying to achieve.

  • There is another tool called Apache Hop, which has a plugin step called "XML Input Stream (StAX)" which could be used and then connect it to a JSON or Text output plugin step.  Apache Hop can work on extremely large files.

  • We hope that in OpenRefine 4.0 we can get a better stream-based XML parser in place, and also better views for hierarchical data such as what you are dealing with.

Rohan Ray

unread,
Aug 13, 2022, 11:27:24 AM8/13/22
to OpenRefine
Thank you so much for your input Thad.

Rohan Ray

unread,
Aug 25, 2022, 3:29:30 AM8/25/22
to openr...@googlegroups.com
Yes! I did that but unfortunately, it makes the data messy. It merged some of the data in a single row 

--
You received this message because you are subscribed to the Google Groups "OpenRefine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openrefine+...@googlegroups.com.
Screenshot 2022-08-25 125841.png
Reply all
Reply to author
Forward
0 new messages