My general experience is that if you hit issues with importing data to OpenRefine and you've already upped the memory to your maximum is that the only thing you can do is do some pre-processing on the data to get it ready to put into OR.
Sometimes changing format can help (e.g. xls may get imported more easily than equivalent csv)
Sometimes making sure you are only importing the data you really need can help (e.g. empty columns in spreadsheets can cause problems on import - and removing the cols in excel beforehand is highly advisable and will make the import work without problems)
Sometimes you have to only import a partial set of data (either fewer records, or only selected fields from each record)
My feeling is (and I can't quantify this but based on my general experience) hierarchical data causes more performance issues than tabular data even for similar amounts of data - so the first thing I'd look at is whether you can flatten out the XML to csv outside OR. Of course this may not be possible.
Even if this is possible, I'd say 5Gb of data is likely to cause OR issues anyway.
If so, then you'd have to fall back on working with a subset of the data - so the next question would be whether you could extract just the relevant parts of each XML record to create a smaller import file you can work on.
Or whether it makes sense to split the data into smaller numbers of records and work on these smaller record sets one by one
If none of this is possible/helps I suspect you are at the point where OR is not going to do the job for you and it's time to look at alternative tools
Owen