OutOfMemoryException when streaming large CSV File (2 mil records)

430 views
Skip to first unread message

rabah...@gmail.com

unread,
Mar 20, 2013, 12:04:35 PM3/20/13
to csv...@googlegroups.com
We're using the CsvWriter to stream the content of the file to a http server to be uploaded into SQL, this works fine for relatively small files but when we try to do the same thing with a large file with 2 million records, the CsvWriter.Write blows out with out of memory exception, below is a code snippet 

HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create(requestInfo.AddressToUse + "/" + ContactListCommand.ImportURI(columnMapping.Values, this, true) + "&requestKey=" + requestInfo.RequestKey);
                request.Method = WebRequestMethods.Http.Put;
                request.ContentType = "text/plain;charset=utf-16";
                request.ReadWriteTimeout = int.MaxValue;
                request.Timeout = int.MaxValue;
                request.SendChunked = true;

                CsvWriter csvWriter = new CsvWriter(request.GetRequestStream(), ',', new UnicodeEncoding(false, false));

                //write the column names
                foreach (DBColumn contactListField in columnMapping.Values)
                    csvWriter.Write(contactListField.Name.Trim());
                csvWriter.EndRecord();

                source.OpenConnection();
                foreach (Dictionary<DBColumn, object> sourceRecord in source)
                {
                    //only pull values from columns that are mapped
                    foreach (DBColumn sourceField in columnMapping.Keys)
                    {
                        object sourceValue;
                        if (sourceRecord.TryGetValue(sourceField, out sourceValue))
                        {
                            if (sourceValue != null)
                                csvWriter.Write(sourceValue.ToString().Trim());
                            else
                                csvWriter.Write(string.Empty);
                        }
                    }
                    csvWriter.EndRecord();
                    totalImported++;
                }

                csvWriter.Flush();
                csvWriter.Close();

                HttpWebResponse response = (HttpWebResponse)request.GetResponse();

What are we doing wrong here?

shriop

unread,
Mar 20, 2013, 9:14:14 PM3/20/13
to csv...@googlegroups.com
That's pretty strange. Internally, CsvWriter is wrapping a StreamWriter around the Stream object that you're passing into the constructor, and then making calls to Write on the StreamWriter. I THOUGHT StreamWriter did some kind of internal management and eventually flushed. Apparently I was wrong. I need to research some more, but I'm fairly sure that the simple fix would be to call csvWriter.Flush() after every call to csvWriter.EndRecord(). If you don't like the performance implications, you can do a mod operation against your totalImported variable and only call flush every thousand or so records. Now I can't say that your web server is going to handle that size of a request very well, so hopefully you have some streaming concept already figured out there.

Bruce Dunwiddie

rabah...@gmail.com

unread,
Mar 21, 2013, 2:07:03 PM3/21/13
to csv...@googlegroups.com
I actually already tried that but I get another error "System.Net.Sockets.SocketException (0x80004005): An existing connection was forcibly closed by the remote host", if I take the flushing out of my code, the upload works fine (with any file not the 2Mil record one).

Any other ideas?

shriop

unread,
Mar 22, 2013, 12:32:22 AM3/22/13
to csv...@googlegroups.com
Ok, this looks promising, http://msdn.microsoft.com/en-us/library/system.net.httpwebrequest.allowwritestreambuffering.aspx . It appears that the default setting for write stream buffering is true, which would obviously cause the entire request stream to be slammed into memory. Take the flushes back out and try setting that property on the request object to false.

Bruce Dunwiddie
Reply all
Reply to author
Forward
0 new messages