Michael Gates
unread,Jul 24, 2012, 7:46:27 PM7/24/12Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to rhino-t...@googlegroups.com
I'm very new to the code, but one thing I noticed while SqlBulkCopying a very large CSV with Rhino ETL was that it seems like the "read" thread (the csv reading operation) was getting backed up because the "write thread" (the SqlBulkInsert operation which writes in batches of 10,000) couldn't complete it's duties fast enough. What I mean by this is that the csv reading operation was reading faster than the SqlBulkInsert could write to the database. I believe this is causing the ThreadSafeEnumerator between the two to keep getting larger and larger, eventually reaching 2 GB of data before I kill the process.
So, I went into the code and changed ThreadSafeEnumerator's AddItem method to stop adding items if there are already too many items queuing up:
public void AddItem(T item)
{
lock (cached)
{
if (cached.Count > 200000)
{
Thread.Sleep(100);
}
cached.Enqueue(item);
Monitor.Pulse(cached);
}
}
This seems to work fine now, and memory has stopped growing out of control. Is there a better way to fix the problem other than my hack?