High memory usage ends up killing my box

54 views
Skip to first unread message

oliwa

unread,
Oct 4, 2012, 12:49:31 PM10/4/12
to rhino-t...@googlegroups.com
I have a process that I would typically use SSIS for but used RETL instead.  The process is to run every row of a 190M row table through a transformation routine and execute some .Net code on the data and then update the row.  This was very easy to set up and I'm using a SqlBatchOperation on the update side.  The issue is that I can read the data from the source table really fast so within a few seconds my memory usage starts to skyrocket (6 GB) and if I just let it go eventually the throughout on the update operation grinds to a halt.  I believe the issue is with the ThreadSafeEnumerator and it's use of a cache because if I change the default PipelineExecutor to be a SingleThreadedNonCachedPipelineExecuter then my memory usage stays below 50 MB and everything works but since I'm not multithreaded the performance is slower.  Is there something I can do curtail the memory usage but keep the performance as it's peak level?

Thanks

Mike G

unread,
Oct 15, 2012, 9:40:33 PM10/15/12
to rhino-t...@googlegroups.com
I went into the ThreadSafeEnumerator code and made the current thread sleep if the number of items in the collection was greater than 10,000. I changed the AddItem code to the following:

        public void AddItem(T item)
        {
            while (cached.Count > 10000)
            {
                Thread.Sleep(100);
            }

            lock (cached)
            {
                cached.Enqueue(item);
                Monitor.Pulse(cached);
            }
        }

I don't know if this is safe or performant, but I never found a better solution and this seemed to do the trick. A good change to this code would be to actually measure the amount of bytes 'cached' and sleep based on that, but you'd need to do some tricks to make this perform well I think.

Good luck!
Reply all
Reply to author
Forward
0 new messages