Rhino DSL: ThreadSafeEnumerator "backing up", causing out of control memory growth

32 views
Skip to first unread message

Michael Gates

unread,
Jul 24, 2012, 7:46:27 PM7/24/12
to rhino-t...@googlegroups.com
I'm very new to the code, but one thing I noticed while SqlBulkCopying a very large CSV with Rhino ETL was that it seems like the "read" thread (the csv reading operation) was getting backed up because the "write thread" (the SqlBulkInsert operation which writes in batches of 10,000) couldn't complete it's duties fast enough. What I mean by this is that the csv reading operation was reading faster than the SqlBulkInsert could write to the database. I believe this is causing the ThreadSafeEnumerator between the two to keep getting larger and larger, eventually reaching 2 GB of data before I kill the process.

So, I went into the code and changed ThreadSafeEnumerator's AddItem method to stop adding items if there are already too many items queuing up:

        public void AddItem(T item)
        {
            lock (cached)
            {
                if (cached.Count > 200000)
                {
                    Thread.Sleep(100);
                }

                cached.Enqueue(item);
                Monitor.Pulse(cached);
            }
        }

This seems to work fine now, and memory has stopped growing out of control. Is there a better way to fix the problem other than my hack?

Simone Busoli

unread,
Jul 25, 2012, 2:46:24 AM7/25/12
to rhino-t...@googlegroups.com
RhinoETL does not provide this mechanism, although you may try using the single threaded pipeline executor, which will avoid upstream operations to iterate until a downstream operation pulls a certain item.
Implementing the feature for the multi-threaded scenario is a general programming question.


--
You received this message because you are subscribed to the Google Groups "Rhino Tools Dev" group.
To view this discussion on the web visit https://groups.google.com/d/msg/rhino-tools-dev/-/DbsATnPdZu8J.
To post to this group, send email to rhino-t...@googlegroups.com.
To unsubscribe from this group, send email to rhino-tools-d...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/rhino-tools-dev?hl=en.

Reply all
Reply to author
Forward
0 new messages