Performance issues over time (and memory consumption)

209 views
Skip to first unread message

poser...@gmail.com

unread,
May 31, 2013, 2:56:40 AM5/31/13
to cs-s...@googlegroups.com
Hi!

We have a framework that processes documents, what we are trying to do is use cs script to manipulate the documents before we send them to another service. I have done some performance tests comparing it to using iron python and it seems like the performance of cs script degrades a lot over time.


Going from about 240 documents per second when processing 25000 documents down to about 150 documents per second when doing 50000 documents (memory is at about 600MB when it finishes). Iron python on the other hand stays pretty stable at around 250 documents per second (and finishes with about 50 MB of memory usage).

The code that uses the scripts looks like this:

foreach (var command in commandBatch)
            {
                try
                {
                    var scope = GetScriptScope(command.Domain);
                    var script = CSScript.Evaluator.LoadFile<IScript>(scope);

                    var processedDocument = command.IsAdd
                                                ? script.ProcessDocument((Document)command.InitialDocument.Clone())
                                                : script.ProcessDelete((Document)command.InitialDocument.Clone());

                    command.State = State.Processed;
                    command.ProcessedDocument = processedDocument;
                    command.ProcessDate = DateTime.Now;
                }
                catch (Exception e)
                {
                    command.Error = new Error
                                        {
                        ScriptRan = command.Domain + ScriptSuffix,
                        CreateDate = DateTime.Now,
                        ErrorMessage = string.Format("{0}\n{1}", e.Message, e.StackTrace)
                    };
                    command.State = State.Error;
                }
            }

So every command is within a domain, and the name of the doamin controlls what script i loaded. The tests I have run has used the same domain name for all command.

The IScript interface looks like this:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using Findwise.Hades.Api;

namespace Findwise.Hades.CsScriptService
{
    public interface IScript
    {
        Document ProcessDocument(Document document);
        Document ProcessDelete(Document document);
    }
}

And the script I'm testing with looks like this:

using Findwise.Hades.Api;
using Findwise.Hades.Api.Attributes;
using Findwise.Hades.CsScriptService;

namespace Findwise.Hades.Tests.Scripts
{
    public class CsScriptTestScript : IScript
    {
        public Document ProcessDocument(Document document)
        {
            document.Fields.Add(new Field("Test", "Test value"));

            return document;
        }

        public Document ProcessDelete(Document document)
        {
            return document;
        }
    }
}

Is there any way that I could do this that would make it perform better and not consume so much memory?


Oleg Shilo

unread,
May 31, 2013, 5:25:33 AM5/31/13
to cs-s...@googlegroups.com
It looks like you have a memory leak.

Despite many obvious advantages CSScript.Evaluator is not immune against memory leaks.
 
Have a look at cs-script\Samples\Hosting\CompilerAsService\MemoryManagement.cs which demonstrates the cases with and without memory leaks. It can also help you to understand how to address the problem. 
 
In the profiling I have done (MemoryManagement.csCSScript.Evaluator was able to perform without any performance/memory degradation. Note some test cases there were designed to have memory leaks.

If this does not help you can consider CS-Script CodeDom hosting as it offers more "orthodox" memory management model. However it is less flexible and I do always recommend starting with  CSScript.Evaluator by default.

Cheers,
Oleg 


 

Jens Bengtsson

unread,
May 31, 2013, 5:56:29 AM5/31/13
to cs-s...@googlegroups.com
I did look at that example but wasn't really able to get anything out of it. I can try with CodeCom and see if there is a difference.




 

--
You received this message because you are subscribed to a topic in the Google Groups "CS-Script" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/cs-script/NgzgxMKEfvQ/unsubscribe?hl=en.
To unsubscribe from this group and all its topics, send an email to cs-script+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

poser...@gmail.com

unread,
May 31, 2013, 6:27:28 AM5/31/13
to cs-s...@googlegroups.com, poser...@gmail.com
So I ran it using the CodeDom instead with the following code:

 var script = CSScript.Load(scope).CreateInstance("Script").AlignToInterface<IScript>();

Memory usage seemed much better, around 35-40MB, but the performance was weak. 127 documents / second.
To unsubscribe from this group and all its topics, send an email to cs-script+unsubscribe@googlegroups.com.

poser...@gmail.com

unread,
May 31, 2013, 6:56:38 AM5/31/13
to cs-s...@googlegroups.com, poser...@gmail.com
My bad, it was actually throwing an error. Changed the code to

AsmHelper helper = new AsmHelper(CSScript.Load(scope));
IScript script = (IScript) helper.CreateAndAlignToInterface<IScript>("*");

But this is also leaking memory.

Oleg Shilo

unread,
May 31, 2013, 7:11:31 AM5/31/13
to cs-s...@googlegroups.com
Before you completely give up on CSScript.Evaluator tel me more about your hosting scenario:

- Do you execute always the same script?
- What is the number of cycles (executions) before you notice the memory/performance degradation?


Oleg Shilo

unread,
May 31, 2013, 7:18:59 AM5/31/13
to cs-s...@googlegroups.com, poser...@gmail.com
CSScript.Load() is expected to leak i your scenario. You will need to use "Remote Execution" instead of "Local" one. I will guide you through but first please answer the questions from my prev email. 
To unsubscribe from this group and all its topics, send an email to cs-script+...@googlegroups.com.

Jens Bengtsson

unread,
May 31, 2013, 7:45:03 AM5/31/13
to cs-s...@googlegroups.com
In this particular test it's the same scrip. In the framework it self it might be different scripts, but probably only about 2-5. 

Here's trace output from my unit test that runs 100 batches with 500 command each through the processing function. The Evaluator.LoadFile function will be called for every single command. The format is screwed so the batch number stands next to the docs/second number with only a space to differentiate them:

http://pastebin.com/wBLb4jhJ


Oleg Shilo

unread,
May 31, 2013, 9:16:18 AM5/31/13
to cs-s...@googlegroups.com
After having a quick look at your code I got an impression that that it is expected to leak but not because of scripting.

 will prepare the proper test case for you tomorrow. It is almost midnight here in Australia ;o) 

poser...@gmail.com

unread,
May 31, 2013, 12:36:32 PM5/31/13
to cs-s...@googlegroups.com, osh...@gmail.com
OK. :)

Don't really know why it's expected to leak since the ironpython implementation doesn't have any issues.

Oleg Shilo

unread,
Jun 1, 2013, 3:57:48 AM6/1/13
to cs-s...@googlegroups.com
What I meant is that I see in your code cloning of Document and it was not clear if the Batch (containing Documents) is ever released. However it is irrelevant now as I have prepared the test case for you and now I am confident that the problem is in not in your code but in the the Mono.Evaluator (part of CS-Script). 

The problem with the CodeDom script compilation is that it is performance-expensive at the compilation time and it is not immune to the CLR assembly unloading problem. The compiler-as-service (Mono.Evaluator) expected to solve this problem. And it did. However Mono.Evaluator created certain ambiguity with respect to the memory leaks. 
The MemoryManagement.cs sample demonstrates various hosting scenarios implemented with CSScript.Evaluator (very light API adapter/wrapper around Mono.Evaluator). All of these scenarios exhibited memory leaks except one: "Fixed script". And this scenario is the one that CSScript.Evaluator is recommended for. 

To my surprise your use case is consistent with the "Fixed script" scenario and yet if it is profiled with the same technique as in MemoryManagement.cs sample it also leaks. All this means that I will need to investigate deeply what is triggering the problem and log the report to Mono defect  tracking system (unless of course I discover some CS-Script flaw).

Also it means that cannot use CSScript.Evaluator for your hosting scenario. 

However it is not so bad as in your case you are not facing the task of the execution of an infinite number of constantly changing scripts. Thus for you the CodeDome compilation is preferred anyway.

Thus I have prepared the comprehensive profiling test case for you: https://dl.dropboxusercontent.com/u/2192462/Support/Jens%20Ben..son/Jens%20Ben..son.7z
   
This sample demonstrates how to leverage the poor performance of the default CodeDome compilation with the appropriate caching. For your case I would suggest "CodeDom_InMemoryCaching" sample. It allows no memory leaks and ~30 times faster  then CScript.Evaluator. 

Cheers,
Oleg 

--
You received this message because you are subscribed to the Google Groups "CS-Script" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cs-script+...@googlegroups.com.

Jens Bengtsson

unread,
Jun 1, 2013, 4:37:55 AM6/1/13
to cs-s...@googlegroups.com, cs-s...@googlegroups.com
Thanks for your very comprehensive answer! I'll check out your test on monday an change the code accordingly then I will get back with my results.  

Jens Bengtsson

Sent from my iPhone
You received this message because you are subscribed to a topic in the Google Groups "CS-Script" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/cs-script/NgzgxMKEfvQ/unsubscribe?hl=en.
To unsubscribe from this group and all its topics, send an email to cs-script+...@googlegroups.com.

poser...@gmail.com

unread,
Jun 1, 2013, 2:02:12 PM6/1/13
to cs-s...@googlegroups.com, poser...@gmail.com
Couldn't wait until Monday. :)

I don't know what kind of black magic that way of loading the script did, but the results are great. It went from memory leak and 148 docs / second to stable memory and 6137 docs / second. No errors on the documents and they all contain the field that the script is supposed to add.

http://pastebin.com/0ybiFMmt



On Saturday, June 1, 2013 10:37:55 AM UTC+2, Jens Bengtsson wrote:
Thanks for your very comprehensive answer! I'll check out your test on monday an change the code accordingly then I will get back with my results.  

Jens Bengtsson

Sent from my iPhone

On 1 jun 2013, at 09:57, Oleg Shilo <oleg....@gmail.com> wrote:

What I meant is that I see in your code cloning of Document and it was not clear if the Batch (containing Documents) is ever released. However it is irrelevant now as I have prepared the test case for you and now I am confident that the problem is in not in your code but in the the Mono.Evaluator (part of CS-Script). 

The problem with the CodeDom script compilation is that it is performance-expensive at the compilation time and it is not immune to the CLR assembly unloading problem. The compiler-as-service (Mono.Evaluator) expected to solve this problem. And it did. However Mono.Evaluator created certain ambiguity with respect to the memory leaks. 
The MemoryManagement.cs sample demonstrates various hosting scenarios implemented with CSScript.Evaluator (very light API adapter/wrapper around Mono.Evaluator). All of these scenarios exhibited memory leaks except one: "Fixed script". And this scenario is the one that CSScript.Evaluator is recommended for. 

To my surprise your use case is consistent with the "Fixed script" scenario and yet if it is profiled with the same technique as in MemoryManagement.cs sample it also leaks. All this means that I will need to investigate deeply what is triggering the problem and log the report to Mono defect  tracking system (unless of course I discover some CS-Script flaw).

Also it means that cannot use CSScript.Evaluator for your hosting scenario. 

However it is not so bad as in your case you are not facing the task of the execution of an infinite number of constantly changing scripts. Thus for you the CodeDome compilation is preferred anyway.

Thus I have prepared the comprehensive profiling test case for you: https://dl.dropboxusercontent.com/u/2192462/Support/Jens%20Ben..son/Jens%20Ben..son.7z
   
This sample demonstrates how to leverage the poor performance of the default CodeDome compilation with the appropriate caching. For your case I would suggest "CodeDom_InMemoryCaching" sample. It allows no memory leaks and ~30 times faster  then CScript.Evaluator. 

Cheers,
Oleg 

On Sat, Jun 1, 2013 at 2:36 AM, <poser...@gmail.com> wrote:
OK. :)

Don't really know why it's expected to leak since the ironpython implementation doesn't have any issues.

On Friday, May 31, 2013 3:16:18 PM UTC+2, Oleg Shilo wrote:
After having a quick look at your code I got an impression that that it is expected to leak but not because of scripting.

 will prepare the proper test case for you tomorrow. It is almost midnight here in Australia ;o) 

--
You received this message because you are subscribed to the Google Groups "CS-Script" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cs-script+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to a topic in the Google Groups "CS-Script" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/cs-script/NgzgxMKEfvQ/unsubscribe?hl=en.
To unsubscribe from this group and all its topics, send an email to cs-script+unsubscribe@googlegroups.com.

Oleg Shilo

unread,
Jun 1, 2013, 9:28:35 PM6/1/13
to cs-s...@googlegroups.com, poser...@gmail.com
:-)
I am glad I could help. 

And no, there is no black magic :). 

Unfortunately because of the infamous CLR assembly loading bug (once loaded the assembly cannot be unloaded) CS-Script cannot provide one hosting model that fits all scenarios. And the hope that I had for Mono.Evaluator to solve this problem is fading now after your discovered and reported your problem. Unfortunately Mono team is not responsive at all. For example I have my question to them not being answered for ~6 months and the CS-Script implications are "debugging scripts with Mono.Evaluator is not possible". It looks like I will need to put another disclaimer "not having memory leaks with Mono.Evaluator is not possible". 

The CodeDom approach in contrast to Mono.Evaluator is straightforward, simple and very much orthodox. But sadly, it is also slow. In the wast majority of cases it can be fully compensated by carefully chosen caching model what is not necessarily a simple task. And also there is a limit on what caching can do: it is useless if the you need to host an infinite number of constantly changing scripts. 

I tried to help the users by providing some guidance: http://www.csscript.net/help/Script_hosting_guideline_.html. But still it may not be enough.

In your case this is what made the performance so great:
- You are no longer loading the file but the script code.
- This allows CS-Script runtime to "CRC" the code and store the script after the compilation in the memory.
- Next time you call LoadCode the runtime looks up for the CRC and if found returns the compiled script immediately without any compilation.

Because you have only a few scripts the caching mode makes such a dramatic difference. 
Note that described caching mechanism is enabled by default. Caching with LoadFile works similarly but it has more overhead because of file operations.     

Cheers,
Oleg

Jens Bengtsson

unread,
Jun 1, 2013, 10:50:37 PM6/1/13
to Oleg Shilo, cs-s...@googlegroups.com
Yeah, I read your guide and tried to adjust for it, but as you noted there are several different scenarios and it was hard to know which one would fit. I'm very appreciative of your help! 


Jens Bengtsson

Sent from my iPhone
Reply all
Reply to author
Forward
0 new messages