Predicting the slowdown of SGX‏

25 views
Skip to first unread message

Avi

unread,
Mar 27, 2021, 5:19:51 AM3/27/21
to sup...@graphene-project.io
Hi Graphene

My name is Avi and I'm a M.SC student (Computer Science) at the Open University of Israel. I'm doing a thesis on predicting the slowdown of running unmodified applications in a TEE (compared to native). specifically in SGX.
I am planning to build a dataset, machine learning models and release everything as Open Source.

I will be very glad if you could answer those 3 questions:
1) I am not an expert at TEE.  Do you think that this is a good  research question? - that is interesting and not too difficult to achieve (predicting the slowdown ,and maybe runtime,  of application on SGX/SEV). Do you have a suggestion for a more interesting/more difficult Machine Learning research direction in this domain?
2) Do you have any recommendations on how to do it using Graphene? - how to get the right features for each application (record specific system calls?).

Thanks,
Avi

Chia-Che Tsai

unread,
Mar 29, 2021, 11:19:13 AM3/29/21
to Avi, sup...@graphene-project.io
Hi, Avi,

Personally, I think that's a super intriguing problem--maybe because I am an academic. Performance prediction has been proposed in many domains, and I think it can be particularly useful for cloud. Suppose you need to replicate an enclaved application to one or multiple hosts, how soon do you expect one of the replicas to complete? When can you expect the resources to be reclaimable, these are all useful questions from scalability (elasticity) or fault tolerance perspective.

However, looking from a pragmatic side, I would say that it can be difficult to come up with an accurate for a nontrivial workload. The main issue is that there are too many factors to consider when it comes to enclave performance. You have the nondeterminism inside the program like user inputs, randomness, and thread interleaving. Outside the enclaves you have microarchitecture-level interference, system call latencies, scheduling and paging decisions---When ever you have a program, the difficulty of predicting it inside the enclave(s) will be at least higher than predicting it outside the enclave(s). And you also have the problem that you may have to re-train the model frequently to adapt to system changes. Even all those are possible, you need to ask yourself whether it's worthwhile to spend CPU/GPU/TPU for these kinds of prediction. It's a lot.

Anyhow, if you really need to do this, my suggestion is to start a testbed where you can automatically run Graphene against various inputs and system parameters to generate the ground truth. You should also collect samples of performance counters from the CPUs. In addition, using fuzzing or symbolic execution can help you increase your coverage to reduce the chance of overfitting. That's pretty much all I can think of right now.

Good luck on your path of research. If you are interested, please feel free to reach out to us or me personally for any follow-up question/discussion.

Thanks!
Chia-Che 

--
You received this message because you are subscribed to the Google Groups "Graphene Support Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to graphene-suppo...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/graphene-support/CAN5t9ZDLgnnAvD%3Dc-jy4nUA8Lv1rr%2BumXwvX4yFXGZPcyt60hQ%40mail.gmail.com.


--
Chia-Che Tsai
Assistant Professor, CSE, Texas A&M University

Avi levin

unread,
Jun 20, 2021, 3:56:15 PM6/20/21
to Graphene Support Mailing List
Thanks Chia-Che!

I created a testbed for Graphene for different workloads.
I decided to  go to  different direction - to build smart auto tuner like SGXTuner.
The main idea to find optimal parameters to run Graphene with different databases\applications (in minimum time\number of iterations).
I am using the YCSB benchmark with different workloads.  I am trying to improve the throughput and Latency.  
My hope is that this tool will help you and others to find better parameters.

The problem now is which configuration is worth trying to tune in Graphene?
Using the Graphene Docs I found 2 that are relevant:
1) sgx.thread_num = [NUM]
2)  sgx.rpc_thread_num =  [NUM]
We can  say that it's one (because Graphene recommends setting sgx.rpc_thread_num = sgx.thread_num).
There are other relevant parameters?

You think it's worth trying to tune GlibC configurations?
Also for tuning specific parameter in GlibC: https://devcenter.heroku.com/articles/tuning-glibc-memory-behavior

Miguel Guirao

unread,
Oct 26, 2021, 2:24:29 AM10/26/21
to Graphene Support Mailing List
Avi, 

I am using Graphene SGX to run ML models (for inference) only using TF Lite. Right now I have focus on model reduction but your point of view is also interesting. I would like to collaborate with you since the goals are kind of similar.

Reply all
Reply to author
Forward
0 new messages