Struggling with basic Rust app hanging issue that only exists when running workload on Graphene

5 views
Skip to first unread message

Chris Cassano

unread,
Jul 29, 2021, 7:24:56 PM7/29/21
to sup...@graphene-project.io
I am seeing some weird issues with Rust + Graphene that I think are related to locking or race conditions of some kind, where the Rust process ends up hanging forever.

This is a link to the proof of concept repo to reproduce: https://github.com/glitch003/graphene_rust_poor_performance_poc

The repo is designed to be run using the .sh scripts, which will spawn 3 graphene-sgx processes which each run the binary "poc". This is designed to mimic a distributed system with 3 "nodes"

Inside src/main.rs is a HTTP server that has 2 endpoints. The endpoint at "/" simply returns the string "Hello, World". The endpoint at "/test" will hit the "/" endpoint of all 3 servers (including itself) 3 times.

Running curl.sh will call the "/test" endpoint on all the servers.  When run without graphene, this works fine and completes instantly. Running in graphene-sgx, we see mixed behavior. Sometimes it all works fine. Sometimes it works but takes 5 seconds to run (it should be nearly instantaneous). Sometimes it works on 2 of the 3 servers and the 3rd one hangs forever.  Sometimes all 3 servers hang forever.

I am running the 1.2-rc1 release of Graphene on Ubuntu 20.04.  

Things I tried that did not work: 
* changing the "rpc_thread_num" variable to match "thread_num" to enable exitless mode.  
* running strace on the graphene-sgx process (maybe I didn't understand how to use the output to figure out the problem)
* checked htop to see if the CPU is doing work when the process hangs.  It is not.
* setting "insecure__allow_eventfd" to false, which stops the app from running at all.

Any ideas what might be causing this, or what I should investigate next?  How can I debug where the "hang" is happening?

Chris Cassano

unread,
Aug 2, 2021, 1:12:41 PM8/2/21
to sup...@graphene-project.io
I am willing to pay for help with this, if anyone can help or knows someone who can.  I added a little more info to the repo so that it should be enough to get anyone fully up to speed on the problem: https://github.com/glitch003/graphene_rust_poor_performance_poc


Reply all
Reply to author
Forward
0 new messages