Made golang GRPC non-blocking across AWS multi-regions - High CPU usage impacting performance

190 views
Skip to first unread message

Deepal Tennakoon

unread,
Nov 5, 2021, 7:12:03 AM11/5/21
to grpc.io
Hi,

I have implemented Golang grpc on a blockchain for message passing. My client side and server side code looks as follows:

1. Client code
/////////////////////code that establishes the grpc connection - grpc dial//////////////////////////////
func (net DbftNetwork) EstablishConnection() {
if connected == false {
for _, p := range net.Peers {
client, err := rpc.Dial("tcp", p.Address)
fmt.Println("Making a dial")

log.Printf("Dial Error : %v\n", err)
return
}
ClientsList = append(ClientsList, client)

}
connected = true
}
}

///////////////////////code that makes the grpc call to the server/////////////////////////////////

for _, p := range net.Peers {
go p.Call(method, msg, new(interface{}), ClientsList[count])
count++
}

func (rpcClient *DbftRpcClient) Call(method string, args messages.ConsensusMsg, reply *interface{}, Client *rpc.Client) {
Client.Go(method, args, reply, nil)
}

2. Server code

func (server *DbftRpcServer) Start() {

handler := rpc.NewServer()
handler.Register(server)
l, err := net.Listen("tcp", fmt.Sprintf("0.0.0.0:%d", server.port))
if err != nil {
log.Fatal("HTTP listen error: ", err)
}

go func() {
for {
cxn, err := l.Accept()
if err != nil {
log.Printf("Error Accept Request: %s\n", err)
return
}
go handler.ServeConn(cxn)
}
}()
}

Problem - I have made the client code non-blocking by adding a go routine for each client request. I had to do this because, in AWS multi-region the latency was high and the number of grpc connections  blocking made the CPU stall and performance bad. After adding go routines and increasing the GOGC value, I was able to get it to work up to 50 peers across 10 AWS regions. But when I increase the number of peers further, the non-blocking go routines increases further and starts stalling the CPU again. I was wondering if there is any solution considering the above code, to make my gRPC scalable. Any optimizations that anyone can think of.

Machines used - AWS c5.4xlarge, 16vCPUs, 16GB RAM, Ubuntu 18,  go version - 1.13.1(Cannot upgrade the version because the code base depends on it)

Thank you for your time.
Highly appreciated.
Deepal

dfa...@google.com

unread,
Dec 20, 2021, 5:09:42 PM12/20/21
to grpc.io
On Friday, November 5, 2021 at 4:12:03 AM UTC-7 deepal.t...@gmail.com wrote:
Hi,

I have implemented Golang grpc on a blockchain for message passing. My client side and server side code looks as follows:

1. Client code
/////////////////////code that establishes the grpc connection - grpc dial//////////////////////////////
func (net DbftNetwork) EstablishConnection() {
if connected == false {
for _, p := range net.Peers {
client, err := rpc.Dial("tcp", p.Address)

This doesn't look like gRPC to me.
 
2. Server code

Neither does this.
 
Problem - I have made the client code non-blocking by adding a go routine for each client request. I had to do this because, in AWS multi-region the latency was high and the number of grpc connections  blocking made the CPU stall and performance bad. After adding go routines and increasing the GOGC value, I was able to get it to work up to 50 peers across 10 AWS regions. But when I increase the number of peers further, the non-blocking go routines increases further and starts stalling the CPU again. I was wondering if there is any solution considering the above code, to make my gRPC scalable. Any optimizations that anyone can think of.

Have you considered using a worker pool so you only have a fixed number of RPCs outstanding at any point in time?

Reply all
Reply to author
Forward
0 new messages