Importing protobuff data in python

415 views
Skip to first unread message

Daniel

unread,
Jun 15, 2022, 5:59:14 PM6/15/22
to Protocol Buffers
I am having the following problem:

I have implemented a protobuff to send data from a golang client to a python server using a grpc stream. The data I am sending needs to be loaded quickly on the python server and processed. This data is a composite field with the largest part being a repeated sint array defined as such: repeated sint32 array = 4 [packed=true];

This field contains around 18,000,000 entries and when I try to load these into my code using the following line: data = np.array(array_obj, dtype=np.int8) this process takes around 1.5 SECONDS. I have tried alternative methods of first reading the data as a list, which is also not faster, using copy=False in numpy... I just want to access the memory where these values are stored...

I would like to try something such as numba o Cython, but both of those would require me to implement the complete container type stored here https://github.com/protocolbuffers/protobuf/blob/main/python/google/protobuf/internal/containers.py . Is there some way this process could be accelerated?

Thankful for any help

Jie Luo

unread,
Aug 1, 2022, 3:28:34 PM8/1/22
to Protocol Buffers
Are you using protobuf python cpp extension or pure python?

We do have a design of "Optimized Python NumPy Bindings for Repeated Fields" for cpp extension , but the implementation breaks too many users so it was blocked. I will check with the owner to go if we can go forward
Reply all
Reply to author
Forward
0 new messages