support RDMA in seastar network stack

505 views
Skip to first unread message

Honggang(Joseph) Yang

<eagle.rtlinux@gmail.com>
unread,
May 18, 2020, 10:30:46 PM5/18/20
to seastar-dev
Hello everyone,

After check seastar docs and src code, I found there is no RDMA support yet.
Are there plans to support RDMA in seastar network stack,
or this is a wrong direction to do so?

Regards,

Joseph

Avi Kivity

<avi@scylladb.com>
unread,
May 19, 2020, 1:24:45 AM5/19/20
to Honggang(Joseph) Yang, seastar-dev
It's a good direction, but I don't know that anyone has plan to do it.

Certainly RDMA fits with the Seastar philosophy of asynchronous operations.
--
You received this message because you are subscribed to the Google Groups "seastar-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to seastar-dev...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/seastar-dev/46d51670-9103-4e14-acf5-2fe050ef1368%40googlegroups.com.


Joren Wu

<jorenwu@gmail.com>
unread,
May 22, 2020, 12:41:58 AM5/22/20
to seastar-dev
We want to add a rdma_stack inherited from the network_stack. Is that OK?


在 2020年5月19日星期二 UTC+8下午1:24:45,Avi Kivity写道:
It's a good direction, but I don't know that anyone has plan to do it.

Certainly RDMA fits with the Seastar philosophy of asynchronous operations.


On 5/19/20 5:30 AM, Honggang(Joseph) Yang wrote:
Hello everyone,

After check seastar docs and src code, I found there is no RDMA support yet.
Are there plans to support RDMA in seastar network stack,
or this is a wrong direction to do so?

Regards,

Joseph
--
You received this message because you are subscribed to the Google Groups "seastar-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to seast...@googlegroups.com.

Avi Kivity

<avi@scylladb.com>
unread,
May 23, 2020, 10:31:28 AM5/23/20
to Joren Wu, seastar-dev
Depends on what you want. network_stack's purpose is to provide multiple implementations of a network stack.

If you want to implement TCP/IP using RDMA, then inheriting from network_stack is the right thing (and we'll have three stacks).

If you want to expose RDMA APIs to Seastar applications, it should be added directly to the reactor.
To unsubscribe from this group and stop receiving emails from it, send an email to seastar-dev...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/seastar-dev/46c5ae7c-4fb1-46d1-a145-0bd1f596b060%40googlegroups.com.


Joren Wu

<jorenwu@gmail.com>
unread,
May 25, 2020, 1:41:11 AM5/25/20
to seastar-dev
Thanks for your suggestion. We want to hide the details of RMDA verbs and use seastar API. So that this feature can be a common function and support more applications.
After all, We will deep dive into it and start to implement this feature.

在 2020年5月23日星期六 UTC+8下午10:31:28,Avi Kivity写道:
Depends on what you want. network_stack's purpose is to provide multiple implementations of a network stack.

If you want to implement TCP/IP using RDMA, then inheriting from network_stack is the right thing (and we'll have three stacks).

If you want to expose RDMA APIs to Seastar applications, it should be added directly to the reactor.

On 5/22/20 7:41 AM, Joren Wu wrote:

We want to add a rdma_stack inherited from the network_stack. Is that OK?


在 2020年5月19日星期二 UTC+8下午1:24:45,Avi Kivity写道:
It's a good direction, but I don't know that anyone has plan to do it.

Certainly RDMA fits with the Seastar philosophy of asynchronous operations.


On 5/19/20 5:30 AM, Honggang(Joseph) Yang wrote:
Hello everyone,

After check seastar docs and src code, I found there is no RDMA support yet.
Are there plans to support RDMA in seastar network stack,
or this is a wrong direction to do so?

Regards,

Joseph
--
You received this message because you are subscribed to the Google Groups "seastar-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to seast...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/seastar-dev/46d51670-9103-4e14-acf5-2fe050ef1368%40googlegroups.com.


--
You received this message because you are subscribed to the Google Groups "seastar-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to seast...@googlegroups.com.

Avi Kivity

<avi@scylladb.com>
unread,
May 25, 2020, 3:33:48 AM5/25/20
to Joren Wu, seastar-dev
From this I understand you want to expose RDMA directly, not IP-over-IB. Please post patches and I'll be happy to review.
To unsubscribe from this group and stop receiving emails from it, send an email to seastar-dev...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/seastar-dev/31b2343a-39e7-4a7c-910d-95360079a79b%40googlegroups.com.


Avi Kivity

<avi@scylladb.com>
unread,
Jun 3, 2020, 5:18:21 AM6/3/20
to Joren Wu, seastar-dev

I had a look at libfabric, it looks like it would be easy to integrate with Seastar.

Joren Wu

<jorenwu@gmail.com>
unread,
Jun 10, 2020, 6:04:04 AM6/10/20
to seastar-dev
I reviewed the UCX lib, it is too heavy for seastar. And I think this OFI is also too heavy. Cause it has many duplicated functions, like the same interfaces definition with seastar API_V2, and the same posix stack. Don't make it complicated and hard to operate in the production environment.

在 2020年6月3日星期三 UTC+8下午5:18:21,Avi Kivity写道:

Avi Kivity

<avi@scylladb.com>
unread,
Jun 10, 2020, 6:18:10 AM6/10/20
to Joren Wu, seastar-dev

Can you explain? I'm thinking about wrapping fi_read() with with future<> seastar::rdma_read(), and add a poller that calls fi_cq_read() and completes any futures returned from seastar::rdma_read(). It looks fairly simple.

To unsubscribe from this group and stop receiving emails from it, send an email to seastar-dev...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/seastar-dev/7f2e3399-8429-44a4-a359-f6420a06fe4eo%40googlegroups.com.

Joren Wu

<jorenwu@gmail.com>
unread,
Jun 10, 2020, 10:13:56 AM6/10/20
to seastar-dev
Let me introduce the background. We are optimizing the performance a software, which is based on Seastar and using API_V2 semantic for network IO. We don't want to change these codes in this software, wo we want to add rdma featrure like below in seastar:


Why add a new stack in seastar? Because RDMA receive/send application's buffers, not raw ethernet packet, so it cannot be added under native stack like DPDK.

Why I refused libfabric first time? Because in my intuition, it will introduce more stability risks, and will introduce more communication with OFA community to add or optimize this lib. Second, I think this lib also have the same FI api definitions like API_V2 in seastar, why not use ibverbs directly in seastar.

Righ now, we are working on this feature like the diagram above using native ibverbs. We are all greeners in seastar world, and we have fixed some puzzles. However, I believe there are some other obstacles we haven't met yet, if have, we will ask help from you.

Looking forward to your feedback on these infos.


在 2020年6月10日星期三 UTC+8下午6:18:10,Avi Kivity写道:

Avi Kivity

<avi@scylladb.com>
unread,
Jun 10, 2020, 11:04:44 AM6/10/20
to Joren Wu, seastar-dev

To be clear, you want new connected_socket_impl and server_socket_impl based on libibverbs? Not a completely new rdma API?


I'll be happy to review patches or provide guidance.


wrt. ibverbs or libfabric, I have no experience with either of them. I saw that libfabric supports infiniband as a provider so I assumed it does API translation and calls verbs immediately. I also saw that libfabric has a shared memory provider which can be useful for testing. If rdma-core provides an equivalent, we can use that.


btw, do not build on api_v2, that is used for deprecated APIs. Build no the the main seastar:: namespace.

To unsubscribe from this group and stop receiving emails from it, send an email to seastar-dev...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/seastar-dev/e87597d5-e684-4404-9beb-3c1f42b3a1d5o%40googlegroups.com.

Joren Wu

<jorenwu@gmail.com>
unread,
Jun 10, 2020, 10:53:41 PM6/10/20
to seastar-dev
To be clear, you want new connected_socket_impl and server_socket_impl based on libibverbs? Not a completely new rdma API?
Yes, not new RDMA API, in the future may it be added. For the existed programs, just re-build it with rdma which is derived from these sockets.

btw, do not build on api_v2, that is used for deprecated APIs. Build no the the main seastar:: namespace.

Thanks for the reminding. We write these codes in the seastar::net namespace, is that ok?

在 2020年6月10日星期三 UTC+8下午11:04:44,Avi Kivity写道:

Avi Kivity

<avi@scylladb.com>
unread,
Jun 11, 2020, 3:18:39 AM6/11/20
to Joren Wu, seastar-dev
Ok. seastar::net is a good place for the new stack.
To unsubscribe from this group and stop receiving emails from it, send an email to seastar-dev...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/seastar-dev/fdd67543-40b0-47ed-9b53-4d40827c11a1o%40googlegroups.com.


黎明犇

<ben741863140@gmail.com>
unread,
Mar 8, 2021, 3:13:28 AM3/8/21
to seastar-dev
is there any progress for seastar support RDMA? I'm searching RDMA in Seastar, and I'm glad to build this together.

Avi Kivity

<avi@scylladb.com>
unread,
Mar 8, 2021, 4:16:05 AM3/8/21
to 黎明犇, seastar-dev
No, we were waiting for you. I'm happy to accept patches for both an RDMA API and for a traditional socket based network stack using libibverbs.

ka mof

<mofhejia@gmail.com>
unread,
Apr 14, 2022, 7:29:30 AM4/14/22
to seastar-dev

Is there any progress for seastar supporting RDMA?

Avi Kivity

<avi@scylladb.com>
unread,
Apr 14, 2022, 7:33:51 AM4/14/22
to ka mof, seastar-dev

I did not see any patches.

ka mof

<mofhejia@gmail.com>
unread,
Apr 17, 2022, 6:52:52 AM4/17/22
to seastar-dev
Is there any roadmap?

Avi Kivity

<avi@scylladb.com>
unread,
Apr 17, 2022, 7:01:26 AM4/17/22
to ka mof, seastar-dev

The roadmap is that when patches are posted, they will be reviewed and merged.


If you are interested in Seastar RDMA, I recommend trying to write the required patches.

Reply all
Reply to author
Forward
0 new messages