Separating the core SSH implementation from any specific I/O or concurrency models

44 views
Skip to first unread message

Nicholas Chammas

unread,
Feb 5, 2016, 8:55:55 AM2/5/16
to asyncssh-dev

Hey Ron,

There was an interesting discussion on Reddit recently about asyncio and how many projects are reimplementing various stacks on top of it. One of the core developers for Requests chimed with an interesting comment.

The money quote from that comment, to me, was this:

What we as a community need to be doing is to write synchronous stacks that don’t do I/O, and then write thin wrapper libraries around them that work well in the specific concurrency and I/O model being used.

That got me thinking: Does it make sense to somehow take AsyncSSH and separate the core SSH implementation from the parts that depend on asyncio for I/O? Perhaps there could be a core sshlib library that, like Cory described, doesn’t do any I/O. AsyncSSH would then be a wrapper around that library that uses asyncio as its concurrency model and I/O interface.

If I understood Cory’s comment correctly, that would open the door for other wrappers around this new sshlib core that use different I/O or concurrency models. Perhaps then there could be a wrapper that uses curio, and another one that offers a strictly synchronous interface.

What are your thoughts on this? I remember you mentioned wanting to separate parts of the core implementation from asyncio the last time we discussed curio.

Nicholas Chammas

unread,
Feb 5, 2016, 12:38:27 PM2/5/16
to asyncssh-dev
There is an interesting discussion on Reddit about that comment by Cory:


One of the interesting links from there is to this:


Basically, examples of the same, core HTTP/2 Server using different I/O frameworks.

Perhaps AsyncSSH can evolve into something like this, with an I/O-agnostic core and separate I/O integrations.

Ron Frederick

unread,
Feb 5, 2016, 11:49:07 PM2/5/16
to Nicholas Chammas, asyncssh-dev
Hi Nick,
I think akoumjian’s comment a few hours ago best sums up my opinion on this:

This likely only works in scenarios where the authors of the non-I/O portions are also authoring the various I/O models to work with them. 

The problem is that in order to provide an API for the non I/O portions of your code, it helps tremendously to know what the possible I/O or concurrency models are going to be! 80% of the code ends up being the I/O portion anyway, and will be opinionated and distinct from its sync/async counterpart.

The result is that you don't end up benefitting as much as you might expect from reusing non-IO modules. What does make sense about this is the idea that the module authors should be authoring the sync/async interfaces simultaneously.

I could definitely imagine separating out the I/O and non-I/O portions of the code to allow for multiple I/O models to be shimmed in place underneath AsyncSSH. In fact, the first version of AsyncSSH was written on top of asyncore (before “tulip” that later became asyncio was available). It was 100% callback based with no coroutines, and my initial version on top of asyncio used only a few coroutines but still that the majority of the I/O event processing handled through non-blocking callbacks. Some of that code is still in AsyncSSH today, but over time I moved more and more of the code to take advantage of coroutines and other asyncio design principles.

There’s another fundamental point here as well. The library needs to decide what kind of API it wants to provide to its caller. One of the things I chose to do with AsyncSSH was to mirror the asyncio as closely as I could. So, someone familiar with how to open a TCP connection with asyncio would be able to use almost exactly the same style of code to open an SSH connection with AsyncSSH, or a channel within that connection. If I were to provide a version of AsyncSSH that worked on another framework like curio, I would probably want to think about provide a curio-like version the AsyncSSH API instead of (or in addition to) the asyncio-like version.

There are a number of other insightful comments in the thread, such as the points about dealing with incomplete input. The last thing you want to do is to complicate the non-I/O code to have manually keep track of things like parsing state so that it can pick up where it left off each time a new block of data is fed to it. You can try to use things like coroutines to maintain that state, but building a library that can take data chunks of arbitrary size at any time and hold onto that data until it needs it is much more complicated than writing a library that controls when it reads/writes data and how many bytes it wants to send or receive each time. You can abstract that interface so there are several different ways to perform the I/O, but you generally want that non-I/O code to still control what data (and/or how much data) is being input or output.
-- 
Ron Frederick



Nicholas Chammas

unread,
Feb 6, 2016, 11:13:55 AM2/6/16
to Ron Frederick, asyncssh-dev
> There are a number of other insightful comments in the thread, such as the points about dealing with incomplete input. The last thing you want to do is to complicate the non-I/O code to have manually keep track of things like parsing state so that it can pick up where it left off each time a new block of data is fed to it.

I've never implemented a protocol, but I can definitely see how this would be the case.

Thanks for sharing your thoughts! Your comments here as well as the discussion on Reddit were very enlightening.

Nick
Reply all
Reply to author
Forward
0 new messages