timeout not respected sometimes

18 views
Skip to first unread message

xof...@gmail.com

unread,
Sep 22, 2020, 7:56:16 PM9/22/20
to asyncssh-users
I wonder if I'm doing this wrong. I have a timeout set like this
    tasks = (asyncio.wait_for(run_client(inst, cmd, sshAgent=sshAgent,
                scpSrcFilePath=scpSrcFilePath, dlDirPath=dlDirPath, dlFileName=dlFileName,
                knownHostsOnly=knownHostsOnly),
        timeout=timeLimit)
        for inst in instances )
    results = await asyncio.gather(*tasks, return_exceptions=True)

In my async  run_client coro, I do this, after a connect()
                async with conn.create_process(cmd) as proc:
                    async for line in proc.stdout:
                        logger.info('stdout[%s] %s', iidAbbrev, line.strip() )
                        logResult( 'stdout', line.rstrip(), iid )

                    async for line in proc.stderr:
                        logger.info('stderr[%s] %s', iidAbbrev, line.strip() )
                        logResult( 'stderr', line.rstrip(), iid )
                await proc.wait_closed()

Usually, the timeout is respected but, on rare occasions, the CancelledError exception does not occur until much later than it should.

Do I need to add explicit code to break out of those async loops?



Ron Frederick

unread,
Sep 22, 2020, 9:47:57 PM9/22/20
to xof...@gmail.com, asyncssh-users
Hello,
No, you shouldn’t need as long as you aren’t explicitly catching (and not re-raising) the CanceledError.

Are you actually doing the top-level SSH connect inside run_client as well, or is that happening outside of your call to wait_for() with the timeout? Is there any chance the longer delays could be happening when doing this initial connect?

If the initial connect is inside run_client, one thing to be careful about is opening too many simultaneous connections to the same server. AsyncSSH has no problem with that, but some servers will limit the number of connections that they allow to be performing authentication in parallel, and that can lead to problems. By default, I think the limit is something like 10 connections. You may need to space out the connects if you don’t want to hit that limit.

As an aside, you shouldn’t need the “await proc.wait_closed()” since you are using “async with” when creating the proc. However, that wouldn’t explain the problem you mention here.
--
Ron Frederick
ro...@timeheart.net



xof...@gmail.com

unread,
Sep 23, 2020, 2:02:00 PM9/23/20
to asyncssh-users
I do the ssh connect inside the run_client and I do one connect for each remote server. I don't have full control over the remote servers, so it's fair to say crazy things may be happening out there. However, I know the problem is not happening at connect-time because I see in logs many stdout output before the timeout should have happened. In the bad case, I was addressing only about 10 remote servers.

The command I send includes an apt-get that produces very verbose output from the remote, so I wonder if some buffer gets full enough to cause problems. And yet, this works 90-% of the time.

Here is a fuller listing of my run_client code. The timeLimit was passed as 720, but over 32 minutes elapsed during the "for line in proc.stdout" loop before the cancellederror occurred. 300+ lines of stdout came from the affected server , as well as a couple thousand lines of stdout from other servers.

async def run_client(inst, cmd, sshAgent=None, scpSrcFilePath=None, dlDirPath='.', 
        dlFileName=None, knownHostsOnly=False ):
    #logger.info( 'inst %s', inst)
    sshSpecs = inst['ssh']
    #logger.info( 'iid %s, ssh: %s', inst['instanceId'], inst['ssh'])
    host = sshSpecs['host']
    port = sshSpecs['port']
    user = sshSpecs['user']
    iid = inst['instanceId']
    iidAbbrev = iid[0:16]

    try:
        if knownHostsOnly:
            known_hosts = os.path.expanduser( '~/.ssh/known_hosts' )
        else:
            known_hosts = None
        logResult( 'operation', ['connect', host, port], iid )
        async with asyncssh.connect(host, port=port, username=user,
              keepalive_interval=15, keepalive_count_max=4,
              known_hosts=known_hosts, agent_path=sshAgent ) as conn:
            serverHostKey = conn.get_server_host_key()
            serverPubKey = serverHostKey.export_public_key(format_name='openssh')
            serverPubKeyStr = str(serverPubKey,'utf8')
            inst['returnedPubKey'] = serverPubKeyStr

            if scpSrcFilePath:
                logger.info( 'uploading %s to %s', scpSrcFilePath, iidAbbrev )
                await asyncssh.scp( scpSrcFilePath, conn, preserve=True, recurse=True )
                logResult( 'operation', ['upload', scpSrcFilePath], iid )
            proc = None
            # execute cmd on remote, if non-null cmd given
            if cmd:
                # substitute actual instanceId for '<<instanceId>>' in cmd
                cmd = cmd.replace( '<<instanceId>>', iid )
                logResult( 'operation', ['command', cmd], iid )
                async with conn.create_process(cmd) as proc:
                    async for line in proc.stdout:
                        logger.info('stdout[%s] %s', iidAbbrev, line.strip() )
                        logResult( 'stdout', line.rstrip(), iid )

                    async for line in proc.stderr:
                        logger.info('stderr[%s] %s', iidAbbrev, line.strip() )
                        logResult( 'stderr', line.rstrip(), iid )
                await proc.wait_closed()
                logResult( 'returncode', proc.returncode, iid )
                if proc.returncode is None:
                    logger.warning( 'returncode[%s] NONE', iidAbbrev )



Ron Frederick

unread,
Sep 23, 2020, 10:36:10 PM9/23/20
to xof...@gmail.com, asyncssh-users
Nothing jumps out at me as being a problem here.

If you can reproduce this, have you tried enabling debug logging in AsyncSSH to see if there’s any activity showing up in the log at the time the timeout was supposed to go off, like anything showing it closing sessions or connections?
Reply all
Reply to author
Forward
0 new messages