Re: [ansible-project] Make parsing of result [Baby] JSON more resilient to garbage?

144 views
Skip to first unread message

Michael DeHaan

unread,
Aug 11, 2012, 8:53:39 AM8/11/12
to ansible...@googlegroups.com
Before we consider this, what kind of SSH daemon or setting is this that sends MOTD every single time?

I have some ideas, but it seems so non-standard that it's almost not worth including the code hack. But I could be persuaded...


On Saturday, August 11, 2012 at 6:32 AM, Dietmar Schinnerl wrote:

> Hi,
>
> unfortunately I have some hosts[1] which always send MOTD back to the client which -- of course -- breaks parsing of [Baby] JSON (and also the parsing of the temp path during "setup" which is empty because of that). Sure, one solution would be to remove MOTD or install "better" SSH daemon. (But both solutions I don't like.) But on the other hand I wonder if it would make sense to make ansible's parsing of stdout/-err more resilient to "garbage". E.g. at least ignore lines which start with '#'. (Or just use the last line of the stdout/-err for parsing of [Baby] JSON result?)
>
> A hackish solution for the case "line starts with '#'" you find at the end of this post (which works for me, YMMV).
>
> Anyways, maybe the code has some use for someone if it is not feasible to include this into ansible core (controllable by flag per host/task?). (I'm pretty sure there are some odd implications I don't see. Although I like the idea of ignoring such lines. But I'm a bit biased on that. :) )
>
> Thanks,
> Dietmar
>
> [1] Dropbear 0.48.1-1 does that.
>
> diff --git a/lib/ansible/runner/__init__.py b/lib/ansible/runner/__init__.py
> index 6a43a59..cb2c586 100644
> --- a/lib/ansible/runner/__init__.py
> +++ b/lib/ansible/runner/__init__.py
> @@ -673,7 +673,7 @@ class Runner(object):
> cmd += ' && echo %s' % basetmp
>
> result = self._low_level_exec_command(conn, cmd, None, sudoable=False)
> - return result.split("\n")[0].strip() + '/'
> + return utils.filter_comment_lines(result).split("\n")[0].strip() + '/'
>
> # *****************************************************
>
> diff --git a/lib/ansible/utils.py b/lib/ansible/utils.py
> index b8db010..2851ddb 100644
> --- a/lib/ansible/utils.py
> +++ b/lib/ansible/utils.py
> @@ -106,9 +106,11 @@ def json_loads(data):
>
> return json.loads(data)
>
> +
> def parse_json(data):
> ''' this version for module return data only '''
>
> + data = filter_comment_lines(data)
> try:
> return json.loads(data)
> except:
> @@ -353,4 +355,10 @@ def base_parser(constants=C, usage="", output_opts=False, runas_opts=False, asyn
> return parser
>
>
> +def filter_comment_lines(data):
> + filtered_data = ''
> + for line in data.splitlines():
> + if (not line.startswith('#')):
> + filtered_data += line + '\n'
> + return filtered_data



Dietmar Schinnerl

unread,
Aug 11, 2012, 9:04:46 AM8/11/12
to ansible...@googlegroups.com
Hi,

Before we consider this, what kind of SSH daemon or setting is this that sends MOTD every single time?

Dropbear 0.48.1-1 (Debian) 
(It seems that 0.45 doesn't have that issue.)

But AFAIK it can be configured to never send MOTD, but that needs recompilation. :(
And there seems to be a way to suppress MOTD by having ~/.hushlogin which didn't work for me.
And especially I want MOTD. (I know it's a corner case.)
 
Thanks,
Dietmar 

Michael DeHaan

unread,
Aug 11, 2012, 9:08:52 AM8/11/12
to ansible...@googlegroups.com
So this is an embedded device or is some version of Debian using it by default now? I know some tablets or things have used it.

My suggestion is we modify the parse_json function in utils to discard any lines up until the first line that includes the character '{', '[', ' or '='. I think this would also allow us to remove some other hacks in the code like trying to find "tcgetattr" error message noise, and would probably be good for ansible in general. Since all modules are supposed to return a hash or key/value data, this only breaks if your MOTD contains an equals sign.

This should throw any any MOTD contents out until we encounter any JSON or key=value "baby JSON" formats (which are in there to make
it easier to code ansible modules in Bash).

If that makes sense, I'd take a pull request to add this.

--Michael

Dietmar Schinnerl

unread,
Aug 11, 2012, 9:30:40 AM8/11/12
to ansible...@googlegroups.com
Hi Michael,


So this is an embedded device or is some version of Debian using it by default now?  I know some tablets or things have used it.

It's a server machine with Debian (no embedded device). And to my knowledge it's not (or was ever) default on Debian. The guy who did the administration some ages ago was a bit of a fan of Dropbear. (Well, IIRC it was more secure(?) than OpenSSH sshd since it's much smaller.)
 
My suggestion is we modify the parse_json function in utils to discard any lines up until the first line that includes the character '{', '[', ' or '='.  I think this would also allow us to remove some other hacks in the code like trying to find "tcgetattr" error message noise, and would probably be good for ansible in general.   Since all modules are supposed to return a hash or key/value data, this only breaks if your MOTD contains an equals sign.

This should throw any any MOTD contents out until we encounter any JSON or key=value "baby JSON" formats (which are in there to make
it easier to code ansible modules in Bash).

If that makes sense, I'd take a pull request to add this.

Yes, your suggestion makes sense and that's surely the better way to do it. (I wasn't sure if we can just throw away lines not matching the above characters since I didn't know all internals.)

> Since all modules are supposed to return a hash or key/value data, this only breaks if your MOTD contains an equals sign.

This would be solvable I think if we try to parse the next line if one fails or (which I think makes more sense) parse only the _last_ (non blank) line of the output. Since for my understanding the last (non blank) output should be the (module) result anyways. That would make it more resilient I think.
 
Thanks,
Dietmar 

Michael DeHaan

unread,
Aug 11, 2012, 9:35:14 AM8/11/12
to ansible...@googlegroups.com
We can't do this last part, because there is nothing illegal about multi-line JSON. Trying to throw away leading junk is ok though.

>
> Thanks,
> Dietmar



Dietmar Schinnerl

unread,
Aug 11, 2012, 9:38:17 AM8/11/12
to ansible...@googlegroups.com
Hi again,

I missed one point: 

It's not only parse_json but also _make_tmp_path (runner/__init__py) where this filtering needs to be done. And at this point lines starting with '/' are allowed (or even essential). (But we could here also apply the rule: last non-blank line only).

Thanks,
Dietmar

Dietmar Schinnerl

unread,
Aug 11, 2012, 9:41:37 AM8/11/12
to ansible...@googlegroups.com
Hi Michael,

> > This would be solvable I think if we try to parse the next line if one fails or (which I think makes more sense) 
> > parse only the _last_ (non blank) line of the output. Since for my understanding the last (non blank)
> > output should be the (module) result anyways. That would make it more resilient I think. 

> We can't do this last part, because there is nothing illegal about multi-line JSON.    Trying to throw away leading junk is ok though. 

Understood, thank you for the clarification.

Let's see if I can make your wish come true about pull request.

Thanks,
Dietmar
 

Michael DeHaan

unread,
Aug 11, 2012, 9:45:25 AM8/11/12
to ansible...@googlegroups.com
This would work there, yes.

The remote md5 function will also need modifications. I would recommend adding a "last_non_blank_line" function to utils.py to do this.
>
> Thanks,
> Dietmar



Dietmar Schinnerl

unread,
Aug 11, 2012, 10:31:50 AM8/11/12
to ansible...@googlegroups.com
Hi again,

for anyone who needs this: It can be found at https://github.com/SleeplessAnnoyedNerd/ansible.
So far it seems to work pretty well.

Not sure if I like the current implementation... (well, it's not that bad, but maybe it can be improved, time will show).
Pull request comes as soon as I'm confident about the implementation.

Thanks,
Dietmar 

Reply all
Reply to author
Forward
0 new messages