stream a file to shell script

121 views
Skip to first unread message

Dominique MARTINET

unread,
Mar 3, 2021, 4:11:04 AM3/3/21
to swup...@googlegroups.com
Hi,
(thanks for swupdate!)

Some background first:
I'd like to use swupdate to update podman container images from data
inside the .swu (this comes as a tar file that can be piped to 'podman
load' for example)
The use case would be updating devices without internet, or just a way
of verifying images without going through the trouble of setting up a
registry.


As far as I can see, I can:
- use the rawfile handler to put the update tarball, then a postinstall
script to run the load command.
This has the advantage to be simple, but requires to store the tarball
as a file as well so for large images can be quite the overhead.

- write a new handler that spawns the podman import process, stream the
data to its stdin, and collect the return code like the shell script
handler (probably extend run_system_cmd -- by the way calling execl like
that rubs me wrong, if you're ok with it I'll submit something to change
the "cmd" argument to argv format regardless of the outcome of this
discussion...)
I can't think of something else that would be streamed data right now
but I think that could be quite useful in general; for users who can
rely on a minimum of userspace tools actually most handlers could be
trivially reimplemented like that (piping to dd, tar, whatever)


From what I've seen on the list archives and documentation however I
don't think that'll get welcomed -- swupdate obviously wants to be as
much self-contained as possible.
In my case the base OS is read-only and only container images get
updated so I can trust the base - or rather if anything in it got
overwritten there's high odds the whole device can be considered done
for, so piping image data to a system command would be a great way to
keep development/test cost low and be useable as is.


With that being said, what do you suggest?
I can obviously make a handler specific to podman if that has more
chances of being accepted, even if it's a bit of a shame for me, but
would there be any interest for one?

Or did I miss another way of doing this that wouldn't require an extra
temporary buffer? The remote handler could more or less work but
implementing one sounds like more work than implementing a handler.


Thanks,
--
Dominique Martinet

Stefano Babic

unread,
Mar 3, 2021, 5:32:06 AM3/3/21
to Dominique MARTINET, swup...@googlegroups.com
Hi Dominique,

On 03.03.21 10:10, Dominique MARTINET wrote:
> Hi,
> (thanks for swupdate!)
>

You're welcome.

> Some background first:
> I'd like to use swupdate to update podman container images from data
> inside the .swu (this comes as a tar file that can be piped to 'podman
> load' for example)

It is a nice feature.

> The use case would be updating devices without internet, or just a way
> of verifying images without going through the trouble of setting up a
> registry.
>

Fully agree, feature understood. It will be nice to get it into SWUpdate.

>
> As far as I can see, I can:
> - use the rawfile handler to put the update tarball, then a postinstall
> script to run the load command.
> This has the advantage to be simple, but requires to store the tarball
> as a file as well so for large images can be quite the overhead.

Right. This is simple, but it is nasty, it cannot use streaming and it
eats a lot of resources.

>
> - write a new handler that spawns the podman import process, stream the
> data to its stdin, and collect the return code like the shell script
> handler (probably extend run_system_cmd -- by the way calling execl like
> that rubs me wrong, if you're ok with it I'll submit something to change
> the "cmd" argument to argv format regardless of the outcome of this
> discussion...)
> I can't think of something else that would be streamed data right now
> but I think that could be quite useful in general; for users who can
> rely on a minimum of userspace tools actually most handlers could be
> trivially reimplemented like that (piping to dd, tar, whatever)
>
>
> From what I've seen on the list archives and documentation however I
> don't think that'll get welcomed -- swupdate obviously wants to be as
> much self-contained as possible.

That is correct - I can accept the implementation above, but only after
discussion and if we have not found any other suitable way. Rather,
linking a go library with another language is a nightmare.

The preferred solution is that we can link in some way "libpod" and then
call it from a new handler. I will like to check first if this way is
praticable, and to fall back to the implementation based on
run_system_cmd if no other solution is possible. Do you have already
investigated ?


> In my case the base OS is read-only and only container images get
> updated so I can trust the base - or rather if anything in it got
> overwritten there's high odds the whole device can be considered done
> for, so piping image data to a system command would be a great way to
> keep development/test cost low and be useable as is.

Your use case is clear, thanks.

>
>
> With that being said, what do you suggest?
> I can obviously make a handler specific to podman if that has more
> chances of being accepted, even if it's a bit of a shame for me, but
> would there be any interest for one?

I would try to investigate if there are some chances to get a library to
drive podman. I appreciate if you can check this, and we can discuss
then here the results. And if there will be really no chance to do in
another way, I can accept the solution above with a handler calling
run_system_cmd().

>
> Or did I miss another way of doing this that wouldn't require an extra
> temporary buffer?

No, everything you say is correct. We can avoid temporary buffers only
if we have a handler (and then, we have without costs all general
features, like compression, encryption, hash verification,...) that
stream into the container.

> The remote handler could more or less work but
> implementing one sounds like more work than implementing a handler.

Frankly speaking: I do not think is a solution. I implemented the remote
handler just for one customer, because he did not want to publish his
code used to update some microcontrollers under GPL. It is overkilling
in most cases as it was for that project, and SWUpdate loses the full
control, making more difficult to debug issues in case something went
wrong. Please stay with the handler solution, it is the technical best
choice.

Best regards,
Stefano Babic


--
=====================================================================
DENX Software Engineering GmbH, Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: +49-8142-66989-53 Fax: +49-8142-66989-80 Email: sba...@denx.de
=====================================================================

Dominique MARTINET

unread,
Mar 3, 2021, 9:09:39 PM3/3/21
to Stefano Babic, swup...@googlegroups.com
Thanks for the prompt answer.

Stefano Babic wrote on Wed, Mar 03, 2021 at 11:31:59AM +0100:
> > From what I've seen on the list archives and documentation however I
> > don't think that'll get welcomed -- swupdate obviously wants to be as
> > much self-contained as possible.
>
> That is correct - I can accept the implementation above, but only after
> discussion and if we have not found any other suitable way. Rather, linking
> a go library with another language is a nightmare.
>
> The preferred solution is that we can link in some way "libpod" and then
> call it from a new handler. I will like to check first if this way is
> praticable, and to fall back to the implementation based on run_system_cmd
> if no other solution is possible. Do you have already investigated ?

hm, libpod itself does not seem very friendly to use from C as you've
pointed out.
There's an issue[1] asking about it and general agreement that go is not
the best to link with from other languages -- and suggestion of a
varlink API instead?

[1] https://github.com/containers/podman/issues/935


I've never heard of varlink before but it looks like spawning another
process and feeding it json on a specified file descriptor; an usage
example I found[2] suggests systemd socket-activation to spawn the
process (we won't have systemd on our boards so more realistically
busybox inetd would probably work if the command is not tied to the
systemd way of handing the socket fd...)

[2] https://podman.io/blogs/2019/01/16/podman-varlink.html


There's also a REST api[3] that works the same way, on my laptop I have a
systemd-activated socket that can be used with `curl --unix-socket...`
that I could get to work here.
I see the service can also be run permanently as a regular service so it
is not as much a problem as the varlink one either.

. . . . which looked great until now, but the REST API will store the
tar it receives as is before processing it, so in effect needs to store
the image twice in a temporary dir while direct load command processes
it on the fly, so that's pretty much a deal breaker for me.

[3] https://docs.podman.io/en/latest/_static/api.html#operation/libpodImagesLoad


> I would try to investigate if there are some chances to get a library to
> drive podman. I appreciate if you can check this, and we can discuss then
> here the results. And if there will be really no chance to do in another
> way, I can accept the solution above with a handler calling
> run_system_cmd().

Thanks. I'm afraid I don't see much alternative right now, but will keep
looking a bit more.
varlink might be more useable than I believe... (although from my point
of view if both are spawning a process we might as well do it ourselves)

--
Dominique

Dominique MARTINET

unread,
Mar 4, 2021, 12:25:02 AM3/4/21
to Stefano Babic, swup...@googlegroups.com
Dominique MARTINET wrote on Thu, Mar 04, 2021 at 11:09:22AM +0900:
> varlink might be more useable than I believe... (although from my point
> of view if both are spawning a process we might as well do it ourselves)

For that one: it looks like varlink support got removed with podman 3.0,
so that was a dead end. It probably got overriden by the http API so
would likely have had the same problem.

--
Dominique

Stefano Babic

unread,
Mar 4, 2021, 4:34:20 AM3/4/21
to Dominique MARTINET, Stefano Babic, swup...@googlegroups.com
Hi Dominique,

On 04.03.21 03:09, Dominique MARTINET wrote:
> Thanks for the prompt answer.
>
> Stefano Babic wrote on Wed, Mar 03, 2021 at 11:31:59AM +0100:
>>> From what I've seen on the list archives and documentation however I
>>> don't think that'll get welcomed -- swupdate obviously wants to be as
>>> much self-contained as possible.
>>
>> That is correct - I can accept the implementation above, but only after
>> discussion and if we have not found any other suitable way. Rather, linking
>> a go library with another language is a nightmare.
>>
>> The preferred solution is that we can link in some way "libpod" and then
>> call it from a new handler. I will like to check first if this way is
>> praticable, and to fall back to the implementation based on run_system_cmd
>> if no other solution is possible. Do you have already investigated ?
>
> hm, libpod itself does not seem very friendly to use from C as you've
> pointed out.

Right.

> There's an issue[1] asking about it and general agreement that go is not
> the best to link with from other languages -- and suggestion of a
> varlink API instead?
>
> [1] https://github.com/containers/podman/issues/935

I research myself yesterday, I get the same link, never heard about
varlink before.

>
>
> I've never heard of varlink before but it looks like spawning another
> process and feeding it json on a specified file descriptor;

This is also my understandig - the increased complexity makes then no
sense, because it makes easier that an update does not work.

> an usage
> example I found[2] suggests systemd socket-activation to spawn the
> process (we won't have systemd on our boards so more realistically
> busybox inetd would probably work if the command is not tied to the
> systemd way of handing the socket fd...)

This is not an issue, but then, if we have to spawn a process (and it is
not a service), it is like spawning with run_system_cmd()

>
> [2] https://podman.io/blogs/2019/01/16/podman-varlink.html
>
>
> There's also a REST api[3] that works the same way, on my laptop I have a
> systemd-activated socket that can be used with `curl --unix-socket...`
> that I could get to work here.

Read the same (then I stopped it), it seemed to me promising...

> I see the service can also be run permanently as a regular service so it
> is not as much a problem as the varlink one either.
>
> . . . . which looked great until now, but the REST API will store the
> tar it receives as is before processing it, so in effect needs to store
> the image twice in a temporary dir while direct load command processes
> it on the fly, so that's pretty much a deal breaker for me.
>
> [3] https://docs.podman.io/en/latest/_static/api.html#operation/libpodImagesLoad

Ouch....well, yes, streaming is a nice feature, but not so easy to
implement, it is much easier to have temporary copies. But I full agree
with you, with container is a no go.

>
>
>> I would try to investigate if there are some chances to get a library to
>> drive podman. I appreciate if you can check this, and we can discuss then
>> here the results. And if there will be really no chance to do in another
>> way, I can accept the solution above with a handler calling
>> run_system_cmd().
>
> Thanks. I'm afraid I don't see much alternative right now, but will keep
> looking a bit more.
> varlink might be more useable than I believe...

I read your follow-up mail, it is a no go, too. We should find a way
that does not disappear soon.

> (although from my point
> of view if both are spawning a process we might as well do it ourselves)

I think the same, and we can better get under control because the output
of tool is read by SWUpdate. We do not really need a shell script, the
handler can spawn podman-load, right ?

Dominique MARTINET

unread,
Mar 4, 2021, 4:59:33 AM3/4/21
to Stefano Babic, swup...@googlegroups.com
Hi,

Stefano Babic wrote on Thu, Mar 04, 2021 at 10:34:12AM +0100:
> > (although from my point
> > of view if both are spawning a process we might as well do it ourselves)
>
> I think the same, and we can better get under control because the output of
> tool is read by SWUpdate. We do not really need a shell script, the handler
> can spawn podman-load, right ?

I've run into issues with podman where a power cut or interruption in
the middle of a "podman load" (was with podman pull but I could
reproduce with load), which make me want to create a snapshot of the
podman storage before running the load -- but I can do that in a preinst
script just fine.

I have no problem making this specific and running podman load directly
at this stage, no need for this to be an arbitrary script.
(I just bet that if we do this someone else will want to pipe data to
something else at some point, but you can deal with them when they come
:D)




Jumping ahead, for implementation details, I had a quick look at the
code earlier, and using run_system_cmd directly might not be as obvious
as I thought it'd be -- I was planning on just passing an extra fd as
parameter and if set create an extra pipe for stdin and adjust poll, but
I don't think copyfile() can be called incrementally... And we can't
just spawn the process, pipe all data from the input fd through copyfile
and then watch stdout/stderr in case the process blocks on its own output pipe.

At this point a callback to copyfile that periodically checks for
poll and waitpid without waiting seems to be the most practical,
but that would need to be a different function altogether (and require a
global variable as the callback doesn't have any custom parameter, which
I wouldn't mind adding but it is intrusive)


I haven't started as such so am open to suggestions to keep things as
manageable as possible for you

(It's unfortunate my work days ends when yours start so it's a bit
difficult to discuss this, if you're on IRC I can show up on evenings or
I'll resubscribe on a personal address here maybe..)


Thanks,
--
Dominique

Stefano Babic

unread,
Mar 4, 2021, 5:30:11 AM3/4/21
to Dominique MARTINET, Stefano Babic, swup...@googlegroups.com
Hi Dominique,

On 04.03.21 10:59, Dominique MARTINET wrote:
> Hi,
>
> Stefano Babic wrote on Thu, Mar 04, 2021 at 10:34:12AM +0100:
>>> (although from my point
>>> of view if both are spawning a process we might as well do it ourselves)
>>
>> I think the same, and we can better get under control because the output of
>> tool is read by SWUpdate. We do not really need a shell script, the handler
>> can spawn podman-load, right ?
>
> I've run into issues with podman where a power cut or interruption in
> the middle of a "podman load" (was with podman pull but I could
> reproduce with load), which make me want to create a snapshot of the
> podman storage before running the load -- but I can do that in a preinst
> script just fine.

It was just a question from my site.

IMHO the new handler should be very general, and it should be possible
to set in sw-description (in properties, I think) what should be called
/ spawn. It could be a binary, or a script.

>
> I have no problem making this specific and running podman load directly
> at this stage, no need for this to be an arbitrary script.

No, I prefer to have a general solution, but my question was specific
for podman.

> (I just bet that if we do this someone else will want to pipe data to
> something else at some point, but you can deal with them when they come
> :D)

Right - so IMHO the new handler should:

- get all information from sw-description, including optional parameters
for external command
- have a pipe with external command
- use copyfile() and a callback to fill the pipeline
- obtain the output from stdout and stderr of external command and
convert them as TRACE and ERROR.

>
>
>
>
> Jumping ahead, for implementation details, I had a quick look at the
> code earlier, and using run_system_cmd directly might not be as obvious
> as I thought it'd be

It could be that it should be completely implemented in the new handler.

> -- I was planning on just passing an extra fd as
> parameter and if set create an extra pipe for stdin and adjust poll, but
> I don't think copyfile() can be called incrementally... And we can't
> just spawn the process, pipe all data from the input fd through copyfile
> and then watch stdout/stderr in case the process blocks on its own output pipe.
>
> At this point a callback to copyfile

See other examples with callback. copyfile() is called, and the callback
should fill the pipe. If process crashes, pipe is closed by kernel.

> that periodically checks for
> poll and waitpid without waiting seems to be the most practical,
> but that would need to be a different function altogether (and require a
> global variable as the callback doesn't have any custom parameter, which
> I wouldn't mind adding but it is intrusive)

Callback has no user data, you need a global data but defined static
inside the handler. See other exa,ples (swuforward_handler is one of them).

>
>
> I haven't started as such so am open to suggestions to keep things as
> manageable as possible for you
>
> (It's unfortunate my work days ends when yours start so it's a bit
> difficult to discuss this, if you're on IRC I can show up on evenings or
> I'll resubscribe on a personal address here maybe..)

Your name suggests me you are in France, but your company's website
(ouch ? just japanize ?) tells something else...

Best regards,
Stefano

Dominique MARTINET

unread,
Mar 4, 2021, 7:19:49 PM3/4/21
to Stefano Babic, swup...@googlegroups.com
Hi,

Stefano Babic wrote on Thu, Mar 04, 2021 at 11:30:04AM +0100:
> > (I just bet that if we do this someone else will want to pipe data to
> > something else at some point, but you can deal with them when they come
> > :D)
>
> Right - so IMHO the new handler should:
>
> - get all information from sw-description, including optional parameters for
> external command
> - have a pipe with external command
> - use copyfile() and a callback to fill the pipeline
> - obtain the output from stdout and stderr of external command and convert
> them as TRACE and ERROR.

Ok, I think we pretty much agree on the implementation (this corresponds
to what I was saying toward the end)

I have other things planned for today but will send a patch when it's
done/tested a bit, probably early next week.

> > (It's unfortunate my work days ends when yours start so it's a bit
> > difficult to discuss this, if you're on IRC I can show up on evenings or
> > I'll resubscribe on a personal address here maybe..)
>
> Your name suggests me you are in France, but your company's website (ouch ?
> just japanize ?) tells something else...

I'm quite French, but very much not in France yes.
If you look hard enough there is one(!) page in English -- probably
something else worth working on at some point...

--
Dominique
Reply all
Reply to author
Forward
0 new messages