CRI: How silly it is to have a systemd-cri?

674 views
Skip to first unread message

the.m...@gmail.com

unread,
Feb 27, 2017, 5:46:25 AM2/27/17
to Kubernetes developer/contributor discussion
Hi,

Fleet is no longer being developed and maintained. This lead me to an idea to implement systemd-cri do to schedule systemd units in my cluster. No need for containers, no runc, no docker, no rkt etc. I want to use k8s for everything except the actual execution and execute the way we already do: no namespaces, just shelling out. Even data distribution is out of the scope as everything required is always available on AMIs.

I did my research and it sounds possible to do. Looks like I just miss a custom CRI that will use systemd as runtime backend. I did take a look at cri-o but I'm not sure I need this amount of wrapping just to drop it later and run simple systemd units.

1. Do I miss something that will prevent this from working?
2. Is there any ongoing work towards the same direction?
3. What do you think?

Just to be clear. I do understand that this is an attempt to try to use k8s not the way it is supposed to be used. I do understand what containers are and everything. I do understand security implications. I just want to use this amazing system to solve my very problems and I don't need the whole package.


--
Alex

Michał Rostecki

unread,
Feb 27, 2017, 9:27:53 AM2/27/17
to the.m...@gmail.com, Kubernetes developer/contributor discussion
The definition of containers inside pod requires the name of image. I hardly imagine how you can use the concept of images for running systemd units. Image would be the name of the command?
Also, when using only systemd units, to add new software, you need to install packages on the system, i.e. by using some configuration management system. Container images allow you to pull images without having to install software on the host.

But on the other hand, a custom CRI for systemd-nspawn is the idea I was loosely thinking about. Assuming that such a CRI implementation would use Docker images, the image service would need just to store them (most probably as tarballs for each layer) and untar them. Maybe that would be better idea for you if the thing you care mostly is to use only systemd-related ecosystem under Kubernetes?

--
You received this message because you are subscribed to the Google Groups "Kubernetes developer/contributor discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-de...@googlegroups.com.
To post to this group, send email to kuberne...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kubernetes-dev/49443b90-185a-48dd-9431-a506a4891281%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Alexander Krasnukhin

unread,
Feb 27, 2017, 10:17:48 AM2/27/17
to Michał Rostecki, Kubernetes developer/contributor discussion
Hi,

Yes, I think you got it.

It is an interesting idea to make a slightly more generic solution and go systemd-nspawn. I guess I can always turn off all namespaces if needed. Docker image distribution is also something I thought about. There is no need to run docker to use images from its registry. I can even pre-pull images for every AMI to make sure there is no unnecessary network load.

Hm.. systemd-nspawn is an interesting suggestion but I thought to take containers out of the picture entirely. On the other hand systemd-nspawn will take care of some bookkeeping that must be done anyway for every container or container-like system. This way I can avoid writing this code. I need to think about it.

Thanks for your input!

On 27 February 2017 at 15:27, Michał Rostecki <mic...@kinvolk.io> wrote:
The definition of containers inside pod requires the name of image. I hardly imagine how you can use the concept of images for running systemd units. Image would be the name of the command?
Also, when using only systemd units, to add new software, you need to install packages on the system, i.e. by using some configuration management system. Container images allow you to pull images without having to install software on the host.

But on the other hand, a custom CRI for systemd-nspawn is the idea I was loosely thinking about. Assuming that such a CRI implementation would use Docker images, the image service would need just to store them (most probably as tarballs for each layer) and untar them. Maybe that would be better idea for you if the thing you care mostly is to use only systemd-related ecosystem under Kubernetes?

On Mon, Feb 27, 2017 at 11:46 AM <the.m...@gmail.com> wrote:
Hi,

Fleet is no longer being developed and maintained. This lead me to an idea to implement systemd-cri do to schedule systemd units in my cluster. No need for containers, no runc, no docker, no rkt etc. I want to use k8s for everything except the actual execution and execute the way we already do: no namespaces, just shelling out. Even data distribution is out of the scope as everything required is always available on AMIs.

I did my research and it sounds possible to do. Looks like I just miss a custom CRI that will use systemd as runtime backend. I did take a look at cri-o but I'm not sure I need this amount of wrapping just to drop it later and run simple systemd units.

1. Do I miss something that will prevent this from working?
2. Is there any ongoing work towards the same direction?
3. What do you think?

Just to be clear. I do understand that this is an attempt to try to use k8s not the way it is supposed to be used. I do understand what containers are and everything. I do understand security implications. I just want to use this amazing system to solve my very problems and I don't need the whole package.


--
Alex

--
You received this message because you are subscribed to the Google Groups "Kubernetes developer/contributor discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-dev+unsubscribe@googlegroups.com.
To post to this group, send email to kubernetes-dev@googlegroups.com.



--
Regards,
Alexander

David Aronchick

unread,
Feb 27, 2017, 12:53:20 PM2/27/17
to Alexander Krasnukhin, Michał Rostecki, Kubernetes developer/contributor discussion
FWIW, the ability to run arbitrary binaries (not just containers) is fairly regularly requested. If this could be generalized in this way, that'd be pretty sweet.

To post to this group, send email to kuberne...@googlegroups.com.



--
Regards,
Alexander

--
You received this message because you are subscribed to the Google Groups "Kubernetes developer/contributor discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-dev+unsubscribe@googlegroups.com.
To post to this group, send email to kubernetes-dev@googlegroups.com.

Alexander Krasnukhin

unread,
Feb 28, 2017, 3:34:23 AM2/28/17
to David Aronchick, Michał Rostecki, Kubernetes developer/contributor discussion
What would the preferred way to distribute binaries in this case? rpm?

Ilya Dmitrichenko

unread,
Feb 28, 2017, 4:02:17 AM2/28/17
to Alexander Krasnukhin, David Aronchick, Michał Rostecki, Kubernetes developer/contributor discussion
I think you'd get a similar effect by running a pod with dummy image that has `hostPID`, `hostNetwork` & `hostIPC`, and mounts hosts root FS... It's probably going to be quite easy to build.
I'm not sure if you can mount hosts root directly as container root, and make it recursive, but in the worst case you'd just have to mount required directories individually.

But I think that there are very few features of Kubernetes that you will be able to use with something like this anyway.

FYI, Nomad already supports something very similar to what you have described, yet I don't think it does it via systemd, as far as I'm aware it's able to sandbox (namespace & chroot) binaries found on nodes, and I believe you can disable the sandboxing functionality....  I am pretty sure that the feature set of Nomad is much smaller comparing to Kubernetes.

What exactly are you looking for? Is it just scheduling with cgroups but without namespace-based isolation?
If you don't mind me asking, are you fully aware of all benefits namespace isolation gives you around network security, general host security and runtime dependencies etc?

To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-de...@googlegroups.com.



--
Regards,
Alexander

--
You received this message because you are subscribed to the Google Groups "Kubernetes developer/contributor discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-de...@googlegroups.com.

To post to this group, send email to kuberne...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Kubernetes developer/contributor discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-de...@googlegroups.com.

To post to this group, send email to kuberne...@googlegroups.com.

Alexander Krasnukhin

unread,
Feb 28, 2017, 5:20:24 AM2/28/17
to Ilya Dmitrichenko, David Aronchick, Michał Rostecki, Kubernetes developer/contributor discussion
Yes, you got it.

Indeed, currently we use nomad to run similar tasks. It has raw binaries support and doesn't require any namespacing at all. This is the very requirements we have: cluster-wide scheduler with no namespacing, no isolation etc. You are right here.

Yes, I'm well aware what namespaces are and what you get from them. I even understand that I go against the very idea of kubernetes: "Kubernetes is an open-source system for automating deployment, scaling, and management of containerized applications.". You can think about this idea to have systemd-cri as the ultimate test for CRI: is it robust enough fake all isolation while still keep the cluster rolling.


To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-dev+unsubscribe@googlegroups.com.
To post to this group, send email to kubernetes-dev@googlegroups.com.



--
Regards,
Alexander

--
You received this message because you are subscribed to the Google Groups "Kubernetes developer/contributor discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-dev+unsubscribe@googlegroups.com.
To post to this group, send email to kubernetes-dev@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Kubernetes developer/contributor discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-dev+unsubscribe@googlegroups.com.
To post to this group, send email to kubernetes-dev@googlegroups.com.



--
Regards,
Alexander

David Aronchick

unread,
Feb 28, 2017, 12:44:09 PM2/28/17
to Alexander Krasnukhin, Ilya Dmitrichenko, Michał Rostecki, Kubernetes developer/contributor discussion

On Tue, Feb 28, 2017 at 2:20 AM, Alexander Krasnukhin <the.m...@gmail.com> wrote:
Yes, I'm well aware what namespaces are and what you get from them. I even understand that I go against the very idea of kubernetes: "Kubernetes is an open-source system for automating deployment, scaling, and management of containerized applications.". You can think about this idea to have systemd-cri as the ultimate test for CRI: is it robust enough fake all isolation while still keep the cluster rolling.

We've always thought of Kubernetes as a way to run distributed apps - containerization is an implementation detail. An important one, but implementation detail nonetheless. The _most_ common two alternate scenarios are exactly what you described a binary with no dependencies (e.g. go binary, whatever) that just needs to run, and alternate packaging formats (e.g. .jar).

Klaus Ma

unread,
Mar 26, 2017, 11:00:03 PM3/26/17
to Kubernetes developer/contributor discussion
We have similar requirements to run some application without container. I'd like to see what can we do to move forward.

Klaus Ma

unread,
Mar 26, 2017, 11:01:49 PM3/26/17
to Kubernetes developer/contributor discussion, aron...@google.com, mic...@kinvolk.io
No special way :). One of our customer mount a readonly FS for all related binaries.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-de...@googlegroups.com.



--
Regards,
Alexander

--
You received this message because you are subscribed to the Google Groups "Kubernetes developer/contributor discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-de...@googlegroups.com.

To post to this group, send email to kuberne...@googlegroups.com.

the.m...@gmail.com

unread,
Mar 27, 2017, 9:23:31 AM3/27/17
to Kubernetes developer/contributor discussion
Hi,

I started some work but it is not moving fast right now. Poke me if you are interested to move it forward together.

Michał Rostecki

unread,
Mar 27, 2017, 9:29:31 AM3/27/17
to the.m...@gmail.com, Kubernetes developer/contributor discussion
Can you share a link to the repo? :)

--
You received this message because you are subscribed to the Google Groups "Kubernetes developer/contributor discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-de...@googlegroups.com.
To post to this group, send email to kuberne...@googlegroups.com.

Klaus Ma

unread,
Mar 27, 2017, 11:34:53 PM3/27/17
to Kubernetes developer/contributor discussion, the.m...@gmail.com
I'm starting the work at https://github.com/k82cn/sysletd , and expect to get a prototype in next few weeks.

And I'd like to know community's suggestion on how to move forward after the prototype.

Alexander Krasnukhin

unread,
Mar 28, 2017, 8:00:03 AM3/28/17
to Klaus Ma, Kubernetes developer/contributor discussion
You maybe want to take a look at https://github.com/kubernetes-incubator/cri-o. The code is easy to follow and understand. You can reuse a lot of logic here to implement something for systemd. For example the client there is very generic and you can use it almost as is to test CRIs.
--
Regards,
Alexander

Klaus Ma

unread,
Mar 28, 2017, 8:10:34 AM3/28/17
to Kubernetes developer/contributor discussion, klaus1...@gmail.com
Thanks for your suggestion; yes, I'm build the systemd based cri implementation by referencing https://github.com/kubernetes-incubator/cri-o. :).
Reply all
Reply to author
Forward
This conversation is locked
You cannot reply and perform actions on locked conversations.
0 new messages