Thanks Kevin, that's helpful.
Can you be more specific about what you mean by "packets not destined for my chosen IP out to the public internet"? It would be helpful to understand your setup in more detail.
I am trying to create a program that can run another program in a new network namespace and filter packets produced by that program using rules implemented in Go. I don't want the subprocess to be isolated from the rest of the system in terms of filesystem, process ID, user ID, etc, but it should be isolated in terms of network. My approach is to create a new network namespace, create a tun device within that namespace, route everything to that tun, then exec the subprocess. The parent then receives all the packets generated by the subprocess using an fdbased LinkEndpoint on the file descriptor for the tun device, and feeds those packets into a tcpip.Stack on which I have a tcpip.Endpoint listening on one particular IP address.The issue I'm facing is that after ingesting the packets into a tcpip.Stack, I then want to send some of them out to the host's network to be routed to the public internet (or elsewhere). I'm wondering how I can set that second part up with a tcpip.Stack: what kind of LinkEndpoint I would use for that
The fdbased LinkEndpoint is typically run inside a network namespace and uses the IP address of the namespace('s device).
Makes sense
But those packets are typically routed via iptables and come out of the host interface looking like packets from this host interface.
Interesting. When you say "those packets are typically routed via iptables", do you mean that network device visible to the container is set up, using some combination of iptables primitives, to route via the host network interface? I thought that when using runsc, sentry would take care of routing packets to/from the container in userspace, not using iptables, or perhaps making some use of iptables but still passing all packets through userspace. What I'm thinking of here is the netstack package in sentry.
If you mean that you want to send packets with source address a.b.c.d from your host interface that has address w.x.y.z, I believe you could create an fdbased LinkEndpoint hooked up to an AF_PACKET socket that's been opened on the host interface. You could send packets with an arbitrary source address, and you'd receive a copy of all incoming packets.
Make sense. Now that you write this, I understand that this won't work in my case, since sources address a.b.c.d won't be a public internet IP address in my case, so return packets won't arrive at w.x.y.z. Is there any way forward with the fdbased LinkEndpoint hooked up to an AF_PACKET socket? Can I have some kind of NAT layer in userspace within tcpip.Stack? If I do that, how do I correctly send these packets out of the host interface in a way that I can identify the return packets and leave my host interface usable by other processes running on the host?Many thanks for taking the time to help with this.
KevinOn Monday, September 23, 2024 at 2:28:46 PM UTC-7 Kōshin wrote:Hi-Based on the tun_tcp_echo sample (https://github.com/google/gvisor/blob/master/pkg/tcpip/sample/tun_tcp_echo/main.go), I have created a tun device and a tcpip.Endpoint that receives packets for a certain IP address and communicates back to the sender. Wonderful!At the moment, though, the tun device is the only way for packets to get in and out of my system. I want to send packets not destined for my chosen IP out to the public internet, using my host's network interface. How, roughly speaking, do I do that?My guess is that I need to create a second NIC with the kind of LinkEndpoint that knows how to send packets out via native linux kernel syscalls. Which LinkEndpoint implementation would I use for that? And are there any samples showing how this could be done at the level of tcpip.Stack?Many thanks,Kōshin
--
You received this message because you are subscribed to a topic in the Google Groups "gVisor Users [Public]" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/gvisor-users/UwVR0SUSlY4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gvisor-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gvisor-users/512c28b8-7e6e-4bd2-a893-853aa55f60d6n%40googlegroups.com.
To answer the routing question: gVisor/runsc is usually run via Docker or Kubernetes, which takes care of routing packets between the host network interface and the device the sentry is hooked up to. This is a good high level diagram of how Docker does things: veth devices are used along with iptables rules (notably NAT rules) to isolate and also support port publishing and the like. It sounds like Docker won't work for what you want to do, although you could probably hack something together with iptables rules to sort of make it work.
| A B C host netns | D E subprocess netns
<--->|[host NIC]<--->[TAP/TUN or veth pair]<--->[controller]<---->[veth0]<---|--->[veth1]<--->[subprocess]
| |A here is some iptables rules that do something like: forward all traffic (minus maybe SSH?) to a TAP/TUN device (or veth pair) that at the other end is listened to by the controller via fdbased+AF_PACKET (B). That controller has another fdbased+AF_PACKET endpoint attached at C to a veth device whose counterpart (D) is inside the network namespace and used in E by the subprocess. In this setup every single packet goes through the controller, which can forward or filter or do whatever to it. You would probably want to use NAT at point A, as if both the host and the controller are using the same IP you might get weird behavior (e.g. your controller knows how to route packets to the subprocess, but the host doesn't and it returns ICMP errors).
Can you be more specific about what you mean by "packets not destined for my chosen IP out to the public internet"? It would be helpful to understand your setup in more detail.
The fdbased LinkEndpoint is typically run inside a network namespace and uses the IP address of the namespace('s device).
But those packets are typically routed via iptables and come out of the host interface looking like packets from this host interface.
If you mean that you want to send packets with source address a.b.c.d from your host interface that has address w.x.y.z, I believe you could create an fdbased LinkEndpoint hooked up to an AF_PACKET socket that's been opened on the host interface. You could send packets with an arbitrary source address, and you'd receive a copy of all incoming packets.
KevinOn Monday, September 23, 2024 at 2:28:46 PM UTC-7 Kōshin wrote:Hi-Based on the tun_tcp_echo sample (https://github.com/google/gvisor/blob/master/pkg/tcpip/sample/tun_tcp_echo/main.go), I have created a tun device and a tcpip.Endpoint that receives packets for a certain IP address and communicates back to the sender. Wonderful!At the moment, though, the tun device is the only way for packets to get in and out of my system. I want to send packets not destined for my chosen IP out to the public internet, using my host's network interface. How, roughly speaking, do I do that?My guess is that I need to create a second NIC with the kind of LinkEndpoint that knows how to send packets out via native linux kernel syscalls. Which LinkEndpoint implementation would I use for that? And are there any samples showing how this could be done at the level of tcpip.Stack?Many thanks,Kōshin
... create tun device, fdbased LinkEndpoint, tcpip stack with 1 NIC ...
mystack.AddProtocolAddress(1, myAddress, stack.AddressProperties{});
...
ep, e := mystack.NewEndpoint(tcp.ProtocolNumber, proto, &wq)
...
ep.Bind(tcpip.FullAddress{Port: myPort})
...
ep.Listen(10)
for {
conn, wq, err := ep.Accept(nil)
...
... communicate here with conn ...
go func() {
defer conn.Close()
}()
}