filter instances in bridge mode with iptables

ricardo

unread,

May 12, 2010, 10:29:50 AM5/12/10

to ganeti

Hello,

I have been playing around a bit with iptables as sometimes we have
customers who require access to their instance. Of course they can
change their IP as they have root access and this can lead to some
problems as changing their IPs to poke around the network.

This I believe can be solved using routing instead of bridging but
this lead to a bit more complex setup. So, I tried keeping the bridge
but filter by mac and ip address, for example

iptables -A PREROUTING -t mangle -m physdev --physdev-in tap+ -m mac --
mac-source 00:XX:XX:XX:XX:XX -s 10.11.12.1 -j ACCEPT
... more rules for other instances ...
iptables -A PREROUTING -t mangle -m physdev --physdev-in tap+ -j DROP

In this case, the only way for an instance to go "out" is if the mac
matches with the IP. I only use paravirtualized network cards (virtio)
and as far as I know, KVM is the one controlling the MAC address and
cannot be changed inside the instance.
To deply the changes as so far they are on each node, using gnt-
cluster copyfile and command seems to do the trick pretty good even if
you have a huge number of nodes. Of course if you have that many, you
probably have the time (and money) to try a more complex setup.

Anyone have tried this? Or know if the mac address can be changed
inside a KVM instance?

Ricardo.

Guido Trotter

unread,

May 12, 2010, 11:11:49 AM5/12/10

to gan...@googlegroups.com

On Wed, May 12, 2010 at 3:29 PM, ricardo <ricardo....@gmail.com> wrote:

> Anyone have tried this? Or know if the mac address can be changed
> inside a KVM instance?

ifconfig ethX hw ether NEW_MAC
seems to do the trick. Must be done with the interface down, but I
suppose one can write a script that downs the interface, changes that,
and ups it again.

The safest way imho is to inject yourself in the kvm network script,
and dynamically associate with the interface when it comes up,
dropping packets with the wrong mac or ip from the relevant tapX.

Thanks,

Guido

Roberto Espinoza

unread,

May 12, 2010, 11:16:47 AM5/12/10

to gan...@googlegroups.com

You are right. There is a chance that trying the right Mac Address + the IP they will get access. I will take a look to associate the rule the kvm script, in this case lets say tap2 gets associated with a mac and address, even if they use a valid address, the tap interface will be wrong.

Is there a way to get the IP address set in the instance with ganeti?

I checked the creation scripts and actually it doesn't setup the IP inside the instance.

2010/5/13 Guido Trotter <ultr...@gmail.com>

Guido Trotter

unread,

May 12, 2010, 11:26:27 AM5/12/10

to gan...@googlegroups.com

On Wed, May 12, 2010 at 4:16 PM, Roberto Espinoza <kamu...@yahoo.com> wrote:
> You are right. There is a chance that trying the right Mac Address + the IP
> they will get access. I will take a look to associate the rule the kvm
> script, in this case lets say tap2 gets associated with a mac and address,
> even if they use a valid address, the tap interface will be wrong.
> Is there a way to get the IP address set in the instance with ganeti?
> I checked the creation scripts and actually it doesn't setup the IP inside
> the instance.

You can associate each nic with an ip (though it's optional, for bridged nics).
If you do it will be available in the vif script, which you can hook into.

The creation scripts also get the ip, in the environment, if it's set
at create time.

Thanks,

Guido

Iustin Pop

unread,

May 12, 2010, 1:23:50 PM5/12/10

to gan...@googlegroups.com

On Wed, May 12, 2010 at 07:29:50AM -0700, ricardo wrote:
> Hello,
>
> I have been playing around a bit with iptables as sometimes we have
> customers who require access to their instance. Of course they can
> change their IP as they have root access and this can lead to some
> problems as changing their IPs to poke around the network.
>
> This I believe can be solved using routing instead of bridging but
> this lead to a bit more complex setup. So, I tried keeping the bridge
> but filter by mac and ip address, for example
>
> iptables -A PREROUTING -t mangle -m physdev --physdev-in tap+ -m mac --
> mac-source 00:XX:XX:XX:XX:XX -s 10.11.12.1 -j ACCEPT
> ... more rules for other instances ...
> iptables -A PREROUTING -t mangle -m physdev --physdev-in tap+ -j DROP

No, this is wrong. You cannot do filtering decisions (DROP) in the mangle
table.

Doing proper bridge-filtering int the filter table, in the FORWARD
chaing, is doable as long as CONFIG_BRIDGE_NETFILTER is enabled in your
kernel configuration. After that it's a simple matter of using physdev,
as you show above.

Plain Xen (non-ganeti) has more advice/scripts about this, by the way.

regards,
iustin

Roberto Espinoza

unread,

May 12, 2010, 9:16:54 PM5/12/10

to gan...@googlegroups.com

Doing proper bridge-filtering int the filter table, in the FORWARD
chaing, is doable as long as CONFIG_BRIDGE_NETFILTER is enabled in your
kernel configuration. After that it's a simple matter of using physdev,
as you show above.

I actually my first try was using the FORWARD chain as I know that is how it should be done but checking the traffic in the FORWARD chain didn't show the MAC address (arp traffic?), your option sounds like it is going to enable that option.

Hopefully this mixed with the KVM scripts should be more secure.

Thanks Iustin!

ricardo

unread,

May 13, 2010, 2:52:13 AM5/13/10

to ganeti

Hello Iustin,

I tried this script initially

#!/bin/sh

ifconfig $INTERFACE 0.0.0.0 up
brctl addif $BRIDGE $INTERFACE

eval `iptables-save | fgrep $INTERFACE | sed 's/-A FORWARD/-D FORWARD/
g' | awk '{print "iptables " $0 "; "}'`
iptables -I FORWARD -m physdev --physdev-in $INTERFACE -s $IP -j
ACCEPT
iptables -I FORWARD -m physdev --physdev-out $INTERFACE -d $IP -j
ACCEPT

And it works pretty good except that if the instance can still search
the node.
for example, lets say the IP for the node is 10.10.10.1

inside the instance if you change the ip (lets say 10.10.10.200), you
cannot get out of the node but you reach 10.10.10.1.

I tried
iptables -I INPUT -m physdev --physdev-in $INTERFACE ! -s $IP -j DROP

It works as it stops all the traffic going from the instance to the
node, but the node can still send packets even if the instance cannot
reply.

I checked the output chain but there is no reliable way (that I know)
to identify this IP. Mostly because the interface being used is br0 so
you only see IPs. Suddenly the idea of transparent bridge gets
complicated.
Also dropping everything in the OUTPUT chain add more complexity to
the rules.

Do you have any ideas?

Guido Trotter

unread,

May 13, 2010, 5:00:13 AM5/13/10

to gan...@googlegroups.com

On Thu, May 13, 2010 at 7:52 AM, ricardo <ricardo....@gmail.com> wrote:
>
> Hello Iustin,
>
> I tried this script initially
>
> #!/bin/sh
>
> ifconfig $INTERFACE 0.0.0.0 up
> brctl addif $BRIDGE $INTERFACE
>
> eval `iptables-save | fgrep $INTERFACE | sed 's/-A FORWARD/-D FORWARD/
> g' | awk '{print "iptables " $0 "; "}'`
> iptables -I FORWARD -m physdev --physdev-in $INTERFACE -s $IP -j
> ACCEPT
> iptables -I FORWARD -m physdev --physdev-out $INTERFACE -d $IP -j
> ACCEPT
>
>
> And it works pretty good except that if the instance can still search
> the node.
> for example, lets say the IP for the node is 10.10.10.1
>
> inside the instance if you change the ip (lets say 10.10.10.200), you
> cannot get out of the node but you reach 10.10.10.1.
>
> I tried
> iptables -I INPUT -m physdev --physdev-in $INTERFACE ! -s $IP -j DROP
>
> It works as it stops all the traffic going from the instance to the
> node, but the node can still send packets even if the instance cannot
> reply.
>

Well, but what's the problem with that. What you're concerned about is
a rogue instance changing its ip to something invalid, right?
If it does it won't be able to communicate, neither with the node nor
with the network, so you should be ok.

> I checked the output chain but there is no reliable way (that I know)
> to identify this IP. Mostly because the interface being used is br0 so
> you only see IPs. Suddenly the idea of transparent bridge gets
> complicated.
> Also dropping everything in the OUTPUT chain add more complexity to
> the rules.
>
> Do you have any ideas?

Well, you could try -I OUTPUT -m physdev --physdev-out $INTERFACE ! -d
$IP -j DROP (untested)
But are you sure you actually need to go this far? What's the threat
you're trying to defend against, here?

Thanks,

Guido

Ricardo Espinoza

unread,

May 13, 2010, 6:45:58 AM5/13/10

to gan...@googlegroups.com

Guido,

Thanks for the comments. I actually tried that rule but didn't work. I per see think this is enough but it is a requirement because from time to time we accept instances from customers for a brief period of time,

So I can secure the nodes all the infrastructure I handle.

But for example.

Lets say you have an instance running in 10.10.10.1, this is in node1. So, if someone with an instance in the same node changes the IP to 10.10.10.30, they cannot reach the internet but they can reach 10.10.10.1

The physdev doesn't work because I suppose the traffic is local, so it never leaves the bridge. I can see the traffic in the OUTPUT chain but I cannot tell in anyway if the traffic from 10.10.10.30 is from the allowed tap.

I am trying now with routed instances and they seem to work when not using a table ($LINK is blank). I tried changing the IP and as expected, you cannot get out of the instance. This should solve my problem, right?

Do you have any information about using the $LINK with tables? (first time doing it).

Any tips to point me to the right direction would be awesome.

Ricardo

2010/5/13 Guido Trotter <ultr...@gmail.com>

Guido Trotter

unread,

May 13, 2010, 7:54:26 AM5/13/10

to gan...@googlegroups.com

On Thu, May 13, 2010 at 11:45 AM, Ricardo Espinoza
<ricardo....@gmail.com> wrote:
> Guido,
> Thanks for the comments. I actually tried that rule but didn't work. I per
> see think this is enough but it is a requirement because from time to time
> we accept instances from customers for a brief period of time,
> So I can secure the nodes all the infrastructure I handle.
> But for example.
> Lets say you have an instance running in 10.10.10.1, this is in node1. So,
> if someone with an instance in the same node changes the IP to 10.10.10.30,
> they cannot reach the internet but they can reach 10.10.10.1
> The physdev doesn't work because I suppose the traffic is local, so it never
> leaves the bridge. I can see the traffic in the OUTPUT chain but I cannot
> tell in anyway if the traffic from 10.10.10.30 is from the allowed tap.

I'm a bit lost here. Could you document what entity has what ip, and
what it is trying to do.
Let's make a supposition:
node (node1) has address 10.10.10.1
instance (instance1) has address 10.10.10.101

the filter rules are in place.

Now when the instance changes its address 10.10.10.30 it can't talk
anymore to the internet, nor it can talk to the node, due to the
rules. The node, though, not having an iptables OUTPUT rule, could
ping the instance on 10.10.10.30. So what you might want is to take a
look at arp tables, to filter the arp requests the instance can give.
If it arp replies for wrong IPs are filtered, the node won't even try
to contact it on the wrong address, and you should be fine.

> I am trying now with routed instances and they seem to work when not using a
> table ($LINK is blank). I tried changing the IP and as expected, you cannot
> get out of the instance. This should solve my problem, right?

Yes, it should. Then you need to use proxy arp on your main ethernet
or some other way to make sure the traffic is routed to the right
node, for the right instance (routing daemon, ganeti nbma project,
etc).

> Do you have any information about using the $LINK with tables? (first time
> doing it).
> Any tips to point me to the right direction would be awesome.
>

The link parameter is useful if you have different sets of instances
which should not talk to each other.
Each of them gets insulated in its own routing table. Then the routing
table can define different output path (perhaps tunnel their traffic
to an endpoint, using GRE encapsulation, this is what the nbma project
does) (but a separate ethernet interface could also be used). Please
note that enabling that you get into policy routing, which is at least
"tricky" to get right. (for example the nodes won't be able to talk to
the instances, unless you make ip rule changes in order to be able to
do so).

Basically you have three ways to use routed instances:
- normal routing (no link), the node has full communication with all instances
- policy routing (link), each routing table can direct instance
traffic its own way. different instances won't see each other. you can
add policies to make node traffic talk with all instances, if they
have separate ip spaces
- policy routing (link) and overlapping ip spaces: you can also run
this way, but this will break path mtu discovery from the node to the
instance, so you have to make sure not to need it, or to circumvent
the issue somehow.

Thanks,

Guido

ricardo

unread,

May 13, 2010, 9:03:43 AM5/13/10

to ganeti

Wow. routed mode seems like a different kind of monster :)

>
> I'm a bit lost here. Could you document what entity has what ip, and
> what it is trying to do.
> Let's make a supposition:
> node (node1) has address 10.10.10.1
> instance (instance1) has address 10.10.10.101
>
> the filter rules are in place.
>
> Now when the instance changes its address 10.10.10.30 it can't talk
> anymore to the internet, nor it can talk to the node, due to the
> rules. The node, though, not having an iptables OUTPUT rule, could
> ping the instance on 10.10.10.30. So what you might want is to take a
> look at arp tables, to filter the arp requests the instance can give.
> If it arp replies for wrong IPs are filtered, the node won't even try
> to contact it on the wrong address, and you should be fine.
>

I just setup some testing instances to try out (if you are still
there).

node1 is 10.1.1.1

instance1 is 10.1.2.51 (gateway 10.1.2.1)
instance2 is 10.1.3.51 (gateway 10.1.3.1)

Now in the node

#ip route show
10.1.1.0/24 dev br0 proto kernel scope link src 10.1.1.11
default via 10.1.1.1 dev br0

#iptables -L -n
Chain INPUT (policy ACCEPT)
target prot opt source destination
DROP all -- 10.1.2.51 0.0.0.0/0 PHYSDEV
match --physdev-in tap0
DROP all -- 10.1.3.51 0.0.0.0/0 PHYSDEV
match --physdev-in tap1

Chain FORWARD (policy DROP)
target prot opt source destination
ACCEPT all -- 0.0.0.0/0 10.1.2.51 PHYSDEV
match --physdev-out tap0
ACCEPT all -- 10.1.2.51 0.0.0.0/0 PHYSDEV
match --physdev-in tap0
ACCEPT all -- 0.0.0.0/0 10.1.3.51 PHYSDEV
match --physdev-out tap1
ACCEPT all -- 10.1.3.51 0.0.0.0/0 PHYSDEV
match --physdev-in tap1

Now, both instances can ping outside. They can also ping between them
because the gateway(router) allows it.

For example in the instance1 pinging instance2

~# traceroute -n test2
traceroute to test2 (10.1.3.51), 30 hops max, 40 byte packets
1 10.1.2.1 0.675 ms 0.647 ms 0.637 ms
2 10.1.3.51 2.206 ms 2.204 ms 2.196 ms

Now, I will change in test1 the IP from 10.1.2.51 to 10.1.3.100
As expected, with the iptables tool I cannot reach the internet.
But then, I can reach 10.1.3.51 and

Same, if i change to 10.1.1.100, then I can reach the nodes (10.1.1.11
and so on).

For testing routed I actually enabled proxy arp on my eth0.
Just added this to my sysctl.conf

net.ipv4.ip_forward=1
net.ipv4.conf.eth0.forwarding=1
net.ipv4.conf.eth0.proxy_arp=1

that should it?
just one question as I can't test right away, what happens when you
move the instance to other node?
Isn't the static route left in the old node?

Thanks.

Guido Trotter

unread,

May 13, 2010, 9:11:14 AM5/13/10

to gan...@googlegroups.com

On Thu, May 13, 2010 at 2:03 PM, ricardo <ricardo....@gmail.com> wrote:
> Wow. routed mode seems like a different kind of monster :)
>

Yes. It can be very entertaining though, once you got the hang of it. :)

Don't you want here ! -s 10.1.2.51 ?

> DROP all -- 10.1.3.51 0.0.0.0/0 PHYSDEV
> match --physdev-in tap1
>

Same here.

> Chain FORWARD (policy DROP)
> target prot opt source destination
> ACCEPT all -- 0.0.0.0/0 10.1.2.51 PHYSDEV
> match --physdev-out tap0
> ACCEPT all -- 10.1.2.51 0.0.0.0/0 PHYSDEV
> match --physdev-in tap0
> ACCEPT all -- 0.0.0.0/0 10.1.3.51 PHYSDEV
> match --physdev-out tap1
> ACCEPT all -- 10.1.3.51 0.0.0.0/0 PHYSDEV
> match --physdev-in tap1
>
>
> Now, both instances can ping outside. They can also ping between them
> because the gateway(router) allows it.
>
> For example in the instance1 pinging instance2
>
> ~# traceroute -n test2
> traceroute to test2 (10.1.3.51), 30 hops max, 40 byte packets
> 1 10.1.2.1 0.675 ms 0.647 ms 0.637 ms
> 2 10.1.3.51 2.206 ms 2.204 ms 2.196 ms
>
>
> Now, I will change in test1 the IP from 10.1.2.51 to 10.1.3.100
> As expected, with the iptables tool I cannot reach the internet.
> But then, I can reach 10.1.3.51 and
>

That's why you should investigate arptables to block those arp replies.

> Same, if i change to 10.1.1.100, then I can reach the nodes (10.1.1.11
> and so on).
>
>
>
> For testing routed I actually enabled proxy arp on my eth0.
> Just added this to my sysctl.conf
>
> net.ipv4.ip_forward=1
> net.ipv4.conf.eth0.forwarding=1
> net.ipv4.conf.eth0.proxy_arp=1
>
> that should it?

Yes.

> just one question as I can't test right away, what happens when you
> move the instance to other node?
> Isn't the static route left in the old node?
>

No, the routes disappear as the interface go down. Handy.
Any iptables rule (or ip policy routing rule) needs special handling though.

Thanks,

Guido

Reply all

Reply to author

Forward