Packets are dropped between some of my OVS switches.

656 views
Skip to first unread message

gonçalo Semedo

unread,
May 8, 2014, 3:18:18 PM5/8/14
to geni-...@googlegroups.com
Hi,

I don't know what is happening but it seems that packets are dropped between some of my OVS switches.

I attached a screen shot of my topology and a screen shot of an wireshark capture.

When I ping a host, the request and arp reply are flooded, and they arrive at their destination.

ICMP are routed following a random algorithm, but I know that the switches are sending to the correct port, avoiding loops.

Using wireshark I notice that, switches are sending and receiving OpenFlow packets with the controller but there is always a switch, that doesn't receive the packet forwarded by the previous switch.

I also notice, using wireshark, an error in every OpenFlow packet as shown in the attached wireshark screen shot. What does this error mean?

The wireshark capture is an example of a switch that received a Packet Out decision from the controller, but the next switch did not received any packet.

Thanks 
Gonçalo
wiresharkCapture.png
myTopo.png

Niky Riga

unread,
May 8, 2014, 3:50:52 PM5/8/14
to geni-...@googlegroups.com
Hi Goncalo,

Just as a general advice in a looped topology you should NEVER flood packets even if the dest MAC address is the bcast address (like the case for ARP request). Even if they are just a few packets that are flooded, depending on the number of paths you have  these packets get multiplied and can cause a broadcast storm in the network. Remember that in an OF network there is no one doing STP for you, so you have to make sure that packets do not get multiplied by flooding on loops.

I believe that the error you see in wireshark is a dissector error and not an OpenFlow error,
i.e. there is a problem with wireshark parsing the packet not with the communication of the controller and the switch.

I am not sure exactly what is happening, but you might want to try tcpdump directly on the data interfaces of your switches (not the one to the controller) to track the actual packets that go out, to see if the packet actually ever left the switch.

Also as you try to debug this it might make sense to use a simpler topology:
 host1 -- ovs1 -- ovs2 -- host2
just to see if this is a problem with the loops or something else. Also in a simpler topology you can use a learning switch to verify that connectivity is there, before introducing your custom controller (don't use a learning switch in a looped topology).

Cheers,
niky

PS: I see that you are using egre tunnels within one site. I am assuming this is fine since you are able to exchange arps, just mentioning it in case someone on the list thinks that this would not work
May 8, 2014 at 3:18 PM


When I ping a host, the request and arp reply are flooded, and they arrive at their destination.

ICMP are routed following a random algorithm, but I know that the switches are sending to the correct port, avoiding loops.

Using wireshark I notice that, switches are sending and receiving OpenFlow packets with the controller but there is always a switch, that doesn't receive the packet forwarded by the previous switch.

I also notice, using wireshark, an error in every OpenFlow packet as shown in the attached wireshark screen shot. What does this error mean?

The wireshark capture is an example of a switch that received a Packet Out decision from the controller, but the next switch did not received any packet.

Thanks 
Gonçalo
--
GENI Users is a community supported mailing list, so please help by responding to questions you know the answer to.
 
If this is your first time posting a question to this list, please review http://groups.geni.net/geni/wiki/GENIExperimenter/CommunityMailingList
---
You received this message because you are subscribed to the Google Groups "GENI Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to geni-users+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

gonçalo Semedo

unread,
May 10, 2014, 8:39:37 AM5/10/14
to geni-...@googlegroups.com
Hi Niky,

With smaller topologies everything works fine. The problem is when the topology is more complicated.

I was trying to use stitching instead of egre-tunnels, but every time I try to stitch more than one link, it fails.

Is there a problem with the stitching service, or am I doing something wrong? I attached my rspec.

Thanks
Gonçalo


starStitch.rspec

Xi Yang

unread,
May 10, 2014, 10:31:19 AM5/10/14
to geni-...@googlegroups.com
Hi Goncalo, 

I tried your rspec with the stitching computation service (SCS). The expanded request rspec looks good.
Can you be specific about the problem you had? What did you see from stitcher.py output?

—Xi
<starStitch.rspec>

gonçalo Semedo

unread,
May 10, 2014, 1:17:29 PM5/10/14
to geni-...@googlegroups.com
It was something like vlantag xxxx not available

But I tried again and I was able to reserve my topology.

Niky, I did what you said, I used tcpdump on switch's interfaces and the icmp request leaves the switch but doesn't arrive at next switch.

In the topology on this rspec, the situation above occurs when the package is sent to utahOVS. Communication between stanOVS and gpoOVS works fine.

The configuration on the utahOVS is the following:

ovs-vsctl add-br test
ovs-vsctl add-port test eth1
ovs-vsctl add-port test eth2
ovs-vsctl set-controller test tcp:controller_ip:6633
ovs-vsctl set-fail-mode test secure


Can you check please if my rspec has something wrong? Do I need to make more configurations?

Thanks
Gonçalo
starStitch.rspec

Niky Riga

unread,
May 12, 2014, 6:17:53 PM5/12/14
to geni-...@googlegroups.com
Hi Goncalo,

I have tried several times to create the topology but I keep getting errors at reservation time, the most recent one is that I failed to get resources
at stanford (https://www.instageni.stanford.edu/spewlogfile.php3?logfile=111f25227d2a3a05f74ecbe35129600a).

I think it would be easier for me to help you debug if you just install my key to all your nodes. If you are familiar
with ssh keys then you should just append the content of the attached public key to all your nodes at  the ~/.ssh/authorized_keys file.

If you are using the linux version of stitcher/omni and you are using a portal account then there is probably a better
way to do this. Let me know and I can guide you through this.

Cheers,
Niky

May 10, 2014 at 1:17 PM
It was something like vlantag xxxx not available

But I tried again and I was able to reserve my topology.

Niky, I did what you said, I used tcpdump on switch's interfaces and the icmp request leaves the switch but doesn't arrive at next switch.

In the topology on this rspec, the situation above occurs when the package is sent to utahOVS. Communication between stanOVS and gpoOVS works fine.

The configuration on the utahOVS is the following:

ovs-vsctl add-br test
ovs-vsctl add-port test eth1
ovs-vsctl add-port test eth2
ovs-vsctl set-controller test tcp:controller_ip:6633
ovs-vsctl set-fail-mode test secure


Can you check please if my rspec has something wrong? Do I need to make more configurations?

Thanks
Gonçalo




--
GENI Users is a community supported mailing list, so please help by responding to questions you know the answer to.
 
If this is your first time posting a question to this list, please review http://groups.geni.net/geni/wiki/GENIExperimenter/CommunityMailingList
---
You received this message because you are subscribed to the Google Groups "GENI Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to geni-users+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
May 10, 2014 at 10:31 AM
Hi Goncalo, 

I tried your rspec with the stitching computation service (SCS). The expanded request rspec looks good.
Can you be specific about the problem you had? What did you see from stitcher.py output?

—Xi

On May 10, 2014, at 8:39 AM, gonçalo Semedo <goncalo...@gmail.com> wrote:


--
GENI Users is a community supported mailing list, so please help by responding to questions you know the answer to.
 
If this is your first time posting a question to this list, please review http://groups.geni.net/geni/wiki/GENIExperimenter/CommunityMailingList
---
You received this message because you are subscribed to the Google Groups "GENI Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to geni-users+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
May 10, 2014 at 8:39 AM
Hi Niky,

With smaller topologies everything works fine. The problem is when the topology is more complicated.

I was trying to use stitching instead of egre-tunnels, but every time I try to stitch more than one link, it fails.

Is there a problem with the stitching service, or am I doing something wrong? I attached my rspec.

Thanks
Gonçalo





--
GENI Users is a community supported mailing list, so please help by responding to questions you know the answer to.
 
If this is your first time posting a question to this list, please review http://groups.geni.net/geni/wiki/GENIExperimenter/CommunityMailingList
---
You received this message because you are subscribed to the Google Groups "GENI Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to geni-users+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
May 8, 2014 at 3:50 PM
Hi,

I don't know what is happening but it seems that packets are dropped between some of my OVS switches.

I attached a screen shot of my topology and a screen shot of an wireshark capture.

When I ping a host, the request and arp reply are flooded, and they arrive at their destination.

ICMP are routed following a random algorithm, but I know that the switches are sending to the correct port, avoiding loops.

Using wireshark I notice that, switches are sending and receiving OpenFlow packets with the controller but there is always a switch, that doesn't receive the packet forwarded by the previous switch.

I also notice, using wireshark, an error in every OpenFlow packet as shown in the attached wireshark screen shot. What does this error mean?

The wireshark capture is an example of a switch that received a Packet Out decision from the controller, but the next switch did not received any packet.

Thanks 
Gonçalo
id_geni_ssh_rsa (1).pub

gonçalo Semedo

unread,
May 14, 2014, 11:57:47 AM5/14/14
to geni-...@googlegroups.com
Hi Niky,

I am not being able to request my stitched topology again, so I given up, and I used egre-tunnels instead, but the problem is exactly the same.


I attached my new rspec.


I am using Linux and i have a portal account, so how can I add your public key? Maybe now that I am using egre-tunnels you can reserve this topology?


Thanks
Gonçalo



test.rspec

Leigh Stoller

unread,
May 14, 2014, 12:05:32 PM5/14/14
to geni-...@googlegroups.com
> I am not being able to request my stitched topology again, so I given up, and I used egre-tunnels instead, but the problem is exactly the same.

Well, it can’t be *exactly* the same, right? :-)

You will need to tell us what the errors were, which aggregates, the
output of omni, etc.

Leigh





gonçalo Semedo

unread,
May 14, 2014, 12:16:56 PM5/14/14
to geni-...@googlegroups.com
I was able to reserve my topology with egre-tunnels, the problem I was referring  was the dropped packets between the OVS switches ;)


Reply all
Reply to author
Forward
0 new messages