*What's new*
We're glad to announce an entirely new version of RouteFlow, with many new
features in response to our first year´s experiences and the requests from
users and developers!
The version has been in an experimental branch for some time now and is
stable enough to become mainstream.
In this new version, we have introduced:
- Centralized database and IPC
- We leverage MongoDB for storing the core system´s state and the
OpenFlow network statistics. A JSON-based IPC service (aka RouteFlow
protocol) is also implemented on top of it.
- Cleaner code base
- Much of the code was rewritten and organized, making it easier for
developers to play with RouteFlow.
This includes the renaming of some components: RF-Slave becomes
RFClient, RF-Controller becomes RFProxy.
- POX support
- Support for using the new POX controller was added.
- Web monitoring interface (requires POX)
- Inspect network topology, RouteFlow internal messages and network
state.
- Open vSwich v1.4
- To attach the virtual interfaces (eth1 to ethX) of the VMs.
- Used also in the control network that (attaching et0) and running
in bridge mode removes the requirement of a second controller instance to
act as a simple L2 switch.
- Tools for testing
- A new module (rftest) introduces several scripts to facilitate
testing and environment creation.
- SNMP support
- Export OpenFlow stats via SNMP. [Contribution by Joe
Stringer<https://github.com/joestringer>
]
- Foremost we want to make RouteFlow more and easier configurable.
Currently, there's no trivial way to associate VMs and datapaths
statically, but we want to solve this through a new configuration apporach.
- RouteFlow with NOX requires Ubuntu 11.04 (POX users should be fine in
newer versions). We will be adding support for Ubuntu 11.10 and 12.04.
- Embrace OpenFlow v1.X. We have working prototypes of NOX and
software-based reference switch using OpenFlow 1.1 and 1.2.
- Extensions to support LDP label information.
- Exploration of possibilities opened by the use of a central database
(e.g., keep state history and allow queries like "show me flow table at
timestamp x").
- Address High Availability.
- New routing abstractions implemented as Services on top of the
RF-Server.
- ... a number of additions under investigation by students and project
collaborators.
Stay tuned for further news!
Thank you all!
-- Christian Esteve Rothenberg, Ph.D.
Converged Networks Business Unit
CPqD - Center for Research and Development in Telecommunications
Tel. (+55 19) 3705 4479 / Cel. (+55 19) 8193-7087
Nice to see! Definitely fewer moving pieces in the new design which is
great! Also being able to use pox as well will be helpful.
Unfortunately the new setup doesn't appear to work for me with neither
nox nor pox - no packets are received by the control VM rfvm1. I used
Ubuntu 11.0.4 (x64).
I can build fine (though I had to add CPPFLAGS=-fPIC, LDFLAGS=-fPIC).
It's also a bit scary that mongodb is built with prefix /usr which
would overwrite any packaged distribution - but no big deal.
Also the README says to use --nox vs --pox for testing. The scripts
though are actually rftest1 and rftest1_pox and they don't take
command line arguments - and the default user password is root/root
not routeflow.
Anyway - on to the actual problem. :-) switch1 is correctly set up and
when I ping from b1 or b2, ARP packets are tunnelled to the
controller. However rfvm1 never sees them. I uncommented the logging
line in RouteFlow/pox/ext/rfproxy.py:
results = rftable.find({VS_ID: str(event.dpid)})
if results.count() == 0 or results[0][VM_ID] == "":
results = rftable.find({VS_ID: {"$ne": ""}, DP_ID:
str(event.dpid), DP_PORT: str(event.port)})
if results.count() == 0:
log.info("Datapath not associated with a VM")
return
And I see now "Datapath not associated with a VM" on each ping
attempt.
Please let me know if you have any suggestions? Otherwise I will
continue to troubleshoot.
Thanks.
# ./rftest1_pox
Resetting and stopping LXC VMs...
Stopping any running instances and data of rfserver, POX, OVS and
MongoDB...
Starting MongoDB...
all output going to: /dev/null
Starting the rfvm1 virtual machine...
Starting the management network (br0)...
Starting POX and the RouteFlow network controller...
POX 0.0.0 / Copyright 2011 James McCauley
2012-06-04 01:57:41,092 - ext.rfproxy - INFO - RFProxy running.
2012-06-04 01:57:42,538 - openflow.topology - INFO - Switch 60-
eb-69-21-5b-92 connected
DP is up, installing config flows... `?i![
Starting the RouteFlow server...
Starting the control plane network (dp0 OVS)...
2012-06-04 01:57:46,241 - openflow.topology - INFO - Switch
76-73-72-66-76-73|29286 connected
DP is up, installing config flows... rfvsrfv
Starting the sample network...
2012-06-04 01:57:49,744 - openflow.topology - INFO - Switch f6-39-
ee-55-73-49 connected
DP is up, installing config flows... ?9?Us
Now we'll open this test's log.
Try pinging b1 from b2:
$ sudo lxc-console -n b1
Login and run:
$ ping 172.31.2.2
Jun 04 01:57:22|00042|bridge|WARN|bridge dp0: using default bridge
Ethernet address c6:04:a7:73:d2:45
Jun 04 01:57:22|00043|ofproto|INFO|datapath ID changed to
7266767372667673
Jun 04 01:57:22|00044|rconn|INFO|dp0<->tcp:127.0.0.1:6633:
connecting...
Jun 04 01:57:22|00045|rconn|WARN|dp0<->tcp:127.0.0.1:6633: connection
failed (Connection refused)
Jun 04 01:57:22|00046|rconn|INFO|dp0<->tcp:127.0.0.1:6633: waiting 1
seconds before reconnect
Jun 04 01:57:22|00047|bridge|WARN|bridge switch1: using default bridge
Ethernet address 9a:2c:80:a6:9b:47
Jun 04 01:57:22|00048|ofproto|INFO|datapath ID changed to
00009a2c80a69b47
Jun 04 01:57:22|00049|rconn|INFO|switch1<->tcp:127.0.0.1:6633:
connecting...
Jun 04 01:57:22|00050|rconn|WARN|switch1<->tcp:127.0.0.1:6633:
connection failed (Connection refused)
Jun 04 01:57:22|00051|rconn|INFO|switch1<->tcp:127.0.0.1:6633: waiting
1 seconds before reconnect
> *What's new*
> We're glad to announce an entirely new version of RouteFlow, with many new
> features in response to our first year´s experiences and the requests from
> users and developers!
> The version has been in an experimental branch for some time now and is
> stable enough to become mainstream.
> In this new version, we have introduced:
> - Centralized database and IPC
> - We leverage MongoDB for storing the core system´s state and the
> OpenFlow network statistics. A JSON-based IPC service (aka RouteFlow
> protocol) is also implemented on top of it.
> - Cleaner code base
> - Much of the code was rewritten and organized, making it easier for
> developers to play with RouteFlow.
> This includes the renaming of some components: RF-Slave becomes
> RFClient, RF-Controller becomes RFProxy.
> - POX support
> - Support for using the new POX controller was added.
> - Web monitoring interface (requires POX)
> - Inspect network topology, RouteFlow internal messages and network
> state.
> - Open vSwich v1.4
> - To attach the virtual interfaces (eth1 to ethX) of the VMs.
> - Used also in the control network that (attaching et0) and running
> in bridge mode removes the requirement of a second controller instance to
> act as a simple L2 switch.
> - Tools for testing
> - A new module (rftest) introduces several scripts to facilitate
> testing and environment creation.
> - SNMP support
> - Export OpenFlow stats via SNMP. [Contribution by Joe
> Stringer<https://github.com/joestringer>
> ]
> - Foremost we want to make RouteFlow more and easier configurable.
> Currently, there's no trivial way to associate VMs and datapaths
> statically, but we want to solve this through a new configuration apporach.
> - RouteFlow with NOX requires Ubuntu 11.04 (POX users should be fine in
> newer versions). We will be adding support for Ubuntu 11.10 and 12.04.
> - Embrace OpenFlow v1.X. We have working prototypes of NOX and
> software-based reference switch using OpenFlow 1.1 and 1.2.
> - Extensions to support LDP label information.
> - Exploration of possibilities opened by the use of a central database
> (e.g., keep state history and allow queries like "show me flow table at
> timestamp x").
> - Address High Availability.
> - New routing abstractions implemented as Services on top of the
> RF-Server.
> - ... a number of additions under investigation by students and project
> collaborators.
> Stay tuned for further news!
> Thank you all!
> --
> Christian Esteve Rothenberg, Ph.D.
> Converged Networks Business Unit
> CPqD - Center for Research and Development in Telecommunications
> Tel. (+55 19) 3705 4479 / Cel. (+55 19) 8193-7087
About Mongo in /usr: yep, that's not the ideal. We plan to support Ubuntu
12.04 soon, and then it will be a matter of apt-getting mongo :)
As for the testing, the default user/password for the prebuilt VM is
routeflow/routeflow. The LXC containers are root/root.
Anyhow, I changed the scripts a lot in the last commits a few days ago. Try
pulling from the repository. I believe they're much better, though not
quite flawless :) You might have run into some troublesome commit.
There's a small glitch: in the first run of the tests after booting, OVS
behaves badly sometimes. You have to close the script and try again, then
it should work.
Please, let me know if you run into any problems :)
On Mon, Jun 4, 2012 at 6:14 AM, joshb <jo...@google.com> wrote:
> Hi Christian;
> Nice to see! Definitely fewer moving pieces in the new design which is
> great! Also being able to use pox as well will be helpful.
> Unfortunately the new setup doesn't appear to work for me with neither
> nox nor pox - no packets are received by the control VM rfvm1. I used
> Ubuntu 11.0.4 (x64).
> I can build fine (though I had to add CPPFLAGS=-fPIC, LDFLAGS=-fPIC).
> It's also a bit scary that mongodb is built with prefix /usr which
> would overwrite any packaged distribution - but no big deal.
> Also the README says to use --nox vs --pox for testing. The scripts
> though are actually rftest1 and rftest1_pox and they don't take
> command line arguments - and the default user password is root/root
> not routeflow.
> Anyway - on to the actual problem. :-) switch1 is correctly set up and
> when I ping from b1 or b2, ARP packets are tunnelled to the
> controller. However rfvm1 never sees them. I uncommented the logging
> line in RouteFlow/pox/ext/rfproxy.py:
> results = rftable.find({VS_ID: str(event.dpid)})
> if results.count() == 0 or results[0][VM_ID] == "":
> results = rftable.find({VS_ID: {"$ne": ""}, DP_ID:
> str(event.dpid), DP_PORT: str(event.port)})
> if results.count() == 0:
> log.info("Datapath not associated with a VM")
> return
> And I see now "Datapath not associated with a VM" on each ping
> attempt.
> Please let me know if you have any suggestions? Otherwise I will
> continue to troubleshoot.
> Thanks.
> # ./rftest1_pox
> Resetting and stopping LXC VMs...
> Stopping any running instances and data of rfserver, POX, OVS and
> MongoDB...
> Starting MongoDB...
> all output going to: /dev/null
> Starting the rfvm1 virtual machine...
> Starting the management network (br0)...
> Starting POX and the RouteFlow network controller...
> POX 0.0.0 / Copyright 2011 James McCauley
> 2012-06-04 01:57:41,092 - ext.rfproxy - INFO - RFProxy running.
> 2012-06-04 01:57:42,538 - openflow.topology - INFO - Switch 60-
> eb-69-21-5b-92 connected
> DP is up, installing config flows... `?i![
> Starting the RouteFlow server...
> Starting the control plane network (dp0 OVS)...
> 2012-06-04 01:57:46,241 - openflow.topology - INFO - Switch
> 76-73-72-66-76-73|29286 connected
> DP is up, installing config flows... rfvsrfv
> Starting the sample network...
> 2012-06-04 01:57:49,744 - openflow.topology - INFO - Switch f6-39-
> ee-55-73-49 connected
> DP is up, installing config flows... ?9?Us
> Now we'll open this test's log.
> Try pinging b1 from b2:
> $ sudo lxc-console -n b1
> Login and run:
> $ ping 172.31.2.2
> Jun 04 01:57:22|00042|bridge|WARN|bridge dp0: using default bridge
> Ethernet address c6:04:a7:73:d2:45
> Jun 04 01:57:22|00043|ofproto|INFO|datapath ID changed to
> 7266767372667673
> Jun 04 01:57:22|00044|rconn|INFO|dp0<->tcp:127.0.0.1:6633:
> connecting...
> Jun 04 01:57:22|00045|rconn|WARN|dp0<->tcp:127.0.0.1:6633: connection
> failed (Connection refused)
> Jun 04 01:57:22|00046|rconn|INFO|dp0<->tcp:127.0.0.1:6633: waiting 1
> seconds before reconnect
> Jun 04 01:57:22|00047|bridge|WARN|bridge switch1: using default bridge
> Ethernet address 9a:2c:80:a6:9b:47
> Jun 04 01:57:22|00048|ofproto|INFO|datapath ID changed to
> 00009a2c80a69b47
> Jun 04 01:57:22|00049|rconn|INFO|switch1<->tcp:127.0.0.1:6633:
> connecting...
> Jun 04 01:57:22|00050|rconn|WARN|switch1<->tcp:127.0.0.1:6633:
> connection failed (Connection refused)
> Jun 04 01:57:22|00051|rconn|INFO|switch1<->tcp:127.0.0.1:6633: waiting
> 1 seconds before reconnect
> On Jun 1, 5:08 am, Christian Esteve Rothenberg <est...@cpqd.com.br>
> wrote:
> > *What's new*
> > We're glad to announce an entirely new version of RouteFlow, with many
> new
> > features in response to our first year´s experiences and the requests
> from
> > users and developers!
> > The version has been in an experimental branch for some time now and is
> > stable enough to become mainstream.
> > In this new version, we have introduced:
> > - Centralized database and IPC
> > - We leverage MongoDB for storing the core system´s state and the
> > OpenFlow network statistics. A JSON-based IPC service (aka
> RouteFlow
> > protocol) is also implemented on top of it.
> > - Cleaner code base
> > - Much of the code was rewritten and organized, making it easier
> for
> > developers to play with RouteFlow.
> > This includes the renaming of some components: RF-Slave becomes
> > RFClient, RF-Controller becomes RFProxy.
> > - POX support
> > - Support for using the new POX controller was added.
> > - Web monitoring interface (requires POX)
> > - Inspect network topology, RouteFlow internal messages and network
> > state.
> > - Open vSwich v1.4
> > - To attach the virtual interfaces (eth1 to ethX) of the VMs.
> > - Used also in the control network that (attaching et0) and
> running
> > in bridge mode removes the requirement of a second controller
> instance to
> > act as a simple L2 switch.
> > - Tools for testing
> > - A new module (rftest) introduces several scripts to facilitate
> > testing and environment creation.
> > - SNMP support
> > - Export OpenFlow stats via SNMP. [Contribution by Joe
> > Stringer<https://github.com/joestringer>
> > ]
> > - Foremost we want to make RouteFlow more and easier configurable.
> > Currently, there's no trivial way to associate VMs and datapaths
> > statically, but we want to solve this through a new configuration
> apporach.
> > - RouteFlow with NOX requires Ubuntu 11.04 (POX users should be fine
> in
> > newer versions). We will be adding support for Ubuntu 11.10 and 12.04.
> > - Embrace OpenFlow v1.X. We have working prototypes of NOX and
> > software-based reference switch using OpenFlow 1.1 and 1.2.
> > - Extensions to support LDP label information.
> > - Exploration of possibilities opened by the use of a central
> database
> > (e.g., keep state history and allow queries like "show me flow table
> at
> > timestamp x").
> > - Address High Availability.
> > - New routing abstractions implemented as Services on top of the
> > RF-Server.
> > - ... a number of additions under investigation by students and
> project
> > collaborators.
> > Stay tuned for further news!
> > Thank you all!
> > --
> > Christian Esteve Rothenberg, Ph.D.
> > Converged Networks Business Unit
> > CPqD - Center for Research and Development in Telecommunications
> > Tel. (+55 19) 3705 4479 / Cel. (+55 19) 8193-7087
On Mon, 4 Jun 2012, Allan Vidal wrote:
> Hi Josh,
> About Mongo in /usr: yep, that's not the ideal. We plan to support Ubuntu
> 12.04 soon, and then it will be a matter of apt-getting mongo :)
> As for the testing, the default user/password for the prebuilt VM is
> routeflow/routeflow. The LXC containers are root/root.
> Anyhow, I changed the scripts a lot in the last commits a few days ago. Try
> pulling from the repository. I believe they're much better, though not quite
> flawless :) You might have run into some troublesome commit.
> There's a small glitch: in the first run of the tests after booting, OVS
> behaves badly sometimes. You have to close the script and try again, then it
> should work.
> Please, let me know if you run into any problems :)
> Allan
> On Mon, Jun 4, 2012 at 6:14 AM, joshb <jo...@google.com> wrote:
> Hi Christian;
> Nice to see! Definitely fewer moving pieces in the new design
> which is
> great! Also being able to use pox as well will be helpful.
> Unfortunately the new setup doesn't appear to work for me with
> neither
> nox nor pox - no packets are received by the control VM rfvm1. I
> used
> Ubuntu 11.0.4 (x64).
> I can build fine (though I had to add CPPFLAGS=-fPIC,
> LDFLAGS=-fPIC).
> It's also a bit scary that mongodb is built with prefix /usr
> which
> would overwrite any packaged distribution - but no big deal.
> Also the README says to use --nox vs --pox for testing. The
> scripts
> though are actually rftest1 and rftest1_pox and they don't take
> command line arguments - and the default user password is
> root/root
> not routeflow.
> Anyway - on to the actual problem. :-) switch1 is correctly set
> up and
> when I ping from b1 or b2, ARP packets are tunnelled to the
> controller. However rfvm1 never sees them. I uncommented the
> logging
> line in RouteFlow/pox/ext/rfproxy.py:
> On Jun 1, 5:08�am, Christian Esteve Rothenberg
> <est...@cpqd.com.br>
> wrote:
> > *What's new*
> > We're glad to announce an entirely new version of RouteFlow,
> with many new
> > features in response to our first year�s experiences and the
> requests from
> > users and developers!
> > The version has been in an experimental branch for some time
> now and is
> > stable enough to become mainstream.
> > �In this new version, we have introduced:
> > � �- Centralized database and IPC
> > � � � - We leverage MongoDB for storing the core system�s state and
> the
> > � � � OpenFlow network statistics. A JSON-based IPC service (aka
> RouteFlow
> > � � � protocol) is also implemented on top of it.
> > � �- Cleaner code base
> > � � � - Much of the code was rewritten and organized, making it
> easier for
> > � � � developers to play with RouteFlow.
> > � � � This includes the renaming of some components: RF-Slave
> becomes
> > � � � RFClient, RF-Controller becomes RFProxy.
> > � � � �- POX support
> > � � � - Support for using the new POX controller was added.
> > � �- Web monitoring interface (requires POX)
> > � � � - Inspect network topology, RouteFlow internal messages and
> network
> > � � � state.
> > � �- Open vSwich v1.4
> > � � � - To attach the virtual interfaces (eth1 to ethX) of the VMs.
> > � � � �- Used also in the control network that (attaching et0) and
> running
> > � � � in bridge mode removes the requirement of a second controller
> instance to
> > � � � act as a simple L2 switch.
> > � � � �- Tools for testing
> > � � � - A new module (rftest) introduces several scripts to
> facilitate
> > � � � testing and environment creation.
> > � �- SNMP support
> > � � � - Export OpenFlow stats via SNMP. [Contribution by Joe
> > Stringer<https://github.com/joestringer>
> > � � � ]
OK, I find the problem. Firstly there's a lot of startup races in rftest1. For example, if MongoDB fails in time to start nothing else works and just crashes. I added some checks to make sure MongoDB starts before proceeding. I'll send a patch.
But the actual problem is, that RFServer gets a datapath join event for "0x60eb69215b92", which happens to be br0. The control plane datapath joins no problem, but then switch1 tries to join. br0 has already claimed the control plane VM so nothing works.
I just hacked rfserver.cc to never associate 0x60eb69215b92 with a VM. Then when switch1 comes up the VM is free and everything works.
I'm not sure how the code as is currently checked in can work at all, unless I am missing something - Eg, br0 just happens to very slow to come up so it comes up last. Probably the right solution is to add a more robust check for br0 to ignore it no matter when it comes up.
> I'll keep troubleshooting but any suggestions appreciated.
> On Mon, 4 Jun 2012, Allan Vidal wrote:
>> Hi Josh,
>> About Mongo in /usr: yep, that's not the ideal. We plan to support Ubuntu
>> 12.04 soon, and then it will be a matter of apt-getting mongo :)
>> As for the testing, the default user/password for the prebuilt VM is
>> routeflow/routeflow. The LXC containers are root/root.
>> Anyhow, I changed the scripts a lot in the last commits a few days ago. Try
>> pulling from the repository. I believe they're much better, though not >> quite
>> flawless :) You might have run into some troublesome commit.
>> There's a small glitch: in the first run of the tests after booting, OVS
>> behaves badly sometimes. You have to close the script and try again, then >> it
>> should work.
>> Please, let me know if you run into any problems :)
>> Allan
>> On Mon, Jun 4, 2012 at 6:14 AM, joshb <jo...@google.com> wrote:
>> Hi Christian;
>> Nice to see! Definitely fewer moving pieces in the new design
>> which is
>> great! Also being able to use pox as well will be helpful.
>> Unfortunately the new setup doesn't appear to work for me with
>> neither
>> nox nor pox - no packets are received by the control VM rfvm1. I
>> used
>> Ubuntu 11.0.4 (x64).
>> I can build fine (though I had to add CPPFLAGS=-fPIC,
>> LDFLAGS=-fPIC).
>> It's also a bit scary that mongodb is built with prefix /usr
>> which
>> would overwrite any packaged distribution - but no big deal.
>> Also the README says to use --nox vs --pox for testing. The
>> scripts
>> though are actually rftest1 and rftest1_pox and they don't take
>> command line arguments - and the default user password is
>> root/root
>> not routeflow.
>> Anyway - on to the actual problem. :-) switch1 is correctly set
>> up and
>> when I ping from b1 or b2, ARP packets are tunnelled to the
>> controller. However rfvm1 never sees them. I uncommented the
>> logging
>> line in RouteFlow/pox/ext/rfproxy.py:
>> On Jun 1, 5:08�am, Christian Esteve Rothenberg
>> <est...@cpqd.com.br>
>> wrote:
>> > *What's new*
>> > We're glad to announce an entirely new version of RouteFlow,
>> with many new
>> > features in response to our first year�s experiences and the
>> requests from
>> > users and developers!
>> > The version has been in an experimental branch for some time
>> now and is
>> > stable enough to become mainstream.
>> > �In this new version, we have introduced:
>> > � �- Centralized database and IPC
>> > � � � - We leverage MongoDB for storing the core system�s state and
>> the
>> > � � � OpenFlow network statistics. A JSON-based IPC service (aka
>> RouteFlow
>> > � � � protocol) is also implemented on top of it.
>> > � �- Cleaner code base
>> > � � � - Much of the code was rewritten and organized, making it
>> easier for
>> > � � � developers to play with RouteFlow.
>> > � � � This includes the renaming of some components: RF-Slave
>> becomes
>> > � � �
br0 shouldn't be connecting to the controller, as it should be managed by
OVS itself. But it's what seems to be happening. I'll look into it and get
back to you ASAP.
On Tue, Jun 5, 2012 at 9:29 AM, Josh Bailey <jo...@google.com> wrote:
> Hi Allan;
> OK, I find the problem. Firstly there's a lot of startup races in rftest1.
> For example, if MongoDB fails in time to start nothing else works and just
> crashes. I added some checks to make sure MongoDB starts before proceeding.
> I'll send a patch.
> But the actual problem is, that RFServer gets a datapath join event for
> "0x60eb69215b92", which happens to be br0. The control plane datapath joins
> no problem, but then switch1 tries to join. br0 has already claimed the
> control plane VM so nothing works.
> I just hacked rfserver.cc to never associate 0x60eb69215b92 with a VM.
> Then when switch1 comes up the VM is free and everything works.
> I'm not sure how the code as is currently checked in can work at all,
> unless I am missing something - Eg, br0 just happens to very slow to come
> up so it comes up last. Probably the right solution is to add a more robust
> check for br0 to ignore it no matter when it comes up.
> Thanks,
> On Mon, 4 Jun 2012, Josh Bailey wrote:
>> Hi Allan;
>> Thanks, things are a bit closer now but still not working. The rfvm1 VM
>> now receives packets but eth1 (for example) drops them all:
>> I'll keep troubleshooting but any suggestions appreciated.
>> On Mon, 4 Jun 2012, Allan Vidal wrote:
>> Hi Josh,
>>> About Mongo in /usr: yep, that's not the ideal. We plan to support Ubuntu
>>> 12.04 soon, and then it will be a matter of apt-getting mongo :)
>>> As for the testing, the default user/password for the prebuilt VM is
>>> routeflow/routeflow. The LXC containers are root/root.
>>> Anyhow, I changed the scripts a lot in the last commits a few days ago.
>>> Try
>>> pulling from the repository. I believe they're much better, though not
>>> quite
>>> flawless :) You might have run into some troublesome commit.
>>> There's a small glitch: in the first run of the tests after booting, OVS
>>> behaves badly sometimes. You have to close the script and try again,
>>> then it
>>> should work.
>>> Please, let me know if you run into any problems :)
>>> Allan
>>> On Mon, Jun 4, 2012 at 6:14 AM, joshb <jo...@google.com> wrote:
>>> Hi Christian;
>>> Nice to see! Definitely fewer moving pieces in the new design
>>> which is
>>> great! Also being able to use pox as well will be helpful.
>>> Unfortunately the new setup doesn't appear to work for me with
>>> neither
>>> nox nor pox - no packets are received by the control VM rfvm1. I
>>> used
>>> Ubuntu 11.0.4 (x64).
>>> I can build fine (though I had to add CPPFLAGS=-fPIC,
>>> LDFLAGS=-fPIC).
>>> It's also a bit scary that mongodb is built with prefix /usr
>>> which
>>> would overwrite any packaged distribution - but no big deal.
>>> Also the README says to use --nox vs --pox for testing. The
>>> scripts
>>> though are actually rftest1 and rftest1_pox and they don't take
>>> command line arguments - and the default user password is
>>> root/root
>>> not routeflow.
>>> Anyway - on to the actual problem. :-) switch1 is correctly set
>>> up and
>>> when I ping from b1 or b2, ARP packets are tunnelled to the
>>> controller. However rfvm1 never sees them. I uncommented the
>>> logging
>>> line in RouteFlow/pox/ext/rfproxy.py:
>>> results = rftable.find({VS_ID: str(event.dpid)})
>>> if results.count() == 0 or results[0][VM_ID] == "":
>>> results = rftable.find({VS_ID: {"$ne": ""}, DP_ID:
>>> str(event.dpid), DP_PORT: str(event.port)})
>>> if results.count() == 0:
>>> log.info("Datapath not associated with a VM")
>>> return
>>> And I see now "Datapath not associated with a VM" on each ping
>>> attempt.
>>> Please let me know if you have any suggestions? Otherwise I will
>>> continue to troubleshoot.
>>> Thanks.
>>> # ./rftest1_pox
>>> Resetting and stopping LXC VMs...
>>> Stopping any running instances and data of rfserver, POX, OVS
>>> and
>>> MongoDB...
>>> Starting MongoDB...
>>> all output going to: /dev/null
>>> Starting the rfvm1 virtual machine...
>>> Starting the management network (br0)...
>>> Starting POX and the RouteFlow network controller...
>>> POX 0.0.0 / Copyright 2011 James McCauley
>>> 2012-06-04 01:57:41,092 - ext.rfproxy - INFO - RFProxy running.
>>> 2012-06-04 01:57:42,538 - openflow.topology - INFO - Switch 60-
>>> eb-69-21-5b-92 connected
>>> DP is up, installing config flows... `?i![
>>> Starting the RouteFlow server...
>>> Starting the control plane network (dp0 OVS)...
>>> 2012-06-04 01:57:46,241 - openflow.topology - INFO - Switch
>>> 76-73-72-66-76-73|29286 connected
>>> DP is up, installing config flows... rfvsrfv
>>> Starting the sample network...
>>> 2012-06-04 01:57:49,744 - openflow.topology - INFO - Switch
>>> f6-39-
>>> ee-55-73-49 connected
>>> DP is up, installing config flows... ?9?Us
>>> Now we'll open this test's log.
>>> Try pinging b1 from b2:
>>> $ sudo lxc-console -n b1
>>> Login and run:
>>> $ ping 172.31.2.2
>>> Jun 04 01:57:22|00042|bridge|WARN|**bridge dp0: using default
>>> bridge
>>> Ethernet address c6:04:a7:73:d2:45
>>> Jun 04 01:57:22|00043|ofproto|INFO|**datapath ID changed to
>>> 7266767372667673
>>> Jun 04 01:57:22|00044|rconn|INFO|dp0<**->tcp:127.0.0.1:6633:
>>> connecting...
>>> Jun 04 01:57:22|00045|rconn|WARN|dp0<**->tcp:127.0.0.1:6633:
>>> connection
>>> failed (Connection refused)
>>> Jun 04 01:57:22|00046|rconn|INFO|dp0<**->tcp:127.0.0.1:6633:
>>> waiting 1
>>> seconds before reconnect
>>> Jun 04 01:57:22|00047|bridge|WARN|**bridge switch1: using default
>>> bridge
>>> Ethernet address 9a:2c:80:a6:9b:47
>>> Jun 04 01:57:22|00048|ofproto|INFO|**datapath ID changed to
>>> 00009a2c80a69b47
>>> Jun 04 01:57:22|00049|rconn|INFO|**switch1<->tcp:127.0.0.1:6633:
>>> connecting...
>>> Jun 04 01:57:22|00050|rconn|WARN|**switch1<->tcp:127.0.0.1:6633:
>>> connection failed (Connection refused)
>>> Jun 04 01:57:22|00051|rconn|INFO|**switch1<->tcp:127.0.0.1:6633:
>>> waiting
>>> 1 seconds before reconnect
>>> On Jun 1, 5:08 am, Christian Esteve Rothenberg
>>> <est...@cpqd.com.br>
>>> wrote:
>>> > *What's new*
>>> > We're glad to announce an entirely new version of RouteFlow,
>>> with many new
>>> > features in response to our first year´s experiences and the
>>> requests from
>>> > users and developers!
>>> > The version has been in an experimental branch for some time
>>> now and is
>>> > stable enough to become mainstream.
>>> > In this new version, we have introduced:
>>> > - Centralized database and IPC
>>> > - We leverage MongoDB for storing the core system´s state and
>>> the
>>> > OpenFlow network statistics. A JSON-based IPC service (aka
>>> RouteFlow
>>> > protocol) is also implemented on top of it.
>>> > - Cleaner code base
>>> > - Much of the code was rewritten and organized, making it
>>> easier for
>>> > developers to play with RouteFlow.
>>> > This includes the renaming of some components: RF-Slave
>>> becomes
>>> > RFClient, RF-Controller becomes RFProxy.
>>> > - POX support
>>> > - Support for using the new POX controller was added.
>>> > - Web monitoring interface
I tested here, and br0 never connects to either NOX or POX. Which version
of OVS are you using?
What happens here is the following:
-> Starting the control plane network (dp0 VS)...
INFO:openflow.of_01:[Con 1/8243406406160905843] Connected to
76-73-72-66-76-73|29286
[...]
-> Starting the sample network...
INFO:openflow.of_01:[Con 2/34042236714307] Connected to 1e-f6-13-6d-3d-43
The DP IDs are:
~$ sudo ovs-vsctl get Bridge br0 datapath-id
"0000b69ab5a1b348"
~$ sudo ovs-vsctl get Bridge switch1 datapath-id
"00001ef6136d3d43"
And OVS says the configured controllers are:
~$ sudo ovs-vsctl get-controller br0
~$ sudo ovs-vsctl get-controller switch1
tcp:127.0.0.1:6633
I'm very curious as to why your br0 is connecting to the controller...
Based on the OVS man pages, I believe this shouldn't be happening (man
ovs-vsctl):
ovs-vswitchd can perform all configured bridging and switching locally,
or it can be configured to communicate with one or more external Open‐
Flow controllers. The switch is typically configured to connect to a
primary controller that takes charge of the bridge's flow table to
implement a network policy. In addition, the switch can be configured
to listen to connections from service controllers. Service controllers
are typically used for occasional support and maintenance, e.g. with
ovs-ofctl.
Does this behavior persist? Do you have any theories?
As for the Mongo and other race conditions, you're right, we're still too
dependent on the order of the events. Your patch will be very helpful :)
On Tue, Jun 5, 2012 at 10:04 AM, Allan Vidal <all...@cpqd.com.br> wrote:
> Hi Josh,
> br0 shouldn't be connecting to the controller, as it should be managed by
> OVS itself. But it's what seems to be happening. I'll look into it and get
> back to you ASAP.
> Allan
> On Tue, Jun 5, 2012 at 9:29 AM, Josh Bailey <jo...@google.com> wrote:
>> Hi Allan;
>> OK, I find the problem. Firstly there's a lot of startup races in
>> rftest1. For example, if MongoDB fails in time to start nothing else works
>> and just crashes. I added some checks to make sure MongoDB starts before
>> proceeding. I'll send a patch.
>> But the actual problem is, that RFServer gets a datapath join event for
>> "0x60eb69215b92", which happens to be br0. The control plane datapath joins
>> no problem, but then switch1 tries to join. br0 has already claimed the
>> control plane VM so nothing works.
>> I just hacked rfserver.cc to never associate 0x60eb69215b92 with a VM.
>> Then when switch1 comes up the VM is free and everything works.
>> I'm not sure how the code as is currently checked in can work at all,
>> unless I am missing something - Eg, br0 just happens to very slow to come
>> up so it comes up last. Probably the right solution is to add a more robust
>> check for br0 to ignore it no matter when it comes up.
>> Thanks,
>> On Mon, 4 Jun 2012, Josh Bailey wrote:
>>> Hi Allan;
>>> Thanks, things are a bit closer now but still not working. The rfvm1 VM
>>> now receives packets but eth1 (for example) drops them all:
>>> I'll keep troubleshooting but any suggestions appreciated.
>>> On Mon, 4 Jun 2012, Allan Vidal wrote:
>>> Hi Josh,
>>>> About Mongo in /usr: yep, that's not the ideal. We plan to support
>>>> Ubuntu
>>>> 12.04 soon, and then it will be a matter of apt-getting mongo :)
>>>> As for the testing, the default user/password for the prebuilt VM is
>>>> routeflow/routeflow. The LXC containers are root/root.
>>>> Anyhow, I changed the scripts a lot in the last commits a few days ago.
>>>> Try
>>>> pulling from the repository. I believe they're much better, though not
>>>> quite
>>>> flawless :) You might have run into some troublesome commit.
>>>> There's a small glitch: in the first run of the tests after booting, OVS
>>>> behaves badly sometimes. You have to close the script and try again,
>>>> then it
>>>> should work.
>>>> Please, let me know if you run into any problems :)
>>>> Allan
>>>> On Mon, Jun 4, 2012 at 6:14 AM, joshb <jo...@google.com> wrote:
>>>> Hi Christian;
>>>> Nice to see! Definitely fewer moving pieces in the new design
>>>> which is
>>>> great! Also being able to use pox as well will be helpful.
>>>> Unfortunately the new setup doesn't appear to work for me with
>>>> neither
>>>> nox nor pox - no packets are received by the control VM rfvm1. I
>>>> used
>>>> Ubuntu 11.0.4 (x64).
>>>> I can build fine (though I had to add CPPFLAGS=-fPIC,
>>>> LDFLAGS=-fPIC).
>>>> It's also a bit scary that mongodb is built with prefix /usr
>>>> which
>>>> would overwrite any packaged distribution - but no big deal.
>>>> Also the README says to use --nox vs --pox for testing. The
>>>> scripts
>>>> though are actually rftest1 and rftest1_pox and they don't take
>>>> command line arguments - and the default user password is
>>>> root/root
>>>> not routeflow.
>>>> Anyway - on to the actual problem. :-) switch1 is correctly set
>>>> up and
>>>> when I ping from b1 or b2, ARP packets are tunnelled to the
>>>> controller. However rfvm1 never sees them. I uncommented the
>>>> logging
>>>> line in RouteFlow/pox/ext/rfproxy.py:
>>>> results = rftable.find({VS_ID: str(event.dpid)})
>>>> if results.count() == 0 or results[0][VM_ID] == "":
>>>> results = rftable.find({VS_ID: {"$ne": ""}, DP_ID:
>>>> str(event.dpid), DP_PORT: str(event.port)})
>>>> if results.count() == 0:
>>>> log.info("Datapath not associated with a VM")
>>>> return
>>>> And I see now "Datapath not associated with a VM" on each ping
>>>> attempt.
>>>> Please let me know if you have any suggestions? Otherwise I will
>>>> continue to troubleshoot.
>>>> Thanks.
>>>> # ./rftest1_pox
>>>> Resetting and stopping LXC VMs...
>>>> Stopping any running instances and data of rfserver, POX, OVS
>>>> and
>>>> MongoDB...
>>>> Starting MongoDB...
>>>> all output going to: /dev/null
>>>> Starting the rfvm1 virtual machine...
>>>> Starting the management network (br0)...
>>>> Starting POX and the RouteFlow network controller...
>>>> POX 0.0.0 / Copyright 2011 James McCauley
>>>> 2012-06-04 01:57:41,092 - ext.rfproxy - INFO - RFProxy running.
>>>> 2012-06-04 01:57:42,538 - openflow.topology - INFO - Switch 60-
>>>> eb-69-21-5b-92 connected
>>>> DP is up, installing config flows... `?i![
>>>> Starting the RouteFlow server...
>>>> Starting the control plane network (dp0 OVS)...
>>>> 2012-06-04 01:57:46,241 - openflow.topology - INFO - Switch
>>>> 76-73-72-66-76-73|29286 connected
>>>> DP is up, installing config flows... rfvsrfv
>>>> Starting the sample network...
>>>> 2012-06-04 01:57:49,744 - openflow.topology - INFO - Switch
>>>> f6-39-
>>>> ee-55-73-49 connected
>>>> DP is up, installing config flows... ?9?Us
>>>> Now we'll open this test's log.
>>>> Try pinging b1 from b2:
>>>> $ sudo lxc-console -n b1
>>>> Login and run:
>>>> $ ping 172.31.2.2
>>>> Jun 04 01:57:22|00042|bridge|WARN|**bridge dp0: using default
>>>> bridge
>>>> Ethernet address c6:04:a7:73:d2:45
>>>> Jun 04 01:57:22|00043|ofproto|INFO|**datapath ID changed to
>>>> 7266767372667673
>>>> Jun 04 01:57:22|00044|rconn|INFO|dp0<**->tcp:127.0.0.1:6633:
>>>> connecting...
>>>> Jun 04 01:57:22|00045|rconn|WARN|dp0<**->tcp:127.0.0.1:6633:
>>>> connection
>>>> failed (Connection refused)
>>>> Jun 04 01:57:22|00046|rconn|INFO|dp0<**->tcp:127.0.0.1:6633:
>>>> waiting 1
>>>> seconds before reconnect
>>>> Jun 04 01:57:22|00047|bridge|WARN|**bridge switch1: using default
>>>> bridge
>>>> Ethernet address 9a:2c:80:a6:9b:47
>>>> Jun 04 01:57:22|00048|ofproto|INFO|**datapath ID changed to
>>>> 00009a2c80a69b47
>>>> Jun 04 01:57:22|00049|rconn|INFO|**switch1<->tcp:127.0.0.1:6633:
>>>> connecting...
>>>> Jun 04 01:57:22|00050|rconn|WARN|**switch1<->tcp:127.0.0.1:6633:
>>>> connection failed (Connection refused)
>>>> Jun 04 01:57:22|00051|rconn|INFO|**switch1<->tcp:127.0.0.1:6633:
>>>> waiting
>>>> 1 seconds before reconnect
> I tested here, and br0 never connects to either NOX or POX. Which version
> of OVS are you using?
> What happens here is the following:
> -> Starting the control plane network (dp0 VS)...
> INFO:openflow.of_01:[Con 1/8243406406160905843] Connected to
> 76-73-72-66-76-73|29286
> [...]
> -> Starting the sample network...
> INFO:openflow.of_01:[Con 2/34042236714307] Connected to 1e-f6-13-6d-3d-43
> The DP IDs are:
> ~$ sudo ovs-vsctl get Bridge br0 datapath-id
> "0000b69ab5a1b348"
> ~$ sudo ovs-vsctl get Bridge switch1 datapath-id
> "00001ef6136d3d43"
> I'm very curious as to why your br0 is connecting to the controller...
> Based on the OVS man pages, I believe this shouldn't be happening (man
> ovs-vsctl):
> ovs-vswitchd can perform all configured bridging and switching locally,
> or it can be configured to communicate with one or more external Open‐
> Flow controllers. The switch is typically configured to connect to a
> primary controller that takes charge of the bridge's flow table to
> implement a network policy. In addition, the switch can be configured
> to listen to connections from service controllers. Service controllers
> are typically used for occasional support and maintenance, e.g. with
> ovs-ofctl.
> Does this behavior persist? Do you have any theories?
> As for the Mongo and other race conditions, you're right, we're still too
> dependent on the order of the events. Your patch will be very helpful :)
> Thank you,
> Allan
> On Tue, Jun 5, 2012 at 10:04 AM, Allan Vidal <all...@cpqd.com.br> wrote:
> > Hi Josh,
> > br0 shouldn't be connecting to the controller, as it should be managed by
> > OVS itself. But it's what seems to be happening. I'll look into it and get
> > back to you ASAP.
> > Allan
> > On Tue, Jun 5, 2012 at 9:29 AM, Josh Bailey <jo...@google.com> wrote:
> >> Hi Allan;
> >> OK, I find the problem. Firstly there's a lot of startup races in
> >> rftest1. For example, if MongoDB fails in time to start nothing else works
> >> and just crashes. I added some checks to make sure MongoDB starts before
> >> proceeding. I'll send a patch.
> >> But the actual problem is, that RFServer gets a datapath join event for
> >> "0x60eb69215b92", which happens to be br0. The control plane datapath joins
> >> no problem, but then switch1 tries to join. br0 has already claimed the
> >> control plane VM so nothing works.
> >> I just hacked rfserver.cc to never associate 0x60eb69215b92 with a VM.
> >> Then when switch1 comes up the VM is free and everything works.
> >> I'm not sure how the code as is currently checked in can work at all,
> >> unless I am missing something - Eg, br0 just happens to very slow to come
> >> up so it comes up last. Probably the right solution is to add a more robust
> >> check for br0 to ignore it no matter when it comes up.
> >> Thanks,
> >> On Mon, 4 Jun 2012, Josh Bailey wrote:
> >>> Hi Allan;
> >>> Thanks, things are a bit closer now but still not working. The rfvm1 VM
> >>> now receives packets but eth1 (for example) drops them all:
> >>> I'll keep troubleshooting but any suggestions appreciated.
> >>> On Mon, 4 Jun 2012, Allan Vidal wrote:
> >>> Hi Josh,
> >>>> About Mongo in /usr: yep, that's not the ideal. We plan to support
> >>>> Ubuntu
> >>>> 12.04 soon, and then it will be a matter of apt-getting mongo :)
> >>>> As for the testing, the default user/password for the prebuilt VM is
> >>>> routeflow/routeflow. The LXC containers are root/root.
> >>>> Anyhow, I changed the scripts a lot in the last commits a few days ago.
> >>>> Try
> >>>> pulling from the repository. I believe they're much better, though not
> >>>> quite
> >>>> flawless :) You might have run into some troublesome commit.
> >>>> There's a small glitch: in the first run of the tests after booting, OVS
> >>>> behaves badly sometimes. You have to close the script and try again,
> >>>> then it
> >>>> should work.
> >>>> Please, let me know if you run into any problems :)
> >>>> Allan
> >>>> On Mon, Jun 4, 2012 at 6:14 AM, joshb <jo...@google.com> wrote:
> >>>> Hi Christian;
> >>>> Nice to see! Definitely fewer moving pieces in the new design
> >>>> which is
> >>>> great! Also being able to use pox as well will be helpful.
> >>>> Unfortunately the new setup doesn't appear to work for me with
> >>>> neither
> >>>> nox nor pox - no packets are received by the control VM rfvm1. I
> >>>> used
> >>>> Ubuntu 11.0.4 (x64).
> >>>> I can build fine (though I had to add CPPFLAGS=-fPIC,
> >>>> LDFLAGS=-fPIC).
> >>>> It's also a bit scary that mongodb is built with prefix /usr
> >>>> which
> >>>> would overwrite any packaged distribution - but no big deal.
> >>>> Also the README says to use --nox vs --pox for testing. The
> >>>> scripts
> >>>> though are actually rftest1 and rftest1_pox and they don't take
> >>>> command line arguments - and the default user password is
> >>>> root/root
> >>>> not routeflow.
> >>>> Anyway - on to the actual problem. :-) switch1 is correctly set
> >>>> up and
> >>>> when I ping from b1 or b2, ARP packets are tunnelled to the
> >>>> controller. However rfvm1 never sees them. I uncommented the
> >>>> logging
> >>>> line in RouteFlow/pox/ext/rfproxy.py:
> >>>> results = rftable.find({VS_ID: str(event.dpid)})
> >>>> if results.count() == 0 or results[0][VM_ID] == "":
> >>>> results = rftable.find({VS_ID: {"$ne": ""}, DP_ID:
> >>>> str(event.dpid), DP_PORT: str(event.port)})
> >>>> if results.count() == 0:
> >>>> log.info("Datapath not associated with a VM")
> >>>> return
> >>>> And I see now "Datapath not associated with a VM" on each ping
> >>>> attempt.
> >>>> Please let me know if you have any suggestions? Otherwise I will
> >>>> continue to troubleshoot.
> >>>> Thanks.
> >>>> # ./rftest1_pox
> >>>> Resetting and stopping LXC VMs...
> >>>> Stopping any running instances and data of rfserver, POX, OVS
> >>>> and
> >>>> MongoDB...
> >>>> Starting MongoDB...
> >>>> all output going to: /dev/null
> >>>> Starting the rfvm1 virtual machine...
> >>>> Starting the management network (br0)...
> >>>> Starting POX and the RouteFlow network controller...
> >>>> POX 0.0.0 / Copyright 2011 James McCauley
> >>>> 2012-06-04 01:57:41,092 - ext.rfproxy - INFO - RFProxy running.
> >>>> 2012-06-04 01:57:42,538 - openflow.topology - INFO - Switch 60-
> >>>> eb-69-21-5b-92 connected
> >>>> DP is up, installing config flows... `?i![
> >>>> Starting the RouteFlow server...
> >>>> Starting the control plane network (dp0 OVS)...
> >>>> 2012-06-04 01:57:46,241 - openflow.topology - INFO - Switch
> >>>> 76-73-72-66-76-73|29286 connected
> >>>> DP is up, installing config flows... rfvsrfv
> >>>> Starting the sample network...
> >>>> 2012-06-04 01:57:49,744 - openflow.topology - INFO - Switch
> >>>> f6-39-
> >>>> ee-55-73-49 connected
> >>>> DP is up, installing config flows... ?9?Us
> >>>> Now we'll open this test's log.
> >>>> Try pinging b1 from b2:
> >>>> $ sudo lxc-console -n b1
> >>>> Login and run:
> >>>> $ ping 172.31.2.2
> >>>> Jun 04 01:57:22|00042|bridge|WARN|**bridge dp0: using default
> >>>> bridge
> >>>> Ethernet address c6:04:a7:73:d2:45
> >>>> Jun 04 01:57:22|00043|ofproto|INFO|**datapath ID changed to
> >>>> 7266767372667673
> >>>> Jun 04 01:57:22|00044|rconn|INFO|dp0<**->tcp:127.0.0.1:6633:
> >>>> connecting...
> >>>> Jun 04 01:57:22|00045|rconn|WARN|dp0<**->tcp:127.0.0.1:6633:
> >>>> connection
> >>>> failed (Connection refused)
> >>>> Jun 04 01:57:22|00046|rconn|INFO|dp0<**->tcp:127.0.0.1:6633:
> >>>> waiting 1
> >>>> seconds before reconnect
> >>>> Jun 04 01:57:22|00047|bridge|WARN|**bridge switch1: using default
> >>>> bridge
> >>>> Ethernet address 9a:2c:80:a6:9b:47
> >>>> Jun 04 01:57:22|00048|ofproto|INFO|**datapath ID changed to
> >>>> 00009a2c80a69b47
> >>>> Jun 04 01:57:22|00049|rconn|INFO|**switch1<->tcp:127.0.0.1:6633:
> >>>> connecting...
> >>>> Jun 04
On Tue, 5 Jun 2012, Allan Vidal wrote:
> Hi Josh,
> I tested here, and br0 never connects to either NOX or POX. Which version of OVS are you using?
> What happens here is the following:
> -> Starting the control plane network (dp0 VS)...
> INFO:openflow.of_01:[Con 1/8243406406160905843] Connected to 76-73-72-66-76-73|29286
> [...]
> -> Starting the sample network...
> INFO:openflow.of_01:[Con 2/34042236714307] Connected to 1e-f6-13-6d-3d-43
> The DP IDs are:
> ~$ sudo ovs-vsctl get Bridge br0 datapath-id
> "0000b69ab5a1b348"
> ~$ sudo ovs-vsctl get Bridge switch1 datapath-id
> "00001ef6136d3d43"
> I'm very curious as to why your br0 is connecting to the controller... Based on the OVS man pages, I believe this shouldn't be happening (man ovs-vsctl):
> ovs-vswitchd can perform all configured bridging and switching locally,
> or it can be configured to communicate with one or more external Open‐
> Flow controllers. The switch is typically configured to connect to a
> primary controller that takes charge of the bridge's flow table to
> implement a network policy. In addition, the switch can be configured
> to listen to connections from service controllers. Service controllers
> are typically used for occasional support and maintenance, e.g. with
> ovs-ofctl.
> Does this behavior persist? Do you have any theories?
> As for the Mongo and other race conditions, you're right, we're still too dependent on the order of the events. Your patch will be very helpful :)
> Thank you,
> Allan
> On Tue, Jun 5, 2012 at 10:04 AM, Allan Vidal <all...@cpqd.com.br> wrote:
> Hi Josh,
> br0 shouldn't be connecting to the controller, as it should be managed by OVS itself. But it's what seems to be happening. I'll look into it and get
> back to you ASAP.
> Allan
> On Tue, Jun 5, 2012 at 9:29 AM, Josh Bailey <jo...@google.com> wrote:
> Hi Allan;
> OK, I find the problem. Firstly there's a lot of startup races in rftest1. For example, if MongoDB fails in time to start nothing else
> works and just crashes. I added some checks to make sure MongoDB starts before proceeding. I'll send a patch.
> But the actual problem is, that RFServer gets a datapath join event for "0x60eb69215b92", which happens to be br0. The control plane
> datapath joins no problem, but then switch1 tries to join. br0 has already claimed the control plane VM so nothing works.
> I just hacked rfserver.cc to never associate 0x60eb69215b92 with a VM. Then when switch1 comes up the VM is free and everything works.
> I'm not sure how the code as is currently checked in can work at all, unless I am missing something - Eg, br0 just happens to very slow
> to come up so it comes up last. Probably the right solution is to add a more robust check for br0 to ignore it no matter when it comes
> up.
> Thanks,
> On Mon, 4 Jun 2012, Josh Bailey wrote:
> Hi Allan;
> Thanks, things are a bit closer now but still not working. The rfvm1 VM now receives packets but eth1 (for example) drops
> them all:
> I'll keep troubleshooting but any suggestions appreciated.
> On Mon, 4 Jun 2012, Allan Vidal wrote:
> Hi Josh,
> About Mongo in /usr: yep, that's not the ideal. We plan to support Ubuntu
> 12.04 soon, and then it will be a matter of apt-getting mongo :)
> As for the testing, the default user/password for the prebuilt VM is
> routeflow/routeflow. The LXC containers are root/root.
> Anyhow, I changed the scripts a lot in the last commits a few days ago. Try
> pulling from the repository. I believe they're much better, though not quite
> flawless :) You might have run into some troublesome commit.
> There's a small glitch: in the first run of the tests after booting, OVS
> behaves badly sometimes. You have to close the script and try again, then it
> should work.
> Please, let me know if you run into any problems :)
> Allan
> On Mon, Jun 4, 2012 at 6:14 AM, joshb <jo...@google.com> wrote:
> Hi Christian;
> Nice to see! Definitely fewer moving pieces in the new design
> which is
> great! Also being able to use pox as well will be helpful.
> Unfortunately the new setup doesn't appear to work for me with
> neither
> nox nor pox - no packets are received by the control VM rfvm1. I
> used
> Ubuntu 11.0.4 (x64).
> I can build fine (though I had to add CPPFLAGS=-fPIC,
> LDFLAGS=-fPIC).
> It's also a bit scary that mongodb is built with prefix /usr
> which
> would overwrite any packaged distribution - but no big deal.
> Also the README says to use --nox vs --pox for testing. The
> scripts
> though are actually rftest1 and rftest1_pox and they don't take
> command line arguments - and the default user password is
> root/root
> not routeflow.
> Anyway - on to the actual problem. :-) switch1 is correctly set
> up and
> when I ping from b1 or b2, ARP packets are tunnelled to the
> controller. However rfvm1 never sees them. I uncommented the
> logging
> line in RouteFlow/pox/ext/rfproxy.py:
> results = rftable.find({VS_ID: str(event.dpid)})
> if results.count() == 0 or results[0][VM_ID] == "":
> results = rftable.find({VS_ID: {"$ne": ""}, DP_ID:
> str(event.dpid), DP_PORT: str(event.port)})
> if results.count() == 0:
> log.info("Datapath not associated with a VM")
> return
> And I see now "Datapath not associated with a VM" on each ping
> attempt.
> Please let me know if you have any suggestions? Otherwise I will
> continue to troubleshoot.
> Thanks.
> # ./rftest1_pox
> Resetting and stopping LXC VMs...
> Stopping any running instances and data of rfserver, POX, OVS
> and
> MongoDB...
> Starting MongoDB...
> all output going to: /dev/null
> Starting the rfvm1 virtual machine...
> Starting the management network (br0)...
> Starting POX and the RouteFlow network controller...
> POX 0.0.0 / Copyright 2011 James McCauley
> 2012-06-04 01:57:41,092 - ext.rfproxy - INFO - RFProxy running.
> 2012-06-04 01:57:42,538 - openflow.topology - INFO - Switch 60-
> eb-69-21-5b-92 connected
> DP is up, installing config flows... `?i![
> Starting the RouteFlow server...
> Starting the control plane network (dp0 OVS)...
> 2012-06-04 01:57:46,241 - openflow.topology - INFO - Switch
> 76-73-72-66-76-73|29286 connected
> DP is up, installing config flows... rfvsrfv
> Starting the sample network...
> 2012-06-04 01:57:49,744 - openflow.topology - INFO - Switch
> f6-39-
> ee-55-73-49 connected
> DP is up, installing config flows... ?9?Us
> Now we'll open this test's log.
> Try pinging b1 from b2:
> $ sudo lxc-console -n b1
> Login and run:
> $ ping 172.31.2.2
> Jun 04 01:57:22|00042|bridge|WARN|bridge dp0: using default
> bridge
>
OK. I am now able to swap out "switch1" with a hardware OpenFlow switch and have pings work. However I had to fix another problem along the way.
This problem is in RFServer.cc. My hardware switch has lots of ports, of course, so I want to add them all. However, RFServer.cc only handles VM_MAP messages where there are no existing entries with VS_ID set. So it handles the first ports and then silently drops the rest because the first ones to add, add entries with VS_ID...
I just commented out query[VS_ID] = "" (see below) and now all VM_MAP messages are processed.
Would it be possible to commit a fix for this?
Thanks,
else if (type == VM_MAP) {
VMMap *mapmsg = dynamic_cast<VMMap*>(&msg);
syslog(LOG_INFO, "Mapping message arrived from vm=0x%llx", mapmsg->get_vm_id());
// Search for VM's with no mapping
map<string, string> query;
vector<RFEntry> results;
query[VM_ID] = to_string<uint64_t>(mapmsg->get_vm_id());
// query[VS_ID] = "";
^^^^^^^^^^^^^^^^^^^^^
// Querying for VS_ID is enough, but we could play it safe and query all mapping attributes
results = this->rftable->get_entries(query);
> OK, I find the problem. Firstly there's a lot of startup races in rftest1. > For example, if MongoDB fails in time to start nothing else works and just > crashes. I added some checks to make sure MongoDB starts before proceeding. > I'll send a patch.
> But the actual problem is, that RFServer gets a datapath join event for > "0x60eb69215b92", which happens to be br0. The control plane datapath joins > no problem, but then switch1 tries to join. br0 has already claimed the > control plane VM so nothing works.
> I just hacked rfserver.cc to never associate 0x60eb69215b92 with a VM. Then > when switch1 comes up the VM is free and everything works.
> I'm not sure how the code as is currently checked in can work at all, unless > I am missing something - Eg, br0 just happens to very slow to come up so it > comes up last. Probably the right solution is to add a more robust check for > br0 to ignore it no matter when it comes up.
> Thanks,
> On Mon, 4 Jun 2012, Josh Bailey wrote:
>> Hi Allan;
>> Thanks, things are a bit closer now but still not working. The rfvm1 VM now >> receives packets but eth1 (for example) drops them all:
>> I'll keep troubleshooting but any suggestions appreciated.
>> On Mon, 4 Jun 2012, Allan Vidal wrote:
>>> Hi Josh,
>>> About Mongo in /usr: yep, that's not the ideal. We plan to support Ubuntu
>>> 12.04 soon, and then it will be a matter of apt-getting mongo :)
>>> As for the testing, the default user/password for the prebuilt VM is
>>> routeflow/routeflow. The LXC containers are root/root.
>>> Anyhow, I changed the scripts a lot in the last commits a few days ago. >>> Try
>>> pulling from the repository. I believe they're much better, though not >>> quite
>>> flawless :) You might have run into some troublesome commit.
>>> There's a small glitch: in the first run of the tests after booting, OVS
>>> behaves badly sometimes. You have to close the script and try again, then >>> it
>>> should work.
>>> Please, let me know if you run into any problems :)
>>> Allan
>>> On Mon, Jun 4, 2012 at 6:14 AM, joshb <jo...@google.com> wrote:
>>> Hi Christian;
>>> Nice to see! Definitely fewer moving pieces in the new design
>>> which is
>>> great! Also being able to use pox as well will be helpful.
>>> Unfortunately the new setup doesn't appear to work for me with
>>> neither
>>> nox nor pox - no packets are received by the control VM rfvm1. I
>>> used
>>> Ubuntu 11.0.4 (x64).
>>> I can build fine (though I had to add CPPFLAGS=-fPIC,
>>> LDFLAGS=-fPIC).
>>> It's also a bit scary that mongodb is built with prefix /usr
>>> which
>>> would overwrite any packaged distribution - but no big deal.
>>> Also the README says to use --nox vs --pox for testing. The
>>> scripts
>>> though are actually rftest1 and rftest1_pox and they don't take
>>> command line arguments - and the default user password is
>>> root/root
>>> not routeflow.
>>> Anyway - on to the actual problem. :-) switch1 is correctly set
>>> up and
>>> when I ping from b1 or b2, ARP packets are tunnelled to the
>>> controller. However rfvm1 never sees them. I uncommented the
>>> logging
>>> line in RouteFlow/pox/ext/rfproxy.py:
I'm not sure I understand the problem. You said RFServer handles the first
ports and then silently drops the rest, but that's not what should happen.
When a switch joins RFServer and there's an idle VM to connect, it will
create N entries, where N is the number of switch ports. The format of
these entries will be:
vm_id, -, -, -, dp_id, -
After that, the VM will be instructed to send mapping packets on each of
its interfaces. When a mapping message arrives at the controller and is
redirected to RFServer, we check if there's an unmapped entry in the format
above (thus, the check for VS_ID=""). When there's, we make it an active
entry:
vm_id, vm_port, vs_id, vs_port, dp_id, dp_port
Removing the check for VS_ID="" could potentially cause valid entries to be
overwritten in the association table.
Are you running RouteFlow under POX?
If you are, it will be interesting to see what happens (with and without
your modification) in the association table through the web interface. You
can start it by going to rfweb, running "python rfweb_server.py" and then
access http://localhost:8080/index.html You will need pymongo as instructed in the README file.
And thanks for the patch! We really appreciate your efforts :)
On Wed, Jun 6, 2012 at 12:48 AM, Josh Bailey <jo...@google.com> wrote:
> OK. I am now able to swap out "switch1" with a hardware OpenFlow switch
> and have pings work. However I had to fix another problem along the way.
> This problem is in RFServer.cc. My hardware switch has lots of ports, of
> course, so I want to add them all. However, RFServer.cc only handles VM_MAP
> messages where there are no existing entries with VS_ID set. So it handles
> the first ports and then silently drops the rest because the first ones to
> add, add entries with VS_ID...
> I just commented out query[VS_ID] = "" (see below) and now all VM_MAP
> messages are processed.
> // Search for VM's with no mapping
> map<string, string> query;
> vector<RFEntry> results;
> query[VM_ID] = to_string<uint64_t>(mapmsg->**get_vm_id());
> // query[VS_ID] = "";
> ^^^^^^^^^^^^^^^^^^^^^
> // Querying for VS_ID is enough, but we could play it safe
> and query all mapping attributes
> results = this->rftable->get_entries(**query);
> On Tue, 5 Jun 2012, Josh Bailey wrote:
>> Hi Allan;
>> OK, I find the problem. Firstly there's a lot of startup races in
>> rftest1. For example, if MongoDB fails in time to start nothing else works
>> and just crashes. I added some checks to make sure MongoDB starts before
>> proceeding. I'll send a patch.
>> But the actual problem is, that RFServer gets a datapath join event for
>> "0x60eb69215b92", which happens to be br0. The control plane datapath joins
>> no problem, but then switch1 tries to join. br0 has already claimed the
>> control plane VM so nothing works.
>> I just hacked rfserver.cc to never associate 0x60eb69215b92 with a VM.
>> Then when switch1 comes up the VM is free and everything works.
>> I'm not sure how the code as is currently checked in can work at all,
>> unless I am missing something - Eg, br0 just happens to very slow to come
>> up so it comes up last. Probably the right solution is to add a more robust
>> check for br0 to ignore it no matter when it comes up.
>> Thanks,
>> On Mon, 4 Jun 2012, Josh Bailey wrote:
>>> Hi Allan;
>>> Thanks, things are a bit closer now but still not working. The rfvm1 VM
>>> now receives packets but eth1 (for example) drops them all:
>>> I'll keep troubleshooting but any suggestions appreciated.
>>> On Mon, 4 Jun 2012, Allan Vidal wrote:
>>> Hi Josh,
>>>> About Mongo in /usr: yep, that's not the ideal. We plan to support
>>>> Ubuntu
>>>> 12.04 soon, and then it will be a matter of apt-getting mongo :)
>>>> As for the testing, the default user/password for the prebuilt VM is
>>>> routeflow/routeflow. The LXC containers are root/root.
>>>> Anyhow, I changed the scripts a lot in the last commits a few days ago.
>>>> Try
>>>> pulling from the repository. I believe they're much better, though not
>>>> quite
>>>> flawless :) You might have run into some troublesome commit.
>>>> There's a small glitch: in the first run of the tests after booting, OVS
>>>> behaves badly sometimes. You have to close the script and try again,
>>>> then it
>>>> should work.
>>>> Please, let me know if you run into any problems :)
>>>> Allan
>>>> On Mon, Jun 4, 2012 at 6:14 AM, joshb <jo...@google.com> wrote:
>>>> Hi Christian;
>>>> Nice to see! Definitely fewer moving pieces in the new design
>>>> which is
>>>> great! Also being able to use pox as well will be helpful.
>>>> Unfortunately the new setup doesn't appear to work for me with
>>>> neither
>>>> nox nor pox - no packets are received by the control VM rfvm1. I
>>>> used
>>>> Ubuntu 11.0.4 (x64).
>>>> I can build fine (though I had to add CPPFLAGS=-fPIC,
>>>> LDFLAGS=-fPIC).
>>>> It's also a bit scary that mongodb is built with prefix /usr
>>>> which
>>>> would overwrite any packaged distribution - but no big deal.
>>>> Also the README says to use --nox vs --pox for testing. The
>>>> scripts
>>>> though are actually rftest1 and rftest1_pox and they don't take
>>>> command line arguments - and the default user password is
>>>> root/root
>>>> not routeflow.
>>>> Anyway - on to the actual problem. :-) switch1 is correctly set
>>>> up and
>>>> when I ping from b1 or b2, ARP packets are tunnelled to the
>>>> controller. However rfvm1 never sees them. I uncommented the
>>>> logging
>>>> line in RouteFlow/pox/ext/rfproxy.py:
>>>> results = rftable.find({VS_ID: str(event.dpid)})
>>>> if results.count() == 0 or results[0][VM_ID] == "":
>>>> results = rftable.find({VS_ID: {"$ne": ""}, DP_ID:
>>>> str(event.dpid), DP_PORT: str(event.port)})
>>>> if results.count() == 0:
>>>> log.info("Datapath not associated with a VM")
>>>> return
>>>> And I see now "Datapath not associated with a VM" on each ping
>>>> attempt.
>>>> Please let me know if you have any suggestions? Otherwise I will
>>>> continue to troubleshoot.
>>>> Thanks.
>>>> # ./rftest1_pox
>>>> Resetting and stopping LXC VMs...
>>>> Stopping any running instances and data of rfserver, POX, OVS
>>>> and
>>>> MongoDB...
>>>> Starting MongoDB...
>>>> all output going to: /dev/null
>>>> Starting the rfvm1 virtual machine...
>>>> Starting the management network (br0)...
>>>> Starting POX and the RouteFlow network controller...
>>>> POX 0.0.0 / Copyright 2011 James McCauley
>>>> 2012-06-04 01:57:41,092 - ext.rfproxy - INFO - RFProxy running.
>>>> 2012-06-04 01:57:42,538 - openflow.topology - INFO - Switch 60-
>>>> eb-69-21-5b-92 connected
>>>> DP is up, installing config flows... `?i![
>>>> Starting the RouteFlow server...
>>>> Starting the control plane network (dp0 OVS)...
>>>> 2012-06-04 01:57:46,241 - openflow.topology - INFO - Switch
>>>> 76-73-72-66-76-73|29286 connected
>>>> DP is up, installing config flows... rfvsrfv
>>>> Starting the sample network...
>>>> 2012-06-04 01:57:49,744 - openflow.topology - INFO - Switch
>>>> f6-39-
>>>> ee-55-73-49 connected
>>>> DP is up, installing config flows... ?9?Us
>>>> Now we'll open this test's log.
>>>> Try pinging b1 from b2:
>>>> $ sudo lxc-console -n b1
>>>> Login and run:
>>>> $ ping 172.31.2.2
>>>> Jun 04 01:57:22|00042|bridge|WARN|**bridge dp0: using default
>>>> bridge
>>>> Ethernet address c6:04:a7:73:d2:45
>>>> Jun 04 01:57:22|00043|ofproto|INFO|**datapath ID changed to
>>>> 7266767372667673
>>>> Jun 04 01:57:22|00044|rconn|INFO|dp0<**->tcp:127.0.0.1:6633:
>>>> connecting...
>>>> Jun 04 01:57:22|00045|rconn|WARN|dp0<**->tcp:127.0.0.1:6633:
>>>> connection
>>>> failed (Connection refused)
>>>> Jun 04 01:57:22|00046|rconn|INFO|dp0<**->tcp:127.0.0.1:6633:
>>>> waiting 1
>>>> seconds before reconnect
>>>> Jun 04 01:57:22|00047|bridge|WARN|**bridge switch1: using default
>>>> bridge
>>>> Ethernet address 9a:2c:80:a6:9b:47
>>>> Jun 04 01:57:22|00048|ofproto|INFO|**datapath ID changed to
>>>> 00009a2c80a69b47
>>>> Jun 04 01:57:22|00049|rconn|INFO|**switch1<->tcp:127.0.0.1:6633:
>>>> connecting...
>>>> Jun 04 01:57:22|00050|rconn|WARN|**switch1<->tcp:127.0.0.1:6633:
>>>> connection failed (Connection refused)
>>>> Jun 04 01:57:22|00051|rconn|INFO|**switch1<->tcp:127.0.0.1:6633:
>>>> waiting
>>>> 1 seconds before
Thanks Josh for your efforts in debugging and the patch!
We shall extend our tests with hardware OpenFlow switches, beyond the
4-port NetFPGAs.
What is really strange is that OVS br0 uses the OpenFlow channel to
connect to the OpenFlow controller, that should not be the case when
initiating OVS in bridge mode...
We will work on a stable fix for the port mapping issue once we can
reproduce it.
On Wed, Jun 6, 2012 at 2:49 PM, Allan Vidal <all...@cpqd.com.br> wrote:
> Hi Josh,
> I'm not sure I understand the problem. You said RFServer handles the first
> ports and then silently drops the rest, but that's not what should happen.
> When a switch joins RFServer and there's an idle VM to connect, it will
> create N entries, where N is the number of switch ports. The format of these
> entries will be:
> vm_id, -, -, -, dp_id, -
> After that, the VM will be instructed to send mapping packets on each of its
> interfaces. When a mapping message arrives at the controller and is
> redirected to RFServer, we check if there's an unmapped entry in the format
> above (thus, the check for VS_ID=""). When there's, we make it an active
> entry:
> vm_id, vm_port, vs_id, vs_port, dp_id, dp_port
> Removing the check for VS_ID="" could potentially cause valid entries to be
> overwritten in the association table.
> Are you running RouteFlow under POX?
> If you are, it will be interesting to see what happens (with and without
> your modification) in the association table through the web interface. You
> can start it by going to rfweb, running "python rfweb_server.py" and then
> access http://localhost:8080/index.html > You will need pymongo as instructed in the README file.
> And thanks for the patch! We really appreciate your efforts :)
> Allan
> On Wed, Jun 6, 2012 at 12:48 AM, Josh Bailey <jo...@google.com> wrote:
>> OK. I am now able to swap out "switch1" with a hardware OpenFlow switch
>> and have pings work. However I had to fix another problem along the way.
>> This problem is in RFServer.cc. My hardware switch has lots of ports, of
>> course, so I want to add them all. However, RFServer.cc only handles VM_MAP
>> messages where there are no existing entries with VS_ID set. So it handles
>> the first ports and then silently drops the rest because the first ones to
>> add, add entries with VS_ID...
>> I just commented out query[VS_ID] = "" (see below) and now all VM_MAP
>> messages are processed.
>> // Search for VM's with no mapping
>> map<string, string> query;
>> vector<RFEntry> results;
>> query[VM_ID] = to_string<uint64_t>(mapmsg->get_vm_id());
>> // query[VS_ID] = "";
>> ^^^^^^^^^^^^^^^^^^^^^
>> // Querying for VS_ID is enough, but we could play it safe
>> and query all mapping attributes
>> results = this->rftable->get_entries(query);
>> On Tue, 5 Jun 2012, Josh Bailey wrote:
>>> Hi Allan;
>>> OK, I find the problem. Firstly there's a lot of startup races in
>>> rftest1. For example, if MongoDB fails in time to start nothing else works
>>> and just crashes. I added some checks to make sure MongoDB starts before
>>> proceeding. I'll send a patch.
>>> But the actual problem is, that RFServer gets a datapath join event for
>>> "0x60eb69215b92", which happens to be br0. The control plane datapath joins
>>> no problem, but then switch1 tries to join. br0 has already claimed the
>>> control plane VM so nothing works.
>>> I just hacked rfserver.cc to never associate 0x60eb69215b92 with a VM.
>>> Then when switch1 comes up the VM is free and everything works.
>>> I'm not sure how the code as is currently checked in can work at all,
>>> unless I am missing something - Eg, br0 just happens to very slow to come up
>>> so it comes up last. Probably the right solution is to add a more robust
>>> check for br0 to ignore it no matter when it comes up.
>>> Thanks,
>>> On Mon, 4 Jun 2012, Josh Bailey wrote:
>>>> Hi Allan;
>>>> Thanks, things are a bit closer now but still not working. The rfvm1 VM
>>>> now receives packets but eth1 (for example) drops them all:
>>>> I'll keep troubleshooting but any suggestions appreciated.
>>>> On Mon, 4 Jun 2012, Allan Vidal wrote:
>>>>> Hi Josh,
>>>>> About Mongo in /usr: yep, that's not the ideal. We plan to support
>>>>> Ubuntu
>>>>> 12.04 soon, and then it will be a matter of apt-getting mongo :)
>>>>> As for the testing, the default user/password for the prebuilt VM is
>>>>> routeflow/routeflow. The LXC containers are root/root.
>>>>> Anyhow, I changed the scripts a lot in the last commits a few days ago.
>>>>> Try
>>>>> pulling from the repository. I believe they're much better, though not
>>>>> quite
>>>>> flawless :) You might have run into some troublesome commit.
>>>>> There's a small glitch: in the first run of the tests after booting,
>>>>> OVS
>>>>> behaves badly sometimes. You have to close the script and try again,
>>>>> then it
>>>>> should work.
>>>>> Please, let me know if you run into any problems :)
>>>>> Allan
>>>>> On Mon, Jun 4, 2012 at 6:14 AM, joshb <jo...@google.com> wrote:
>>>>> Hi Christian;
>>>>> Nice to see! Definitely fewer moving pieces in the new design
>>>>> which is
>>>>> great! Also being able to use pox as well will be helpful.
>>>>> Unfortunately the new setup doesn't appear to work for me with
>>>>> neither
>>>>> nox nor pox - no packets are received by the control VM rfvm1. I
>>>>> used
>>>>> Ubuntu 11.0.4 (x64).
>>>>> I can build fine (though I had to add CPPFLAGS=-fPIC,
>>>>> LDFLAGS=-fPIC).
>>>>> It's also a bit scary that mongodb is built with prefix /usr
>>>>> which
>>>>> would overwrite any packaged distribution - but no big deal.
>>>>> Also the README says to use --nox vs --pox for testing. The
>>>>> scripts
>>>>> though are actually rftest1 and rftest1_pox and they don't take
>>>>> command line arguments - and the default user password is
>>>>> root/root
>>>>> not routeflow.
>>>>> Anyway - on to the actual problem. :-) switch1 is correctly set
>>>>> up and
>>>>> when I ping from b1 or b2, ARP packets are tunnelled to the
>>>>> controller. However rfvm1 never sees them. I uncommented the
>>>>> logging
>>>>> line in RouteFlow/pox/ext/rfproxy.py:
>>>>> results = rftable.find({VS_ID: str(event.dpid)})
>>>>> if results.count() == 0 or results[0][VM_ID] == "":
>>>>> results = rftable.find({VS_ID: {"$ne": ""}, DP_ID:
>>>>> str(event.dpid), DP_PORT: str(event.port)})
>>>>> if results.count() == 0:
>>>>> log.info("Datapath not associated with a VM")
>>>>> return
>>>>> And I see now "Datapath not associated with a VM" on each ping
>>>>> attempt.
>>>>> Please let me know if you have any suggestions? Otherwise I will
>>>>> continue to troubleshoot.
>>>>> Thanks.
>>>>> # ./rftest1_pox
>>>>> Resetting and stopping LXC VMs...
>>>>> Stopping any running instances and data of rfserver, POX, OVS
>>>>> and
>>>>> MongoDB...
>>>>> Starting MongoDB...
>>>>> all output going to: /dev/null
>>>>> Starting the rfvm1 virtual machine...
>>>>> Starting the management network (br0)...
>>>>> Starting POX and the RouteFlow network controller...
>>>>> POX 0.0.0 / Copyright 2011 James McCauley
>>>>> 2012-06-04 01:57:41,092 - ext.rfproxy - INFO - RFProxy running.
>>>>> 2012-06-04 01:57:42,538 - openflow.topology - INFO - Switch 60-
>>>>> eb-69-21-5b-92 connected
>>>>> DP is up, installing config flows... `?i![
>>>>> Starting the RouteFlow server...
>>>>> Starting the control plane network (dp0 OVS)...
>>>>> 2012-06-04 01:57:46,241 - openflow.topology - INFO - Switch
>>>>> 76-73-72-66-76-73|29286 connected
>>>>> DP is up, installing config flows... rfvsrfv
>>>>> Starting the sample network...
>>>>> 2012-06-04 01:57:49,744 - openflow.topology - INFO - Switch
>>>>> f6-39-
>>>>> ee-55-73-49 connected
>>>>> DP is up, installing config flows... ?9?Us
>>>>> Now we'll open this test's log.
>>>>> Try pinging b1 from b2:
>>>>> $ sudo lxc-console -n b1
>>>>> Login and run:
>>>>> $ ping 172.31.2.2
>>>>> Jun 04 01:57:22|00042|bridge|WARN|bridge dp0: using
So I have some good news on that one - the br0 problem was due to a left over OVS config database. I got rid of that and now br0 no longer tries to connect.
On Wed, 6 Jun 2012, Christian Esteve Rothenberg wrote:
> Thanks Josh for your efforts in debugging and the patch!
> We shall extend our tests with hardware OpenFlow switches, beyond the
> 4-port NetFPGAs.
> What is really strange is that OVS br0 uses the OpenFlow channel to
> connect to the OpenFlow controller, that should not be the case when
> initiating OVS in bridge mode...
> We will work on a stable fix for the port mapping issue once we can
> reproduce it.
> Cheers,
> Christian
> On Wed, Jun 6, 2012 at 2:49 PM, Allan Vidal <all...@cpqd.com.br> wrote:
>> Hi Josh,
>> I'm not sure I understand the problem. You said RFServer handles the first
>> ports and then silently drops the rest, but that's not what should happen.
>> When a switch joins RFServer and there's an idle VM to connect, it will
>> create N entries, where N is the number of switch ports. The format of these
>> entries will be:
>> vm_id, -, -, -, dp_id, -
>> After that, the VM will be instructed to send mapping packets on each of its
>> interfaces. When a mapping message arrives at the controller and is
>> redirected to RFServer, we check if there's an unmapped entry in the format
>> above (thus, the check for VS_ID=""). When there's, we make it an active
>> entry:
>> vm_id, vm_port, vs_id, vs_port, dp_id, dp_port
>> Removing the check for VS_ID="" could potentially cause valid entries to be
>> overwritten in the association table.
>> Are you running RouteFlow under POX?
>> If you are, it will be interesting to see what happens (with and without
>> your modification) in the association table through the web interface. You
>> can start it by going to rfweb, running "python rfweb_server.py" and then
>> access http://localhost:8080/index.html >> You will need pymongo as instructed in the README file.
>> And thanks for the patch! We really appreciate your efforts :)
>> Allan
>> On Wed, Jun 6, 2012 at 12:48 AM, Josh Bailey <jo...@google.com> wrote:
>>> OK. I am now able to swap out "switch1" with a hardware OpenFlow switch
>>> and have pings work. However I had to fix another problem along the way.
>>> This problem is in RFServer.cc. My hardware switch has lots of ports, of
>>> course, so I want to add them all. However, RFServer.cc only handles VM_MAP
>>> messages where there are no existing entries with VS_ID set. So it handles
>>> the first ports and then silently drops the rest because the first ones to
>>> add, add entries with VS_ID...
>>> I just commented out query[VS_ID] = "" (see below) and now all VM_MAP
>>> messages are processed.
>>> Would it be possible to commit a fix for this?
>>> � � � � � � � �// Search for VM's with no mapping
>>> � � � � � � � �map<string, string> query;
>>> � � � � � � � �vector<RFEntry> results;
>>> � � � � � � � �query[VM_ID] = to_string<uint64_t>(mapmsg->get_vm_id());
>>> � � � � � � � �// query[VS_ID] = "";
>>> � � � � � � � �^^^^^^^^^^^^^^^^^^^^^
>>> � � � � � � � �// Querying for VS_ID is enough, but we could play it safe
>>> and query all mapping attributes
>>> � � � � � � � �results = this->rftable->get_entries(query);
>>> On Tue, 5 Jun 2012, Josh Bailey wrote:
>>>> Hi Allan;
>>>> OK, I find the problem. Firstly there's a lot of startup races in
>>>> rftest1. For example, if MongoDB fails in time to start nothing else works
>>>> and just crashes. I added some checks to make sure MongoDB starts before
>>>> proceeding. I'll send a patch.
>>>> But the actual problem is, that RFServer gets a datapath join event for
>>>> "0x60eb69215b92", which happens to be br0. The control plane datapath joins
>>>> no problem, but then switch1 tries to join. br0 has already claimed the
>>>> control plane VM so nothing works.
>>>> I just hacked rfserver.cc to never associate 0x60eb69215b92 with a VM.
>>>> Then when switch1 comes up the VM is free and everything works.
>>>> I'm not sure how the code as is currently checked in can work at all,
>>>> unless I am missing something - Eg, br0 just happens to very slow to come up
>>>> so it comes up last. Probably the right solution is to add a more robust
>>>> check for br0 to ignore it no matter when it comes up.
>>>> Thanks,
>>>> On Mon, 4 Jun 2012, Josh Bailey wrote:
>>>>> Hi Allan;
>>>>> Thanks, things are a bit closer now but still not working. The rfvm1 VM
>>>>> now receives packets but eth1 (for example) drops them all:
>>>>> I'll keep troubleshooting but any suggestions appreciated.
>>>>> On Mon, 4 Jun 2012, Allan Vidal wrote:
>>>>>> Hi Josh,
>>>>>> About Mongo in /usr: yep, that's not the ideal. We plan to support
>>>>>> Ubuntu
>>>>>> 12.04 soon, and then it will be a matter of apt-getting mongo :)
>>>>>> As for the testing, the default user/password for the prebuilt VM is
>>>>>> routeflow/routeflow. The LXC containers are root/root.
>>>>>> Anyhow, I changed the scripts a lot in the last commits a few days ago.
>>>>>> Try
>>>>>> pulling from the repository. I believe they're much better, though not
>>>>>> quite
>>>>>> flawless :) You might have run into some troublesome commit.
>>>>>> There's a small glitch: in the first run of the tests after booting,
>>>>>> OVS
>>>>>> behaves badly sometimes. You have to close the script and try again,
>>>>>> then it
>>>>>> should work.
>>>>>> Please, let me know if you run into any problems :)
>>>>>> Allan
>>>>>> On Mon, Jun 4, 2012 at 6:14 AM, joshb <jo...@google.com> wrote:
>>>>>> � � �Hi Christian;
>>>>>> � � �Nice to see! Definitely fewer moving pieces in the new design
>>>>>> � � �which is
>>>>>> � � �great! Also being able to use pox as well will be helpful.
>>>>>> � � �Unfortunately the new setup doesn't appear to work for me with
>>>>>> � � �neither
>>>>>> � � �nox nor pox - no packets are received by the control VM rfvm1. I
>>>>>> � � �used
>>>>>> � � �Ubuntu 11.0.4 (x64).
>>>>>> � � �I can build fine (though I had to add CPPFLAGS=-fPIC,
>>>>>> � � �LDFLAGS=-fPIC).
>>>>>> � � �It's also a bit scary that mongodb is built with prefix /usr
>>>>>> � � �which
>>>>>> � � �would overwrite any packaged distribution - but no big deal.
>>>>>> � � �Also the README says to use --nox vs --pox for testing. The
>>>>>> � � �scripts
>>>>>> � � �though are actually rftest1 and rftest1_pox and they don't take
>>>>>> � � �command line arguments - and the default user password is
>>>>>> � � �root/root
>>>>>> � � �not routeflow.
>>>>>> � � �Anyway - on to the actual problem. :-) switch1 is correctly set
>>>>>> � � �up and
>>>>>> � � �when I ping from b1 or b2, ARP packets are tunnelled to the
>>>>>> � � �controller. However rfvm1 never sees them. I uncommented the
>>>>>> � � �logging
>>>>>> � � �line in RouteFlow/pox/ext/rfproxy.py:
As you can see ports 10 and 11 among others are missing. So I don't think my patch is the correct solution but it does illustrate the problem.
When I ping from port 11, without the patch, I see:
DEBUG:ext.rfproxy:Datapath not associated with a VM
Which is true - it's not in rftable. Even though clearly the VM_MAP messages are arriving.
I tweaked the logging messages slightly to make it more clear:
Jun 6 17:04:58 project-w-2 rf-server[14199]: Mapping message arrived from vm=0x1ed68ed5199055
Jun 6 17:04:58 project-w-2 rf-server[14199]: Adding entry for vs_id=1919317619, vs_port=1, vm_port=1, dp_port=1
Jun 6 17:04:58 project-w-2 rf-server[14199]: Mapping message arrived from vm=0x1ed68ed5199055
Jun 6 17:04:58 project-w-2 rf-server[14199]: Adding entry for vs_id=1919317619, vs_port=2, vm_port=2, dp_port=2
Jun 6 17:04:58 project-w-2 rf-server[14199]: Mapping message arrived from vm=0x1ed68ed5199055
Jun 6 17:06:02 project-w-2 rf-server[14199]: last message repeated 8 times
As you can see the VM_MAP messages are arriving but RFServer only adds the first two and ignores the rest.
> I'm not sure I understand the problem. You said RFServer handles the first ports and
> then silently drops the rest, but that's not what should happen.
> When a switch joins RFServer and there's an idle VM to connect, it will create N
> entries, where N is the number of switch ports. The format of these entries will be:
> vm_id, -, -, -, dp_id, -
> After that, the VM will be instructed to send mapping packets on each of its
> interfaces. When a mapping message arrives at the controller and is redirected to
> RFServer, we check if there's an unmapped entry in the format above (thus, the check
> for VS_ID=""). When there's, we make it an active entry:
> vm_id, vm_port, vs_id, vs_port, dp_id, dp_port
> Removing the check for VS_ID="" could potentially cause valid entries to be
> overwritten in the association table.
> Are you running RouteFlow under POX?
> If you are, it will be interesting to see what happens (with and without your
> modification) in the association table through the web interface. You can start it by
> going to rfweb, running "python rfweb_server.py" and then access
> http://localhost:8080/index.html > You will need pymongo as instructed in the README file.
> And thanks for the patch! We really appreciate your efforts :)
> Allan
> On Wed, Jun 6, 2012 at 12:48 AM, Josh Bailey <jo...@google.com> wrote:
> OK. I am now able to swap out "switch1" with a hardware OpenFlow switch
> and have pings work. However I had to fix another problem along the way.
> This problem is in RFServer.cc. My hardware switch has lots of ports, of
> course, so I want to add them all. However, RFServer.cc only handles
> VM_MAP messages where there are no existing entries with VS_ID set. So it
> handles the first ports and then silently drops the rest because the first
> ones to add, add entries with VS_ID...
> I just commented out query[VS_ID] = "" (see below) and now all VM_MAP
> messages are processed.
> � � � � � � � �// Search for VM's with no mapping
> � � � � � � � �map<string, string> query;
> � � � � � � � �vector<RFEntry> results;
> � � � � � � � �query[VM_ID] = to_string<uint64_t>(mapmsg->get_vm_id());
> � � � � � � � �// query[VS_ID] = "";
> � � � � � � � �^^^^^^^^^^^^^^^^^^^^^
> � � � � � � � �// Querying for VS_ID is enough, but we could play it safe
> and query all mapping attributes
> � � � � � � � �results = this->rftable->get_entries(query);
> On Tue, 5 Jun 2012, Josh Bailey wrote:
> Hi Allan;
> OK, I find the problem. Firstly there's a lot of startup races
> in rftest1. For example, if MongoDB fails in time to start
> nothing else works and just crashes. I added some checks to
> make sure MongoDB starts before proceeding. I'll send a patch.
> But the actual problem is, that RFServer gets a datapath join
> event for "0x60eb69215b92", which happens to be br0. The
> control plane datapath joins no problem, but then switch1
> tries to join. br0 has already claimed the control plane VM so
> nothing works.
> I just hacked rfserver.cc to never associate 0x60eb69215b92
> with a VM. Then when switch1 comes up the VM is free and
> everything works.
> I'm not sure how the code as is currently checked in can work
> at all, unless I am missing something - Eg, br0 just happens
> to very slow to come up so it comes up last. Probably the
> right solution is to add a more robust check for br0 to ignore
> it no matter when it comes up.
> Thanks,
> On Mon, 4 Jun 2012, Josh Bailey wrote:
> Hi Allan;
> Thanks, things are a bit closer now but still not
> working. The rfvm1 VM now receives packets but
> eth1 (for example) drops them all:
> I'll keep troubleshooting but any suggestions
> appreciated.
> On Mon, 4 Jun 2012, Allan Vidal wrote:
> Hi Josh,
> About Mongo in /usr: yep, that's not
> the ideal. We plan to support Ubuntu
> 12.04 soon, and then it will be a
> matter of apt-getting mongo :)
> As for the testing, the default
> user/password for the prebuilt VM is
> routeflow/routeflow. The LXC
> containers are root/root.
> Anyhow, I changed the scripts a lot in
> the last commits a few days ago. Try
> pulling from the repository. I believe
> they're much better, though not quite
> flawless :) You might have run into
> some troublesome commit.
> There's a small glitch: in the first
> run of the tests after booting, OVS
> behaves badly sometimes. You have to
> close the script and try again, then
> it
> should work.
> Please, let me know if you run into
> any problems :)
> Allan
> On Mon, Jun 4, 2012 at 6:14 AM, joshb
> <jo...@google.com> wrote:
> � � �Hi Christian;
> � � �Nice to see! Definitely fewer
> moving pieces in the new design
> � �
> I'm not sure I understand the problem. You said RFServer handles the first ports and
> then silently drops the rest, but that's not what should happen.
> When a switch joins RFServer and there's an idle VM to connect, it will create N
> entries, where N is the number of switch ports. The format of these entries will be:
> vm_id, -, -, -, dp_id, -
> After that, the VM will be instructed to send mapping packets on each of its
> interfaces. When a mapping message arrives at the controller and is redirected to
> RFServer, we check if there's an unmapped entry in the format above (thus, the check
> for VS_ID=""). When there's, we make it an active entry:
> vm_id, vm_port, vs_id, vs_port, dp_id, dp_port
> Removing the check for VS_ID="" could potentially cause valid entries to be
> overwritten in the association table.
> Are you running RouteFlow under POX?
> If you are, it will be interesting to see what happens (with and without your
> modification) in the association table through the web interface. You can start it by
> going to rfweb, running "python rfweb_server.py" and then access
> http://localhost:8080/index.html > You will need pymongo as instructed in the README file.
> And thanks for the patch! We really appreciate your efforts :)
> Allan
> On Wed, Jun 6, 2012 at 12:48 AM, Josh Bailey <jo...@google.com> wrote:
> OK. I am now able to swap out "switch1" with a hardware OpenFlow switch
> and have pings work. However I had to fix another problem along the way.
> This problem is in RFServer.cc. My hardware switch has lots of ports, of
> course, so I want to add them all. However, RFServer.cc only handles
> VM_MAP messages where there are no existing entries with VS_ID set. So it
> handles the first ports and then silently drops the rest because the first
> ones to add, add entries with VS_ID...
> I just commented out query[VS_ID] = "" (see below) and now all VM_MAP
> messages are processed.
> � � � � � � � �// Search for VM's with no mapping
> � � � � � � � �map<string, string> query;
> � � � � � � � �vector<RFEntry> results;
> � � � � � � � �query[VM_ID] = to_string<uint64_t>(mapmsg->get_vm_id());
> � � � � � � � �// query[VS_ID] = "";
> � � � � � � � �^^^^^^^^^^^^^^^^^^^^^
> � � � � � � � �// Querying for VS_ID is enough, but we could play it safe
> and query all mapping attributes
> � � � � � � � �results = this->rftable->get_entries(query);
> On Tue, 5 Jun 2012, Josh Bailey wrote:
> Hi Allan;
> OK, I find the problem. Firstly there's a lot of startup races
> in rftest1. For example, if MongoDB fails in time to start
> nothing else works and just crashes. I added some checks to
> make sure MongoDB starts before proceeding. I'll send a patch.
> But the actual problem is, that RFServer gets a datapath join
> event for "0x60eb69215b92", which happens to be br0. The
> control plane datapath joins no problem, but then switch1
> tries to join. br0 has already claimed the control plane VM so
> nothing works.
> I just hacked rfserver.cc to never associate 0x60eb69215b92
> with a VM. Then when switch1 comes up the VM is free and
> everything works.
> I'm not sure how the code as is currently checked in can work
> at all, unless I am missing something - Eg, br0 just happens
> to very slow to come up so it comes up last. Probably the
> right solution is to add a more robust check for br0 to ignore
> it no matter when it comes up.
> Thanks,
> On Mon, 4 Jun 2012, Josh Bailey wrote:
> Hi Allan;
> Thanks, things are a bit closer now but still not
> working. The rfvm1 VM now receives packets but
> eth1 (for example) drops them all:
> I'll keep troubleshooting but any suggestions
> appreciated.
> On Mon, 4 Jun 2012, Allan Vidal wrote:
> Hi Josh,
> About Mongo in /usr: yep, that's not
> the ideal. We plan to support Ubuntu
> 12.04 soon, and then it will be a
> matter of apt-getting mongo :)
> As for the testing, the default
> user/password for the prebuilt VM is
> routeflow/routeflow. The LXC
> containers are root/root.
> Anyhow, I changed the scripts a lot in
> the last commits a few days ago. Try
> pulling from the repository. I believe
> they're much better, though not quite
> flawless :) You might have run into
> some troublesome commit.
> There's a small glitch: in the first
> run of the tests after booting, OVS
> behaves badly sometimes. You have to
> close the script and try again, then
> it
> should work.
> Please, let me know if you run into
> any problems :)
> Allan
> On Mon, Jun 4, 2012 at 6:14 AM, joshb
> <jo...@google.com> wrote:
> � � �Hi Christian;
> � � �Nice to see! Definitely fewer
> moving pieces in the new design
> � � �which is
> � � �great! Also being able to use pox
> as well will be helpful.
> � � �Unfortunately the new setup
> doesn't appear to work for me with
> � � �neither
> � � �nox nor pox - no packets are
> received by the control VM rfvm1. I
> � � �used
> � � �Ubuntu 11.0.4 (x64).
> � � �I can build fine (though I had to
> add CPPFLAGS=-fPIC,
> � � �LDFLAGS=-fPIC).
> � � �It's also a bit scary that
> mongodb is built with prefix /usr
> � � �which
> � � �would overwrite any packaged
> distribution - but no big deal.
> � � �Also the README says to use --nox
> vs --pox for testing. The
> � � �scripts
> � � �though are actually rftest1 and
> rftest1_pox and they don't take
What's the number of ports informed by the switch? It's in the DatapathJoin
message that goes through the rfserver<->rfproxy channel (the collection is
named the same).
A possibility is that the config from previous runs is being kept. The
Mongo database is deleted in each run of the script so we have a clean
testing environment each time, but if isn't and the switch informed it had
two ports in a previous run, it will remain as the expected value. Again,
we plan to change that soon in a new scheme of manual, updatable
configuration.
> The "wild card" VS_ID = "" is gone. Because it's gone no new entries can
> be added.
> On Wed, 6 Jun 2012, Allan Vidal wrote:
> Hi Josh,
>> I'm not sure I understand the problem. You said RFServer handles the
>> first ports and
>> then silently drops the rest, but that's not what should happen.
>> When a switch joins RFServer and there's an idle VM to connect, it will
>> create N
>> entries, where N is the number of switch ports. The format of these
>> entries will be:
>> vm_id, -, -, -, dp_id, -
>> After that, the VM will be instructed to send mapping packets on each of
>> its
>> interfaces. When a mapping message arrives at the controller and is
>> redirected to
>> RFServer, we check if there's an unmapped entry in the format above
>> (thus, the check
>> for VS_ID=""). When there's, we make it an active entry:
>> vm_id, vm_port, vs_id, vs_port, dp_id, dp_port
>> Removing the check for VS_ID="" could potentially cause valid entries to
>> be
>> overwritten in the association table.
>> Are you running RouteFlow under POX?
>> If you are, it will be interesting to see what happens (with and without
>> your
>> modification) in the association table through the web interface. You can
>> start it by
>> going to rfweb, running "python rfweb_server.py" and then access
>> http://localhost:8080/index.**html <http://localhost:8080/index.html>
>> You will need pymongo as instructed in the README file.
>> And thanks for the patch! We really appreciate your efforts :)
>> Allan
>> On Wed, Jun 6, 2012 at 12:48 AM, Josh Bailey <jo...@google.com> wrote:
>> OK. I am now able to swap out "switch1" with a hardware OpenFlow
>> switch
>> and have pings work. However I had to fix another problem along the
>> way.
>> This problem is in RFServer.cc. My hardware switch has lots of
>> ports, of
>> course, so I want to add them all. However, RFServer.cc only handles
>> VM_MAP messages where there are no existing entries with VS_ID set.
>> So it
>> handles the first ports and then silently drops the rest because the
>> first
>> ones to add, add entries with VS_ID...
>> I just commented out query[VS_ID] = "" (see below) and now all VM_MAP
>> messages are processed.
>> // Search for VM's with no mapping
>> map<string, string> query;
>> vector<RFEntry> results;
>> query[VM_ID] = to_string<uint64_t>(mapmsg->**
>> get_vm_id());
>> // query[VS_ID] = "";
>> ^^^^^^^^^^^^^^^^^^^^^
>> // Querying for VS_ID is enough, but we could play it
>> safe
>> and query all mapping attributes
>> results = this->rftable->get_entries(**query);
>> On Tue, 5 Jun 2012, Josh Bailey wrote:
>> Hi Allan;
>> OK, I find the problem. Firstly there's a lot of startup races
>> in rftest1. For example, if MongoDB fails in time to start
>> nothing else works and just crashes. I added some checks to
>> make sure MongoDB starts before proceeding. I'll send a patch.
>> But the actual problem is, that RFServer gets a datapath join
>> event for "0x60eb69215b92", which happens to be br0. The
>> control plane datapath joins no problem, but then switch1
>> tries to join. br0 has already claimed the control plane VM so
>> nothing works.
>> I just hacked rfserver.cc to never associate 0x60eb69215b92
>> with a VM. Then when switch1 comes up the VM is free and
>> everything works.
>> I'm not sure how the code as is currently checked in can work
>> at all, unless I am missing something - Eg, br0 just happens
>> to very slow to come up so it comes up last. Probably the
>> right solution is to add a more robust check for br0 to ignore
>> it no matter when it comes up.
>> Thanks,
>> On Mon, 4 Jun 2012, Josh Bailey wrote:
>> Hi Allan;
>> Thanks, things are a bit closer now but still not
>> working. The rfvm1 VM now receives packets but
>> eth1 (for example) drops them all:
>> I'll keep troubleshooting but any suggestions
>> appreciated.
>> On Mon, 4 Jun 2012, Allan Vidal wrote:
>> Hi Josh,
>> About Mongo in /usr: yep, that's not
>> the ideal. We plan to support Ubuntu
>> 12.04 soon, and then it will be a
>> matter of apt-getting mongo :)
>> As for the testing, the default
>> user/password for the prebuilt VM is
>> routeflow/routeflow. The LXC
>> containers are root/root.
>> Anyhow, I changed the scripts a lot in
>> the last commits a few days ago. Try
>> pulling from the repository. I believe
>> they're much better, though not quite
>> flawless :) You might have run into
>> some troublesome commit.
>> There's a small glitch: in the first
>> run of the tests after booting, OVS
>> behaves badly sometimes. You have to
>> close the script and try again, then
>> it
>> should work.
>> Please, let me know if you run into
>> any problems :)
>> Allan
>> On Mon, Jun 4, 2012 at 6:14 AM, joshb
>> <jo...@google.com> wrote:
>> Hi Christian;
>> Nice to see! Definitely fewer
>> moving pieces in the new design
>> which is
>> great! Also being able to use pox
>> as well will be helpful.
>> Unfortunately the new setup
>> doesn't appear to work for me with
>> neither
>> nox nor pox - no packets are
>> received by the control VM rfvm1. I
>> used
>> Ubuntu 11.0.4 (x64).
>> I can build fine (though I had to
>> add CPPFLAGS=-fPIC,
>> LDFLAGS=-fPIC).
>> It's also a bit scary that
>> mongodb is built with prefix /usr
>> which
>> would overwrite any packaged
>> distribution - but no big deal.
>> Also the README says to use --nox
>> vs --pox for testing. The
>> scripts
>> though are actually rftest1 and
>> rftest1_pox and
On Mon, 11 Jun 2012, Allan Vidal wrote:
> Hi Josh,
> What's the number of ports informed by the switch? It's in the DatapathJoin message that goes through the�rfserver<->rfproxy channel
> (the collection is named the same).
> A possibility is that the config from previous runs is being kept. The Mongo database is deleted in each run of the script so we
> have a clean testing environment each time, but if isn't and the switch informed it had two ports in a previous run, it will remain
> as the expected value. Again, we plan to change that soon in a new scheme of manual, updatable configuration.
> Allan
> On Wed, Jun 6, 2012 at 10:58 PM, Josh Bailey <jo...@google.com> wrote:
> So just running rftest1 --pox with unmodified code, here's what happens:
> VM has registered, but switch1 is not up yet. register_vm() adds this entry.
> The "wild card" VS_ID = "" is gone. Because it's gone no new entries can be added.
> On Wed, 6 Jun 2012, Allan Vidal wrote:
> Hi Josh,
> I'm not sure I understand the problem. You said RFServer handles the first ports and
> then silently drops the rest, but that's not what should happen.
> When a switch joins RFServer and there's an idle VM to connect, it will create N
> entries, where N is the number of switch ports. The format of these entries will be:
> vm_id, -, -, -, dp_id, -
> After that, the VM will be instructed to send mapping packets on each of its
> interfaces. When a mapping message arrives at the controller and is redirected to
> RFServer, we check if there's an unmapped entry in the format above (thus, the check
> for VS_ID=""). When there's, we make it an active entry:
> vm_id, vm_port, vs_id, vs_port, dp_id, dp_port
> Removing the check for VS_ID="" could potentially cause valid entries to be
> overwritten in the association table.
> Are you running RouteFlow under POX?
> If you are, it will be interesting to see what happens (with and without your
> modification) in the association table through the web interface. You can start it by
> going to rfweb, running "python rfweb_server.py" and then access
> http://localhost:8080/index.html > You will need pymongo as instructed in the README file.
> And thanks for the patch! We really appreciate your efforts :)
> Allan
> On Wed, Jun 6, 2012 at 12:48 AM, Josh Bailey <jo...@google.com> wrote:
> � � �OK. I am now able to swap out "switch1" with a hardware OpenFlow switch
> � � �and have pings work. However I had to fix another problem along the way.
> � � �This problem is in RFServer.cc. My hardware switch has lots of ports, of
> � � �course, so I want to add them all. However, RFServer.cc only handles
> � � �VM_MAP messages where there are no existing entries with VS_ID set. So it
> � � �handles the first ports and then silently drops the rest because the first
> � � �ones to add, add entries with VS_ID...
> � � �I just commented out query[VS_ID] = "" (see below) and now all VM_MAP
> � � �messages are processed.
> � � �Would it be possible to commit a fix for this?
> � � � � � �OK, I find the problem. Firstly there's a lot of startup races
> � � � � � �in rftest1. For example, if MongoDB fails in time to start
> � � � � � �nothing else works and just crashes. I added some checks to
> � � � � � �make sure MongoDB starts before proceeding. I'll send a patch.
> � � � � � �But the actual problem is, that RFServer gets a datapath join
> � � � � � �event for "0x60eb69215b92", which happens to be br0. The
> � � � � � �control plane datapath joins no problem, but then switch1
> � � � � � �tries to join. br0 has already claimed the control plane VM so
> � � � � � �nothing works.
> � � � � � �I just hacked rfserver.cc to never associate 0x60eb69215b92
> � � � � � �with a VM. Then when switch1 comes up the VM is free and
> � � � � � �everything works.
> � � � � � �I'm not sure how the code as is currently checked in can work
> � � � � � �at all, unless I am missing something - Eg, br0 just happens
> � � � � � �to very slow to come up so it comes up last. Probably the
> � � � � � �right solution is to add a more robust check for br0 to ignore
> � � � � � �it no matter when it comes up.
> � � � � � � � � �Thanks, things are a bit closer now but still not
> � � � � � � � � �working. The rfvm1 VM now receives packets but
> � � � � � � � � �eth1 (for example) drops them all:
I tried to setup a similar test based on your scenario, using OVS. It's
basically rftest1 modified to include more ports (1 and 2 are connected to
hosts b1 and b2, respectively, and the rest is inactive).
I had to change a few things:
1) The config file for rfvm1 to include the new interfaces (now
rfvm1.0-rfvm1.11)
2) /etc/network/interfaces in rfvm1 to setup the new interfaces
3) rftest1 to include the new ports. This is the interesting part:
For the linked interfaces in switch1 (b1.0 and b2.0), I just add them.
For the inactive interfaces in switch1 (b3.0 to b11.0), I needed to add "--
set interface b*.0 type=internal" to the add-port command so that they're
created regardless of being linked.
After we do that, we have 11 ports in the switch, so RFServer creates 11
entries in RFTable. When the mapping packets come from the VM, they should
fill the table appropriately.
I also needed to add the ports a command at a time. This is so that the
ports are added in order: we want port 1 in the switch to be associated
with port 1 in the control plane (dp0). If we add the ports in a single
command, b2.0 might end up as port 5 in the switch for example, messing
with what we configured in rfvm1. Actually, this is an issue I identified
modifying rftest1 and I will fix it soon.
I'm very puzzled by your issue. Could you try this modified version of
rftest1 to see what happens?
Are you using a hardware switch? If it's informing 11 ports in the datapath
join event, this number of entries should've been created, even if they're
blank at first.
What might be happening is that only some (the active?) ports are being
communicated by the switch. In this case, fewer entries will be created,
and so mapping messages will be discarded when they arrive from the VM.
On Mon, Jun 11, 2012 at 10:11 PM, Josh Bailey <jo...@google.com> wrote:
> It's 11. And I confirm that the database is being deleted.
> On Mon, 11 Jun 2012, Allan Vidal wrote:
> Hi Josh,
>> What's the number of ports informed by the switch? It's in the
>> DatapathJoin message that goes through the rfserver<->rfproxy channel
>> (the collection is named the same).
>> A possibility is that the config from previous runs is being kept. The
>> Mongo database is deleted in each run of the script so we
>> have a clean testing environment each time, but if isn't and the switch
>> informed it had two ports in a previous run, it will remain
>> as the expected value. Again, we plan to change that soon in a new scheme
>> of manual, updatable configuration.
>> Allan
>> On Wed, Jun 6, 2012 at 10:58 PM, Josh Bailey <jo...@google.com> wrote:
>> So just running rftest1 --pox with unmodified code, here's what
>> happens:
>> VM has registered, but switch1 is not up yet. register_vm() adds
>> this entry.
>> The "wild card" VS_ID = "" is gone. Because it's gone no new entries
>> can be added.
>> On Wed, 6 Jun 2012, Allan Vidal wrote:
>> Hi Josh,
>> I'm not sure I understand the problem. You said RFServer
>> handles the first ports and
>> then silently drops the rest, but that's not what should
>> happen.
>> When a switch joins RFServer and there's an idle VM to
>> connect, it will create N
>> entries, where N is the number of switch ports. The format of
>> these entries will be:
>> vm_id, -, -, -, dp_id, -
>> After that, the VM will be instructed to send mapping packets
>> on each of its
>> interfaces. When a mapping message arrives at the controller
>> and is redirected to
>> RFServer, we check if there's an unmapped entry in the format
>> above (thus, the check
>> for VS_ID=""). When there's, we make it an active entry:
>> vm_id, vm_port, vs_id, vs_port, dp_id, dp_port
>> Removing the check for VS_ID="" could potentially cause valid
>> entries to be
>> overwritten in the association table.
>> Are you running RouteFlow under POX?
>> If you are, it will be interesting to see what happens (with
>> and without your
>> modification) in the association table through the web
>> interface. You can start it by
>> going to rfweb, running "python rfweb_server.py" and then
>> access
>> http://localhost:8080/index.**html<http://localhost:8080/index.html>
>> You will need pymongo as instructed in the README file.
>> And thanks for the patch! We really appreciate your efforts :)
>> Allan
>> On Wed, Jun 6, 2012 at 12:48 AM, Josh Bailey <jo...@google.com>
>> wrote:
>> OK. I am now able to swap out "switch1" with a hardware
>> OpenFlow switch
>> and have pings work. However I had to fix another problem
>> along the way.
>> This problem is in RFServer.cc. My hardware switch has
>> lots of ports, of
>> course, so I want to add them all. However, RFServer.cc
>> only handles
>> VM_MAP messages where there are no existing entries with
>> VS_ID set. So it
>> handles the first ports and then silently drops the rest
>> because the first
>> ones to add, add entries with VS_ID...
>> I just commented out query[VS_ID] = "" (see below) and
>> now all VM_MAP
>> messages are processed.
>> // Search for VM's with no mapping
>> map<string, string> query;
>> vector<RFEntry> results;
>> query[VM_ID] = to_string<uint64_t>(mapmsg->
>> **get_vm_id());
>> // query[VS_ID] = "";
>> ^^^^^^^^^^^^^^^^^^^^^
>> // Querying for VS_ID is enough, but we
>> could play it safe
>> and query all mapping attributes
>> results = this->rftable->get_entries(**
>> query);
>> On Tue, 5 Jun 2012, Josh Bailey wrote:
>> Hi Allan;
>> OK, I find the problem. Firstly there's a lot of
>> startup races
>> in rftest1. For example, if MongoDB fails in time
>> to start
>> nothing else works and just crashes. I added some
>> checks to
>> make sure MongoDB starts before proceeding. I'll
>> send a patch.
>> But the actual problem is, that RFServer gets a
>> datapath join
>> event for "0x60eb69215b92", which happens to be
>> br0. The
>> control plane datapath joins no problem, but then
>> switch1
>> tries to join. br0 has already claimed the control
>> plane VM so
>> nothing works.
>> I just hacked rfserver.cc to never associate
>> 0x60eb69215b92
>> with a VM. Then when switch1 comes up the VM is
>> free and
>> everything works.
>> I'm not sure how the code as is currently checked
>> in can work
>> at all, unless I am missing something - Eg, br0
>> just happens
>> to very slow to come up so it comes up last.
>> Probably the
>> right solution is to add a more robust check for
>> br0 to ignore
>> it no matter when it comes up.
>> Thanks,
>> On Mon, 4 Jun 2012, Josh Bailey wrote:
>> Hi Allan;
>> Thanks, things are a bit closer now but still
>> not
>> working. The rfvm1 VM now receives packets but
>> eth1 (for example) drops them all:
Ah! Thanks! I'll follow your suggestions and update you with the result. Makes sense to me that there will be a mismatch based on whether ports are active or not.
On Tue, 12 Jun 2012, Allan Vidal wrote:
> Hi Josh,
> I tried to setup a similar test based on your scenario, using OVS. It's
> basically rftest1 modified to include more ports (1 and 2 are connected to hosts
> b1 and b2, respectively, and the rest is inactive).
> I had to change a few things:
> 1) The config file for rfvm1 to include the new interfaces (now
> rfvm1.0-rfvm1.11)
> 2) /etc/network/interfaces in rfvm1 to setup the new interfaces
> 3) rftest1 to include the new ports. This is the interesting part:
> For the linked interfaces in switch1 (b1.0 and b2.0), I just add them.
> For the inactive interfaces in switch1 (b3.0 to b11.0), I needed to add "-- set
> interface b*.0 type=internal" to the add-port command so that they're created
> regardless of being linked.
> After we do that, we have 11 ports in the switch, so RFServer creates 11 entries
> in RFTable. When the mapping packets come from the VM, they should fill the
> table appropriately.
> I also needed to add the ports a command at a time. This is so that the ports
> are added in order: we want port 1 in the switch to be associated with port 1 in
> the control plane (dp0). If we add the ports in a single command, b2.0 might end
> up as port 5 in the switch for example, messing with what we configured in
> rfvm1. Actually, this is an issue I identified modifying rftest1 and I will fix
> it soon.
> I'm very puzzled by your issue. Could you try this modified version of rftest1
> to see what happens?
> Are you using a hardware switch? If it's informing 11 ports in the datapath join
> event, this number of entries should've been created, even if they're blank at
> first.
> What might be happening is that only some (the active?) ports are being
> communicated by the switch. In this case, fewer entries will be created, and so
> mapping messages will be discarded when they arrive from the VM.
> Allan
> On Mon, Jun 11, 2012 at 10:11 PM, Josh Bailey <jo...@google.com> wrote:
> It's 11. And I confirm that the database is being deleted.
> On Mon, 11 Jun 2012, Allan Vidal wrote:
> Hi Josh,
> What's the number of ports informed by the switch? It's
> in the DatapathJoin message that goes through
> the�rfserver<->rfproxy channel
> (the collection is named the same).
> A possibility is that the config from previous runs is
> being kept. The Mongo database is deleted in each run of
> the script so we
> have a clean testing environment each time, but if isn't
> and the switch informed it had two ports in a previous
> run, it will remain
> as the expected value. Again, we plan to change that
> soon in a new scheme of manual, updatable configuration.
> Allan
> On Wed, Jun 6, 2012 at 10:58 PM, Josh Bailey
> <jo...@google.com> wrote:
> � � �So just running rftest1 --pox with unmodified code,
> here's what happens:
> � � �VM has registered, but switch1 is not up yet.
> register_vm() adds this entry.
> � � �The "wild card" VS_ID = "" is gone. Because it's
> gone no new entries can be added.
> � � �On Wed, 6 Jun 2012, Allan Vidal wrote:
> � � � � � �Hi Josh,
> � � � � � �I'm not sure I understand the problem. You
> said RFServer handles the first ports and
> � � � � � �then silently drops the rest, but that's not
> what should happen.
> � � � � � �When a switch joins RFServer and there's an
> idle VM to connect, it will create N
> � � � � � �entries, where N is the number of switch
> ports. The format of these entries will be:
> � � � � � �vm_id, -, -, -, dp_id, -
> � � � � � �After that, the VM will be instructed to send
> mapping packets on each of its
> � � � � � �interfaces. When a mapping message arrives at
> the controller and is redirected to
> � � � � � �RFServer, we check if there's an unmapped
> entry in the format above (thus, the check
> � � � � � �for VS_ID=""). When there's, we make it an
> active entry:
> � � � � � �vm_id, vm_port, vs_id, vs_port, dp_id,
> dp_port
> � � � � � �Removing the check for VS_ID="" could
> potentially cause valid entries to be
> � � � � � �overwritten in the association table.
> � � � � � �Are you running RouteFlow under POX?
> � � � � � �If you are, it will be interesting to see
> what happens (with and without your
> � � � � � �modification) in the association table
> through the web interface. You can start it by
> � � � � � �going to rfweb, running "python
> rfweb_server.py" and then access
> � � � � � �http://localhost:8080/index.html > � � � � � �You will need pymongo as instructed in the
> README file.
> � � � � � �And thanks for the patch! We really
> appreciate your efforts :)
> � � � � � �Allan
> � � � � � �On Wed, Jun 6, 2012 at 12:48 AM, Josh Bailey
> <jo...@google.com> wrote:
> � � � � � �� � �OK. I am now able to swap out "switch1"
> with a hardware OpenFlow switch
> � � � � � �� � �and have pings work. However I had to
> fix another problem along the way.
> � � � � � �� � �This problem is in RFServer.cc. My
> hardware switch has lots of ports, of
> � � � � � �� � �course, so I want to add them all.
> However, RFServer.cc only handles
> � � � � � �� � �VM_MAP messages where there are no
> existing entries with VS_ID set. So it
> � � � � � �� � �handles the first ports and then
> silently drops the rest because the first
> � � � � � �� � �ones to add, add entries with VS_ID...
> � � � � � �� � �I just commented out query[VS_ID] = ""
> (see below) and now all VM_MAP
> � � � � � �� � �messages are processed.
> � � � � � �� � �Would it be possible to commit a fix for
> this?
I finally was able to test this (add all ports on the hardware OpenFlow switch, one at a time, in order, and do the same with the RouteFlow VM's interfaces). Then I am able to pass traffic.
Great! Thanks for all the help.
I think the key thing now is to identify via configuration, which ports on what switch are associated with what VM/interface. Are there any specific plans in this area?
On Tue, 12 Jun 2012, Allan Vidal wrote:
> Hi Josh,
> I tried to setup a similar test based on your scenario, using OVS. It's basically rftest1 modified to include more ports (1 and 2 are connected to hosts b1 and b2,
> respectively, and the rest is inactive).
> I had to change a few things:
> 1) The config file for rfvm1 to include the new interfaces (now rfvm1.0-rfvm1.11)
> 2) /etc/network/interfaces in rfvm1 to setup the new interfaces
> 3) rftest1 to include the new ports. This is the interesting part:
> For the linked interfaces in switch1 (b1.0 and b2.0), I just add them.
> For the inactive interfaces in switch1 (b3.0 to b11.0), I needed to add "-- set interface b*.0 type=internal" to the add-port command so that they're created
> regardless of being linked.
> After we do that, we have 11 ports in the switch, so RFServer creates 11 entries in RFTable. When the mapping packets come from the VM, they should fill the table
> appropriately.
> I also needed to add the ports a command at a time. This is so that the ports are added in order: we want port 1 in the switch to be associated with port 1 in the
> control plane (dp0). If we add the ports in a single command, b2.0 might end up as port 5 in the switch for example, messing with what we configured in rfvm1.
> Actually, this is an issue I identified modifying rftest1 and I will fix it soon.
> I'm very puzzled by your issue. Could you try this modified version of rftest1 to see what happens?
> Are you using a hardware switch? If it's informing 11 ports in the datapath join event, this number of entries should've been created, even if they're blank at
> first.
> What might be happening is that only some (the active?) ports are being communicated by the switch. In this case, fewer entries will be created, and so mapping
> messages will be discarded when they arrive from the VM.
> Allan
> On Mon, Jun 11, 2012 at 10:11 PM, Josh Bailey <jo...@google.com> wrote:
> It's 11. And I confirm that the database is being deleted.
> On Mon, 11 Jun 2012, Allan Vidal wrote:
> Hi Josh,
> What's the number of ports informed by the switch? It's in the DatapathJoin message that goes through the�rfserver<->rfproxy channel
> (the collection is named the same).
> A possibility is that the config from previous runs is being kept. The Mongo database is deleted in each run of the script so we
> have a clean testing environment each time, but if isn't and the switch informed it had two ports in a previous run, it will remain
> as the expected value. Again, we plan to change that soon in a new scheme of manual, updatable configuration.
> Allan
> On Wed, Jun 6, 2012 at 10:58 PM, Josh Bailey <jo...@google.com> wrote:
> � � �So just running rftest1 --pox with unmodified code, here's what happens:
> � � �VM has registered, but switch1 is not up yet. register_vm() adds this entry.
> � � �The "wild card" VS_ID = "" is gone. Because it's gone no new entries can be added.
> � � �On Wed, 6 Jun 2012, Allan Vidal wrote:
> � � � � � �Hi Josh,
> � � � � � �I'm not sure I understand the problem. You said RFServer handles the first ports and
> � � � � � �then silently drops the rest, but that's not what should happen.
> � � � � � �When a switch joins RFServer and there's an idle VM to connect, it will create N
> � � � � � �entries, where N is the number of switch ports. The format of these entries will be:
> � � � � � �vm_id, -, -, -, dp_id, -
> � � � � � �After that, the VM will be instructed to send mapping packets on each of its
> � � � � � �interfaces. When a mapping message arrives at the controller and is redirected to
> � � � � � �RFServer, we check if there's an unmapped entry in the format above (thus, the check
> � � � � � �for VS_ID=""). When there's, we make it an active entry:
> � � � � � �vm_id, vm_port, vs_id, vs_port, dp_id, dp_port
> � � � � � �Removing the check for VS_ID="" could potentially cause valid entries to be
> � � � � � �overwritten in the association table.
> � � � � � �Are you running RouteFlow under POX?
> � � � � � �If you are, it will be interesting to see what happens (with and without your
> � � � � � �modification) in the association table through the web interface. You can start it by
> � � � � � �going to rfweb, running "python rfweb_server.py" and then access
> � � � � � �http://localhost:8080/index.html > � � � � � �You will need pymongo as instructed in the README file.
> � � � � � �And thanks for the patch! We really appreciate your efforts :)
> � � � � � �Allan
> � � � � � �On Wed, Jun 6, 2012 at 12:48 AM, Josh Bailey <jo...@google.com> wrote:
> � � � � � �� � �OK. I am now able to swap out "switch1" with a hardware OpenFlow switch
> � � � � � �� � �and have pings work. However I had to fix another problem along the way.
> � � � � � �� � �This problem is in RFServer.cc. My hardware switch has lots of ports, of
> � � � � � �� � �course, so I want to add them all. However, RFServer.cc only handles
> � � � � � �� � �VM_MAP messages where there are no existing entries with VS_ID set. So it
> � � � � � �� � �handles the first ports and then silently drops the rest because the first
> � � � � � �� � �ones to add, add entries with VS_ID...
> � � � � � �� � �I just commented out query[VS_ID] = "" (see below) and now all VM_MAP
> � � � � � �� � �messages are processed.
> � � � � � �� � �Would it be possible to commit a fix for this?
Currently, the only possibility of configuration is manually editing RFTable through MongoDB. But even in this case, the only configuration we can change is the VM-DP association. The ports and interfaces are still associated automatically. We plan a new algorithm in which there will be a static configuration file (a CSV file, maybe), that will be read when RFServer starts. Association will then only be performed when there's a full match in the config file. The plan is to implement it ASAP, but we can't give any hard deadlines. I expect we will have something in the next two or three months.
On Mon, Jun 25, 2012 at 7:55 AM, Josh Bailey <jo...@google.com> wrote:
> Hi Allan;
> I finally was able to test this (add all ports on the hardware OpenFlow > switch, one at a time, in order, and do the same with the RouteFlow VM's > interfaces). Then I am able to pass traffic.
> Great! Thanks for all the help.
> I think the key thing now is to identify via configuration, which ports on > what switch are associated with what VM/interface. Are there any specific > plans in this area?
> Thanks,
> On Tue, 12 Jun 2012, Allan Vidal wrote:
> Hi Josh,
>> I tried to setup a similar test based on your scenario, using OVS. It's >> basically rftest1 modified to include more ports (1 and 2 are connected to >> hosts b1 and b2, >> respectively, and the rest is inactive).
>> I had to change a few things:
>> 1) The config file for rfvm1 to include the new interfaces (now >> rfvm1.0-rfvm1.11)
>> 2) /etc/network/interfaces in rfvm1 to setup the new interfaces
>> 3) rftest1 to include the new ports. This is the interesting part: >> For the linked interfaces in switch1 (b1.0 and b2.0), I just add them.
>> For the inactive interfaces in switch1 (b3.0 to b11.0), I needed to add >> "-- set interface b*.0 type=internal" to the add-port command so that >> they're created >> regardless of being linked.
>> After we do that, we have 11 ports in the switch, so RFServer creates 11 >> entries in RFTable. When the mapping packets come from the VM, they should >> fill the table >> appropriately.
>> I also needed to add the ports a command at a time. This is so that the >> ports are added in order: we want port 1 in the switch to be associated >> with port 1 in the >> control plane (dp0). If we add the ports in a single command, b2.0 might >> end up as port 5 in the switch for example, messing with what we configured >> in rfvm1. >> Actually, this is an issue I identified modifying rftest1 and I will fix >> it soon.
>> I'm very puzzled by your issue. Could you try this modified version of >> rftest1 to see what happens? >> Are you using a hardware switch? If it's informing 11 ports in the >> datapath join event, this number of entries should've been created, even if >> they're blank at >> first. >> What might be happening is that only some (the active?) ports are being >> communicated by the switch. In this case, fewer entries will be created, >> and so mapping >> messages will be discarded when they arrive from the VM.
>> Allan
>> On Mon, Jun 11, 2012 at 10:11 PM, Josh Bailey <jo...@google.com> wrote:
>> It's 11. And I confirm that the database is being deleted.
>> On Mon, 11 Jun 2012, Allan Vidal wrote:
>> Hi Josh, >> What's the number of ports informed by the switch? It's in the >> DatapathJoin message that goes through theárfserver<->rfproxy channel
>> (the collection is named the same). >> A possibility is that the config from previous runs is being >> kept. The Mongo database is deleted in each run of the script so we >> have a clean testing environment each time, but if isn't and >> the switch informed it had two ports in a previous run, it will remain >> as the expected value. Again, we plan to change that soon in a >> new scheme of manual, updatable configuration.
>> Allan
>> On Wed, Jun 6, 2012 at 10:58 PM, Josh Bailey <jo...@google.com> >> wrote:
>> á á áSo just running rftest1 --pox with unmodified code, >> here's what happens:
>> á á áVM has registered, but switch1 is not up yet. >> register_vm() adds this entry.
>> á á á á á ádb.rftable.find()
>> á á á{ "_id" : ObjectId("**4fd00a19a784d5bdd7305713"), >> "vm_id" : "1142171972022649", "vm_port" : "", "vs_id" : "", "vs_port" : >> á á á"", "dp_id" : "", "dp_port" : "" }
>> á á áNow switch1 has come up.
>> á á á á á ádb.rftable.find()
>> á á á{ "_id" : ObjectId("**4fd00a38a784d5bdd730571c"), >> "dp_id" : "270793501641551", "dp_port" : "1", "vm_id" : >> á á á"1142171972022649", "vm_port" : "1", "vs_id" : >> "1919317619", "vs_port" : "1" } >> á á á{ "_id" : ObjectId("**4fd00a38a784d5bdd730571d"), >> "dp_id" : "270793501641551", "dp_port" : "2", "vm_id" : >> á á á"1142171972022649", "vm_port" : "2", "vs_id" : >> "1919317619", "vs_port" : "2" }
>> á á áThe "wild card" VS_ID = "" is gone. Because it's gone no >> new entries can be added.
>> á á áOn Wed, 6 Jun 2012, Allan Vidal wrote:
>> á á á á á áHi Josh,
>> á á á á á áI'm not sure I understand the problem. You said >> RFServer handles the first ports and >> á á á á á áthen silently drops the rest, but that's not what >> should happen.
>> á á á á á áWhen a switch joins RFServer and there's an idle VM >> to connect, it will create N >> á á á á á áentries, where N is the number of switch ports. The >> format of these entries will be: >> á á á á á ávm_id, -, -, -, dp_id, -
>> á á á á á áAfter that, the VM will be instructed to send >> mapping packets on each of its >> á á á á á áinterfaces. When a mapping message arrives at the >> controller and is redirected to >> á á á á á áRFServer, we check if there's an unmapped entry in >> the format above (thus, the check >> á á á á á áfor VS_ID=""). When there's, we make it an active >> entry: >> á á á á á ávm_id, vm_port, vs_id, vs_port, dp_id, dp_port
>> á á á á á áRemoving the check for VS_ID="" could potentially >> cause valid entries to be >> á á á á á áoverwritten in the association table.
>> á á á á á áAre you running RouteFlow under POX? >> á á á á á áIf you are, it will be interesting to see what >> happens (with and without your >> á á á á á ámodification) in the association table through the >> web interface. You can start it by >> á á á á á ágoing to rfweb, running "python rfweb_server.py" >> and then access >> á á á á á áhttp://localhost:8080/index.**html<http://localhost:8080/index.html> >> á á á á á áYou will need pymongo as instructed in the README >> file.
>> á á á á á áAnd thanks for the patch! We really appreciate your >> efforts :)
>> á á á á á áAllan
>> á á á á á áOn Wed, Jun 6, 2012 at 12:48 AM, Josh Bailey < >> jo...@google.com> wrote:
>> á á á á á áá á áOK. I am now able to swap out "switch1" with a >> hardware OpenFlow switch >> á á á á á áá á áand have pings work. However I had to fix >> another problem along the way.
>> á á á á á áá á áThis problem is in RFServer.cc. My hardware >> switch has lots of ports, of >> á á á á á áá á ácourse, so I want to add them all. However, >> RFServer.cc only handles >> á á á á á áá á áVM_MAP messages where there are no existing >> entries with VS_ID set. So it >> á á á á á áá á áhandles the first ports and then silently >> drops the rest because the first >> á á á á á áá á áones to add, add entries with VS_ID...
>> á á á á á áá á áI just commented out query[VS_ID] = "" (see >> below) and now all VM_MAP >> á á á á á áá á ámessages are processed.
>> á á á á á áá á áWould it be possible to commit a fix for this?
>> á á á á á áá á áThanks,
>> á á á á á áá á áá á á á á áelse if (type == VM_MAP) { >> á á á á á áá á áá á á á á á á áVMMap *mapmsg = >> dynamic_cast<VMMap*>(&msg); >> á á á á á áá á áá á á á á á á ásyslog(LOG_INFO, "Mapping >> message arrived from vm=0x%llx", >> á á á á á áá á ámapmsg->get_vm_id());
>> á á á á á áá á áá á á á á á á á// Search for VM's with no >> mapping >> á á á á á áá á áá á á á á á á ámap<string, string> query; >> á á á á á áá á áá á á á á á á ávector<RFEntry> results; >> á á á á á áá á áá á á á á á á áquery[VM_ID] = >> to_string<uint64_t>(mapmsg->**get_vm_id()); >> á á á á á áá á áá á á á á á á á// query[VS_ID] = ""; >> á á á á á áá á áá á á á á á á á^^^^^^^^^^^^^^^^^^^^^ >> á á á á á áá á áá á á á á á á á// Querying for VS_ID is >> enough, but we could play it safe >> á á á á á áá á áand query all mapping attributes >> á á á á á áá á áá á á á á á á áresults = >> this->rftable->get_entries(**query);
>> á á á á á áá á áOn Tue, 5 Jun 2012, Josh Bailey wrote: