m1. Paper Title:
Where is the Debuffer for my SDN?
2. Date:
HotSDN 2012
3. Novel Idea:
The paper realized the fact that there is still no gdb-like debugger for SDN applications. So it is worthy of developing a debugger for SDN programmers. The behavior of the debugger is inspired by the GDB, primarily by the idea of breakpoint and backtrace.
4. Main Results:
The authors developed a prototype of SDN debugger (ndb), and initially it supports breakpoints and backtrace. In a scenario of a load-balancer development, they encountered three bugs ( no matching entry, server location bug and connection failure). So the backtracing feature did a good job solving these bugs.
5. Impact
The emergence of ndb could affect the future design of SDN network application and protocols. The paper has a wishlist which advocates future versions of OpenFlow to add more features to support the development of the debugger, like atomic flow table updates, layer-2 support and additional forwarding actions such as write arbitrary metadata to packet headers.
6. Evidence
The main idea of implementing the backtracing for any packet is to let switches to send “postcards” to the controller. The reason why not using the “stamp” model is that commodity switches don’t support adding this state within a packet. So to implement the postcard, the proxy gets flow entry modification messages and add additional “sending postcard” action. Then to group postcards from a same packet, the collector maintains a path table whose key is the header fields and value is a list of collected postcards. And the backtrace routes could be reconstructed using the topology information.
But the problems with ambiguity appears. There are flow table ambiguity and packet ambiguity. For flow table ambiguity, the debugger might have to add the version number for the entire flow table. To deal with packet ambiguity where a same packet with header could have conflicting routes, the paper said we don’t need to worry too much on it.
7. Related Work
If SDN becomes a mainstream in network community, papers related to debugging the SDN application will emerge more and more. In HotSDN 2012, another paper for verifying SDN network is Veriflow. But Veriflow take effect before the errors happen but the debugger does its work after the bug is discovered. These two methods are both dynamic. There are also publications for statically checking the SDN, like NICE and Anteater.
Criticism / Question
I understand it is a short paper for HotSDN. But the result is not convincing because more detailed experiments are absent from the paper. One of the few data in this paper suggests that in a 5 hops network, postcards will increase the traffic by 31%. The number I think is not negligible, so methods should be developed to limit the packet we interest in, in order to reduce the extra burden on the whole network.
ndb monitors and records the history (and future) of each packet. This might not be necessary. The idea of EC in Veriflow could be a good way to reduce the traffic and resource consumption in maintaining information of all packets.
Paper Title: VeriFlow – Verifying Network-Wide Invariants in Real Time
Authors: Ahmed Khurshid, Xuan Zou, Wenxuan Zhou, Matthew Caesar, P. Brighten Godfrey.
Date: In 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI), April 2013
Novel Idea: To demonstrate that the goal of real-time verification of network-wide invariants is achievable.
Specifically, “the paper presents Veriflow, a network debugging tool to find faulty rules issued by SDN applications, and optionally prevent them from reaching the network and causing anomalous network behavior”.
Main Results: Veriflow verifies network-wide invariants in 100’s microseconds when new rules are introduced into the network (i.e.: Veriflow is fast enough to compute all the conflicting rules within hundreds of microseconds for 99% of the updates).
Of importance, the verification performed by Veriflow after receiving each flow rule from the controller inflates the end-to-end connection setup latency to some extent.
As we increase the number of OpenFlow packet header fields, the overhead of Veriflow increases gradually but remains low enough to ensure real-time response.
Impact: Packet forwarding is a complex process. It requires a big effort to achieve network corrections, security and fault tolerance. Unfortunately, errors do occur in the form of loops, suboptimal routing, black holes, and access control violations.
Software defined networking makes development easier via centralized network programmability, but software complexity tends to be high.
Also, if we have multiple SDN applications running together (simultaneously) on the same physical network causes a coherence/reliability problem.
Furthermore, applying pre-deployment static software checks results in offline checks that only find bugs after they happen.
Hence, Veriflow resolves these situations by using a SDN component to obtain a picture of the network, then applies incremental search algorithms to find violations, aided by Openflow and custom IP forwarding rules. All these steps happen in a RT response window. Veriflow looks to identify in time the:
- - Availability of a path to the destination
- - Absence of forwarding loops
- - Isolation between virtual networks
- - Enforces access control policy.
It is unfeasible to check the entire network state every time a new flow rule is inserted. This behavior is wasteful, and fails to provide a real time response. Instead the authors suggest using forwarding rules who effects only a small portion of all possible packets. They slice the network into equivalent classes.
Criticism: Veriflow has difficulty verifying invariants in real-time when large swaths of the network’s forwarding behavior are altered in one operation, and when there is link failure.
Question: In Figure 3d, what causes that big jump of the CDF right after 0.15 ms (i.e.: Why do we have a positive step function as CDF increases?)
Future work: In a multi-controller scenario, attaining a global view in real time could be very complex. The paper leaves this for future study.
Rodrigo--
You received this message because you are subscribed to the Google Groups "CSCI2950-u Spring 13 - Brown" group.
To unsubscribe from this group and stop receiving emails from it, send an email to csci2950u-sp13-b...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
Title: VeriFlow: Verifying Network-Wide Invariants in Real Time
Authors: Ahmed Khurshid, Xuan Zou, Wenxuan Zhou, Matthew Caesar, P. Brighten Godfrey
Novel Idea: This paper proposes VeriFlow, which is a network debugging tool that dynamically checks for network wide invariant violations at low latency, so that network performance is not significantly affected. VeriFlow performs real-time verification in the context of SDNs and aims in detecting and potentially preventing bugs by blocking changes to forwarding behavior that might violate important invariants.
Main Result: The authors found that VeriFlow can verify network wide invariants within hundreds of microseconds per rule insertion or deletion. They also found that it has a small impact on network performance and increases TCP connection setup latency by 15.5% on average.
Evidence: The authors used a stream of updates from a simulated IP network constructed with Rocketfuel topology data and real BGP traces, to evaluate VeriFlow's performance. Its overhead on the NOX controller was also evaluated, using Mininet to emulate an OpenFlow network. It was found that the total verification time of VeriFlow remained below 1ms for 97.8% of the updates verified. The mean verification time reported was 0.38ms and the query phase took 0.01ms on average. After evaluating the effect of VeriFlow's operations on TCP connection setup latency and network throughput, the authors reported results on the number of TCP connections that were successfully completed per second for different workloads , with and without VeriFlow. In all cases, VeriFlow was found to cause negligible overhead on the TCP connection setup throughput (the largest reduction observed was only 0.74%).
Prior Work: VerIFlow leverages Software Defined Networking (SDN) to create a picture of the evolving network. But Veriflow doesn't rely only on SDNs. It also uses novel algorithms to search for violations of key network invariants in order to ensure real-time response. The authors built two versions of VeriFlow. One is a proxy process [25] that sits between the controller and the network and is independent of the particular controller, and one that is integrated with the NOX OpenFlow controller [14] for improving performance.
Competitive Work: The paper mentions plenty of related and competitive work on debugging networks and SDNs. Some of them have been focusing on detecting network anomalies [10,19]. Others, such as NICE[12] have been focusing on checking OpenFlow applications. NICE though, is not designed for checking network properties in real-time. Others aim on ensuring data-plane consistency [20, 24] and allowing multiple applications to run side-by-side in a non-conflicting way [22, 23, 25]. FlowVisor [25] in particular, differs from VeriFlow in that it does not verify the rules that applications send to the switches and doesn't check for violations of key network invariants. To mention some other related works: Anteater [19] uses data-plane network information and checks for violations of key network invariants. ConfigChecker [10] and FlowChecker [9] check also for network invariants using BDD to model the network state and run queries using CTL, while VeriFlow uses graph search. One important characteristic of VeriFlow compared to competitive work is that it is capable of preventing problems from reaching the forwarding plane while many others cannot achieve that.
Reproducibility: The results of this paper are reproducible.
Criticism: The results of the paper are good. The authors paid particular attention making VeriFlow different from other existing works. One of those differences that makes the results of the paper particularly interesting is the fact that VeriFlow is the first tool that can dynamically verify network wide invariants in an evolving network in real time. Furthermore, it is also capable of preventing faulty rules issued by the SDN applications for reaching and affecting the network. These two characteristics make VeriFlow a very interesting approach.
Rodrigo
Paper Title: VeriFlow: verifying Network wide Invariants in Real Time
Authors: Ahmed Khurshid, Xuan Zou, Matthew Caesar, P. Brighten Godfrey
Date: 2013
Novel idea:
Rodrigo
Where is the Debugger for my Software-Defined Network?
by Nikhil Handigol, Brandon Heller, Vimalkumar Jeyakumar,David Mazières, and Nick McKeown
Novel Idea: A debugger for networks that is inspired by gdb. The debugger provides several functionalities: breakpoint, watch, backtrace, step, and continue. This paper only goes over the implementation of breakpoint and backtrace.
Main Results: When compared against common bugs of SDN developers, ndb would help significantly for correctness issues, while not doing much for performance bugs. It does well in helping to identify race conditions, logic errors, and errors in switch implementations. ndb uses a postcard model where the switch sends a “postcard” containing the switch id, a version number, and an output port. These values are then sent to the collector, who hashes them into the path table. After the maximum time it takes to traverse the network elapses, the postcard is ejected from the table.
Evidence: There wasn’t much evidence other than a couple of bugs mentioned in the implementation of a load balancer that could be solved by ndb.
Impact: I believe this is the start of something very good for the networking industry. As systems get more complex, we need better and better ways of debugging them with ease.
Reproducibility: This work is fairly reproducible. It outlines the basic implementation of traces, but other than that is lacking details. With enough time, it could be reproduced
Prior Work: gdb, Anteater, Header Space Analysis, OFRewind, Fernetic, Nettle, and NICE.
VeriFlow: Verifying Network-Wide Invariants in Real Time
Novel Idea: A layer in between the SDN and the network hardware to verify that there is no invariant violations in real time. Currently, many of the systems in place to analyze network wide invariants can only be used offline, and thus catch bugs after they happen.
Main Results: In able to do real time verification you must monitor all network update events in the live network as they are created by the applications, the devices, and the operators of the network. To achieve this, they add a shim layer in between the controller and the network. For times sake, verification will only occur on those influenced by the new update. To do this, they use equivalence classes, i.e. sets of rules that are mutually affected by rules. To implement this, they use a multi-dimensional prefix tree. They also use forwarding graphs which represents connectivity throughout the network. This data structure allows for Veriflow to determine reachability, cycles, and rule consistency. Veriflow is able to check new rules for invariant violations within a hundred microseconds, and has very little impact on network performance. In cases where a new rule affects many others on the network, VeriFlow will first install the rules, and then verify the correctness, but this only happens 1% of the time.
Evidence: They tested veriflow on with Rocketfuel simulation data and using real BGP traces using an OSPF simulator. 94.5% of the updates affected a single rule, and 99% affected 10 or less. In these cases, the average time was around .38 ms for a verification. It took VeriFlow on average 1 second to identify a link failure, and in the worst case, 4 seconds. To test the performance of TCP with and without veriflow, they used 172 switches on mininet. The only time when there is significant overhead is when there are lots of flow modifications happening, and in that case, the reduction was around 12%.
Impact: It is not clear the impact of VeriFlow, but it seems like it would be something many network admins would want on their network.
Other Work: Flow Checker, Anteater, Header Space Analysis
Question: They mention that for IPv4 the trie is 32 levels deep. Does that mean with IPv6 it is 128? Wouldn’t that have a non-negligible effect on the runtime?