Reviews for Koponen10Onix

Rodrigo Fonseca

unread,

Mar 2, 2013, 6:46:47 PM3/2/13

to csci2950u-...@googlegroups.com

Hi,

I posted a list (some details missing) of all of the remaining papers we will read through the end of the semester.

The paper for next class is Onix, a paper about a scalable SDN controller related to a design used at Google.

Please post your review as a group reply to this message.

I also need a volunteer who has not presented yet to lead the discussion, please send me a message.

Thanks,

Rodrigo

Jeff Rasley

unread,

Mar 4, 2013, 9:52:05 PM3/4/13

to csci2950u-...@googlegroups.com

Authors from Nicira, Google & NEC.
Context: OSDI 2010

Onix is an SDN implementation from a few popular companies. The interesting ideas that they discuss in the paper seem to be related to implementation details that they found were useful in implementing SDN's in the "real world". I found this paper quite interesting and the accompanying talk that I found on USENIX's site also pretty interesting. I found two sentences in the conclusion somewhat telling: "…this paper is not about the ideology of SDN, but about its implementation." and "In fact, Onix required no novel mechanisms, but instead involves only the judicious use of standard distributed system design practices." I wonder how many other papers could get away with this? It seems the true novelty of this paper was that it is (I believe?) the first major SDN implementation that is in real use.

The core of Onix is its control API, which models the network as a collection of graph objects that are part of the Network Information Base (NIB). Applications such as our previous paper Ethane can be written on top of Onix. Their Onix implementation is roughly 150k lines of C++, which seems potentially large.

The authors state that they are heavily building off of previous work in this area, such as: 4D, RCP, SANE, Ethane, and NOX. Especially NOX they point out as being the most similar to Onix in that it exposed a control platform with a general-purpose API. However, they state, that NOX did not address issues like reliability and scalability which are necessary for a production deployment.

Shu Zhang

unread,

Mar 4, 2013, 10:16:57 PM3/4/13

to csci2950u-...@googlegroups.com

1. Paper Title:
Onix: A Distributed Control Platform for Large-scale Production Networks

2. Authors
Teemu Koponen, Martin Casado, Natasha Gude, Jeremy Stribling, Leon Poutievski, Min Zhu, Rajiv Ramanathan, Yuichiro Iwata, Hiroaki Inoue, Takayuki Hama, Scott Shenker

3. Novel Idea
The paper introduces Onix, which is like a middleware of a production-level network which provides distributed management of network devices and APIs for programming control logic.
Onix is stated to overcome five important challenges in building a production-quality control platform ( Generality, Scalability, Reliability, Simplicity and Control plane performance). And there has been little published work on how to build a network control platform, the authors published this paper to fill the void by describing the design and implementation of the system in a fine-grained level.

4. Main Results
Onix has been proved to be a good base of SDN applications. Ethane, DVS,Multi-tenant virtualized data centers, Scale-out carrier-grade IP router have been proved to be feasible to be built on Onix.
In the section 7, the paper evaluate Onix both with micro-benchmarks and with end-to-end performance measurements of Onix Apps.
In the micro bench experiments, the paper first tests the throughput of NIB modification which shows that the throughput is proportional to the number of attributes modified. By measuring the relationship of memory usage and entities in a single Onix instance, it shows that a single Onix instance can easily handle millions of entities.Thirdly, it tests the ability of forwarding packets when a single Onix instance connects to a cloud of switches.
Then multi-node performance is tested. It tests RPC performance between two Onix instances. The high RPC throughput shows the DHT could be capable of handling very dynamic network state.
Besides, the paper tests the performance of Onix when facing three types of failures. Also, tests on applications built on Onix platform are performed to measure how long they take to recover from failure. The results show that Onix can achieve reactive properties on par with traditional routing implementations.

5. Impact
The paper itself fills the void of the fact that no published work was about the implementation and design of a network control platform. The impact of Onix itself is that it meets the five challenges in building a production-quality control platform. The four-layer architecture of Onix is a paradigm for control plane design. Also, NIB is a crucial part of Onix. Onix pays a lot emphasis on the distribution properties of the platform.It also proposes some modols dealing with scalability and reliability.
As a result, some SDN applications are build on top of Onix using APIs it provides. It is fairly promising to utilize the platform.

6. Evidence
The paper is an object-centric article. The logic chain of the paper is first presenting an artifact, then showing why we need it , then introducing its design and concern, then doing experiments to prove why it is good. Alongside the introduction, it presents some additional applications using the artifact the paper presents.

7. Prior Work
Onix instead derives from a long history of related contributions, they are:
[1]GREENBERG, A., HJALMTYSSON, G., MALTZ, D. A., MYERS, A., REXFORD, J., XIE, G., YAN, H., ZHAN, J., AND ZHANG, H. A Clean Slate 4D Approach to Network Control and Management. SIGCOMM CCR 35, 5 (2005), 41–54
[2] CAESAR, M., CALDWELL, D., FEAMSTER, N., REXFORD,J., SHAIKH, A., AND VAN DER MERWE, K. Design and Implementation of a Routing Control Platform. In Proc. NSDI (April 2005)
[3] CASADO, M., GARFINKEL, T., AKELLA, A., FREEDMAN, M. J.,BONEH, D., MCKEOWN, N., AND SHENKER, S. SANE: A Protection Architecture for Enterprise Networks. In Proc. Usenix Security (August 2006).
[4] CASADO, M., FREEDMAN, M. J., PETTIT, J., LUO, J.,MCKEOWN, N., AND SHENKER, S. Ethane: Taking Control of the Enterprise. In Proc. SIGCOMM (August 2007).
[5] GUDE, N., KOPONEN, T., PETTIT, J., PFAFF, B., CASADO, M.,MCKEOWN, N., AND SHENKER, S. NOX: Towards an Operating System for Networks. In SIGCOMM CCR (July 2008).

8. Competitive work
One of the corresponding competitive work is NOX which could be considered as a control platform offering a general-purpose API.However, NOX did not adequately address reliability, nor did it give the application designer enough flexibility to achieve scalability. In contrast, Onix provides a far more general API than previous systems, and it also provides flexible distribution primitives which ensures scalability, performance and reliability.

9. Reproducibility
The control plane has been implemented. The paper states that it consists of roughly 150,000 lines of C++ and integrates a number of third party libraries. In order to reproduce the results of experiments, as long as we can gain the required number of switches which could be clustered as a cloud, the rest of the work is simply deploying the control plane onto the cloud and carry out these experiments and re-collect the data.

10. Question & Criticisim
I suspect that Onix is necessary to build a SDN network. The core data structure of the control platform is NIB, which collects information of devices and topology of the network. However, the functionality has already be implemented in some controllers like EThane and Floodlight. Except for this functionality, what Onix leaves for us is the API for SDN programmers and its distribution primitives. However, the functionalities of NIB API could also be achieved by other controllers, and certain design, other SDN controllers could also be distributed and cooperate with each other.
So in one sentence, is there a circumstance we must use distributed control platform? Or in other words, how large is the “production level” network which should we distribute the control platform to the network?

Zhiyuan "Eric" Zhang

unread,

Mar 4, 2013, 10:46:25 PM3/4/13

to csci2950u-...@googlegroups.com

Paper

Onix: A Distributed Control Platform for Large-scale Production Networks

Authors

Teemu Koponen, Martin Casado, Natasha Gude, Jeremy Stribling, Leon Poutievskiy, Min Zhu, Rajiv Ramanathan, Yuichiro Iwata, Hiroaki Inoue, Takayuki Hama, Scott Shenker

Date

OSDI 10'

Novel Idea

This paper proposed Onix, a platform such that network control plane can be implemented on top of it. The authors argue that the basic network control primitives should be implemented ones and then be reused by multiple control tasks. Based on this idea, Onix provides a set of API that allows users to implement network control function on top of a high-level abstraction. This general API also provides scalability and reliability at low-level, and makes them separate from the control logic. They also define a data model called NIB that represents the network as a graph of objects. With all the information of a network topology, applications can control the network by reading and altering the state of the network elements, or registering for notifications of state changes on them. With all these abstractions, they proposed this network control paradigm as Software-Defined Network.

Main Results

SDN brings a paradigm shift in network architectures: it simplifies network control tasks by providing a general platform. The authors also give a caution that SDN does not solve all the problems of network management by itself. It only provides an abstraction such that each problem can be solved easier in its own level. The problems like scalability still limits the design of the control applications.

Evidence

The paper discusses four applications built on top of Onix: Ethane, distributed virtual switch, multi-tenant virtualized data center and scale-out carrier-grade IP router. And they also evaluate the performance of Onix as a platform and application running on top of it.

Prior Work

The idea of this paper derives from a long line of work, includes the 4D project, RCP, SANE, Ethane and NOX. Onix extends these existing works such that it provides a more general API as well as distribution primitives, which can be reused by network control applications.

Reproducibility

Their evaluation is based on a working implementation in C++. Therefore I guess it wouldn't be difficult to reproduce their result.

Question

The paper doesn't discuss about security issues. I can see that there are huge benefits for security, however it seems that the centralized controller could also be vulnerable. I'm wondering should security be one of the "low-level" features that are hidden behind the API, or should we run security software/hardware on top of the platform? Traditional OS seems to do both: OS provides low-level security while security application running in user mode.

On Saturday, March 2, 2013 6:46:47 PM UTC-5, Rodrigo Fonseca wrote:

DTrejo

unread,

Mar 4, 2013, 11:59:27 PM3/4/13

to csci2950u-...@googlegroups.com

Paper: Onix: A Distributed Control Platform for Large-scale Production Networks Teemu Koponen , Martin Casado , Natasha Gude , Jeremy Stribling , Leon Poutievskiy , Min Zhuy , Rajiv Ramanathany , Yuichiro Iwataz , Hiroaki Inouez , Takayuki Hamaz , Scott Shenker

Novel Idea: A customizable and re-usable control plane with an API that one can use to implement any managemant functions you wish, without having to solve distributed systems problems (discovery, CAP). Onyx is much more general, API-wise. Also, it provides customizeable distribution primitives (DHT and group storage).

Results: A general, scalable, reliable, simple, performant control plane platform.

Impact: A second-generation take on the ideas proposed in NOX.

Evidence: Measurements showing favorable pkt/s, thoughput, and memory usage figures under various benchmarks (single-node, multi-node, link failures, onix instance failure).

Prior work: 4D, RCP, SANE, Ethane, NOX.

Reproducibility: High, as they've coded it up. Took 150k SLOC of C++. Reuses FML as a policy language. Adding an onyx element in a different language means writing ~2k SLOC depending on the language. Their code supports python java and C++.

Criticism: Only a small amount of deployment experience, as noted by the authors.

On Saturday, March 2, 2013 6:46:47 PM UTC-5, Rodrigo Fonseca wrote:

Tan "Charles" Zhang

unread,

Mar 5, 2013, 12:19:15 AM3/5/13

to csci2950u-...@googlegroups.com

Onix: A Distributed Control Platform for Large-scale Production Networks

Authors: Teemu Koponen, Martin Casado, Natasha Gude, Jeremy Stribling, Leon Poutievski, Min Zhu, Rajiv Ramanatham Yuichiro Iwata, Hiroaki Inoue, Takayuki Hama, Scott Shenker

Date: OSDI, Oct, 2010

Novel Idea:
A new design and implementation for a platform on top of which a network control plane can be implemented as a distributed system, providing a general API for control plane implementations while allowing them to make their own tradeoffs among consistency, durability and scalability.

Main Results:
A detailed description of the design and implementation of the platform and the rationale behind it. They also implemented the system in C++ with 150,000 lines of code and did some benchmarking to evaluate it.

Impact:
A step further into the realm of software defined network. It is a full implementation of a distributed controller which is a precursor to all subsequent implementations.

Evidence:
They ran test to test the scalability and reliability of the Onix instance in multiple scenarios with the combination of single Onix instance, multiple Onix instance and different failure situations.

Prior work: Onix descends from a long line of work which feature a centralized controller and to consider network control as a problem of distributed system, and abstracted out a separate forwarding plane and a control plane.

Reproducibility:
They said they have Onix implemented with 150,000 lines of code and have tested the performance. So if we follow the same logic we should be able to reproduce the work and results.

Question and criticism:
Overall it is a great work on the specific implementation of the controller of an SDN. I’d be interested to know how the flow rate would change compared with traditional network flows in the similar network setting.

Christopher Picardo

unread,

Mar 5, 2013, 1:19:43 AM3/5/13

to csci2950u-...@googlegroups.com

Paper Review - Christopher B. Picardo

Paper Title:

Onix: A Distributed Control Platform for Large Scale Production Networks

Author(s):

Teemu Koponen, Martin Casado, Natasha Gude, Jeremy Stribling, Leon Poutievski, Min Zhu, Rajiv Ramanathan, Yuichiro Iwata, Hiroaki Inoue, Takayuki Hama, Scott Shenker

Date:

Hotnets-IX Proceedings of the 9th ACM SIGCOMM Workshop on Hot Topics in Networks 2010

Novel Idea:

Onix, a platform on top of which a network control plane can be implemented as a distributed system.

The control platform handles the lower level issues and allows developers to program their control logic on a high-level API. In so doing, Onix essentially turns networking problems into distributed systems problem, resolvable by concepts and paradigms familiar for distributed systems developers.

Main Results:

We can imagine of an application on top of Onix which allows the creation of tenant-specific L2 networks. These networks provide a standard Ethernet service model and can be configured independently of each other and can span physical network subnets.

Onix is reliable in the face of failures.

Impact:

A system will stabilize only if the traffic sources throttle back at least as quickly as the queues are growing. Handles link failures, switch failures, and Onix instance failures.

Onix instances monitor their connections to switches using aggressive keepalives. Once a link or switch failure is reported to the control application, the latencies involved in disseminating the failure-related state updates throughout the Onix cluster become essential; they define the absolute minimum time the control application will take to react to the failure throughout the network.

Onix is currently being used by a number of organizations as the platform for building commercial applications. While scaling work and testing is ongoing, applications have managed networks of up to

64 switches with a single Onix instance, and Onix has been tested in clusters of up to 5 instances.

Rather than forcing developers to deal directly with the details of the physical infrastructure, the control platform handles the lower level issues and allows developers to program their control logic on a high-level API. In so doing, Onix essentially turns networking problems into distributed systems problem, resolvable by concepts and paradigms familiar for distributed systems developers.

Evidence:

The essence of the SDN philosophy is that basic primitives for state distribution should be implemented once in the control platform rather than separately for individual control tasks, and should use well-known and general-purpose techniques from the distributed systems literature rather than the more specialized algorithms found in routing protocols and other network control mechanisms.

The SDN paradigm allows network system implementors to use a single control platform to implement a range of control functions (e.g., routing, traffic engineering, access control, VM migration) over a spectrum of control granularities (from individual flows to large traffic aggregates) in a variety of contexts (e.g., enterprises, datacenters, WANs).

Prior work:

Onix descends from a long line of work in which the control plane is separated from the dataplane, but Onix’s focus on being a production-quality control platform for large-scale networks led us to focus more on reliability, scalability, and generality than previous systems.

[3] CAESAR, M., CALDWELL, D., FEAMSTER, N., REXFORD,

J., SHAIKH, A., AND VAN DER MERWE, K. Design and Implementation of a Routing Control Platform. In Proc. NSDI (April 2005).

[4] CAI, Z., DINU, F., ZHENG, J., COX, A. L., AND NG, T. S. E. The Preliminary Design and Implementation of the Maestro Network Control Platform. Tech. rep., Rice University, Department of Computer Science, October 2008.

[5] CASADO, M., FREEDMAN, M. J., PETTIT, J., LUO, J., MCKEOWN, N., AND SHENKER, S. Ethane: Taking Control of the Enterprise. In Proc. SIGCOMM (August 2007).

[6] CASADO, M., GARFINKEL, T., AKELLA, A., FREEDMAN, M. J., BONEH, D., MCKEOWN, N., AND SHENKER, S. SANE: A Protection Architecture for Enterprise Networks. In Proc. Usenix Security (August 2006).

[15] GREENBERG, A., HJALMTYSSON, G., MALTZ, D. A., MYERS, A., REXFORD, J., XIE, G., YAN, H., ZHAN, J., AND ZHANG, H. A Clean Slate 4D Approach to Network Control and Management. SIGCOMM CCR 35, 5 (2005), 41–54.

Question:

Why is the control platform not designed to allow multiple applications to control the network simultaneously? Why are limited to a single application per deployment?

Criticism:

In one of our upcoming deployments, if a single-instance application took one second to analyze the statistics of a single Port and compute a result (e.g., for billing purposes), that application would take two months to process all Ports in the NIB. Therefore the emphasis is on light weight analysis/monitoring.

Shao, Tuo

unread,

Mar 5, 2013, 12:48:57 AM3/5/13

to csci2950u-...@googlegroups.com

Paper Title

Onix: A Distributed Control Platform for Large-scale Production Networks

Authors

Teemu Koponen, Martin Casado, Natasha Gude, Jeremy Stribling, Leon Poutievskiy,

Min Zhuy, Rajiv Ramanathany, Yuichiro Iwataz, Hiroaki Inouez, Takayuki Hamaz, Scott Shenkerx

Novel Idea

This paper presents a distributed network control platform-Onix. By gathering information from switches and managing these information both solely and distributively, Onix provides general APIs to applications allowing them to built different services according to their different requirements of durability, consistency and separately.

Main Results

The major takeaway from this paper is the design of NIB, which maps entities in the network and allows control plane to get access to network entities. In order to scale the network, The paper explores the use of NIB to allow partitioning and aggregation. Furthermore, a refenential inconsistency detection logic and conflict resolution logic are proposed to ensure consistency and allow coordination among applications. Since NIB is designed for durability and consistency, another database DHT for volatile information is also provided as a complement.

Impact

This paper provides us with a feasible control platform for scalable SDN.

Evidence

The idea of this paper is derived from previous SDN research and they share similar design principles. However, there are several requirements that previous work didn't satisfy but this paper want to achieve. To meet the requirement of scalability, the paper decribes the stratigies to use NIB, each of which is followed with serverale examples and demostrations. For reliablity, the paper describes four network failures and solutions to them. In order to distribute NIB to satisfy different requirements of applications and remain consistent as well, the paper also presents several strategies. By implementing ONIX and conducting experiments, the paper evaluate the performance and reliablity of this system.

Prior Work

This paper follows the idea to seperate control plane from data plane presented in work like 4D project, RCP, SANE, Ethane and NOX;

This paper refers to work of DIFANE to reduce load of controller;

This paper follows the path of previous distributed system design like Bayou, PRACTI, WheelFS and PNUTS.

Competetive Work

This paper is a more practical implementation and helps applications to achieve better scalability than previous SDN design work;

This paper is complementary to forwarding-plane-focused systems;

Critism and Questions

It seems that designer of an application still have to know how the underlying network works. For example, as mentioned in section 4.4, the application has to deal with the conflict source of data. However, this goal of SDN is to hide the details of network and provide abstraction. So it there any better way to deal with this problem?

--
You received this message because you are subscribed to the Google Groups "CSCI2950-u Spring 13 - Brown" group.
To unsubscribe from this group and stop receiving emails from it, send an email to csci2950u-sp13-b...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Zhou, Rui

unread,

Mar 4, 2013, 11:08:12 PM3/4/13

to csci2950u-...@googlegroups.com

Paper Title :

Onix: A Distributed Control Platform for Large-scale Production Networks

Author(s) :

Teemu Koponen , Martin Casado , Natasha Gude , Jeremy Stribling , Leon Poutievskiy,

Min Zhuy, Rajiv Ramanathany, Yuichiro Iwataz, Hiroaki Inouez, Takayuki Hamaz, Scott Shenker

Date :

OSDI 10

Novel Idea :

Onix provides a general API for SDN control plane implementations, while allowing designers to make their own trade-offs among consistency, durability, and scalability.

Main Result(s) :

The Evaluation tests on Onix include three key scalability-related aspects: throughput of the NIB, memory usage of the NIB, and bandwidth in the presence of many connections with a single Onix instance or Multi-nodes. The test results showed that the throughput is kind of persistent to increasing operations; the memory usage is proportional to number of NIB entities and the number of operations; and the bandwidth are almost consistent within 1k openflow connections. Reliability tests on Onix has also presented reasonably nice results. Thus the authors declares that the Onix has provided a solid basement which is also general enough, and it is up to the designers of up layer applications(logic controls) to keep maintaining the scalability and performance.

Impact :

Seems many network distributed systems have been built based on Onix or the idea of it.

Evidence :

Performance evaluations shows that Onix is a solid built API for building network operating system.

Prior Work & Competitive work :

Onix is inspired and based on many previous SDN projects, but it adds focus on Generality, Scalability, Reliability, Simplicity and Control plane performance.

Reproducibility :

Yes

Question & Criticism:

The paper mentioned :

"We note that if the control logic implements distributed coordination, race-conditions in state updates will either not exist or will be transient in nature."

I am unsure about this, how would such "distributed coordination" be implemented? Wouldn't we need to solve race-conditions first in order to achieve such a "distributed coordination" ?

kmdent

unread,

Mar 4, 2013, 10:12:54 PM3/4/13

to csci2950u-...@googlegroups.com, Rodrigo Fonseca

Onix: A Distributed Control Platform for Large-scale Production Networks

By: Teemu Koponen, Martin Casado, Natasha Gude, Jeremy Stribling, Leon Poutievskiy,Min Zhuy, Rajiv Ramanathany, Yuichiro Iwataz, Hiroaki Inouez, Takayuki Hamaz, Scott Shenkerx

Novel Idea: A extremely general API for a SDN, as well as other flexible distribution primitives, i.e. DHT and group membership. It’s useful because the programmer using the API doesn’t have to implement methods of information distribution. It does not make gaurentees about consistency or ordering, that is left up to the application. The idea is that it can be customized as seen fit by the application.

Main Results: Using a NIB as the control model and distribution model allowed them to build a scalable and customizable system. It could also be extended to store logical elements. The system allowed for possible addition of heirarchical networks, notification on state changes, as well as other things like subclassing of the default structure to accommodate abstract concepts like tunneling.

Impact: It has not had much impact yet, as it is in testing phases, but it is set to roll out to corporations soon. It is the most customizable SDN currently available. It is one of the first papers to outline implementation of SDNs as opposed to “What our SDNs should do” papers.

Prior Work: 4D, RCP, Sane, Ethane

Competetive Work: NOX was the only other system offering a general purpose API. It lacked reliability, and didn’t allow for enough flexibility when scaling systems.

Crit/ Comment: Seems like a tradeoff between hassle and customisation. An example is the “No ordering gaurentees” as well having to implement distributed locking yourself.

Ideas: Extend onix to use vector clocks.

--

kmdent

--

Reply all

Reply to author

Forward