Paper Title:
MicroTE: Fine Grained Traffic Engineering for Data Centers.
Authors
Theophilus Benson, Ashok Anand, Aditya Akella, and Ming Zhang.
Date
In Proceedinds of the 7th CoNEXT, 2011, pages 8:1--8:12
Novel Idea
Implementation of MicroTE, a fine-grained traffic engineering mechanism that uses a central controller to aggregate and create a global view of network conditions and traffic commands to predict and leverage traffic.
Main Results
- Under realistic workloads, fine granularity results in better performance with greater amount of control traffic.
- Under varying predictability, MicroTE has no problem shifting between performing like ECMP when predictability is low, and performing like Optimal when predictability is high.
- MicroTE supports networks of various sizes, uses a low overhead on the network and is able to calculate and install network paths in less than 1 second.
Impact
MicroTE accommodates various data center traffic patterns and provides a substitute for ECMP when predictability is high.
Question
MicroTE is weak with long-term predictability at which point we switch to ECMP. How common are unpredictable flows? and is it possible to build a history out of them, and optimize MicroTE for when a known unpredictable flow happens again? Maybe we could add an extra buffer or component within the network or routing controller?
The paper introduces a novel concept, that is the predictability of flows in data center and TE area and does optimization basing on this feature of predictable flows. When there is nearly no flow which is predictable, the TE schema will degenerate to ECMP. That is the basic behavior of MicroTE.
Diving into details, the three major contributions of the paper are (1) an evaluation and analysis of existing and recently-proposed traffic engineering techniques under real data center traces; (2) an empirical study of predictability of traffic in current data centers and (3) a new , fine-grained, adaptive, load-sensitive traffic engineering approach that virtually eliminates losses and reduces congestion inside data centers. The first contribution is also done by other papers. So the interesting parts are (2) and (3). Here is predictability means, will the flow maintain its behavior without significant change in future one or two seconds? Although it seems not significant, but this feature could be used in order to optimize the behavior of TE schemes.
The paper claims that existing data center topologies is their inability to provide enough capacity between the servers they interconnect. So again, ECMP becomes the major target to be criticized. I totally understand why ECMP is prone to be criticized by so many papers because when ECMP was proposed (I found it first appears in RFC 2991 in 2000) , the concept of SDN and central controller of networks are still nowhere to be found. But data centers today still using the routing algorithm proposed 10 years ago, regardless of the appearance of SDN, so it is really a tragedy. BUT as long as ECMP is being used, it serves as a good point for publishing, because it is really easy to start your argument by attacking the weakness ECMP.
Go back to this paper. Besides ECMP, other modified TE schemes such as Fat-Tree, VL2 and Hedera are evaluated and the paper found there are performance gaps in these techs. Although central controller and SDN concept exist in some of these such as Hedera, but they cannot perform well because some flows are bursty. So to fix these problems, if we can see the future and assign flows which will not bursty in a short future period of time to paths which will be stable enough, the efficiency will certainly be improved.
So what makes MicroSDN different is they imported the idea of the predictability of flows. The paper claims that 35% of flows in its investigated networks are predictable. The definition of a predictable flow is that it is relatively stable over some time period. Why some flows are predictable? The paper attributes the property to the application the datacenter is using. For example, the MapReduce application in a data center or a multi-layered web application run in a university data cluster. Another important definition of a predictable flow is that it must be inter-ToR, not within a rack, because inter-ToR flows is the major source of congestion.
I think two good points we could borrow from this paper are the design requirements and the architecture of the system. The three design requirements could also be borrowed to other applications aiming to solve TE related problems. In order to outperform ECMP, a next-generation TE system should use (1)multipath routing (2) global view of traffic and (3) exploiting short-term predictability for adaption. And when talking about the MicroTE system architecture, the paper mentions some consideration when designing the system. One example is the selection of controller-active polling based monitoring method or controller-passive server notification method? Actually, polling is not encouraged in SDN because a lot of control traffic will be produced. But how to know the condition of the network? The paper gives a solution, that is to designate a server and proximate it to help the controller collect data. So some code should be written in the server, to let it work with the controller. The paper, for example, added a kernel module in Linux.
Besides the server-side monitoring module and the traditional controller module, the routing component serves as a major computation and reasoning building block to select paths for flows. It uses LP formulation and Bin-packing heuristics.
I only scanned the implementation and evaluation part, by reading the figures, MicroTE beats ECMP generally. But in figure 5, when flows are randomized, the performances of MicroTE and ECMP are close because if there are no predictable flows, MicroTE will become ECMP. But when then adjusted the amount of predictable flows, as shown in Figure 6, MicroTE beats ECMP obviously and moves close to optimal solutions. Then if MicroTE is deployed in large data centers, it performs also well. Figure 7 shows that if the predictability is higher, the MLU under MicroTE will be lower than 100%, whereas ECMP will cause loss in large data centers.
In general, I like this paper.
Paper Title
MicroTE: Fine Grained Traffic Engineering for Data Centers
Authors
Theophilus Benson, Ashok Anand, Aditya Akella and Ming Zhang
Date
ACM CoNEXT 2011, December 6–9 2011, Tokyo, Japan
Novel Idea
This paper presents a traffic engineering system called MicroTE. The idea is to isolate predicable traffic and unpredictable traffic and make routing decision for them separately. The predicable flows are routed optimally first by converging to a global objective, and then unpredictabe flows are routed using weighted ECMP.
Main Result
The results shows that MicroTE's performance is between optimal and ECMP: when the traffic is highly predictable MicroTE achieve nearly optimal bandwidth, and when it's not MicroTE degenerates to ECMP. The authors also show that MicroTE is able to compute and install paths fast enough in large data center networks.
Prior Work
This paper is related to several papers we have read, including Fat-Tree, VL2 and Hedera.
Comment
Besides the design of MicroTE, there are also many takeaways from the background and comparative study sections. The authors give a very good summarization of the existing techniques and the implications on traffic engineering. A more interesting part is the design requirements for traffic engineering mechanisms. These principles allow us to analysis existing traffic engineering techniques (eg. ECMP) on a higher-level and think about how does an ideal traffic engineering system look like.
Question & Further Work
From my understanding, the predictablity discussed in this paper seems to be another way of saying stability. Is that correct? If it is, I'm wondering if it is possible to achieve more specific prediction based on the characteristics of the application's traffic. More specifically, if we know the characteristics of the traffic, it might be possible to statistically predict how the traffic changes and then dynamically change the TE mechanism.
--
You received this message because you are subscribed to the Google Groups "CSCI2950-u Spring 13 - Brown" group.
To unsubscribe from this group and stop receiving emails from it, send an email to csci2950u-sp13-b...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
MicroTE: Fine Grained Traffic engineering for data centers
Authors: Theophilus Benson, Ashok Anand, Aditya Akella, Ming Zhang,
Date: Open Flow CONEXT 2011, December
Novel Idea:
They developed a system called MicroTE that adapts to traffic variations by leveraging the short term and partial predictability of the traffic matrix which operated on a granularity of seconds making it a fine grained technology.
Main Results: They came up with three principles that a TE mechanism must adhere: multipath routing, coordinated scheduling using a global view of traffic, exploiting short-term predictability for adaptation. Then gave a detailed description of the architecture of the system, some main components include the monitoring components for getting traffic demand and flow statistics, the network controller for determining forwarding state, and the forwarding component.
Evidence:
They implemented the MicroTE system using the OpenFlow framework. They implemented the network controller and routing component as C++ modules in the “NOX” framework and implemented the monitoring component in C as a kernel module in C. Then they ran evaluations to test the performance under realistic workloads, under different level of predictabilities and with a large number of hosts in a large data center.
Reproducibility:
Fairly high. They did give an in-depth description of their algorithms and design choices and the reasoning.
Title: MicroTE: Fine Grained Traffic Engineering for Data Centers
Authors: Theophilus Benson, Ashok Anand, Aditya Akella and Ming Zhang
Novel Idea: In this paper, the authors designed and implemented MicroTE, a fine-grained traffic engineering scheme that can work on top of a variety of data center network topologies and can adapt to traffic variations, using a central controller to create a global view of the network conditions and traffic demands. They implemented MicoTE within the OpenFlow framework to coordinate traffic scheduling in the network.
Main Result: The main result of the paper is the design and implementation of the new fine-grained and adaptive traffic engineering scheme, MicroTE that can reduce the losses and congestion inside data centers. Through experimentation, the authors found that MicroTE performs close to optimal when traffic is predictable but it approaches the performance of ECMP when traffic is not predictable.
Prior Work: MicroTE uses OpenFlow to coordinate scheduling of traffic within the network. MicroTE was implemented within the OpenFlow framework. A logically-centralized NOX controller [18] is used to gather a global network view and determine
how flows traverse the network.
Competitive Work: The authors performed an analysis and evaluation of other existing data center network architectures and talked about their most concerning issues. Namely, they refered to the Canonical tree topology (they analyze a canonical 2-tier tree topology with two cores), the Fat-Tree interconnect [16], VL2 [9] and Hedera [2]. They found that existing techniques don't manage to control losses in case of bursty traffic in the data center, either because they don't use multipath routing, or they don't take instantaneous load into account and they don't make decisions based on a global view of the system.
Evidence: The authors begun with an evaluation and analysis of existing traffic engineering techniques using real data center traces. They conducted simulations using traces from two data centers, a large cloud computing data center and a university's private data center. The results indicate that their performance is only 80% of the optimal routing mechanism because they either fail to adapt to changes in the traffic load, they don't take into account a global view of the traffic to make routing decisions and they don't perform multipath routing. The authors analyzed the traffic patterns and used their findings to motivate MicroTE. Through experimentation on real data center traces they found that MicroTE performs within 1-15% of the optimal for real traffic traces and for high traffic predictability it performs closer to optimal routing. For low traffic predictability its performance is closer to ECMP. They found that the overhead imposed by MicroTE due to control messages for traffic monitoring and modification of the switch routing entries, is low.
Criticism: Overall, this is a very interesting paper mainly because it offers a broad view of other existing frameworks and a comparison between them, and then presents MicroTE as a solution to the main issues imposed by those frameworks. The evaluation part of the paper is thorough. The authors check how their proposed solution performs under realistic workloads, under different levels of predictability and how well it could scale to large data centers. Overall this paper offers a considerable alternative to existing traffic engineering schemes.