Reviews for Benson11MicroTE

Rodrigo Fonseca

unread,

Apr 1, 2013, 8:40:40 PM4/1/13

to csci2950u-...@googlegroups.com

Hi,

Please post the reviews for MicroTE as a group reply to this message!

Thanks,
Rodrigo

Christopher Picardo

unread,

Apr 1, 2013, 9:17:17 PM4/1/13

to csci2950u-...@googlegroups.com

Paper Title:

MicroTE: Fine Grained Traffic Engineering for Data Centers.

Authors

Theophilus Benson, Ashok Anand, Aditya Akella, and Ming Zhang.

Date

In Proceedinds of the 7th CoNEXT, 2011, pages 8:1--8:12

Novel Idea

Implementation of MicroTE, a fine-grained traffic engineering mechanism that uses a central controller to aggregate and create a global view of network conditions and traffic commands to predict and leverage traffic.

Main Results

- Under realistic workloads, fine granularity results in better performance with greater amount of control traffic.

- Under varying predictability, MicroTE has no problem shifting between performing like ECMP when predictability is low, and performing like Optimal when predictability is high.

- MicroTE supports networks of various sizes, uses a low overhead on the network and is able to calculate and install network paths in less than 1 second.

Impact

MicroTE accommodates various data center traffic patterns and provides a substitute for ECMP when predictability is high.

Question

MicroTE is weak with long-term predictability at which point we switch to ECMP. How common are unpredictable flows? and is it possible to build a history out of them, and optimize MicroTE for when a known unpredictable flow happens again? Maybe we could add an extra buffer or component within the network or routing controller?

Jeff Rasley

unread,

Apr 1, 2013, 10:11:17 PM4/1/13

to csci2950u-...@googlegroups.com

Title: MicroTE: fine grained traffic engineering for data centers

Authors: Theo Benson, Ashok Anand, Aditya Akella (U. Wisc), and Ming Zhang (MSR)

Date/Conf: CoNext 2011

The primary contributions presented in this paper are the follow:

1) The evaluation of current traffic engineering techniques with real traces, eg. the authors look at VL2, FatTree and Hedera.

2) An empirical study of the predictability of data center network traffic.

3) The architecture and evaluation of MircoTE, which leverages OpenFlow to provide fine-grained adaptive traffic engineering.

Main results:

The authors analyzed traffic between pairs of Top-of-Rack (ToR) switches in a large corporate data center (CLD) and a university data center (UNV). They found that in CLD 60% of flows between ToRs remain predictable for between 1.6 - 2.5 seconds on average and in UNV 35% is predictable for 1.5 - 5 seconds on average. Based on this discovery they build a system MicroTE that designates a server in each rack to periodically aggregate and report traffic matrix data to an OpenFlow controller. The controller then is able to install switch rules for the predictable flows in order to minimize congestion. In the case of non-predictable flows it utilizes the standard ECMP technique for routes.

Performance gap between routing X and optimal (optimal=0% gap):

ECMP: 15-20%

Fat-Tree: 23%

VL2: 20%

Hedera: On par with ECMP

Spanning Tree: Much worse than ECMP, see Figure 1.

MicroTE: 1-5% in general. Depending on predictability though. "Very close" to optimal for high predictable workloads and close to ECMP for unpredictable workloads.

Comments: I enjoyed this paper overall, especially all that statistics about the traffic in the CLD/UNV data centers.

Shu Zhang

unread,

Apr 2, 2013, 12:12:04 AM4/2/13

to csci2950u-...@googlegroups.com

The paper introduces a novel concept, that is the predictability of flows in data center and TE area and does optimization basing on this feature of predictable flows. When there is nearly no flow which is predictable, the TE schema will degenerate to ECMP. That is the basic behavior of MicroTE.

Diving into details, the three major contributions of the paper are (1) an evaluation and analysis of existing and recently-proposed traffic engineering techniques under real data center traces; (2) an empirical study of predictability of traffic in current data centers and (3) a new , fine-grained, adaptive, load-sensitive traffic engineering approach that virtually eliminates losses and reduces congestion inside data centers. The first contribution is also done by other papers. So the interesting parts are (2) and (3). Here is predictability means, will the flow maintain its behavior without significant change in future one or two seconds? Although it seems not significant, but this feature could be used in order to optimize the behavior of TE schemes.

The paper claims that existing data center topologies is their inability to provide enough capacity between the servers they interconnect. So again, ECMP becomes the major target to be criticized. I totally understand why ECMP is prone to be criticized by so many papers because when ECMP was proposed (I found it first appears in RFC 2991 in 2000) , the concept of SDN and central controller of networks are still nowhere to be found. But data centers today still using the routing algorithm proposed 10 years ago, regardless of the appearance of SDN, so it is really a tragedy. BUT as long as ECMP is being used, it serves as a good point for publishing, because it is really easy to start your argument by attacking the weakness ECMP.

Go back to this paper. Besides ECMP, other modified TE schemes such as Fat-Tree, VL2 and Hedera are evaluated and the paper found there are performance gaps in these techs. Although central controller and SDN concept exist in some of these such as Hedera, but they cannot perform well because some flows are bursty. So to fix these problems, if we can see the future and assign flows which will not bursty in a short future period of time to paths which will be stable enough, the efficiency will certainly be improved.

So what makes MicroSDN different is they imported the idea of the predictability of flows. The paper claims that 35% of flows in its investigated networks are predictable. The definition of a predictable flow is that it is relatively stable over some time period. Why some flows are predictable? The paper attributes the property to the application the datacenter is using. For example, the MapReduce application in a data center or a multi-layered web application run in a university data cluster. Another important definition of a predictable flow is that it must be inter-ToR, not within a rack, because inter-ToR flows is the major source of congestion.

I think two good points we could borrow from this paper are the design requirements and the architecture of the system. The three design requirements could also be borrowed to other applications aiming to solve TE related problems. In order to outperform ECMP, a next-generation TE system should use (1)multipath routing (2) global view of traffic and (3) exploiting short-term predictability for adaption. And when talking about the MicroTE system architecture, the paper mentions some consideration when designing the system. One example is the selection of controller-active polling based monitoring method or controller-passive server notification method? Actually, polling is not encouraged in SDN because a lot of control traffic will be produced. But how to know the condition of the network? The paper gives a solution, that is to designate a server and proximate it to help the controller collect data. So some code should be written in the server, to let it work with the controller. The paper, for example, added a kernel module in Linux.

Besides the server-side monitoring module and the traditional controller module, the routing component serves as a major computation and reasoning building block to select paths for flows. It uses LP formulation and Bin-packing heuristics.

I only scanned the implementation and evaluation part, by reading the figures, MicroTE beats ECMP generally. But in figure 5, when flows are randomized, the performances of MicroTE and ECMP are close because if there are no predictable flows, MicroTE will become ECMP. But when then adjusted the amount of predictable flows, as shown in Figure 6, MicroTE beats ECMP obviously and moves close to optimal solutions. Then if MicroTE is deployed in large data centers, it performs also well. Figure 7 shows that if the predictability is higher, the MLU under MicroTE will be lower than 100%, whereas ECMP will cause loss in large data centers.

In general, I like this paper.

Zhiyuan "Eric" Zhang

unread,

Apr 2, 2013, 1:12:01 AM4/2/13

to csci2950u-...@googlegroups.com

Paper Title

MicroTE: Fine Grained Trafﬁc Engineering for Data Centers

Authors

Theophilus Benson, Ashok Anand, Aditya Akella and Ming Zhang

Date

ACM CoNEXT 2011, December 6–9 2011, Tokyo, Japan

Novel Idea

This paper presents a traffic engineering system called MicroTE. The idea is to isolate predicable traffic and unpredictable traffic and make routing decision for them separately. The predicable flows are routed optimally first by converging to a global objective, and then unpredictabe flows are routed using weighted ECMP.

Main Result

The results shows that MicroTE's performance is between optimal and ECMP: when the traffic is highly predictable MicroTE achieve nearly optimal bandwidth, and when it's not MicroTE degenerates to ECMP. The authors also show that MicroTE is able to compute and install paths fast enough in large data center networks.

Prior Work

This paper is related to several papers we have read, including Fat-Tree, VL2 and Hedera.

Comment

Besides the design of MicroTE, there are also many takeaways from the background and comparative study sections. The authors give a very good summarization of the existing techniques and the implications on traffic engineering. A more interesting part is the design requirements for traffic engineering mechanisms. These principles allow us to analysis existing traffic engineering techniques (eg. ECMP) on a higher-level and think about how does an ideal traffic engineering system look like.

Question & Further Work

From my understanding, the predictablity discussed in this paper seems to be another way of saying stability. Is that correct? If it is, I'm wondering if it is possible to achieve more specific prediction based on the characteristics of the application's traffic. More specifically, if we know the characteristics of the traffic, it might be possible to statistically predict how the traffic changes and then dynamically change the TE mechanism.

On Monday, April 1, 2013 8:40:40 PM UTC-4, Rodrigo Fonseca wrote:

Shao, Tuo

unread,

Apr 2, 2013, 3:05:26 AM4/2/13

to csci2950u-...@googlegroups.com

Paper Title

MicroTE: Fine Grained Trafﬁc Engineering for Data Centers

Authors

Theophilus Benson, Ashok Anand, Aditya Akella and Ming Zhang

Date

ACM CoNEXT 2011, December 6–9 2011

Novel Idea

This paper presents a traffic scheduling system to achieve better utilization of network resources, taking the predictablity of traffic into account.

Main Results

The contibutions of this paper are threefolds: first, it evaluates the existing traffic engineering in data center; second, it studies the predicablity of traffic in current data center; it proposes a new scheduling system based on traffic predictablity which aims at minimizing the maximum link utilization.

Impact

This paper also presents a scheduling system to avoid congestion in bottleneck switch whitch helps to achieve better utilization of network bandwidth.

Evidence

The paper first study the utilization of network resources in current network traffic engineering and discovers there is plenty room to improve it. It then study the characteristics of data center traffic and finds it possible to predict some of the traffic. Base on that, it could seperate the predictable traffic from unpredictable one and then it could install pathes for these traffic to minimize the maximum link untilization instead of using ECMP. In the evaluation experiments, it outperforms the ECMP.

Reproducibility

The paper does't provide details about its algorithm which makes it hard to reproduce it.

Competetive Work

ECMP

Criticism and Question

It seems the runtime of this system is relatively long compared to 2s interval and grows fast as the number of TOR grows in table 1. And it's evaluating the coarse grained system by using pairs of TORs instead of servers or flows. It do mentions the runtime for server pairs but it neglects the predictablity of traffic which is a important factor affecting the runtime. I think scalability is a big problem for this design.

It mentions weighted ECMP in the paper. Could it alone also help to eliminate the bottleneck in a network?

--
You received this message because you are subscribed to the Google Groups "CSCI2950-u Spring 13 - Brown" group.
To unsubscribe from this group and stop receiving emails from it, send an email to csci2950u-sp13-b...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Zhou, Rui

unread,

Apr 2, 2013, 12:22:46 AM4/2/13

to csci2950u-...@googlegroups.com

Paper:

MicroTE : Fine Grained Traffic Engineering for Data Centers

Authors:

Theophilus Benson Ashok Anand Aditya Akella Ming Zhang

Novel Idea:

This paper provides an analysis of the traffic in data center networks and raise a very distinct conclusion that a small portion of the entire traffic is predicable on short term, thus they implemented MircoTE to optimize even more than previous solutions based on ECMP.

Summary and Doubts:

The paper first enumerate several reasons why given existing multiple rooted tree structure, current art of state flow allocation solutions failed to achieve close to optimal bandwidth between (external) hosts. The reasons includes :" failing to utilize multiple path diversity, failing to adapt to traffic load changes, and failing to use a global view of traffic and obtain routing decisions." And then the paper proposed the following designing goals for their approach: multi-path routing, co-ordinates Scheduling using global view of Traffic and exploiting short-term predictability for adaption." The paper presented MicroTE,which is a traffic engineering solution guided by the goals.

Everything sort of make sense to me except for the parts that involves prediction, which is essentially the big novel idea that distinguish this paper and the previous Hedera paper. However I have the following doubts:

1. (doubtfulness level: *)

Seems that the predication of traffic is very application specific, and different applications results in vastly different traffic patterns. Unless the predication component really predicts the traffic very well and have the scheduler react correctly and everything is automatic, different applications could require special care accordingly from us. This feels like another trade off between generality and performance. We spend more time to outlined a more specific question so we can provide a more specific thus better solution, but the solution is only good for this specific question, and we can have a huge amount of such specific questions.

2. (doubtfulness level: **)

Assuming the designated server can really predicate traffic for all the different applications, it seems that the predicable portion is really low and short in time. And the gain comparing to Hedera seems not even much. But in order to gain this little extra optimization comparing to Hedara, we need a new service module in each rack. Not sure if this worth it.

3. (doubtfulness level: ***)

When it comes to routing, the paper gave the idea of computing the network path by first routing predictable TORtoTOR traffic entries in TM and then relegating the unpredictable TORtoTOR traffic entries in the TM　using weighted ECMP. The granularity of TORs level is aimed at eliminating control message traffics. However later when the paper faced a sub-optimal performance, it complains about the spatial granularity and argues that MicroTE can achieve better performance if using finer granularity of application-level flows or even at the server to server level. They do not mention the potentially burst of control messages at all for those finer granularity at all this time.

4. (doubtfulness level: ****)

In the evaluation figures, the paper provides a performance statistics when there are more than 75% predictable traffic matrix. The performance seems good but according to the previous study on traffics in date-center networks, 75% predictable traffic seems far higher than realistic, and seems not even tangible. Why don't they just give us a performance statistics on 100% predicable traffic if they are talking about intangible situations anyway?

With all the respects to the authors hard work, I really doubt the piratical value of trying to predict the network traffic.

On Mon, Apr 1, 2013 at 8:40 PM, Rodrigo Fonseca <rodrigo...@gmail.com> wrote:

Charles Zhang

unread,

Apr 2, 2013, 6:18:38 AM4/2/13

to Rodrigo Fonseca, csci2950u-...@googlegroups.com

MicroTE: Fine Grained Traffic engineering for data centers

Authors: Theophilus Benson, Ashok Anand, Aditya Akella, Ming Zhang,

Date: Open Flow CONEXT 2011, December

Novel Idea:

They developed a system called MicroTE that adapts to traffic variations by leveraging the short term and partial predictability of the traffic matrix which operated on a granularity of seconds making it a fine grained technology.

Main Results: They came up with three principles that a TE mechanism must adhere: multipath routing, coordinated scheduling using a global view of traffic, exploiting short-term predictability for adaptation. Then gave a detailed description of the architecture of the system, some main components include the monitoring components for getting traffic demand and flow statistics, the network controller for determining forwarding state, and the forwarding component.

Evidence:

They implemented the MicroTE system using the OpenFlow framework. They implemented the network controller and routing component as C++ modules in the “NOX” framework and implemented the monitoring component in C as a kernel module in C. Then they ran evaluations to test the performance under realistic workloads, under different level of predictabilities and with a large number of hosts in a large data center.

Reproducibility:

Fairly high. They did give an in-depth description of their algorithms and design choices and the reasoning.

Place, Jordan

unread,

Apr 2, 2013, 1:39:59 AM4/2/13

to csci2950u-...@googlegroups.com

MicroTE: Fine Grained Traffic Engineering for Data Centers

Theophilus Benson, Ashok Anand, Aditya Akella, Ming Zhang

ACM CoNEXT '11

MicroTE is another flow scheduler for data centers that looks to
approach optimal bandwidth usage in intra-datacenter communication.
MicroTE is based off the experimentally discovered property that
traffic patterns in data centers tend to be mildly predictable from
second to second. MicroTE has servers monitor the traffic and report
it to a centralized controller which aggregates the data into a TM.
The TM can then be used to discover predictable flows and install
routing rules which will ensure these flows travel optimally through
the network. A weighted ECMP is used for traffic deemed unpredictable.
On the granularity of ToR-ToR traffic monitoring, this approach
does not work well as very little traffic is found to be predictable.
However, on a finer server-to-server granularity, this approach comes
close to providing optimal routing. I am disappointed MicroTE was not
tested in anything close to a production environment as it seems too
data intensive to scale nicely (though the paper does provide good
evidence that it will!).

On Mon, Apr 1, 2013 at 8:40 PM, Rodrigo Fonseca
<rodrigo...@gmail.com> wrote:

Papagiannopoulou, Dimitra

unread,

Apr 2, 2013, 4:57:11 AM4/2/13

to Rodrigo Fonseca, csci2950u-...@googlegroups.com

Title: MicroTE: Fine Grained Traffic Engineering for Data Centers

Authors: Theophilus Benson, Ashok Anand, Aditya Akella and Ming Zhang

Novel Idea: In this paper, the authors designed and implemented MicroTE, a fine-grained traffic engineering scheme that can work on top of a variety of data center network topologies and can adapt to traffic variations, using a central controller to create a global view of the network conditions and traffic demands. They implemented MicoTE within the OpenFlow framework to coordinate traffic scheduling in the network.

Main Result: The main result of the paper is the design and implementation of the new fine-grained and adaptive traffic engineering scheme, MicroTE that can reduce the losses and congestion inside data centers. Through experimentation, the authors found that MicroTE performs close to optimal when traffic is predictable but it approaches the performance of ECMP when traffic is not predictable.

Prior Work: MicroTE uses OpenFlow to coordinate scheduling of traffic within the network. MicroTE was implemented within the OpenFlow framework. A logically-centralized NOX controller [18] is used to gather a global network view and determine

how flows traverse the network.

Competitive Work: The authors performed an analysis and evaluation of other existing data center network architectures and talked about their most concerning issues. Namely, they refered to the Canonical tree topology (they analyze a canonical 2-tier tree topology with two cores), the Fat-Tree interconnect [16], VL2 [9] and Hedera [2]. They found that existing techniques don't manage to control losses in case of bursty traffic in the data center, either because they don't use multipath routing, or they don't take instantaneous load into account and they don't make decisions based on a global view of the system.

Evidence: The authors begun with an evaluation and analysis of existing traffic engineering techniques using real data center traces. They conducted simulations using traces from two data centers, a large cloud computing data center and a university's private data center. The results indicate that their performance is only 80% of the optimal routing mechanism because they either fail to adapt to changes in the traffic load, they don't take into account a global view of the traffic to make routing decisions and they don't perform multipath routing. The authors analyzed the traffic patterns and used their findings to motivate MicroTE. Through experimentation on real data center traces they found that MicroTE performs within 1-15% of the optimal for real traffic traces and for high traffic predictability it performs closer to optimal routing. For low traffic predictability its performance is closer to ECMP. They found that the overhead imposed by MicroTE due to control messages for traffic monitoring and modification of the switch routing entries, is low.

Criticism: Overall, this is a very interesting paper mainly because it offers a broad view of other existing frameworks and a comparison between them, and then presents MicroTE as a solution to the main issues imposed by those frameworks. The evaluation part of the paper is thorough. The authors check how their proposed solution performs under realistic workloads, under different levels of predictability and how well it could scale to large data centers. Overall this paper offers a considerable alternative to existing traffic engineering schemes.

On Mon, Apr 1, 2013 at 8:40 PM, Rodrigo Fonseca <rodrigo...@gmail.com> wrote:

Reply all

Reply to author

Forward