Review for [Mice]:
Nowadays the compositions of uplinks of pods in datacenter are mainly the packet switch ports and circuit switch ports. The reason is that using some circuit switch ports could lower the cost (cost is really a big issue in data center construction, as has been seen in recent reviewed papers). The traditionally scheduling method is HSS, which discovers the hotspot and then offload traffic.
The paper discovered that nowadays the capability of optical switches has been largely increased , and in additional, the traditional packet switches based topology is mainly basing on the multiplexing of outgoing ports and it could lower the utilization of physical link, whereas the optical switches could use TDMA to multiplex flows. Along this way, the paper proposed TMS algorithm.
TMS algorithm has two phases. First the TDM (traffic demand matrix ) is scaled into a BAM (bandwidth allocation matrix). Then BAM is decomposed into a circuit switch schedule, which is a convex combination of permutation matrices that sum to the original BAM. The matrix decomposition algorithm could be BvN. So all traffic are decomposed using the algorithm, no matter it is large or small, so the paper does not only catch the elephants, asl it also hunts mice (A pretty interesting title).
An optimization method is longest time-slot first scheduling. There might be some very short time slots generated by BvN, so these small time slots could be neglected because the time used to setup the scheduling might be rather larger. So the greatest benefit comes from scheduling the first n time slots (n is decided by min duty cycle and max allowed schedule length).
Table 1 shows the tradeoff of number of decomposition matrices versus duty cycle and fraction of CSN and PSN. The major trend is that, if n is small, so TDM could not satisfy all flows, so the additional traffic should be given to traditional PSN. When n reaches the total amount of decomposition matrices, all traffic would be scheduled by TDM, but the duty cycle would fall to lowest.
The paper has a prototype Mordia to test TMS algorithm. The Oprical Ring in Mordia is interesting, but if we look at it by a topology view, it is equivalent to all pods connecting to a switch to do TDMA jobs.
Questions:
1. The matrix decomposition algorithm is proposed 60 years ago. Does it mean the paper says, “ Oh, there are much quicker switches, now we could pick up the ‘forgotten’ algorithm again and apply it directly” ?
2. The matrix decomposition / time slice calculation needs a large amounts of time. How does it fit to dynamic flow changes in which case the BAM matrix changes?
Paper Titles:
Hunting Mice with Microsecond Circuit Switches (HM)
Authors: Nathan Farrington, George Porter, Yeshaiahu Fainman, George Papen, Ami Vahdat,
The Emerging Optical Data Center (EODC)
Authors: Amin Vahdat, Hong Liu, Xiaoxue Zhao, Chris Johnson
Dates:
October 29-30 2012, and 2011
Novel Idea:
HM: A generalization of hotspot scheduling, called traffic matrix scheduling, where most or even all bulk traffic is routed over circuits. Traffic matrix scheduling rapidly time-shares circuits across many destinations at a microsecond time scales in polynomial time.
EODC: The paper presents an overview of current data center network deployments, the role played by optics in this environment, and opportunities for developing variants of existing technologies specifically targeting large-scale deployment in the data center. I.e.: wavelength division multiplexing (WDM) technology, and optical circuit switching along side electrical package switches (EPS).
Main results:
HM:
Designed and built microsecond-scale circuit switch called Mordia (Microsecond Optical Research Datcenter Interconnect Architecture). It is an optical ring of wavelength-selective switches called stations, approximately 2300x faster than the optical space switches used in Helios and c-Through (11.5us, vs 27ms).
This prototype shows it is possible to support microsecond-scale circuit switching over commodity Ethernet technology.
EODC:
Optical Circuit Switching – Ideally want native optical packet switching (OPS), but not possible at this time, so we use OCS because it is data rate agnostic and energy efficient. So allows to lower cost, scales well, allows fast switching, and low insertion loss.
WDM Optical transceivers – To reduce cable overhead, to scale with increasing link bandwidth, and to leverage optical circuit switching, WDM must perform well, overcoming the following constraints:
- Transceivers with large power consumption present thermal challenges and limit EPS chassis density.
- Data center transceivers must account for multi-building span reaching 1km and optical loss from OCS and patch panels.
- Photonics highway must align seamlessly with electrical switch fabric in bandwidth and speed.
- For intra-building network, a rich-mesh topology is desirable
Impact:
HM:
Traffic matrix scheduling achieves the following advantages over packet switching:
- CAPEX and OPEX reduction
- Low latency
- No jitter
- Scales the nominal link rate to hundreds of Gb/s per link.
Hence, all these characteristics make circuit switching a viable contender for future data center network architectures.
EODC:
Paper points out:
- The per-port cost of an OCS is competitive with, if not inherently cheaper than the comparable EPS. However, it has more capacity through wavelength bundling and lower power consumption.
- WDM reduces cable complexity, very challenging to accomplish in a data center.
- OCS eliminates some fraction of the optical transceivers and EPS prots by eliminating a subset of the required OEO conversions.
- Paper concludes, data center network architectures are about to change because by emerging optical technologies and components like optical circuit switching and Wavelength Division multiplexing (WDM) transceivers.
Prior work:
There is some prior work like c-through Helios, and Flyways, HSS, but in general these are all experimental and prototyping work attempting to leverage a transition from current to new technology with hybrid approaches. We are getting there, they say.
Question:
I wonder if the switching speed of the optical circuit switch for the emerging architecture incorporating OCS, and WDM should be a source for concern? I also see it as a single point of failure, but assume it is reliable enough to work just fine.
Using an optical circuit switches results in less wiring, and a cleaner interface, and better maintenance.
Also, why OCS cannot perform per-packet switching? Thanks.
--
You received this message because you are subscribed to the Google Groups "CSCI2950-u Spring 13 - Brown" group.
To unsubscribe from this group and stop receiving emails from it, send an email to csci2950u-sp13-b...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
Paper Title: The Emerging Optical Data Center
Authors: Amin Vahdat, Hong Liu1, Xiaoxue Zhao, Chris Johnson
OSA/OFC/NFOEC 2011
Novel Idea: In this paper the authors present an outline of existing data center network deployments and how optics could play an important role in overcoming the limitations imposed by existing technology in large-scale data centers. Specifically they examine how wavelength division multiplexing (WDM) technology could be optimized to be used in data centers and how a combination of Optical Circuit Switching (OCS) and EPS would be advantageous for data centers.
Main Results: The authors showed how emerging optical technology and its components could be incorporated in large-scale data centers. They especially focused on the role of Optical Circuit Switching and WDM transceivers in the data center.
Impact: The data center switching and interconnect technology currently available, imposes several limitations and obstacles in scaling up to large data centers while maintaining good performance levels. For example, the number of Electrical Packet Switches could complicate management and OpEx, while EPS ports and optical transceivers could introduce a high cost in the network equipment and large amounts of multimode fiber would be required. Optimizing the optical technology components and incorporating them into data center networks could play a critical role in overcoming those challenges.
Evidence: The authors first explore what are the communication and network requirements of large-scale data centers. They present an example of a current data center architectures and then show how emerging data center architectures that employ OCS would be. Then, they analyze the main challenges of incorporating OCS into data centers and what are the main requirements for OCS hardware in that case. Finally, they talk about how WDM performance could be achieved without increasing power and cost significantly, in order to meet the emerging data center economies and scale.
Prior Work: The authors present an emerging data center architecture, based on the works of [7] and [9] that employ OCS. They further refer to existing data center technology [3,4,5,6] and [7] that incorporates OCS along with EPS in the data center.
Criticism: This paper is mainly and overview of existing technologies and parameters that should be taken into account when incorporating optics to build large and scalable data centers. It serves its purpose on motivating the networking community to take into consideration how the evolving optical technology could waive the limitations that exist in current data centers and help build scalable and efficient data centers in the future.
The Emerging Optical Data Center
A. Vahdat, H. Liu, X. Zhao, C. Johnson
Novel Idea: Increasing the amount of optics in the data center, namely WDM and OCS.
Main Results:OCS has an increased data transfer rate as well as extremely low power usage. OCS can’t currently do packet switching, though there is much research going on to fix this. OCS should replace some core switches and handle mainly longer flows. Copper cables are prone to high error rates, and use a tremendous amount of power. VCSELs currently can’t cross a single datacenter, but they are low power, and widely used in data centers. WDM transceivers need to be used to scale these data centers without incurring absurd costs.
Impact: Possibly large amounts in the future.
Prior Work: OCS
Evidence: Results from other papers.
Hunting Mice with Microsecond Circuit Switches 2012
N. Farrington, G. Porter, Y. Fainman, G. Papen, A.Vahdat
Novel Idea: Hotspot scheduling is inflexible and static. Data centers with hybrid switches should instead use traffic matrix scheduling. TMS decouples scheduling and switching. Unlike HSS, once TMS has constructed a schedule, it will be implemented in hardware in a matter of microseconds. Separating scheduling and switching allows TMS to schedule a larger amount of traffic while using the same expensive algorithm to construct the schedule.
Main Result: The TMS algorithm is O(N^2). TMS routes all bulk data transfers through circuit switches instead of only hotspots. It allows for lower jitter, lower latency, and lower CAPEX and OPEX.
Evidence: They created mordia to test algorithms such as TMS.
Paper Title: Hunting Mice with Microsecond Circuit Switches
Authors: Nathan Farrington, George Porter, Yeshaiahu Fainman, George Papen, Amin Vahdaty
Hotnets ’12, October 29–30, 2012
Novel Idea: In this paper, the authors present Traffic Matrix Scheduling (TMS), a generalization of hotpot scheduling (HSS), in which most or all bulk traffic is routed over circuits. The proposed scheduling algorithm runs in polynomial time and time-shares circuits across many destinations at microsecond time scales.
Main Results/Impact: The main result of the paper, is TMS that can be used for scheduling circuit switches in data center networks to route all bulk traffic, apart from hotpots traffic, which makes circuit switches much more useful for future data center designs.
Evidence: The authors do a comparison between TMS and HSS to show the main differences between the two scheduling policies. They use an example of TMS, in which they show how 8 pods run Hadoop with an all-to-all communication pattern by providing the physical and logical topology, the inter-pod traffic demand matrix and a Gantt chart. They report useful equations to compute the duty cycle D that determines the effective link rate of the circuit switch, as well as equation to compute the amount of buffering required by each host in bits, and discuss the ways in which the effective link rate could be increased. They present the TMS algorithm in detail and provide measurements on its execution obtained after testing TMS with dense uniform random input patterns. Moreover, they show examples of the trade off between the number of schedule time slots, the amount of traffic sent over the circuit switched network compared to the packet switched network and the duty cycle D.
Prior Work: As mentioned before, TMS generalizes HSS that has been used for circuit scheduling in hybrid data center networks, in previous works such as [8,18, 5, 7, 17] that combined electronic packet switching with either wireless or optical circuit switching.
Competitive Work: There are other hybrid data center network prototypes, such as the work on c-Through [16,17] and Helios [7] that use HSS. Both of those though, require some operations that are time consuming and rely on the critical path for circuit reconfiguration. Also, there are other works such as [3,4] that use the Birkhoff-von Neumann decomposition algorithm to compute schedules for input queued switches, but those works have been focusing on packet switching, while this paper is focusing on circuit switch scheduling for data center networks with packet buffers distributed among the hosts.
Reproducibility: Yes.
Criticism: There have been works that have proposed hybrid data center networks that combine electronic packet switching and wireless or optical circuit switching, to support bulk traffic. Most of those though have been focusing on hotspot scheduling, and while they have managed to offer some improvements, they rely on packet-switching for the remaining traffic. This work uses circuit switching to route all bulk data center traffic, not just the traffic from the hotspots. Since circuit switching has some significant advantages over packet switching, this work has a particular value as it makes circuit switching a considerable alternative for future data center architectures. Furthermore, it proposes a generalization over HSS, to overcome the limitations that are imposed by HSS (certain features that make HSS static and inflexible) while at the same time exploring further the capabilities of circuit switching in data centers.