Paper Title "FAWN: A Fast Array of Wimpy Nodes"
Author(s) David G. Andersen et al.
Date ACM SOSP'09 October 2009
Novel Idea
Motivated by the poor seek performance of disks on random-access
workloads and the large power draw required by cluster and datacenters,
Andersen et al. propose a new approach to furnish distributed
computation and fast disk access by substituting traditional machines
for low-power, embedded CPUs with flash storage for rapid random-data
access.
Main Result(s)
The evaluation shows that FAWN-KV's performance grows by leaps and
bounds as soon as the data storage can fit the DRAM cache. Moreover, the
authors conclude that depending on the brand of Flash device, read and
write operations present totally different performances for the same
workload. Finally FAWN-DS design is suits flash writes better if
compared to Berkeley DB.
Regarding cluster evaluation, FAWN-KV copes well with ring membership
changes as long as the system is not running on its limits.
Finally, compared to traditional systems, FAWN+SSD presents the best
bang for the buck regarding data access, TCO and power draw.
Impact
The FAWN approach has the potential to achieve high performance and be
more energy-efficient than conventional architecture, while harnessing a
well-matched cluster system.
Evidence
FAWN has been implemented using a cluster of embedded components and
different types of SSD and disk-based storage to fault-tolerable,
consistent key-value access. FAWN-KV is the software component
responsible for managing data access and distribution.
Prior Work
Several works have used commodity embedded low-power CPUs and flash
storage for cluster key-value applications, such as Gordon, CEMS,
AmdahlBlades, Microblades and Microsoft's Marlowe.
FAWN-KV organizes the back-end virtual IDs into a storage ring-structure
using consistent hashing, a concept that have been previously used by
DHT system like Chord and other applications like CoralCDN.
Competitive work
Traditional systems composed of desktop-like machines with SSD or
disk-based storage are the direct competitors of FAWN
Reproducibility
Due to the many intricacies of FAWN and the lack of a better description
of its hardware and software components, the results cannot be reproduced.
Question
1) Andersen et al. hypothesizes that SSD performance may reach into DRAM
territory. Considering that in the last few years I/O bandwidth has not
changed much, is this really feasible?
Criticism
1) The authors are only worried about the performance of data access and
power draw, but forget that distributed systems are motivated from the
need of parallel computation and scalable data crunching. At no moment
Andersen et al. considered the case of applications that require large
amounts of processing power, which is common in many distributed
applications is and cannot be ensured by the wimpy nodes.
2) In Section 5, the authors only consider scenarios where FAWN performs
better than traditional systems. What about "large datasets, high-query
rates"? I believe that traditional system scan respond more promptly and
as shown in Table 4, the power savings compared to FAWN are almost the
same.
3) The authors mention that in the future FAWN+SSD could become the
dominant architecture for a wide range of random-access workloads. In
the case of large datasets, a FAWN+SSD architecture would require many
nodes to achieve a large storage space. One parameter that was not
mentioned in Table 4 was the GB/$. Although they have existed for a
while, SSD disks are still expensive, a large number of nodes would
require a considerable amount of investments on infrastructure. In
addition, as mentioned by the authors, a large number of nodes incurs an
increase on the number of switches, which are well known for not being
energy-efficient. Being this way, the FAWN+SSD architecture would not be
as cheap and as energy-efficient as the authors expect it to be.
4) Graphs 9, 10, 11, 12 lack variance measurements. Table 1 and 2 also
should have provided such values.