11/3 Reviews - Misbehavior

Steve Gomez

unread,

Nov 2, 2009, 11:56:12 PM11/2/09

to CSCI2950-u Fall 09 - Brown

Author: Jeffrey C. Mogul
Paper Title: "Emergent (Mis)behavior vs. Complex Software Systems"
Date: In EuroSys '06

The paper's main idea is that emergent behavior should be studied in a
scientific way in order to develop tools that help programmers and
system architects better diagnose and prevent unexpected bugs. Mogul
presents a road map for research into this area, and tries to motivate
these goals with his observations about network protocol failures and
other system misbehaviors (including non-digital ones) that arise from
complex systems.

This paper may inspire other systems researchers to investigate these
issues in emergent system behavior, which is clearly a pet project for
the author. Does he present new ideas, or make new observations? Not
really, but this paper organizes everything into one place using
Mogul's 'research agenda' to outline major issues.

Mogul does motivate the problem, but only anecdotally. So there is no
real experiment or evidence, and nothing to reproduce.

One major criticism I have is that, while he addresses the distinction
several times, I am still unclear about the difference between a
design bug/flaw and an emergent misbehavior. It is safe to use the
example of ants or drivers to motivate emergent behaviors, because
each agent makes independent decisions, unaware of global system
behavior. But, software is written by someone who generally has a
global view of the system. So the inability of an architect to
predict behaviors is not conceptually different from just building a
buggy system.

That is a philosophical distinction, totally aside from the fact that
complex, unpredictable-for-the-architect behaviors (call them bugs or
emergent behaviors) are real problems, and research toward fixing/
preventing those behaviors could be very important. But the
distinction is important. Otherwise, if these are just system bugs,
what is new (research-wise) between the proposed agenda, and regular
error tracking and higher-level patterns in unit/feature/end-to-end
testing in systems?

Unless we nit-pick the details, it isn't clear what would be a
scientific contribution versus engineering guidelines for testing.

Another criticism/question: There is a subtle problem in predicting/
ameliorating emergent behavior that the author fails to mention.
Presumably (based on the comments the author makes), predicting
emergent behavior means matching against patterns (e.g. livelock,
thrashing, etc.) and reporting or blocking them. But, those
emergences are still just consequences of code that was written by
some programmer, who had some intention -- what does a programmer do
if those are the *intended* effects? It may be unlikely, but there
seems to be an inherent consequence that fixing (blocking, flagging,
etc.) emergent behavior will remove some of the richness available to
programmers. That may never an issue, because programmers tend to
want to keep things simple as possible, but what about code patterns
that require such emergent behaviors? Are there any? Will those be
hard to translate into code that doesn't demonstrate behaviors like
thrashing, locking, etc.?

Building on that last point, I think a probabilistic model of
developer intentions would be interesting, extremely difficult to
build, and necessary for any of this work to translate into real
engineering tools.

Rodrigo

unread,

Nov 3, 2009, 1:19:19 AM11/3/09

to CSCI2950-u Fall 09 - Brown

Review from James:

James Tavares

November 3, 2009

Misbehavior

Paper Title: Emergent (mis)behavior vs. Complex Software Systems

Author(s): Jefferey C. Mogul

Date: EuroSys ‘06. April 18-21, 2006.

Main Results/Novel Idea: The paper addresses the notion of “emergent
misbehavior”, a category of system-level failures considered to be
difficult to predict and analyze at “any level simpler than that of
the system as a whole.” The author seeks to set a research agenda for
emergent misbehavior by promoting the creation of relevant taxonomies
of both emergent behaviors and their typical causes, as well as
investigating techniques which could be used to detect, diagnose,
predict, or ameliorate incidents of emergency misbehavior in large-
scale and complex systems.

Impact: Anything that aims to help the stability of large complex
systems is a Good Thing. Otherwise, I’m not sure the impact this
particular work has had in terms of setting the research agenda, as it
hoped to do.

Evidence: The author justifies the existence of emergent misbehavior
by drawing on a number of examples from both non-computer science and
computer-science domains, including examples such as BGP route flap
damping, Ethernet capture, and pedestrian traffic on London’s
Millennium Footbridge. In each of these cases, the composition of two
or more components led to unexpected and undesirable system behavior
under certain circumstances, some of which the author claims were
initially unpredictable.

Prior/Competitive Work: The closest work appears to be by Gribble[20],
which the author claims to differ in the sense that Gribble focused
more so on the ‘butterfly effect’ and continued system operation in
the ‘face of the unexpected’ while Mogul is more interested in
understanding the causes of emergent behavior. Other works have noted
the possibility for and needed study of emergent behaviors in embedded
networks and autonomous computing systems.

Reproducibility: N/A

Question: The author argues the declarative systems might be more
likely to suffer from emergent behavior. While that might be the case
for existing, might it be easier to add prediction and mitigation
features to these systems because the run-time may have more than one
choice as to how to execute the declarative statement?

Criticism: I’ll preface my comments with the fact that this paper is
my first exposure to the concept of emergent behavior in computer
systems. That being said, it would appear to me that the definition of
what types of behavior are considered to be emergent could easily
become a moving a target over time. As we gain experience with large
scale systems, presumably programmers will learn to avoid the pitfalls
of generations past: either because software engineering tools improve
to mitigate prior issues or because a new technique is learned and
passed forward through education.

The author warns against such an extreme interpretation (“that’s not
emergent behavior… [as]…I can explain it…”), yet I tend to lean in
this direction. Most of us know firsthand how difficult it is to
assemble complex systems out of smaller components and get programs to
work correctly in multi-threaded environments. I can therefore
appreciate how useful tracing, prediction and mitigation techniques
would be during the development process. Yet, as I read many of the
author’s examples, the first thought that typically came to mind was
not “that’s an unpredictable emergent behavior”, but rather “that’s a
bad design; someone should fix that bug”.

Future Work: None

Rodrigo

unread,

Nov 3, 2009, 1:21:42 AM11/3/09

to CSCI2950-u Fall 09 - Brown

Review from Kevin

Title: Emergent (Mis)behavior vs. Complex Software Systems

Novel idea: The novelty in this paper comes from its call to action to
categorize and identify emergent misbehavior in software systems. It
creates a taxonomy of sorts and then proposes some high level ideas
for predicting and working around emergent misbehavior.

Main Result: Several examples of emergent misbehavior in real systems
are given to motivate the author's argument that this is a real
problem, a couple of which could be easily reproduced (such as the
ethernet capture effect)

There is no experimentation/comparison and prior work is really
minimal.

Question:
Would a program like NetMedic work for detecting these sorts of mis
behaviors?

Criticism:
I'll give the author credit for starting a discussion that seems to
have been lacking from the CS field, but on the other hand he offers
no interesting methods of prediction, amelioration or testing outside
of pie-in-the-sky high level ideas. The reader is essentially left to
either figure them out for themself or wait for someone else to
publish the solutions.

Future Work:
In terms of prevention, perhaps adding a prototyping or simulation
step to the development paradigms of some large systems could avoid
some emergent misbehavior.

On Nov 2, 11:56 pm, Steve Gomez <steveg...@gmail.com> wrote:

Rodrigo

unread,

Nov 3, 2009, 1:22:44 AM11/3/09

to CSCI2950-u Fall 09 - Brown

Review from Sunil

Paper Title

Emergent (Mis)behavior vs. Complex Software Systems

Author(s)

Jeffrey C. Mogul

Date

EuroSys 2006

Novel Idea

I don’t think there is anything novel in this paper but the author
tries to build on previous work and gives some extensions and
modifications to it. What seems novel is the stress on proposing a
research agenda to deal with emergent misbehavior in complex systems.

Main Result(s)

The author guides us into problems that occur in complex systems and
stress on the fact that most of these occur when bunch of components
are connected together and they cause problems which normally no one
expects and we need to have mechanisms to predict / prevent and solve
these emergent behavior ( misbehavior ). There is a nice blend of
example that the author gives in terms of generic engineering problems
and computer science related. By Stressing the need for classification
of emergent behavior the author tries to put forward a research agenda
for the same. But also acknowledges the fact that generalizing this is
not a trivial task and hence we need to look into domain specific
areas and chooses the operating systems aspect of the problem. There
is also attempt to draw a clear line between what can be said as
emergent misbehavior and not.

The Author parallelizes his work with previous attempts at studying
emergent misbehavior in manufacturing systems and proposes an agenda
which includes creating taxonomy for emergent misbehavior, taxonomy of
typical causes, detection & diagnosis techniques, develop prediction
techniques, develop amelioration techniques and develop testing
techniques. There is also a mention of automated systems to manage
large scale systems which are still naïve like Automated control, SOA
and declarative approaches.

Finally the conclusion that we can drawn from this paper is that
inherently detecting misbehavior is a difficult task but with enough
insight into the system we must be able to detect patterns of
misbehavior and take measures to prevent these patterns.

Criticism

This seems like a very high level abstract paper and wouldn’t like to
criticize too much but I think this is a good introduction to the
papers that come next and its good to know why we need to build tools
such as netmedic, xtrace and also gain insight into how important
misbehavior is !!

On Nov 2, 11:56 pm, Steve Gomez <steveg...@gmail.com> wrote:

Rodrigo

unread,

Nov 3, 2009, 1:23:50 AM11/3/09

to CSCI2950-u Fall 09 - Brown

Review from Juexin

Title
Emergent (Mis)behavior vs. Complex Software Systems

Author(s)
Jeffrey C. Mogul

Date

2006

Novel Idea
N/A

Main Result(s)
Detailedly define and classify the emergent behaviors in large
systems, point out some research aspects of it.

Impact
provide a summary of previous experience in emergent behaviors, which
can help avoid or repair such problems.

Evidence
The author points out these issues in emergent behaviors research:
-Creating a taxonomy of emergent misbehavior;
-Creating a taxonomy of typical causes;
-Developing detection and diagnosis techniques;
-Prediction techniques;
-amelioration techniques;
-testing;
At last introduced some new version complex computing systems being
done by some companies.

Criticism
He didn't provide us any statistics nor quantitative analysis.

===============================================================
Title:
Detailed Diagnosis in Enterprise Networks

Authors:
Srikanth Kandula Ratul Mahajan Patrick Verkaik
Sharad Agarwal Jitendra Padhye Paramvir Bahl

Date:
Aug. 2009

Novel Idea:
analysis joint behavior of two components in the past and estimate the
impact of current events.

Main Result(s):
the authors studied the small enterprise network and realized that
they are different than the large enterprises in that the
administration is less sophisticated. They developed a diagnosis
system, NetMedic, which is scalable to large networks.

Impact:
The paper presents an approach that enables detailed analysis at a
finer granularity with little application specific knowledge.

Evidence:
- Modeling the network as a dependency graph and then using history to
detect abnormalities and likely causes. The nodes of this graph are
network components such as application processes, machines, and
configurations, and network paths. There is a directed edge from a
node A to a node B if A impacts B, and the weight of this edge
represents the magnitude of this impact.
- The NetMedic's workflow: capturing component state, generating the
dependency graph, diagnosing by computing abnormalities of states and
rank the edges.

Prior Work
N?A

Competitive work
None for small enterprise networks.

Reproducibility
Gathering statues (analysis the log) is a barrier.

Question
What make them come to this idea, to study small enterprise networks?

Criticism:
Obtaining dependency graph is complex when a operator want to debug a
performance problem. So the Future Work could be that reform the log
organization, keep the data in the way that easy for Diagnosing.
--
J.W
Happy to receive ur message

On Nov 2, 11:56 pm, Steve Gomez <steveg...@gmail.com> wrote:

Dongbo Wang

unread,

Nov 3, 2009, 1:43:11 AM11/3/09

to brown-cs...@googlegroups.com

Paper Title: Emergent (Mis)behavior vs. Complex Software Systems

Authors: Jeffrey C. Mogul

Date: 2006

Novel Idea: The paper discusses “emergent Misbehavior”. Emergent behaviors are behaviors that cannot be predicted through analysis at any level simpler than that of the system as a whole. The individual components are working as expected, but the behavior of the whole system may deviate what it is supposed to be. The paper gives an analysis on this problem. It presents the taxonomy of emergent misbehavior, the taxonomy of typical causes and how to detect, diagnose and predict such behaviors.

Main Result & Impact: The paper gives a clear description on what is emergent misbehavior and what is not. It gives elaborated examples of emergent misbehavior, from traditional engineering problems (bridge construction) to hard drives, to network, and to complex distributed systems and operating systems. The examples are very interesting and really show how the whole complex system may give rise to some unexpected behaviors.

Evidence: The paper is not based on proof and also there is no experiment to support it. But I think the paper really makes it point. It’s hard to anticipate the overall behavior of a complex system when designing the components. The diagnosing, detection, prediction and amelioration for such emergent misbehaviors are very important.

Prior Work: Steven Gribble’s observation, Parunak and VanderBok’s paper about such problems in distributed control systems for manufacturing systems.

Reproducibility: nothing to reproduce.

Question and Criticism: none.

Marcelo Martins

unread,

Nov 3, 2009, 1:32:15 AM11/3/09

to brown-cs...@googlegroups.com

Paper Title "Emergent (Mis)behavior vs. Complex Software Systems"

Author(s) Jeffrey C. Mogul
Date ACM EuroSys'06 April 2006

Novel Idea

The author proposes a series of steps for creating a agenda on dealing
with emergent misbehavior in complex software systems. These include
taxonomies and development of techniques to identify and reduce the
damage caused by this type of misbehavior.

Main Result(s)

There are no results. This is not a systems paper per se, but a
collection of ideas that might turn into new systems or ameliorations of
the current ones.

Impact

The paper is basically food for thought. It will probably become a
highly cited paper in the future (if it isn't by now), as it opens
many paths of a broad area of research (some of them already mentioned
in other papers).

Evidence

The paper is corroborated by opinions and suggestions of well-known
researchers in the area of complex software systems and software
engineering, which shows that the idea comes from a real necessity of
improving emergent misbehavior detection and prediction techniques.

Prior Work

A few techniques and concepts applied to areas outside the operating
systems one are mentioned such as the works on the mechanical
engineering field, and also others that are directly related to
computers, such as Gribbles's observations on the butterfly effect,
IBM's autonomic computing proposal and the EmNets project.

Competitive work

None, as the area as it is not mature enough.

Reproducibility

There is nothing to be reproduced yet.

Questions / Criticisms

It is difficult to question or criticize something that has not been
designed, implemented nor tested.

Ideas for further work

Many ideas that can be explored are described in the paper. My personal
interest relates their application to the area of ad-hoc wireless networks.

Xiyang Liu

unread,

Nov 3, 2009, 1:34:45 AM11/3/09

to CSCI2950-u Fall 09 - Brown

Paper Title
Emergent (Mis)behavior vs. Complex Software Systems

Author(s)
Jeffrey C. Mogul

Date

EuroSys'06, April 2006

Novel Idea & Main Results
The paper presented examples and the nature of emergent misbehavior
which is generally an unexpected behavior arising from the composition
of components. The paper also proposed a research agenda including
creating taxonomies of emergent misbehavior categories, typical causes
and developing detection, diagnosis, prediction and amelioration
techniques.

Impact
The definitions and techniques to deal with emergent misbehavior
provided by this paper is a general summary of previous researches and
practical architectures. It can be referred by the future research and
implementation in this area.

Evidence
Examples of emergent misbehavior from various domains were presented
to support the later sections of scope and research agenda of emergent
misbehavior.

Prior Work
The paper referred to many previous researches and industrial
implementations. For example, most emergent misbehavior categories and
causes are supported by researches or real cases. Techniques dealing
with emergent misbehavior are also concluded from prior works.

Competitive work
None

Reproducibility
NA

Criticism
Some detection and diagnosis techniques are not necessarily to be
aware of the categories and causes of misbehavior if they can
correctly detect abnormality and compute the cause such as NetMedic
introduced by the other paper.

joeyp

unread,

Nov 3, 2009, 8:43:42 AM11/3/09

to CSCI2950-u Fall 09 - Brown

Emergent (Mis)behavior vs. Complex Software Systems

Jeffrey C. Mogul

EuroSys 2006

This paper is a call for research in a new direction in large-scale
systems research. It advocates for identifying taxonomies and
abstractions to describe emergent behavior in complex systems. The
paper has no real results, just discussion, as it is really more of a
position paper than a scientific one.

The primary utility of this paper is in starting conversation about
the possibility of research in the area of emergent behavior of
systems. It highlights points that can start arguments and
distinctions between different views and approaches to the problems it
presents.

There is clearly no empirical analysis. Most of the reasoning is in
drawing parallels with or extensions from examples and instances of
emergent behaviors. These are classified, discussed, and analyzed to
come up with general conclusions about emergent misbehavior issues
that arise in real systems.

Some of the discussion focues on other research that deals with
emergent behavior in other contexts or with different definitions. I
agree that the distinction between chaotic and emergent behavior is an
important one, for instance. Noting that chaotic behavior depends on
perturbations in the input, while emergent behavior can reach
unexpected *steady* states for broad ranges of inputs is pretty
important.

One point that was especially vague was the argument that things
cannot be "correct by construction." If the methods of construction
are actually correctly verified and evaluated, then it should be
possible to use construction methods that are provable.
Alternatively, there may be some kind of undecidability or
impossibility result of constructing systems with certain properties.
It might even be really difficult to do the verification or make the
probabilistic argument required in these cases. But really what this
says is that solutions that work by construction haven't had their
building blocks or their composition properly verified.

In the same vein, I thought the statement about aggregates not having
well-defined properties was an interesting one. What properties can
we think about that it might be possible to ascribe to a combination
of process or machines or services? How might we design them from the
start to be amenable to this kind of composition?

The "causes of misbehavior" seemed to be mixing errors in the system
with design goals of the system. Massive scale is something that we
design systems specifically for, as is decentralized control. These
things might be common conditions under which emergent behavior, but
they seem like very different properties than design and
implementation *issues* like misconfiguration, unexpected loads, and
lack of composability.

Reply all

Reply to author

Forward