Object Configuration Languages

7 views
Skip to first unread message

dmbarbour

unread,
Apr 22, 2009, 6:04:47 PM4/22/09
to PiLuD
This is a position statement that we should provide first-class
language support for configuring objects and their relationships in
languages that possess anything similar to the 'object' concept
(including actors, first-class processes, etc.)

I have always been impressed with the support dataflow and workflow
programming languages support rapid construction of object graphs
where the 'objects' represent processes or processing tasks of some
sort. These graphs can be composed with multiple inputs and multiple
outputs to create ever larger object graphs.

It turns out that Java and C++ have a variety "dependency-injection"
frameworks that achieve, essentially, the same thing... but without
the various conveniences achieved from first-class support (such as
process-combination and partial-evaluation, 'dead-object' elimination,
optimizing for locality, composable abstract configurations). Further,
these languages resist object configurations due to properties of
their constructors (in particular, constructors with side-effects can
be problematic) and due to their default synchronization properties
(stack-based message passing).

It has long been my opinion that features that tend to be badly
reinvented in languages that lack them were missing those language
features from the start.

I posit that object languages (including actors model languages and
languages with first-class processes) should be designed to support
object configuration. I further assert that the 'object constructor'
for the language should construct whole configurations at once,
allowing one to leverage economies of scale (i.e. grabbing whole slabs
of memory for allocation of the configuration) and allowing the
optimizer to operate on abstract configurations of objects. Creating a
single object will be a degenerate object configuration.

Some useful properties for an object configuration language:

(1) Simple support for cyclic object graphs.
(1.a) This generally requires that the reference or name of an object
be available to other object constructors prior to the definition of
an object.
(1.b) This is benefited if the object constructors are free of side-
effects, as it avoids any risk of objects attempting to communicate
with objects that aren't yet completely constructed.
(1.c) This is benefited by support for asynchronous message passing,
such that calls that don't require replies do not need to wait for the
callee to return. Cycles will otherwise either result in deadlock or
potentially blow the stack.

(2) Transparent support in the configuration for including references
to "live" objects that originated outside of the freshly constructed
object graph. This allows one to hook an abstract object graph into a
live system of existing objects prior to constructing it. This is
actually quite easy to achieve: the object identifier type must be the
same for both live objects and for abstract ones within the
configuration.

(3) First-class composable configurations: the ability to abstract and
construct configurations as first-class values, hook them together,
return multiple values from a configuration including both
configuration properties (e.g. a set of constructed procedures) and
object identifiers.

(4) Parametric configurations: Only able to see the outputs a
configuration directly exports. Only able to vary based on inputs to
abstract configuration. The former allows dead-object elimination. The
latter allows for a configurations to be precompiled and for abstract
configurations to be partially compiled in advance of construction.
(4.1) Both of these goals also benefit from side-effect free
constructors.

For distributed programming languages, I would also add:

(5) Support for distributing objects in the object configuration (i.e.
saying that one object is to be located "nearby" another object), thus
creating 'cliques' or sub-graphs of objects that will automatically be
distributed to different machines based on locations of live
resources. This also makes transparent where code is executed.
(5.a) object capability model security, so the remote platform can run
your code without fear
(5.b) some sort of process accounting in the language or its
implementation, such that remote platforms can decide intelligently
whether they have the resources to host an object configuration.

(6) Secrecy contagion based and data-flow management, such that your
secrets don't get distributed (at least without explicit efforts) to
untrusted remote platforms. This restricts how far objects in the
object configuration can travel to be 'nearby' other objects, and can
also target objects from outside the configuration (including 'live'
objects) to restrict data-flow. Should interact with 'live' objects in
such a manner as to cause construction to fail (unless explicitly
forced) if a live object cannot accommodate the secrecy requirement.

(7) Support for handling the inevitable partial-failures due to node
loss, power cycling, etc. that destroy part of a configuration.
Ideally includes both automatic recovery (automatic regeneration of
configuration when feasible) and cascading failures (allowing one to
fail-fast) so one can make decisions about whether to limp along, fail-
fast, or recover. Requires well-defined behavior for sending a message
to an invalid destination.

There may be other properties useful for object configuration
languages in different circumstances, but I'll admit some bias towards
support for distributed programming.

-------------------------------

Example object configuration language (that meets above properties):

Two configuration-language primitives: 'configure' and 'construct'.

'configure' will create a new object configuration as a first-class
value. Each configuration introduces and defines zero or more actors,
and exports a value. This configuration can be abstracted by use of
functions (i.e. may contain variables from a functional scope). The
creation of a new configuration does not imply the construction of any
objects.

'construct' is a primitive procedure that will take as input a
configuration and produce as output the exported values from a
configuration. The export of an object name indicates that both said
object and the objects with which it can potentially communicate will
be constructed. May throw a (potentially resumable) exception if
construction cannot succeed (e.g. for reasons of secrecy).

'construct' is also available within a configuration, in which case it
indicates that a sub-configuration will also be constructed. This
allows composition of configurations.

fn: A B C =>
{configure D E F:
define (x:X y:Y) = {construct {C E F}}
define D = {ObjectDef1 A B}
define E = {ObjectDef2 D F}
define F = {ObjectDef3 A F X}
export (a:D b:F)
nearby [(D B) (E F) (F A)]
depend [(D A) (D B) (E F) (F X) (F D)]
}

This example configuration, though contrived, demonstrates the above
properties:
(a) The configuration immediately makes available the object-names D,
E, F prior to the definition of any associated objects. These objects
were used ahead of their definitions.
(b) 'C' is implicitly an abstract configuration (a function returning
a configuration) and was composed by passing E and F into the
configuration and constructing the configuration as part of the
configuration.
(c) Each object declared in the configuration eventually receives a
definition.
(d) a value (in this case a record of values) is exported from each
configuration. This value doesn't necessarily contain references to
every object constructed within it. This, in combination with silent
constructors, allows for dead-objects to be statically identified and
dropped. In the above case, for example, 'E' might be dropped if it
isn't reached by F through X.

(e) The 'nearby' field potentially allows for automatic distribution
(e.g. if A and B are on separate machines, then 'D' would end up on
the same machine as B and both 'E' and 'F' would end up on the same
machine as A). Objects produced as part of CC might also be split
across machines.

If the machines hosting A cannot host more actors, then it might be
that E and F are distributed on, say, a Google cloud server that is as
darn close to A as possible (supposing your secrecy limits trust
Google and Google is a willing host). If the machine hosting A is
mobile, then the 'nearby' field might be automatically maintained,
moving E and F to different servers with some aim to keep latency and
bandwidth to a minimum.

(f) The 'depend' field marks dependencies. This could reasonably serve
double-duty for both automatic regeneration (in case D, E, F, X, or
one of X's dependencies is destroyed) and for cascading failures when
regeneration isn't possible (i.e. if 'A' is destroyed then D will die
which will kill F which will kill E).

The above example does not demonstrate secrecy contagion, which is
really a topic unto itself. It also doesn't address explicit mobility
and explicit destruction, though those could be addressed at the
individual object level. That is, one may add some primitive mobility-
objects that can move to be 'nearby' other destinations upon command,
dragging their entire clique (all objects defined to be near the
mobile object) with them. Similarly, one might add some primitive
'suicide' objects upon which other objects depend and that,
explicitly, do no regenerate if they are killed with the 'die'
message, thus inducing a cascading failure.

Additionally, one may still need a constructor to hook up all the
appropriate actor components into live actors. A possibility is to
build a thunk and export it as one of the features, to be executed
when appropriate. Another possibility is whole-configuration side-
effects upon construction even if individual object constructors have
no side-effects. In favor of further control over composition of live
systems, I tend to favor exporting a thunk.

------------------

Graphical object configuration languages:

The 'syntax' for object configuration languages is sometimes
graphical.

This probably helps some people compose object configurations in a
manner that makes sense to them, but it often manages to stymie me
when I wish to abstract configurations in terms of inputs that are not
objects. E.g. if I wished to create a linear configuration of fixpoint
filtering of a length given by an integer, that is difficult to do
graphically.

That said, I would favor graphical object configuration languages so
long as the same information in the graph can be readily produced
functionally, or even better if vice versa: designed for functional
use, but readily supported graphically via some IDE features.

----------------------

Anyhow, questions and other opinions on this subject are welcome.

raould

unread,
Apr 22, 2009, 6:49:00 PM4/22/09
to PiLuD
> I posit that object languages (including actors model languages and
> languages with first-class processes) should be designed to support
> object configuration.

while i can believe that languages shouldn't do bad things like the
problems you mention with constructors, i'm not sure that means the
language has to include operations to implement object config (i know
you only wrote "designed" not "implement", i'm just saying).

in other words, what do you think of things like Orc?
http://orc.csres.utexas.edu/

sincerely.

dmbarbour

unread,
Apr 22, 2009, 8:40:10 PM4/22/09
to PiLuD
> in other words, what do you think of things like Orc? http://orc.csres.utexas.edu/

Orc configures 'site calls', each with an input (potentially a tuple)
and an output. It provides three site combinators: sequence, parallel,
and prune. I must admit to some confusion, raould. Your above question
is preceded by a statement ("I'm not sure that means the language has
to include operations to implement object config") from which I infer
that you believe the Orc language does not provide operations for
describing its site configurations. Are you making a strong
distinction

In any case, it is difficult for me to offer my thoughts on Orc's site-
combinator language independently of its context.

Orc is clearly designed for transient one-off programs and "shallow",
centralized composition of distributed systems. In its purpose, Orc is
certainly an adequate language. It assumes very little about site
calls, and can readily abstract independent external systems that
'reply' to messages.

But it is my impression that the apparent advantages of Orc are due
more to deficiencies in other languages than to the virtues of Orc.

Orc is a simple language bordering on simplistic. It's easy to
implement, and does a lot to aide in some difficult problems, and is a
wonderful proof-of-concept, but is ultimately incomplete in ways that
keep it from a broad range of applications as a general-purpose
programming language. I would not use Orc for interacting with
strongly *inter*dependent systems. I would not use Orc if I needed
long-running programs (i.e. object graphs that might continue sending
messages for days on end). I would not use Orc to construct large
programs (requiring 'deep' composition) or if I desired cross-
combinator optimizations, or if I wished any sort of protocol
(precondition+postcondition and expectations) analysis. I would not
use Orc for true distributed programming.

It might be more accurate to say: Orc and object languages that lack
configuration support complete one another excepting in various
critical ways concerning type-safety, tight integration, safety and
failure issues involving site interdependence, optimizations, and the
various other benefits of first-class language support.

The ability to limit side-effects of constructors is exactly about
achieving a few of those features missing from Orc. It supports
interdependent systems and larger composition (especially allowing for
cyclic interdependent systems and eliminating order-of-construction
concerns for large programs and potentially black-box abstract
configurations). It also supports some powerful optimizations,
including static compilation, and should make regeneration after
partial failure easier to perform.

The other points about 'useful properties' for object configuration
languages are along the same lines. They're useful properties of the
object configuration language that improve the gestalt programming
language. One can get by without those constraints but the result is
(to my analysis) rarely going to offer a justifiable benefit for ease-
of-use, reliability, safety, security, or optimization.

Raoul Duke

unread,
Apr 22, 2009, 8:58:07 PM4/22/09
to pi...@googlegroups.com
many thanks for your thoughts.

> In any case, it is difficult for me to offer my thoughts on Orc's site-
> combinator language independently of its context.

i apologize for not being more clear / explaining what i had in mind!

what i was thinking was: some people have apparently been working on
languages which (i guess) are doing the object-conducting, apart from
languages which implement the objects. so there is some separation of
concerns, and perhaps each language can be revised over time to do its
job ever better, w/out impacting the other.

of course, if the "lower" level object implementation language does
bad things like the constructor issues you mentioned, then the
"higher" level conductor language will suffer.

so while a "lower" level language doesn't have to /implement/
conducting abilities, it should be careful not to /impede/ those of
another "higher" level one.

sincerely.

Raoul Duke

unread,
Apr 22, 2009, 9:00:14 PM4/22/09
to pi...@googlegroups.com
> It might be more accurate to say: Orc and object languages that lack
> configuration support complete one another excepting in various
> critical ways concerning type-safety, tight integration, safety and
> failure issues involving site interdependence, optimizations, and the
> various other benefits of first-class language support.

ah, yes, that makes sense to me, thanks for the thought.

sincerely.

Reply all
Reply to author
Forward
0 new messages