I am busy writing a framework to do some processing that can vary
greatly based on setup.
I have a class representing processing stages, that allows me to
assmble a hierarchy of stages, like so (works best in monotype):
A
|
--------
| |
B1 B2
|
---------------
| | |
C1 C2 C3
Only the leaf nodes (C1, C2, C3, B2) will do actual processing, with
their parents acting as controllers that handle the calling of the
correct factories, validation, and deciding what child node to move to
next.
Now suppose that C1 obtains/calculates a piece of data that is
relevant to C2 (think get customer for C1, and find orders for
customer for C2). I could simply let C2 know about C1, but that limits
my ability to substitute or completely remove C1. Worse, suppose some
static info is obtained right at the top - this may be needed at
several of the leaf nodes - how do I let them know about the data? You
can think of a hundred scenarios like this, and in all cases you could
simply introduce some more coupling - although you wouldn't want to.
The solution that I decided to use, is the point of this post. It
works quite well, but I can see that it will turn into a bit of a
monster in time.
Basically, every stage has a datapacket, and a method that allows it
to supply a datapacket to a child node. The datapacket implements a
number of interfaces, which dependant stages then query for the info
that is being passed along, or use to update that info where
applicable.
Every non-leaf node may decide what datapacket is children needs to
have access to, which is fine, as the controller must certainly have
some notion of what it is controlling.
This works nicely, in that the data packet is a consistent way of
accessing the information that the parent permits, and leaves quite a
bit of control with the controller that has to oversee certain stages.
Additionally, it removes coupling up the tree, as any stage will only
ever need to access the data relevant to its role, and not be
concerned about where it came from.
What I don't like is that, without constant supervision, this packet
can grow to be quite a big structure, and it can easily become a
general-purpose scratch pad. Now I know what it is meant for and know
that I will not abuse it, but I have no control over what other
developers may add to this thing in time.
My question, then, is this: What other ways are there to pass along
information relevant to several nodes, without causing coupling, or
context-dependency? And also, exactly how bad a solution is this?
Regards,
Cobus Kruger.
If it is this general, why don't you just make the data packed a general
structure, like an associative storage. You can then define a set of keys
(probably strings) with certain semantics. Processors store their result
with the given key, so others can later on access it.
Regards
robert
> I have a class representing processing stages, that allows me to
> assmble a hierarchy of stages, like so (works best in monotype):
>
> A
> |
> --------
> | |
> B1 B2
> |
> ---------------
> | | |
> C1 C2 C3
>
> Only the leaf nodes (C1, C2, C3, B2) will do actual processing, with
> their parents acting as controllers that handle the calling of the
> correct factories, validation, and deciding what child node to move to
> next.
Short answer: Don't Do That!
What you have described here is a classical functional decomposition
hierarchy where the hierarchy structure is explicit in the
implementations of the parent controllers. That's fine if you are doing
functional programming but it is a serious no-no for OO development.
Note that the parents do not reflect a commonality shared by all their
children. Instead, they have completely different responsibilities like
deciding which sequence the children will be accessed in.
The OOA/D is-a relationship is about subclassing based on specialized
properties. It is a pure set partitioning mechanism. A given leaf
instance must incorporate all of the properties defined by the relevant
parents and only those resolved through the superclass hierarchy. Your
description implies that A and B1 are separately instantiated to control
the collaborations of multiple leaves.
Which segues to another problem: in OO development objects are logically
indivisible at a given level of abstraction. As a result they are all
peers and collaborate on a peer-to-peer basis. However, the notion of
'controllers' in this context violates that because they act as
middlemen for the collaborations.
>
> Now suppose that C1 obtains/calculates a piece of data that is
> relevant to C2 (think get customer for C1, and find orders for
> customer for C2). I could simply let C2 know about C1, but that limits
> my ability to substitute or completely remove C1. Worse, suppose some
> static info is obtained right at the top - this may be needed at
> several of the leaf nodes - how do I let them know about the data? You
> can think of a hundred scenarios like this, and in all cases you could
> simply introduce some more coupling - although you wouldn't want to.
Collaborations require relationships. If C1 needs to tell C2 something,
then there needs to be a relationship path between them that can be
navigated for message passing. If the collaboration is complicated or
dynamic, then there are patterns like Observer for organizing it.
>
> The solution that I decided to use, is the point of this post. It
> works quite well, but I can see that it will turn into a bit of a
> monster in time.
>
> Basically, every stage has a datapacket, and a method that allows it
> to supply a datapacket to a child node. The datapacket implements a
> number of interfaces, which dependant stages then query for the info
> that is being passed along, or use to update that info where
> applicable.
>
> Every non-leaf node may decide what datapacket is children needs to
> have access to, which is fine, as the controller must certainly have
> some notion of what it is controlling.
>
> This works nicely, in that the data packet is a consistent way of
> accessing the information that the parent permits, and leaves quite a
> bit of control with the controller that has to oversee certain stages.
> Additionally, it removes coupling up the tree, as any stage will only
> ever need to access the data relevant to its role, and not be
> concerned about where it came from.
>
> What I don't like is that, without constant supervision, this packet
> can grow to be quite a big structure, and it can easily become a
> general-purpose scratch pad. Now I know what it is meant for and know
> that I will not abuse it, but I have no control over what other
> developers may add to this thing in time.
That doesn't surprise me and I suspect it will get a lot worse as
requirements change and maintenance is done. The hierarchy is static,
which makes it more difficult to maintain if the relationships are
volatile. Finally, you are instantiating relationships by passing
object references, which is the worst form of coupling. This is
mitigated by employing a complicated interface hierarchy for restricting
access but that is still fragile (i.e., in each context one must ensure
the correct interface is used).
>
> My question, then, is this: What other ways are there to pass along
> information relevant to several nodes, without causing coupling, or
> context-dependency? And also, exactly how bad a solution is this?
Start by getting rid of the hierarchy and controllers. Unfortunately,
without more information about the specific problem it is hard to say
how they should be replaced. For example, you refer to stages but the
accompanying text implies that the sequencing is fairly arbitrary.
[Caveat: you may still need subclassing. Just don't make it serve too
many masters. (Here you seem to be using it for both subclassing and
solution flow of control.) For example, in a typical web POS order
entry application the customer navigates a hierarchy of categories of
goods. Clearly that customer navigation is best modeled as a hierarchy.
But that does not mean that one should subclass the goods descriptions
within the same hierarchy. In fact, doing so would likely be a
long-term nightmare. Web site navigation and goods descriptions are two
entirely different concerns and should be modeled separately.]
*************
There is nothing wrong with me that could
not be cured by a capful of Drano.
H. S. Lahman
h...@pathfindersol.com
Pathfinder Solutions -- We Make UML Work
http://www.pathfindersol.com
(888)-OOA-PATH
> Your
> description implies that A and B1 are separately instantiated to control
> the collaborations of multiple leaves.
Given, but at the layer above, it is immaterial whether it directly
contains leaf stages, or the so-called controllers (whcih again, may
contain both).
>
> Which segues to another problem: in OO development objects are logically
> indivisible at a given level of abstraction. As a result they are all
> peers and collaborate on a peer-to-peer basis. However, the notion of
> 'controllers' in this context violates that because they act as
> middlemen for the collaborations.
This is something I did think of at the time, however the controllers
do act as leaf nodes as well - except that they also manage their
children. I did think of a hierarchy where ControllerNode inherits
from Node, essentially leaving the leaf nodes without this unneeded
funtionality. I'd be interested to see how you would model something
like this?
> Collaborations require relationships. If C1 needs to tell C2 something,
> then there needs to be a relationship path between them that can be
> navigated for message passing. If the collaboration is complicated or
> dynamic, then there are patterns like Observer for organizing it.
I do actually use Observer for certain UI interactions, but somehow
didn't think of Observer for this.
> The hierarchy is static,
> which makes it more difficult to maintain if the relationships are
> volatile.
Just a slight confustion of terms here - the hierarchy I outlined
above is the ineraction of nodes, not the inheritance hierarchy. This
point I think you did get (given your earlier comments). However,
which hierarchy are you referring to as being "static"? Thie
inheritance hierarchy certainly is, but the reason I have this problem
is that the processing (which logically is very stage driven - to the
extent that a descision on stage 1 can influence whether or not stages
2 and 3 must show) is very dynamic, and the complete absense of a
given stage - or even the need for any any information to replace it -
is completely permissible.
> Finally, you are instantiating relationships by passing
> object references, which is the worst form of coupling. This is
> mitigated by employing a complicated interface hierarchy for restricting
> access but that is still fragile (i.e., in each context one must ensure
> the correct interface is used).
What would you suggest, interfaces?
> For example, you refer to stages but the
> accompanying text implies that the sequencing is fairly arbitrary.
I did try to simplify, to more focus the post on my problem of
data-passing. In reality, the factories need to consider not only
several layers of static configuration, but also the context, and the
data collected from prior stages. And in some cases, this data can
differ to the point of either requiring and amount to guide
processing, or a list of items and quantities. Modeling these things
as completely seperate processes is not possible, as certain
configurations may in fact require mixing these. The new design is to
replace software that in about seven years could not be stabilised
because the permutaions were treated too specifically at the highest
layer (not at all an OO design though - the original authors did
introduce a class or two with about 300 lines of public field
declaration, for example. They also wanted to use objects, see?).
> There is nothing wrong with me that could
> not be cured by a capful of Drano.
What is Drano? I normally use coffee as the sollution to everything
:-)
>>The hierarchy is static,
>>which makes it more difficult to maintain if the relationships are
>>volatile.
>
>
> Just a slight confustion of terms here - the hierarchy I outlined
> above is the ineraction of nodes, not the inheritance hierarchy. This
> point I think you did get (given your earlier comments). However,
> which hierarchy are you referring to as being "static"? Thie
> inheritance hierarchy certainly is, but the reason I have this problem
> is that the processing (which logically is very stage driven - to the
> extent that a descision on stage 1 can influence whether or not stages
> 2 and 3 must show) is very dynamic, and the complete absense of a
> given stage - or even the need for any any information to replace it -
> is completely permissible.
I suspected as much, but I wasn't sure. My main point in going through
the static is-a drill is that the notion of OO hierarchy is different
than things like control hierarchies. IOW, one usually represents a
problem space hierarchy with simple associations rather than an is-a.
However, I still don't like the idea of your "controller" classes
because of the indivisibility argument. Such classes "hardwire"
sequences of collaborations in their implementation. Then if the
sequence changes, so must the implementation of the "controller". As a
simplistic example consider:
A::method1(x):
temp = this.attr1 + x
temp = Bref.doIt(temp)
this.attr2 = temp / 5
Ignoring how A gets Bref, this hardwires a dependence on what B.doIt
does. To properly set A.attr2 the method depends on B.doIt to do
something specific correctly. So A.method1 is not testable unless B.doIt
is implemented or stubbed to do the Right Thing. Now consider:
A::mehod1(x)
temp = this.attr1 + x
Bref.doIt(temp)
A::method2(x)
this.attr2 = x / 5
Now each method stands alone, has no implementation dependencies on
B.doIt, and is completely testable without any B.DoIt implementation.
(To test A.method1 all one has to ensure is that the message was sent to
the right place with the right argument.)
The first case is what I would regard as a "controller" in this context
because its implementation depends upon external context (i.e., what
B.doIt actually does). It also depends upon a specific sub-sequence of
the overall solution (i.e., adding two values; then processing that
result; then dividing the result of that processing by 5). In effect it
places A at a higher level than B in a control hierarchy.
The second case has exactly the same functionality but if B.doIt is
modified A's implementation and testing is not affected. For example,
A.method2 could be invoked by somebody else than B.doIt if one modifies
the external collaborations. That is transparent to A's implementation.
Thus A is a peer relative to B or anyone else and there is no
hierarchy of control.
This also embodies the peer-to-peer communication I mentioned. In the
second case the Bref.doIt call is just a message sent to a B. A doesn't
care what B does in response. Someone else is responsible for invoking
A.method2 at the right time. But A's implementation does not care about
who or when that is either.
In peer-to-peer communications methods tend to be small and standalone,
capturing very simplistic, intrinsic behaviors. One obtains complex
solution behaviors by connecting them up with messages at a higher level
of abstraction that individual behaviors. (In UML modeling this would
be done at the Interaction Diagram level.) So this is not so much about
what functionality A has as it is about how A is organized and how A and
B collaborate.
[To get back to my skipping of how one gets Bref, this is actually an
example of peer-to-peer collaboration. Bref represents navigation of a
specific relationship. The instantiation of that relationship requires
that one apply rules about who can talk to whom. But that is orthogonal
to A's responsibilities. In fact, it can damage A's cohesiveness if A
even knows what those rules are. So the "Bref.doIt(temp)" pseudocode is
just a place holder for an addressing mechanism that is driven by the
solution context rather than the intrinsic responsibilities of the
object. This decoupling is one of the crucial advantages of OO
development.]
When you mentioned a "controller" managing the passing of information
from C1 to C2, this immediately conjured up the image of the first case
where the "controller" implemented the rules of communication between C1
and C2 via something like:
B1::method1
temp = C1ref.doSomething()
C2ref.doSomethingElse(temp)
With indivisible objects and peer-to-peer communications C1 would sent
the doSomethingElse message to C2 directly at the right time (e.g., when
C1ref.doSomething() completes). Also, whoever invoked B1.method1 would
send the C1.diSomething message directly to C1. That would eliminate
the need for B1 entirely.
So...
>
>
>>Finally, you are instantiating relationships by passing
>>object references, which is the worst form of coupling. This is
>>mitigated by employing a complicated interface hierarchy for restricting
>>access but that is still fragile (i.e., in each context one must ensure
>>the correct interface is used).
>
>
> What would you suggest, interfaces?
>
>
>>For example, you refer to stages but the
>>accompanying text implies that the sequencing is fairly arbitrary.
>
>
> I did try to simplify, to more focus the post on my problem of
> data-passing. In reality, the factories need to consider not only
> several layers of static configuration, but also the context, and the
> data collected from prior stages. And in some cases, this data can
> differ to the point of either requiring and amount to guide
> processing, or a list of items and quantities. Modeling these things
> as completely seperate processes is not possible, as certain
> configurations may in fact require mixing these. The new design is to
> replace software that in about seven years could not be stabilised
> because the permutaions were treated too specifically at the highest
> layer (not at all an OO design though - the original authors did
> introduce a class or two with about 300 lines of public field
> declaration, for example. They also wanted to use objects, see?).
I still don't know enough about the specific problem. I gather that the
"stage" is some sort of process step, that stages can be constructed and
sequenced dynamically, and there may be some concurrency or parallelism
that allows sequences of stages to interact.
However one cannot allow ignorance to stand in the way of speculating on
a possible solution. So, if these assumptions are correct, then we
might have something like:
1 identifies stages for
[StageBuilder] ----------------------------------+
| 1 |
| |
| |
| constructs |
|* 1 defines 1 |
[StageSequence] ------------- [SequenceSpec] -----|
| 1 |
| |
| |
| organizes |
0..1 | * {ordered} * |
+---------- [Stage] ------------------------------- [StageSpec]
| 0..1 | * defines 1
| |
| | collaborates with
+--------------+
where the responsibilities are:
[StageBuilder] is just a Factory. It uses [SequenceSpec] to identify
the [Stages] is a particular sequence. It then uses [StageSpec] to
instantiate the [Stages]. While doing so it instantiates the
relationships according to the [StageSpec] specification.
[StageSequence] is (for now) just an ordered collection of [Stages] that
someone accesses to navigate a sequence of [Stages].
[Stage] is an instantiation of one of your leafs. Depending on how
different the stages are, this might be subclassed.
[SequenceSpec] essentially just manages the grouping of [StageSpecs].
If [StageSequence] has problem responsibilities beyond simply being a
collection, these would be specified here.
[StageSpec] just provides the instantiation data that [StageBuilder]
needs to instantiate everything.
The reflexive relationship on [Stage] is intended to capture the
inter-stage relationships needs for navigation. There are several ways
to do this, depending on what kind of collaborations are required. For
example, this might be where Observer would be useful. This is where
more information is most sorely needed.
Note that the Factory is a unique class that only understands the rules
for construction. It has no problem semantics beyond what subclasses of
[Stage] to instantiate, the initial state (attribute values), and
relationship participation. And hopefully most of that is defined by
the specification objects. If stages are complicated the Factory itself
may need different algorithms (e.g., the Strategy pattern, triggered by
[StageSpec]).
That separation of construction rules from execution rules provides a
fairly robust structure. As mentioned, we can address complexity with
subclassing of any of the main classes. But such subclassing is limited
to a narrow perspective. That is, while there may be some parallelism,
the subject matters are quite different.
One also has the option of expanding particular elements without
affecting the other elements. Using Strategy to handle different
factory algorithms is an example. Another is using Observer to manage
the reflexive relationship.
Finally, note the use of specification classes. This is a very useful
tool for providing parametric polymorphism in an application.
[SequenceSpec] and [StageSpec] are just dumb data holders so they could
be initialized from external configuration data. That would allow one
to modify the stages and sequences of stages without touching the code
at all.
>
>
>>There is nothing wrong with me that could
>>not be cured by a capful of Drano.
>
>
> What is Drano? I normally use coffee as the sollution to everything
> :-)
In the USA Drano is a brand name of a popular remedy for fixing clogged
toilets.
*************
There is nothing wrong with me that could
not be cured by a capful of Drano.
H. S. Lahman