Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Encapsulation in FP

5 views
Skip to first unread message

jonT

unread,
Feb 6, 2006, 8:50:08 AM2/6/06
to
Hi,

I'd hope that this thead wan't turn into a flame fest. It's quite clear
to me that paradigm choice is very dependent on the domain of the
problem and that usually a

mix of paradigms works best. I've searched ever so many newsgroups and
the web for an answer... perhaps you can help?

The domain of my problem is that of developing a type aware shell. My
assertion: "ls" should return a list of file items, which we can query:
what is it's path?

permissions? size? path should be of type Path, Permissions of type
FilePermissions, size should be of type FileSize (or similar). The
shell should work on strongly

typed data - so I can sort the output of "ls" by "size", not "the third
column in a piece of text". This is my assertion and I write it solely
to describe the context.

As I kid growing up with OOP being the most popular paragigm, it's very
easy to develop something along the lines of (note it's not fully
typed...) the code at the

bottom of this post. At the same time, I'm very interested in the way
functional programming can be used to develop concise, readable code
and I use the paradigm in my

daily programming (hence my preference of Ruby over Java, etc).

The question is: I've never seen a code example of FP providing the
abstractions that we take for granted under OOP (by OOP I mean a
language supporting the majority of
<http://www.paulgraham.com/reesoo.html>). This isn't to say that FP
doesn't provide abstraction - indeed one often works at a higher level
than with OOP. But how

Can I as a library user find out what I can do with some typed data
item? In an OOP language, it's as easy as

my_file = File.new("path")
foo.methods

How does this work in FP? One of the things I dislike about the shell
is that one can't say "what can I do with a file?" Instead I have to
memorise what every app in

/usr/bin does. FP seems perhaps similar to me. Functions operate on
some_data_item passod as an argument. I have to know what that function
is. Modules help us to

group similar functions, but I can't do what I just did above?

I understand that FP works at a greater level of genericity |
genericness, similar in a way to Ruby's ducktyping. Functions don't
necessarily work on data items of a

particular type. But what about when we want - as library writers often
do - to constrain the user?

This isn't a troll. It's a genuine request by a confused computer
scientist to understand how FP works for code of more than a few lines.

Code examples or links would be appreciated.

Thanks,
Jon T


class DirectoryEntry
attr_reader :mode, :owner, :group, :size, :mtime, :path
def initialize(abs_path)
st = File::Stat.new(abs_path)
m = sprintf("%o", st.mode)
@mode = m[m.size-3..m.size-1]
@owner = Etc.getpwuid(st.uid)
@group = Etc.getgrgid(st.gid)
@size = FileSize.new(st.size)
@mtime = st.mtime
@path = abs_path
end
end

class JFile < DirectoryEntry
end
class JDir < DirectoryEntry
end

def ls(sDir)
return Dir::open(sDir).entries.map{|x|
if File.directory?(x)
JDir.new(cwd + "/" + x)
else
JFile.new(cwd + "/" + x)
end
}.sort_by{|x| x.path}
end

#call, returns a file listing sorted by filesize
ls.sort_by &:size

Philippa Cowderoy

unread,
Feb 6, 2006, 11:16:07 AM2/6/06
to
On Mon, 6 Feb 2006, jonT wrote:

> I understand that FP works at a greater level of genericity |
> genericness, similar in a way to Ruby's ducktyping. Functions don't
> necessarily work on data items of a
>
> particular type. But what about when we want - as library writers often
> do - to constrain the user?
>

In Haskell, this is what type classes are for - a type class's instances
are those types which support the operations of that class. In general you
can't find out if something of unknown type belongs to a given typeclass -
however, several modern implementations have the derivable Typeable class,
which offers what you want.

--
fli...@flippac.org

Society does not owe people jobs.
Society owes it to itself to find people jobs.

jonT

unread,
Feb 6, 2006, 12:42:11 PM2/6/06
to
>In Haskell, this is what type classes are for - a type class's instances
>are those types which support the operations of that class. In general you
>can't find out if something of unknown type belongs to a given typeclass -
>however, several modern implementations have the derivable Typeable class,
>which offers what you want.

Thanks for the info. The implication I get is that it isn't commonplace
to do so. Part of what I have a problem with comprehending is how any
large library could be implemented in functional style. Often when
writing Java or C#, one will use the autocompletion ("Intellisense") to
introspect on an object's properties and methods as one codes. Is this
not the same in FP? Would it be the case that less functions would be
available (at no loss of productivity) or is it that FP is rarely used
in these more "mainstream" applications?

Ulf Wiger

unread,
Feb 7, 2006, 3:02:48 AM2/7/06
to
"jonT" <j...@Tippell.com> writes:

> Can I as a library user find out what I can do with some typed data
> item? In an OOP language, it's as easy as
>
> my_file = File.new("path")
> foo.methods
>
> How does this work in FP? One of the things I dislike about the shell
> is that one can't say "what can I do with a file?" Instead I have to
> memorise what every app in

The way we work in Erlang is to create 'modules' with a
list of exported functions. These modules tend to offer
the kind of abstraction you're looking for, but from a
slightly different perspective. That is, you don't start
with the data and attach methods to the data, but rather,
by grouping functions into manageable entities. Each
module usually has its own view of what data types are
reasonable to work with. There is no inheritance.

Now, Erlang is a dynamically typed language; each data
object has a type that is known/detectable at runtime.
For one thing, this means that modules can be runtime
polymorphic (determine what needs to be done based on the
types of function arguments.)

In your example, the 'file' module may be a good example:

...> erl -boot start_clean
Erlang (BEAM) emulator version 5.4.12 [hipe] [threads:0] [kernel-poll]

Eshell V5.4.12 (abort with ^G)
1> file:module_info(exports).
[{pid2name,1},
{rename,2},
{make_dir,1},
{del_dir,1},
{altname,1},
{read_link_info,1},
{read_link,1},
{write_file_info,2},
{make_link,2},
{make_symlink,2},
{write_file,2},
{write_file,3},
{file_info,1},
{raw_read_file_info,1},
{raw_write_file_info,2},
{rawopen,2},
{pread,2},
{pread,3},
{write,2},
{pwrite,2},
{pwrite,3},
{sync,1},
{truncate,1},
{copy,2},
{copy,3},
{ipread_s32bu_p32bu,3},
{ipread_s32bu_p32bu_int,3},
{path_consult|...},
{...}|...]

Other examples of modules are:
- 'filename', for manipulating file names
- 'lists', for list handling
- 'string', for string handling
- 'gen_server', for designing client-server instances
- 'timer', for various timer operations

Our experience from building large systems is that it is
very difficult to manage globally typed objects with
attached methods, as this introduces a tight design-time
coupling between design teams. A very common result is that
each design team ends up designing its own class structure
and marshalling the objects in the interfaces. This can be
likened with human organisations, which all have their own
set of forms and procedures.

By decoupling methods and data, and relying on runtime
polymorphism, we are able to design very large systems
that are surprisingly easy to maintain and develop
further. In our line of business (telecoms), this is
one of the biggest challenges. Also, the requirement
to keep systems up 24x7 means that we need to be good
at replacing parts of the system while it's running.
The module system, together with dynamic typing, makes
this relatively straightforward.

Dynamic typing is not a requirement for using this
method of abstraction. With static typing and, by
loosening the requirement for in-service upgrade,
several useful optimizations and compile-time
checks become possible.

The abstraction techniques can remain the same.
It is reminiscent of how hardware is designed, using
discrete components with well-defined signal
interfaces. Basically, traditional Electrical
Engineering practices with black boxes, transfer
functions, etc.


Regards,
Ulf W
--
Ulf Wiger, Senior Specialist,
/ / / Architecture & Design of Carrier-Class Software
/ / / Team Leader, Software Characteristics
/ / / Ericsson AB, IMS Gateways

rao...@gmail.com

unread,
Mar 1, 2006, 6:53:06 PM3/1/06
to
Ulf Wiger wrote:
> Our experience from building large systems is that it is
> very difficult to manage globally typed objects with
> attached methods... This can be

> likened with human organisations, which all have their own
> set of forms and procedures.
>
> By decoupling methods and data, and relying on runtime
> polymorphism, we are able to design very large systems
> that are surprisingly easy to maintain and develop
> further.

These are very intriguing statements! I am interested in learning more
details about the benefits gained, especially with respect to the
interaction of human organizations. Are there further writings about
these experiences you could refer me to, or perhaps some stories of
your own to relate? (I'm normally a static-typing bigot and want to
learn more about what I'm missing or impeding.)

many thanks!

Ulf Wiger

unread,
Mar 3, 2006, 12:14:08 PM3/3/06
to
rao...@gmail.com writes:

Well, I don't know how much of _that_ is actually published, but a
description of how we went about using Erlang in a large organisation
can be found here:

"Four-Fold Increase in Productivity and Quality"
http://www.erlang.se/publications/Ulf_Wiger.pdf
(Presented at FemSYS 2001)

Chapter 6 describes how we worked and gives a
birds-eye view of the system architecture.

While reading, you may try to envision how the AXD 301 started
out as an ATM switch, but gradually became a "media gateway" for
telephony over ATM, and later morphed into doing telephony
over IP (actually a pretty big step). The key in this market is
continuity - you don't necessarily get a chance to do massive
rewrites, so loosely coupled systems tend to fare better in the long run.
Also, the ability to introduce "patches" in running systems is
invaluable, esp. during the test and debug phase.

Now, we have a tendency to end up with fairly large organisations.
This is partly due to the fact that we have to cope with a large number
of customers with long lists of requirements, and, broken down into
manageable blocks, we still end up with a fairly large number of
programmers. To that, we have to add a few dozen analysts just to
interface with all the people putting together the network solutions.
We've had competitors who seemed to outrun us for a while, but who
ultimately failed to manage the large mass of requirements with
their slim organisations.

I can't say whether this all amounts to a requirement to have
dynamically typed systems. I don't think it does. There may
be some disconnect in that industry is sometimes inticted with
running absurdly large and incompetently managed projects, and
that the proper solution would be to use small teams with very
competent people (who would of course also pick the best tools
for the job.)

While there may be some truth to this (and I've often been advocating
something along those lines myself), there's a limit to how much you can
break with tried and trusted ways of working (WOW, for those of you
who are not up to speed on industry jargon.)

BTW, the flagship (still...) of Ericsson is the AXE switch,
which uses its own proprietary programming language, originally
designed for proprietary cpus. Based on an 80's design, it
nowadays uses ultra-modern hardware (AMD64, running linux)
and a virtual machine with JIT compilation and everything, but
still bug-compatible with the old systems. The design environment
for the AXE is very specification heavy. They can afford it due to
their huge turnaround. Some glimpses into this strange world can
be found on the net:

http://www.mrtc.mdh.se/SoftRT/

where you can also find a cute animated powerpoint presentation
on erlang (if you're into message passing being illustrated as
animated powerpoint arrows, that is ;-)

The general feeling might be that the old way of doing things in
the telecom industry is a poor fit for the fast-paced market of
today. Still, everyone yearns for the kind of robustness that those
old systems had... When one factors in the cost and delays caused
by poor quality in C++ applications, the total cost and lead times
may not be that much better today (note: pure speculation on my
part - not backed up by any secret data from within the catacombs
of Ericsson)

Paul Graham may be right in observing:
"I mean business can learn about new conditions the same way a gene
pool does. I'm not claiming companies can get smarter, just that
dumb ones will die."
(http://www.paulgraham.com/opensource.html)

OTOH, gene pools are in no particular hurry. For those who want to
make an impact with the dinosaurs of today, it's best to spend time
on learning how they operate and learning how to speak their language.
Don't do what Joe Armstrong did once, when he was young and bashful:
Barge in on us honest and hard-working people at the AXD 301 project
and proclaim "you could do all of this with only 6 good
programmers!!!" (we were 200 at the time.) Joe later recanted, and
nowadays works in our unit. We're still about 200 strong. (:

If all this didn't help a bit, perhaps you can phrase some
specific questions instead? ;-)

Our current direction is to use Erlang, Dialyzer and (hopefully
soon) Erlang QuickCheck. For a dynamically typed environment, that's
a pretty potent mix. Dialyzer was a shoe-in from the start, because
it didn't impose a new paradigm. QuickCheck is trickier, because it
really does a number on your head. We're finding some backdoors
through which we can sneak it in, and we hope to start a revolution.

Regards,
/Ulf W

rao...@gmail.com

unread,
Mar 6, 2006, 5:44:36 PM3/6/06
to
> If all this didn't help a bit, perhaps you can phrase some
> specific questions instead? ;-)

Many thanks for the extensive information and thoughts! I will read the
papers etc. you have mentioned and see what I can learn, to try to stay
out of that 'death' part of the corporate gene pool...

sincerely.

rao...@gmail.com

unread,
Mar 6, 2006, 7:25:15 PM3/6/06
to
> If all this didn't help a bit, perhaps you can phrase some
> specific questions instead? ;-)

(I just read your "Four-Fold" paper, many thanks for it.)

I would be interested to know more about what you think of "decoupling
methods and data, and relying on runtime polymorphism" - I am assuming
this would be opposed to the OO style which attempts to keep methods &
data together? Was it simply that it allows disparate teams to have
their own methods on data without having to coordinate in some tortuous
fashion? Perhaps OO with some level of static typing falls apart in a
real world situation that must allow for easy change?

many thanks.

Ulf Wiger

unread,
Mar 7, 2006, 10:02:30 AM3/7/06
to
rao...@gmail.com writes:

<standard_disclaimer>
First of all, these will be my personal reflection and
not necessarily the views of my company.
</standard_disclaimer>

In our production code, I just did a search on calls
to the 'lists' module (a collection of polymorphic
iterator functions and other generic operations on lists)
and the pattern 'fun(', signifying the definition of a
higher-order function in Erlang. I focused on one
subsystem - one that has fairly recently written code.

In 154K lines of code, a simple 'grep' revealed 937 calls
(6%) to 'lists:...' and 452 (3%) declarations of higher-
order functions(*). Perhaps even more interesting, there
were 198 instances (1.3%) of list comprehensions (only
200 generators, though, so nearly all lcs are simple).

(*) Granted, many calls to the lists module will
invclude a fun() (e.g. foldl, map, foreach, ...)

I think it's safe to say that even "average" industrial
programmers rather quickly learn to exploit the virtues
of higher-order functions and iterators.

Decoupling data and functions is obviously fairly straightforward,
and is basically what programmers have been doing all along - you
spend time laying down a good data model, and then devise programs
to operate on the data. Only COBOL and OO programmers approach
things differently. To me, it feels a bit awkward to define a data
object and then be forced to think of all possible ways it might be
accessed. If the data changes over time, it's pretty easy with
pattern matching and polymorphism to handle both new and old
versions in the affected API functions.

We tend to let the "block" designers define their own
data model and have them specify function APIs to their
external users. This basically mean that we don't export the
data model as part of the block interface, partly because doing
so in erlang tends to introduce compile-time dependencies that
become difficult to handle in large systems such as ours.
With "black box" design and polymorphic interfaces, we can
rather easily do integration in stages. As we currently
develop ca 8 products in parallel, this is absolutely
necessary.

My own impression is that OO modeling gets into trouble
very easily in large projects. In order to gain control
of their own environment, designers write stub code to
transform data behind the scenes, so that they can keep
their own class structures relatively free from ripple
effects. Most people would probably agree that the way
to build large systems with OO is through loose coupling
and polymorphism, making sparse use of inheritance.
My problem with it is the feeling that the model doesn't
encourage designers to do so. You depend heavily on
best practices based on years of experience getting it
wrong. I think Erlang and similar languages rather
encourage a style of programming that naturally avoids
many of the pitfalls. This is what good design tools should do.

Regards,

rao...@gmail.com

unread,
Mar 7, 2006, 9:06:13 PM3/7/06
to
> Decoupling data and functions is obviously fairly straightforward,
> and is basically what programmers have been doing all along - you
> spend time laying down a good data model, and then devise programs
> to operate on the data. Only COBOL and OO programmers approach
> things differently. To me, it feels a bit awkward to define a data
> object and then be forced to think of all possible ways it might be
> accessed. If the data changes over time, it's pretty easy with
> pattern matching and polymorphism to handle both new and old
> versions in the affected API functions.

Thanks for your comments, I really appreciate hearing from somebody
with such extensive experience.

I've never actually used multimethods (I'd love to carve out the time
to really learn CLOS and Dylan, in addition to all those other
languages I want to know) but I wonder to what degree they are an
admission that Java/C#/C++ style OO is intellectually bankrupt?
Retro-fitting multimethods to things like Java is also kinda gross,
from what little I've seen.

sincerely.

Ulf Wiger

unread,
Mar 8, 2006, 2:55:30 AM3/8/06
to
rao...@gmail.com writes:

> but I wonder to what degree they are an
> admission that Java/C#/C++ style OO is intellectually bankrupt?
> Retro-fitting multimethods to things like Java is also kinda gross,
> from what little I've seen.

I don't have an answer to that. There is of course the
push towards aspect-oriented programming in order to
counter the worst side-effects of OO. I don't think
I've made up my mind yet regarding the effects of
AOP on system integrity and long-term maintenance.
It could be a way to ease the transitions between major
rewrites, but I don't have enough data.

Ulf Wiger

unread,
Mar 9, 2006, 7:35:38 AM3/9/06
to
Ulf Wiger <ulf....@CUT-ericsson.com> writes:

> In 154K lines of code, a simple 'grep' revealed 937 calls
> (6%) to 'lists:...' and 452 (3%) declarations of higher-
> order functions(*). Perhaps even more interesting, there
> were 198 instances (1.3%) of list comprehensions (only
> 200 generators, though, so nearly all lcs are simple).
>
> (*) Granted, many calls to the lists module will
> invclude a fun() (e.g. foldl, map, foreach, ...)
>
> I think it's safe to say that even "average" industrial
> programmers rather quickly learn to exploit the virtues
> of higher-order functions and iterators.

Correcting the obviously erroenous math, all the
percentages get adjusted downwards by a factor 10,
making my conclusion quite dubious. Sorry 'bout that.

/Ulf W

Ulf Wiger

unread,
Mar 9, 2006, 9:30:55 AM3/9/06
to
Ulf Wiger <ulf....@CUT-ericsson.com> writes:

> Correcting the obviously erroenous math, all the
> percentages get adjusted downwards by a factor 10,
> making my conclusion quite dubious. Sorry 'bout that.

To compensate, here's a link to Joe Armstrong's PhD
talk, where he expands on how to program, how to build
systems, etc. using Concurrency as a modeling paradigm.

http://www.sics.se/~joe/talks/systems.pdf

Also, in

http://www.erlang.se/workshop/2004/ex11.pdf

Joe argues (not very visibly here) that message-passing
interfaces are much more succinct than functional APIs.
His example is the X11 protocol, which consists of 154
protocol messages. XLib, the programming interface, has
about 800 interface functions. The reason, according to
Joe, that programmers don't use the protocol level directly,
is that they get the concurrency wrong. So, in order to
hide the natural concurrency of the problem, one devises
APIs that quickly grow enormous.

I think this is a related problem.

rao...@gmail.com

unread,
Mar 10, 2006, 7:31:54 PM3/10/06
to
Hi, I'd read Dr. Armstrong's thesis, it is great material. I'll read
the workshop paper, thanks for the pointer!

sincerely.

0 new messages