General Question: Object Persistence Mechanism

131 views
Skip to first unread message

lrix

unread,
Jan 11, 2016, 9:32:16 AM1/11/16
to Eiffel Users
NEED: An innate, compiler-known, object persistence mechanism, whereby objects are tracked for version (at design-time), version-updating (at run-time comparative between memory object and persisted object), and attribute change auto-persist.

What would be nice is something akin to the Design-by-Contract system, where the compiler is persistence-engine-API aware and generates C code to handle giving attribute changes over to the persistence engine as those changes happen. From there, it is up to the persistence engine to manage caching and queuing object data and (ultimately) persisting it to a repository (local, network, cloud, etc.).

I know ABEL is headed in this general direction, but what I need and want to know is: Are there any plans (in the near future) to bring such a mechanism to Eiffel, Eiffel compiler, and method?

I am aware of at least one thesis paper on the subject and know there is some active research in this direction. I am curious about the state of that research (or other similar) and what we (as a community) ought to expect—beyond "it-will-get-done-but-you-will-have-to-wait-for-it-or-contribute". I would be happy to contribute, but I do not think I have the level of know-how (presently) to appropriately contribute. Perhaps I need to learn, eh?

I do have a group of folks in Savannah, GA, USA that are forming up to do various work, but not all of it will be in Eiffel (it is a mixed bag). I can put the notion forward to some of them and see what they think.


Thanks!

Ian Joyner

unread,
Jan 11, 2016, 8:06:01 PM1/11/16
to Eiffel Users
I can tell you what I’d expect, which I think should be the fundamental requirement.

That is that the memory implementation hierarchy is not exposed and hidden. Thus an object is written to durable store at the end of the state change, but the programmer does not do this. Marking objects are permanent or temporary should be done in metadata (ECF or whatever it’s called these days), not in the code.

An example is Apple’s Core data, where objects are defined graphically and there is a radio box where programmer chooses permanent or temporary. Programmer needs to write no code, core data engine takes care of the rest. For the end user, Save (which exposes the memory hierarchy) is no longer needed.

I think that is more or less what you are saying, but I wanted to make it really clear and explicit. Do others agree this is a fundamental of object persistence?

Ian

--
You received this message because you are subscribed to the Google Groups "Eiffel Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to eiffel-users...@googlegroups.com.
Visit this group at https://groups.google.com/group/eiffel-users.
For more options, visit https://groups.google.com/d/optout.

Bernd Schoeller

unread,
Jan 12, 2016, 1:28:08 AM1/12/16
to eiffel...@googlegroups.com
Hi -

just out of curiosity: is this an academic or a 'real-world' problem you are trying to solve.

If it is a 'real-world' problem you are trying to solve, the answer to object-persistence mechanisms is: just don't do it.

In reality, data is much longer lived than code. While a good piece of software might live for one or two decades before some serious rewriting is required, data is carried over from one system to the next and is an asset to the business.

The object-oriented blending of code and data is its greatest strength and its greatest weakness at the same time. But the problem also exists with non-OO languages.

The solution is to use programming language dependent data models for all transient information, but to select a code independent data model for your persistent data (examples: SQL tables, XML or JSON, ASN-1, plain ASCII files, CSV ...).

Also: you are trying to deal in a multi-language environment. The object model is always very different between languages and creates long term inconsistencies. Looks at the major problems Eiffel had trying to map its object model to the .NET one (they did a fantastic job, but it still is a complex beast).

I am talking from year long experience in the industry with object-oriented systems, where after a few years the persisted object files always became a liability and people tried to move away from them (Eiffel storables, serialized Java, Python 'pickle' files). But that was always an extremely painful process, because the data was still needed while the code was not.

Bernd

Paul Cohen

unread,
Jan 12, 2016, 2:16:24 AM1/12/16
to eiffel...@googlegroups.com
Hi,

I absolutely agree with Bernd.

Make a clear distinction between the runtime transient representation
of data and the representation of the persistent data. If you are
working with big data or a system that must scale to many users with
high requirements on performance and availability, it's even more
important. In the case of many real-world systems you can often apply
a nothing-shared/sharding architecture when it comes data persistence.
Data sets that are not directly related to each other can and probably
should be stored separately. It will facilitate operations and also
enable better scaling and availability properties.

Object-persistence can be applicable when you actually do want to
store the state of a running system due to an explicitly stated
requirement on the system. But in my experience that is an unusual
real-world requirement.

/Paul
--
Paul Cohen
www.seibostudios.se
mobile: +46 730 787 035
e-mail: paul....@seibostudios.se

Rix, Larry

unread,
Jan 12, 2016, 8:05:48 AM1/12/16
to eiffel...@googlegroups.com

Hi Ian,

 

Thanks for the response. You state that “marking objects” ought to be implemented as some form of meta-data rather than directly in the code. Why? What makes the metadata solution superior to a code notation and a persistence-aware compiler?

Rix, Larry

unread,
Jan 12, 2016, 8:20:17 AM1/12/16
to eiffel...@googlegroups.com

Hello Bernd,

 

The need is real-world and hoping the academic is close enough for real-world implementation. J

 

“Data lives longer than code”—indeed it does. We are in the throes of moving data from “legacy code” to a new Eiffel system. What you’re really saying is that the data has more business value than the code that produced it. True that!

 

Code independent data model: Of course. Data is data at the end of the day—regardless of who or what consumes it.

 

What I am not talking about is Eiffel storables, serialized Java, or Python ‘pickle’ files. This is why I mention ABEL, because how the objects are persisted is of secondary importance. That is a choice made for other reasons. What I am referring to are programmer selected objects that persist without the additional labor of having to write a persistence layer of code.

 

So—whether the ultimate persistence mechanism is SQL, XML, JSON, YAML, or others like them, the result is the same: A safe and sane system of object-persistence that removes the burden of programmer labor, similar to how SCOOP removes the need to write threading code or that the Eiffel compiler removes my need to write complex C (cross-platform and more).

 

Nevertheless—your point is well taken! J

 

From: eiffel...@googlegroups.com [mailto:eiffel...@googlegroups.com] On Behalf Of Bernd Schoeller
Sent: Tuesday, January 12, 2016 1:28 AM
To: eiffel...@googlegroups.com
Subject: Re: [eiffel-users] General Question: Object Persistence Mechanism

 

Hi -


just out of curiosity: is this an academic or a 'real-world' problem you are trying to solve.

If it is a 'real-world' problem you are trying to solve, the answer to object-persistence mechanisms is: just don't do it.

In reality, data is much longer lived than code. While a good piece of software might live for one or two decades before some serious rewriting is required, data is carried over from one system to the next and is an asset to the business.

The object-oriented blending of code and data is its greatest strength and its greatest weakness at the same time. But the problem also exists with non-OO languages.

The solution is to use programming language dependent data models for all transient information, but to select a code independent data model for your persistent data (examples: SQL tables, XML or JSON, ASN-1, plain ASCII files, CSV ...).

Also: you are trying to deal in a multi-language environment. The object model is always very different between languages and creates long term inconsistencies. Looks at the major problems Eiffel had trying to map its object model to the .NET one (they did a fantastic job, but it still is a complex beast).

I am talking from year long experience in the industry with object-oriented systems, where after a few years the persisted object files always became a liability and people tried to move away from them (Eiffel storables, serialized Java, Python 'pickle' files). But that was always an extremely painful process, because the data while was still needed the code was not.



Bernd

On 11/01/16 14:32, lrix wrote:

Ian Joyner

unread,
Jan 12, 2016, 11:43:06 PM1/12/16
to Eiffel Users
HI Larry,

I think it is because Core Data describes objects separate to code (or it used to). That seemed a very good way to go. The compiler is still persistence aware - or at least the runtime is, it has just got that information from one place. I think that is what I am most concerned with is that the definition of persistence is only in one place, that is separation of concerns.

In particular, an application programmer should not be thinking in terms of a memory hierarchy and have code that says “ok, I’ll write that out to disk now”. Application programs should only think in terms of a flat memory model (Turing machine) as far as implementation goes - on top of that, memory is organised into structures to suit the application - not to suit the computer’s memory organization.

Even files should be treated as an abstraction, not as a memory level mechanism. I think this is what Bob Barton and his group would have thought when they invented the file system (at least articles I have read in the last few days indicate that file systems with directory structure like Unix was invented by Barton and co, but maybe that is something they also got from Manchester, like virtual memory).

Ian

Ian Joyner

unread,
Jan 12, 2016, 11:43:08 PM1/12/16
to Eiffel Users
Actually, I think academic vs real world is somewhat a false dichotomy. Academia tries to ignore parts of the real world so that a real-world solution can be designed without being confused by real-world noise. That is abstraction. In the real world, the noise can be addressed and an elegant an efficient solution might not be reached.

Ian

Bernd Schoeller

unread,
Jan 13, 2016, 2:04:26 AM1/13/16
to eiffel...@googlegroups.com
The 'academic' vs 'real-world' comparison was concerned with the life-time and maintainability of a solution. Academia rarely thinks beyond the next paper or PhD thesis. It is all very much "fire and forget".

Best Motto when it comes to software design in the industry: Always code as if the next person to pick-up your project is a mass-murdering serial killer.

Bernd

Bernd Schoeller

unread,
Jan 13, 2016, 2:19:53 AM1/13/16
to eiffel...@googlegroups.com
Hi Larry,

Ok - I get your point and understand what you want.

You are looking versioned (in data model and data content) storage mechanism. And it makes absolutely sense to ask: if I am already modelling my data in a programming language, why can't I use this data model also for the generation of my data storage?

The problem that I see is that Eiffel code too powerful, as it is not declarative but procedural. One would probably need to define a subset of Eiffel, but then we lose a lot of the power of design-by-contract. For example, if I have a complex invariant I just code it down with the full support of Eiffel language. Now, if I change the code of this, it is impossible to detect if the new code describes the same invariant or a different one. So - should the version number change or not? And what to do with the existing data?

To answer your original question: no, I am not aware of any development in Eiffel language to support such a mechanism. I know about a number of projects that store versioned information _using_ Eiffel (into a GIT repository or a specially prepared SQL database). But that does not have any specific support in the compiler or run-time.

Bernd

Ian Joyner

unread,
Jan 13, 2016, 2:55:59 AM1/13/16
to Eiffel Users
Hi Bernd,

On 13 Jan 2016, at 18:04, Bernd Schoeller <bernd.s...@gmail.com> wrote:

The 'academic' vs 'real-world' comparison was concerned with the life-time and maintainability of a solution. Academia rarely thinks beyond the next paper or PhD thesis. It is all very much "fire and forget".

Best Motto when it comes to software design in the industry: Always code as if the next person to pick-up your project is a mass-murdering serial killer.

I certainly agree with you here, but my experience in industry is that most practitioners do exactly the opposite - get the jobs done as quickly as possible, so it looks as if it is working, move on and let others pick up the mess. Of course, Eiffel programmers are different because they have bothered to learn the academic principles to producing quality software. I do agree with your characterization of academia somewhat. But maybe there is a difference between academia and academic.

Ian

Ian Joyner

unread,
Jan 13, 2016, 7:59:55 AM1/13/16
to Eiffel Users
Are we thinking of persistence the wrong way round as something exceptional, when it should be the default.

It might actually come naturally out of DbC - once a unit of work is completed, its results are saved automatically after the ensures check. This is like transactions. Transactions can also be nested.

Any data inside a transaction is temporary, or as Turing puts it a ‘rough note’.
“...

will form the sequence of figures which is the decimal of the real number which is being computed. The others are just rough notes to "assist the memory ". It will only be these rough notes which will be liable to erasure.” ON COMPUTABLE NUMBERS, WITH AN APPLICATION TO THE ENTSCHEIDUNGSPROBLEM

 Thus a system would automatically know what is important and what was just rough calculation. As in transaction processing, once a unit of work is complete, it is automatically made permanent (the D in ACID).

 Ian

Rix, Larry

unread,
Jan 13, 2016, 8:46:00 AM1/13/16
to eiffel...@googlegroups.com

If the data is something the overarching system cares about, I would think that your assertion is true: Once the routine clears the postconditions, the relevant data ought to persisted.

 

In response to a point Bernd made about Data v. Code: It is not entirely true that data exists apart from the code that created it. This is especially true for data that has been dynamically derived and not static (like user input). A derived datum is absolutely tied to the version of the algorithm that produced it. Saving the data apart from the producing algorithm is to partially or fully lose the capacity to understand the data. In a best-of-all-worlds scenario, it seems to me that one would store: 1) Data Result(s) 2) Production algorithm names and versions. While this metadata would not be the algorithm itself, it would (at least) present clues to the data consumers about where the data came from and roughly how it was derived. So, I think it is too simplistic to claim data lives on without code as a black-and-white matter.

Ian Joyner

unread,
Jan 14, 2016, 2:38:19 AM1/14/16
to Eiffel Users
Yes, I think code is the interpretation of the data. If the code is lost, the data is also lost. Alan Kay realised this about what some unnamed engineer on the B220 file format had done:


I see there is a related discussion going on on SCOOP. I think this is related. We have talked about when data is made durable (persistent) and is not a rough note (Turing). But also what happens at the end of a unit of transaction work is that making it durable also makes the data (resource) available to other processes. Thus durability and availability are interlinked.

Most of these ideas are covered in chapter 4 of Gray and Reuter “Transaction Processing” for a formal discussion. I don’t think it is a very large step to make all these facilities work in Eiffel so they are seamless.

Ian

Thomas Beale

unread,
Jan 14, 2016, 5:20:55 AM1/14/16
to Eiffel Users

Hi Ian,

that's only true if you have no specification of the data. In openEHR, we take that very seriously, precisely to allow for the reality that the 'code' will come and go, but the data - human health records - which have to last 100years+. See here for the specifications of the openEHR EHR - if you look inside you'll find UML models (yes, I know, don't start!) and extracted class tables, cunningly Eiffelised so that UML stupidities are somewhat corrected (see e.g. here). We also publish XSDs, and other formal artefacts that describe the data and their formal semantics, independent of any extant implementation.

I can't think off-hand of any data field that is derived by an algorithm as such, but if there were, this algorithm would be documented in the above specifications, or else inline in the archetypes that define most of the content meaning.

Health authorities around the world are starting to realise something like this is a necessary approach (6 countries now officially use these specifications in their MoH/DoH), because it's a guarantee that not even the back-end of large EHR (electronic health records) deployments will last for any more than about 10 years without the code (probably the whole product) being changed. But the know they need the data for ~100 years in health.

Eiffel is probably one of about 5 languages where the 'code' could be in some places be considered specification-level, or alternatively code=its own pseudo-code, to the extent where we even use somewhat disguised Eiffel as pseudo-code in the specifications, e.g. here. But realistically, the actual Eiffel code is no use as a specification of semantics of data, for the simple reason that it's like any other code - any real implementation is thousands of classes arranged in someone's architecture, that even another afficionado of the same language will find hard to navigate. My conclusion is that algorithms, where they matter, have to be extracted out of the code and inserted into documentation.

- thomas

Rix, Larry

unread,
Jan 14, 2016, 8:39:29 AM1/14/16
to eiffel...@googlegroups.com

Good morning Thomas/Ian! J

What an excellent discussion! Thanks for taking the time to write in such detail. I am learning and grateful for it.

 

Thomas—I can see the very good sense of having a specification of the data in such a way that a consumer of it can successfully reason about where it came from, what its purpose is, and (perhaps) how it was derived or came to exist. I also see that one would be overwhelmed to know the actual code used to derive the data and some abbreviated specification is preferable (e.g. “just enough specification”). What I am now curious about is how you all think about that in terms of algorithmic variants—that is—I have two numbers in a persisted (I like Ian’s term “durable”) database. Each number is the same field in two different records. The first is calculated by variant 1 of an algorithm and the second by variant or version 2. Given this, does the specification contain documentation about both variants? Moreover, are the data elements somehow marked with their version such that the reader of the data knows which algorithm variant was used to compute it?

 

Again—thank you gentlemen. This discussion is very helpful.

 

From: eiffel...@googlegroups.com [mailto:eiffel...@googlegroups.com] On Behalf Of Thomas Beale
Sent: Thursday, January 14, 2016 5:21 AM
To: Eiffel Users
Subject: Re: [eiffel-users] General Question: Object Persistence Mechanism

 

 

Hi Ian,

--

Thomas Beale

unread,
Jan 14, 2016, 9:51:24 AM1/14/16
to Eiffel Users

Larry,

you are right of course, and delivering usable code that implements the specs at all times is also a mantra for us. If you visit openEHR in Github, you'll see the Java implementation of the reference model (that's what we call the published information model), on codeplex C# devs can find the C# classes for the same thing; people in Japan maintain a Ruby lib and so on. 

But the reality is that in 10 years' time the specifications will still have exactly the same meaning, while all of the code will have been rewritten - sometimes into greatly different versions of the same languages (imagine upgrading java from 1.5 to 1.8 for example, or to Scala). Even more likely, when you consider how quickly developers change languages and technologies, that Java/C# (maybe even the Python impl) etc code will be abandoned, and the contemporary favourites will be written in Rust, Elm, and who knows what else.

So the job of coding is now changed: we build models (in UML and Eiffel, other decent tools), we generate specifications, and then those are translated into mainstream languages to build just the classes - open source, like the specs - that correspond to the spec, as a library or plugin. That code is now completely separate from the code of a full solution.

Re: the question of algorithms.. the semantics of the data in real systems is mostly not expressed in its reference model (i.e. the main information model in the spec). That's what we use archetypes for - these are second order models that encode the much vaster number of content / action / process types that occur in the domain, e.g. things like 'blood pressure measurement', 'adverse reaction' etc in medicine. There are O(10k) of these things, each one can have 20 data points in a specific structure. So none of those are classes in openEHR, they are archetypes. If you are curious, you can see them online here. Try double-clicking on things you see in the left hand explorer. You'll see things like this:


In medicine, the question of algorithms to generate values is common - e.g. BMI is one such algorithm. One of the things we are working on now is language independent ways of expressing such algorithms, so that they can be stored in the archetypes, not in the code. (Obviously there is a language, but I mean independent of programmer languages like Java etc).

So to summaries, what I called the specification above included around 120 classes, and is very stable. It models only generic things. The O(10k) - O(100k) domain 'models' are expressed in the archetype language, as archetypes, and form a whole other layer of software independent models, which you could also think of as specifications, and indeed, some governments use this method to formally model standard health data sets, like 'discharge summary' etc.

- thomas

Ian Joyner

unread,
Jan 14, 2016, 4:43:42 PM1/14/16
to Eiffel Users
Hi Thomas and Larry,

I think we are talking at somewhat different levels. In Thomas’s case, I don’t see specification and ‘code’ as different. ‘Code’ is an unfortunate word really going back to the 1950s where an analyst would write an algorithm specification and a ‘coder’ would translate it to machine language. That’s what compilers do now. But from the processes (as in programming methodologies) developed in 50s and early 60s, we still have this distinction and a whole lot of bogus methodologies which don’t understand that the fundamentals changed more than 50 years ago. I think Bertrand has written enough on that and that everyone in the Eiffel community understands how much stuff needs to be rethought. So I’m only stating this so you know where I’m coming from.

I read the other day (I think it was a David Bowie comment) that the older one gets, the more we concentrate on fundamentals. I certainly try to do that, and particularly try to teach that to students now. Even when people hear the fundamentals, often the implication is not appreciated, and quite often I go ‘wow, I knew that for ages, but didn’t understand how all this junk in my thinking really is against that’, or I come across a simple fact or fundamental I’d never heard before (like it was Claude Shannon who first used the term bit in his paper on communication, but ‘bit’ was a word suggested by J.W. Tukey, but none of the communication or computer texts tell us that, just about Shannon’s law).

Anyway, some specific comments. I don’t think there is any difference between specification and code - at least at the level I am talking at. But I can see that Thomas is dealing with the reality that ‘code’ will have to at least be recompiled on to a new machine in 10 years time (unless you are still using Burroughs MCP (Unisys Clearpath), which provides the most stable machine code in the industry because it is high level and processor-specific data formats or knowledge of data formats can change since they are not stored in the object code like other systems, that is Turing’s ‘rough notes'). Or that ‘code’ might have to be rewritten in a different language.

For interpretation of the data, it does not matter whether it is a non-executable specification or executable code. That interpretation is the program itself. It is rather like we had lost the interpretation of hieroglyphic symbols until Pierre-Francois Bouchard discovered it and Champolion worked out how to decode it. Perhaps we, should change the word ‘code’ to ‘decode’, since code works as interpreter or decoder of data. The human readable specifications also decode that data in the same way, but is the ultimate fallback - the universal machine. But if English is forgotten, then that interpretation mechanism becomes lost as well, like Harappan script that we still can’t read.

On Larry’s point of computable data, this should not be stored in durable storage. In fact, most database design is involved with just discerning what fundamental data is that cannot be derived from other data. Even fundamental relationships of data are separate from derived relationships, so that such derived relationships can still be made in an ad-hoc manner (normalisation). The practical consideration here is that some computations are expensive and you might want to store the results of such a computation for later - but such data should clearly be distinguished from fundamental (non-derivable) data. This is cached data, and variables are just a form of cache and add no power to the computation process. Variables are just part of that junk that I talked about that don’t add to fundamental understanding (functional programming).

The concepts of persistence or durable should also not be viewed as synonymous with RAM/disk transfer. Persistence is a fundamental concept, separate from the memory hierarchy, and disk, or flash, or off-site backup, are all just implementation techniques of persistence and durability.

I know all that sounds obvious, but I find myself forgetting about it all the time, since my computing education was very much about opening, reading, and closing files, etc, and it is probably only since I have considered the fundamentals that I have seen that as all just implementation of higher-level more fundamental concepts. Memory hierarchies and technologies are changeable - persistence is not.

Ian

lrix

unread,
Jan 14, 2016, 6:33:05 PM1/14/16
to Eiffel Users
Excellent information Ian. Thanks for writing all that. I am still in learn-mode! :-)


On Thursday, January 14, 2016 at 4:43:42 PM UTC-5, i.joyner wrote:
On Larry’s point of computable data, this should not be stored in durable storage. In fact, most database design is involved with just discerning what fundamental data is that cannot be derived from other data. Even fundamental relationships of data are separate from derived relationships, so that such derived relationships can still be made in an ad-hoc manner (normalisation). The practical consideration here is that some computations are expensive and you might want to store the results of such a computation for later - but such data should clearly be distinguished from fundamental (non-derivable) data. This is cached data, and variables are just a form of cache and add no power to the computation process. Variables are just part of that junk that I talked about that don’t add to fundamental understanding (functional programming).


I am happy for you to raise the issue of "fundamental data", where such data cannot be computed. I am even happier to see the distinction of computationally expensive data that is cached (in some way—durable for some amount of time) for later use. We do this not only with computationally expensive data, but with computations that may change over time and we do not want to recompute it later. A good example (in our world) is the total of a sales order. Given an order, we can compute the total price of the order at the point-of-sale based on what we know at that moment. Some time later (months, years), the basis of the same calculation may change, so we want to store (capture or permanently cache) the total at that time. This is especially true because the durable data is not only stored in a database, but it is also printed on paper and walks out the door with the customer. Moreover, if this is a "terms" customer, we don't want the total cost of their order going up or down as data fluctuates in our system. So, the computed total of the order must be done at point-of-sale, stored, and further computational changes disallowed.

Having this discussion is leading me to realize more and more that the matter of durability, when to store, how to store, what to compute, what to cache, how long to cache, and so on is a matter of not just permanent database storage, but also of objects in memory at any number of levels between the consumer/producer of the data and all levels of computation between. Each layer wants to have rules about what is kept, for how long, where, and how it is stored.

Suddenly, I want to envision layers of durability mechanisms that I tell my objects where they are in the hierarchy and the persist-or-not-to-persist question and chore is handled by generated algorithms much like a garbage collector knows when objects need to be destroyed. This thing then is not a garbage collector but a durable data collector and there is not just one, but one for each form of durability my system(s) require at all levels. This thing like a GC operates within semantic notation that I barely think about (just like I don't pay much attention to my GC).

What do you all think? 

Neal Lester

unread,
Jan 14, 2016, 6:54:48 PM1/14/16
to eiffel...@googlegroups.com
On Larry’s point of computable data, this should not be stored in durable storage.In fact, most database design is involved with just discerning what fundamental data is that cannot be derived from other data. 

I think Larry's point is that the computation may change over time so in a versioned data structure one needs to store the particular "formula" for the computation in use when the version is written in addition to the fundamental data the derived value is computed from.

For example

PRODUCT.[Standard Price] = PRODUCT.wholesale_price * CUSTOMER_TYPE.markup

may change to

PRODUCT.[Standard Price] = PRODUCT.retail_price * CUSTOMER_TYPE.discount

Ian Joyner

unread,
Jan 14, 2016, 9:22:34 PM1/14/16
to Eiffel Users
Hi Larry,

I think we still only have two flavours of data - fundamental and computed. The fact that the actual price paid for an article cannot be recomputed later (it might have been negotiated with the customer) makes it fundamental and non-computable.

Yes, there is a hierarchy of transactions. Jim Gray et al realised that flat transactions were not enough for many applications and long-running transactions which meant if there was a failure a lot had to be backed out and redone. Thus they have nested transactions where each sub part can be done and released to others, or chained transactions where results are progressively released. In the nested case, an outer transaction can still abort and thus inner transaction results are only durable to the boundary of the outer transaction. Processes participating in the outer transaction have the results of inner transactions that have completed (waited for them in SCOOP).

I’m a bit uneasy about separate GCs for temporary and durable objects in-so-far as that exposes a distinction between storage levels (that is related to physical memory hierarchy). GCs are only to collect temporary data (Turing’s ‘rough notes’).

On the other hand, perhaps there should be a hierarchy of garbage collection. Inner transactions have temporary data that can be collected by a GC at that level, but anything committed to an outer transaction cannot be collected. However to an outer transaction, that inner-committed data could indeed be its ‘rough notes’ and collected at that level. Such a hierarchy of GCs would indeed reflect the logical needs of an application.

I’m just finding transaction processing a very fruitful field to go into to model many of these problems.

Ian

Peter Horan

unread,
Jan 16, 2016, 8:44:40 AM1/16/16
to eiffel...@googlegroups.com

Larry Rix mentioned layering, which, I think,  is something quite fundamental and perhaps is a relevant part of this discussion.

                                                                                 

We met it in the ISO seven layer model (Physical, electrical, digital, etc.), and it crops up in many other places as well. So, perhaps one could speak of the “durable data” layer in this discussion. Then, routines that write/encode the durable data together with routines that read/decode/interpret it exist in a layer above the durable data layer. This layer implements a process of communication. Perhaps the protocol of this communication is worth considering and exploring in its own right. Are there higher layers, perhaps? Could this layer actually be more than one layer?

                                                                                                                                                                                

Many years ago, I wrote multiple process real-time software for microprocessors. In that case, there were four layers: hardware, device drivers, processes and process communication. Rather than designing a blob of software called an application, I split it into several processes, one for each source of activity that needed to be handled. Separating the processes from process communication made designs robust and easy.

--

Peter Horan              Faculty of Science, Engineering

pe...@deakin.edu.au          and Built Environment

+61-3-5221 1234 (Voice)  Deakin University

+61-4-0831 2116 (Mobile) Geelong, Victoria 3217, AUSTRALIA

 

-- The Eiffel guarantee: From specification to implementation

-- (http://www.objenv.com/cetus/oo_eiffel.html)

 

 


Important Notice:
The contents of this email are intended solely for the named addressee and are confidential; any unauthorised use, reproduction or storage of the contents is expressly prohibited. If you have received this email in error, please delete it and any attachments immediately and advise the sender by return email or telephone.

Deakin University does not warrant that this email and any attachments are error or virus free.

Peter Horan

unread,
Jan 16, 2016, 6:16:27 PM1/16/16
to eiffel...@googlegroups.com

I wrote of the “durable data layer” that:

This layer implements a process of communication. Perhaps the protocol of this communication is worth considering and exploring in its own right.

 

On thinking further in the context of databases etc., the “communication” achieves the concealment of the fact that the objects whose states are saved as persistent data and which are reconstructed from that data when later required, may in fact, not be in memory at all in that interval. But as far as some higher layer is concerned, holding clients of these objects, they continued to exist.

 

Application layer:              ------------------------ client ------------------------------

                                                  |                                                                                                |

                                                  v                                                                                                 v

Communication layer: object   (Garbage collected)  (Object recreated)   object

                                                  |                                                                                                 |

                                                  v                                                                                                 v

                                                 ----------- durable data (continues to exist) -----------

                                          -------------------------------------------------------------------------> time

--

Peter Horan              Faculty of Science, Engineering

pe...@deakin.edu.au          and Built Environment

+61-3-5221 1234 (Voice)  Deakin University

+61-4-0831 2116 (Mobile) Geelong, Victoria 3217, AUSTRALIA

 

-- The Eiffel guarantee: From specification to implementation

-- (http://www.objenv.com/cetus/oo_eiffel.html)

 

 

lrix

unread,
Jan 22, 2016, 3:14:57 PM1/22/16
to Eiffel Users
Precisely correct—that is the point exactly.

Peter Horan

unread,
Jan 23, 2016, 8:01:20 AM1/23/16
to eiffel...@googlegroups.com

This relates to my view (17 Jan 16) that the durable data is a lower layer than the communication layer. The interpretation of data may change. That is, the meaning conveyed at the communication level changes. So, how should changes in meaning be managed?

 

From: eiffel...@googlegroups.com [mailto:eiffel...@googlegroups.com] On Behalf Of lrix
Sent: Saturday, 23 January 2016 07:15
To: Eiffel Users
Subject: Re: [eiffel-users] General Question: Object Persistence Mechanism

 

Precisely correct—that is the point exactly.

--

You received this message because you are subscribed to the Google Groups "Eiffel Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to eiffel-users...@googlegroups.com.
Visit this group at https://groups.google.com/group/eiffel-users.
For more options, visit https://groups.google.com/d/optout.


Important Notice:
The contents of this email are intended solely for the named addressee and are confidential; any unauthorised use, reproduction or storage of the contents is expressly prohibited. If you have received this email in error, please delete it and any attachments immediately and advise the sender by return email or telephone.

lrix

unread,
Jan 25, 2016, 1:38:01 PM1/25/16
to Eiffel Users, peter...@deakin.edu.au
The very nature of data is that it can exist without any surrounding context. It is the context that provides semantics. Otherwise, we do not know if 3.14 is Pi as a math constant or how much we paid for a cup of coffee. In an RDBMS, the name of a column in a table can provide a limited amount of contextual semantics, but not all that is required. If the 3.14 is the price of my cup of coffee (i.e. price: 3.14 as a key:value pair), I still have no means by which to know precisely how that price was computed. I don't even know if it was computed by a program or arbitrarily by a human brain. The field, sitting in a table with a field name, has been forever divorced from the algorithm used to compute it. Moreover, if I have 100,000 fields in that column, I have NO WAY to know that each instance of a field value was derived using the precise same computation or if the set of values called `price' have `n' variants. To know that requires each value to have a reference to a pointer, which points directly at the algorithm from which it was derived.

It may be "good enough" for the pointer to point at one of a number of abstract descriptions of the algorithm, each with some kind of "version" attached to it. However, even that may not be "good enough", where "good enough" is a matter of case-by-case example. Ultimately, we only need to know enough information to work with the data in a reasonable manner—regardless of the programming language or system used to create the data. It occurs to me that an algorithm transcends a particular programming language, so perhaps the best abstraction to attached through a pointer to each value in our column is specifying notation that can be parsed and compiled to any number of target languages?

Cheers!
Larry Rix

Peter Horan

unread,
Jan 27, 2016, 9:24:53 PM1/27/16
to eiffel...@googlegroups.com

On Tue 26/01/2016 05:38, Larry wrote:

“The very nature of data is that it can exist without any surrounding context. It is the context that provides semantics. … I still have no means by which to know precisely how that price was computed.”

 

Data may exist by itself, but without context (the communication layer I referred to), the data is meaningless.

 

Larry >> “Ultimately, we only need to know enough information to work with the data in a reasonable manner—regardless of the programming language or system used to create the data.”

 

Do we really need to know how some data was computed? I think the quote implies that we do not. No algorithm is needed, but “communication about the data”, that is, its context, is necessary.

 

I am introducing the concept of layers to the discussion, because it may be a useful point of view, and may also guide design. For example, what is the “Least Context” needed to interpret data and how should it be made available (communicated, encoded)?

--

Peter Horan              Faculty of Science, Engineering

pe...@deakin.edu.au          and Built Environment

+61-3-5221 1234 (Voice)  Deakin University

+61-4-0831 2116 (Mobile) Geelong, Victoria 3217, AUSTRALIA

 

-- The Eiffel guarantee: From specification to implementation

-- (http://www.objenv.com/cetus/oo_eiffel.html)

 

 


Important Notice:
The contents of this email are intended solely for the named addressee and are confidential; any unauthorised use, reproduction or storage of the contents is expressly prohibited. If you have received this email in error, please delete it and any attachments immediately and advise the sender by return email or telephone.

Ian Joyner

unread,
Jan 28, 2016, 3:56:22 AM1/28/16
to Eiffel Users
You can do things both ways - record the state at a particular point (storage and variables), or compute that state from a previous point (perhaps the big bang).

Jim Gray proposed a data base system with no state storage at all - everything could be recomputed from the logs. It works, but might not be very efficient.

Ian

Reply all
Reply to author
Forward
0 new messages