Updating a record in the spirit of clean architecture

Alex Yip

unread,

Jul 31, 2014, 1:13:18 AM7/31/14

to clean-code...@googlegroups.com

Hello,

I'm trying to learn clean architecture and watched a bunch of Uncle Bob's videos. I have a few questions about structuring my application. Let's say we're doing a simple TODO list application and it's backed by a database. We could say that this todo has a description and a due date. So I make an entity object comprising of a description and a date.

Let’s further say it’s using an ORM (I know, I know, implementation detail…, but read on!), something like Rail’s ActiveRecord where there are strong dependencies on the framework. I know the appropriate thing to do would be to copy the data into an entity object via a gateway. So far so good. Retrieving and adding new records is trivial.

However, the question now comes when I have to update a record. If the user makes changes to a TODO item, how would I reference the original record? My immediate thought would be to tack on a primary key to the TODO item, but that would be kind of a leaky abstraction would it not? And if I had duplicate TODO items, I would just only update one? And if I don’t do a primary key, in the data source I would have to check each record in the database for the same description and due date. (I have seen some examples of an "account id" which would make sense in the real world, but I don't know about a "todo id")

So, what’s the proper way of implementing something like this?

Thanks,

Alex

Michael Krzenski

unread,

Jul 31, 2014, 11:45:31 PM7/31/14

to clean-code...@googlegroups.com

You'll probably want to get some other answers before making a final decision on this, but I don't see it as being a leaky abstraction. It's the unique id of the entity... how else are you supposed to retrieve it sanely?

If I have a CompleteTodoUseCase interactor... what would I pass in the request? Well I'd pass in the unique Id for one... and possibly that's it if the case is simple enough.

Maybe I'm just doing it wrong though?

Let's see what anyone else has to say on it.

Sebastian Gozin

unread,

Aug 1, 2014, 4:30:35 AM8/1/14

to clean-code...@googlegroups.com

In the Case Study videos this comes up a bit.
The trade of made was something like, we know about persistence in terms of id's and transactions but not in terms of how it's partitioned and how it actually works.

Caio Fernando Bertoldi Paes de Andrade

unread,

Aug 1, 2014, 8:33:33 AM8/1/14

to clean-code...@googlegroups.com

I have this feeling that the concept of an ID belongs to the domain itself, and not to persistence. It is the persistence who sets and defines the value that goes into an Entity’s ID, but the concept itself seems to belong in the domain. Eric Evans’ description of an Entity is an object that has a distinct identity that runs through time and different representations.

So I wouldn’t bother using IDs as a form of identifying an entity. But I’m not sure if the knowledge that this ID is a numerical value is a leaky abstraction, it might be.

Caio

—
Sent from Mailbox

--
The only way to go fast is to go well.
---
You received this message because you are subscribed to the Google Groups "Clean Code Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clean-code-discu...@googlegroups.com.
To post to this group, send email to clean-code...@googlegroups.com.
Visit this group at http://groups.google.com/group/clean-code-discussion.

Jakob Holderbaum

unread,

Aug 1, 2014, 11:28:36 AM8/1/14

to clean-code...@googlegroups.com

In the awesome Book "Implementing Domain Driven Design" you can read
about the distinctions between Data Objects and Entities.

Data Objects are mostly method-less objects which define equality by
comparing of all there internal attributes and by this does not have the
concept of identity.

Entities do now this concept of identity by some form of a semantic
identifier (could be a numeric id, a string identifier or even a
username). They define equality by just comparing the identifier.

From this perspective, and as discussed in the case study, it is
totally appropriate to oppose some form of identity upon the entities.

WDYT?

Cheers
Jakob

--
Jakob Holderbaum, B.Sc
Systems Engineer

0176 637 297 71
http://jakob.io
h...@jakob.io
#hldrbm

Frederik Krautwald

unread,

Aug 2, 2014, 11:46:55 AM8/2/14

to clean-code...@googlegroups.com

An ID of an entity must come from the domain requirements, and not how the entity is persisted. Though I am aware that especially the Web world for many years has practiced creating surrogate keys for every unique thing they need to store; from a database point of view, this is bad practice, and primary keys should be identified by the data stored. That is, we look for candidate keys, which could be a single field or a combination of fields that make the entity unique. If, and only if, none such can be identified, we could create a surrogate key which uniquely identifies the entity. However, as already mentioned, I know the common practice is to just create surrogate keys, such as incrementing integers, for everything. The problem with this is that a number, such as 42, doesn’t provide any value to the business. Think about, I have To-do entity no. 42 here. Well, yes, then what? What does that number tell you? Well, if you know about the internal storage mechanism, say an incrementing integer, you might be able to deduct that you now have the forty-second created To-do entity, but then again, how’s that going to be useful to the business?

You need to look at the business domain, and especially the domain context of the To-do list. Find out, how in a particular business context, the To-do items should be distinguished from one another. If, in the domain context, such a distinction cannot be found, you may resort to creating a surrogate identifier for you to work with internally.

Caio Fernando Bertoldi Paes de Andrade

unread,

Aug 2, 2014, 3:10:08 PM8/2/14

to clean-code...@googlegroups.com

So the surrogate identifier is also relevant to the business, because there wouldn’t be any other more legitimate form of referring to an entity rather than the surrogate identifier itself.

This seems to make it OK for the Controllers and Views to have knowledge about the identifier, both its type and value, and answers Alex’s initial question.

Caio

—
Sent from Mailbox

Frederik Krautwald

unread,

Aug 2, 2014, 4:54:15 PM8/2/14

to clean-code...@googlegroups.com

Yes. In the case where there are no natural identifiers, and given the fact that the object indeed should be identifiable to satisfy the business, surrogate identifiers are perfectly acceptable, and, in such a case, the only way to identify the object. However, if the identifier is sensitive to exposure, care should be taken to hide it from scrutiny by introducing translators or encryption methods.

To unsubscribe from this group and stop receiving emails from it, send an email to clean-code-discussion+unsub...@googlegroups.com.

Jakob Holderbaum

unread,

Aug 8, 2014, 12:23:23 AM8/8/14

to clean-code...@googlegroups.com

I agree with this point of view to a certain extend, at least that is
how I explained it to coworkers and other interested individuals.

But there is a tiny trip hazard one should be aware of. I will five a
quick example. Consider the case you are building a system that contains
registered editors which can modify certain articles in respect of
specific permissions.

One could come to the conclusion, that the always provided and naturally
necessary email address of every editor has the inherent feature of
uniqueness and servers by this definition as a potential entity key.
Well, it does. As mentioned before it is unique and makes sense as
identity in context of the business.

And then the customer decides: use our Single Sing On, no more
registrations from the actual platform. And guess what, the SSO does not
deliver mail addresses (company security policy).

(This is a somewhat true story from my project history)

Well, if you had used this domain specific information as the key and
therefor as the reference throughout you persistence, there
(potentially) be dragons.

And for this particular reason, I always think about the probability of
breaking changes when choosing keys (or surrogate keys)

WDYT?

Cheers
Jakob

On 08/02/2014 05:46 PM, Frederik Krautwald wrote:
> An ID of an entity must come from the domain requirements, and not how the
> entity is persisted. Though I am aware that especially the Web world for
> many years has practiced creating surrogate keys for every unique thing
> they need to store; from a database point of view, this is bad practice,
> and primary keys should be identified by the data stored. That is, we look
> for candidate keys, which could be a single field or a combination of
> fields that make the entity unique. If, and only if, none such can be
> identified, we could create a surrogate key which uniquely identifies the
> entity. However, as already mentioned, I know the common practice is to
> just create surrogate keys, such as incrementing integers, for everything.
> The problem with this is that a number, such as 42, doesn’t provide any
> value to the business. Think about, I have To-do entity no. 42 here. Well,

> yes, then what? What does that number tell you? Well, if you *know* about

Frederik Krautwald

unread,

Aug 8, 2014, 1:39:36 PM8/8/14

to clean-code...@googlegroups.com, mail...@jakob.io

I think it is very important to draw a clear distinction between the the logical model and the physical implementation of that model. It doesn’t really matter which key we are talking about, the criterias should be stability, simplicity, and familiarity. Uniqueness isn’t really the issue, as it needs to be enforced whether the key changes or not.

Business requirements change, and they change for a reason, and as such, the entities will change along with the business requirements. If an editor entity was identified by an e-mail address and a new requirement to use some other id is specified, this, of course, will require some rework of code. However, how the entity is persisted is irrelevant to the change. On a side note, the e-mail address of the editor is not a particularly good identifier, as it is prone to change, just as the physical address of the editor is.

In the real, physical world, all objects are unique, and the most reliable criteria for uniqueness is time and location, because no two objects can occupy the same location at the same time. However, to distinguish objects over the course of time, they will need to be tracked by some kind of reliable measurement. When we describe objects in code, we simplify them by attributes, but even with a complex set of attributes, there are chances of collisions between two objects. Surrogate keys or identifiers do a good job in uniquely separating objects from each other. Thus, they are often used in terms of persistence. However, whatever surrogate key the persistence layer identifies an object by should be invisible to the business, and whatever changes the business makes to its models should not affect the persistence keys chosen for identification. If the business model also requires a surrogate key to identify entities and their relations, it should not be coupled with the persistence layer.

That’s what I think.

Rusi Filipov

unread,

Aug 15, 2014, 2:43:13 PM8/15/14

to clean-code...@googlegroups.com

The book "Analysis Patterns: Reusable Object Models" presents some sophisticated schemes for referring to objects. A very good read for handling multiple or even changing IDs.

Frederik Krautwald

unread,

Aug 15, 2014, 4:26:08 PM8/15/14

to clean-code...@googlegroups.com

Could you provide a short description of how such a scheme would be implemented?

Rusi Filipov

unread,

Aug 16, 2014, 6:09:49 AM8/16/14

to clean-code...@googlegroups.com

It is best explained in the book, you can find some excerpts here:

http://books.google.de/books?id=4V8pZmpwmBYC&printsec=frontcover&dq=analysis+patterns&hl=de&sa=X&ei=zSvvU7OeFurD7AaxtYHQAg&redir_esc=y#v=onepage&q=analysis%20patterns&f=false

Scroll down to the table of contents, and then see the 5.x sections
like "Identification Scheme".

By the way, although not quite new, the content of the book is
timeless. It is a valuable one and good to to have in the own
bookshelf.

> --
> The only way to go fast is to go well.
> ---

> You received this message because you are subscribed to a topic in the

> Google Groups "Clean Code Discussion" group.

> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/clean-code-discussion/tyM2E0he5j0/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> clean-code-discu...@googlegroups.com.

Frederik Krautwald

unread,

Nov 21, 2014, 7:17:48 PM11/21/14

to clean-code...@googlegroups.com, mail...@jakob.io

It is best to hide the surrogate identity from the outside world. Because the surrogate is not part of the domain model, visibility constitutes persistence leakage.