Modeling cross-references

187 views
Skip to first unread message

Dénes Harmath

unread,
Sep 22, 2016, 6:24:38 PM9/22/16
to Elm Discuss
Hi everybody,

in a typical model, there are cross-references: e.g. we have artists and albums, and the album refers to its artist. How do we model this in Elm? A possible approach is giving unique identifiers to objects and store the referred object's identifiers similar to the TodoMVC example - is this mechanism extracted to some library in a type-safe way?

Thank you in advance,
thSoft

Rupert Smith

unread,
Sep 23, 2016, 4:44:47 AM9/23/16
to Elm Discuss
On Thursday, September 22, 2016 at 11:24:38 PM UTC+1, Dénes Harmath wrote:
in a typical model, there are cross-references: e.g. we have artists and albums, and the album refers to its artist. How do we model this in Elm? A possible approach is giving unique identifiers to objects and store the referred object's identifiers similar to the TodoMVC example - is this mechanism extracted to some library in a type-safe way?

Something I have been thinking about...
 
Thats sort of like building a database with foreign keys within your UI code - but I can see you might have to do that.

Why not just:

type alias Album { name: String, artists : List Artist }
type alias Artist { name : String }

Album is the root of what I call a 'document fragment'.

Things get more complicated if you also need the Artist to hold a list of Albums they performed on. You can have mutual recursion, but it requires defining types, as type alias cannot recurse.

type Album = 
 Album { name : String, artists : List Artist }
type Artist =
 Artist { name : String, albums : List Album }

In this case this isn;t going to work, because an artist will have a list of albums they performed on, which will have a list including that artist, which will have a list of albums they performed on, which ...

In the first case 'document fragment', I would consider the Album to be an entity, by which I mean a persisted object with an explicit identifier (maybe an int or a GUID). As the artist only appears within an album, it does not need to be an entity; they have a relationship by composition. Composition means that something is part of something else, its lifecycle is tied to the parent. In this case, that also happens not to be true - Artists might exist as artists before their first album (only released singles), and artists appear in multiple albums. So are artists are entities too. For the purposes of modelling them in a database, they are definitely entities, but for the purpose of displaying information about albums in a UI, the document fragment approach might be sufficient.

So modelling Artists and Albums as entities (with Int ids), with a relationship by aggregation gives:

type Album = 
 Album { name : String, artists : List Int }
type Artist =
 Artist { name : String, albums : List Int }

Which is a bit inconvenient to work with, as if I fetched an Album over REST, I would then have to iterate the List, and make 1 REST call per artist to get the list of artists.

What I have done (one the server side in Java), is to make all Entities implement a Reference interface, which means that every entity also has a degenerate form where it is simply represented by its id. When querying for a document fragment containing child entities, I can choose how the 'slice' the data. That is to say that, if I know I will use the entities I will eagerly fetch them, if I don't expect to use them, I will just pull their ids, and they can be lazily fetched as required.

Understanding the nature of the relationships in a data model and choosing how to slice it is a major driver in designing an API to work with that data model.

Coming back to artists and albums, how about the below. I'll use a String id this time, in fact I think I will always use a String id, even if the id was an Int, as the id is an identifier that the server side needs to understand but for the UI it is just an opaque label that identifies something.

type Album = 
  Album { name : String, artists : List Artist }
 | Reference String

type Artist =
  Artist { name : String, albums : List Album }
 | Reference String

If I have an album query that fetches the album and its artists - I would slice after that second level. So the artists would only hold references to other albums (to link to them on the page), but not pull that data.

When working with this data model, when you hit a Reference, you need to trigger a request to fetch it on demand.

One problem with the above, is that determining all references up front can add to the cost of the query. In some cases I might not want to put in the references at all, if I know I really will never use them. So perhaps:

type Album = 
  Album { name : String, artists : Maybe List Artist }
 | Reference String

type Artist =
  Artist { name : String, albums : Maybe List Album }
 | Reference String

What do you think?

Rupert Smith

unread,
Sep 23, 2016, 4:49:40 AM9/23/16
to Elm Discuss
On Friday, September 23, 2016 at 9:44:47 AM UTC+1, Rupert Smith wrote:
type Album = 
  Album { name : String, artists : Maybe List Artist }
 | Reference String

type Artist =
  Artist { name : String, albums : Maybe List Album }
 | Reference String

Would also make the ids explicit:

type Album = 
  Album { id : String, name : String, artists : Maybe List Artist }
 | Reference String

type Artist =
  Artist { id : String, name : String, albums : Maybe List Album }
 | Reference String 

You'd need the id if you wanted to make an update to an entity on the server or database.

Francesco Orsenigo

unread,
Sep 23, 2016, 7:24:51 PM9/23/16
to Elm Discuss

I'm writing a real time strategy game and I'm using numeric ids on everything.

For example, each unit has an Id, and the command to attack another unit contains the target unit id.
I keep all units in a (Dict UnitId Unit), so I can access each quickly.
It's not as handy (or fast) as having pointers, but it is more robust: say the target unit is destroyed, the attacking unit will be left with Nothing to attack rather than a seg fault/reference error.

Wouter In t Velt

unread,
Sep 24, 2016, 5:14:11 AM9/24/16
to Elm Discuss
This is a case I also run into a lot. Haven't yet found a library/ good examples for how to deal with it.
The pattern I typically use is:

type alias UID = String -- so it is clear when I deal with ID or Reference

type
alias Model =
 
{ artists : List Artist
 
, albums : List Album
 
, nextUID : UID -- for creating new items
 
}

type
alias Artist =
 
{ uid : UID
 
, name : String
 
, albums : List UID
 
}

type
alias Album =
 
{ uid : UID
 
, name : String
 
, artists : List UID
 
}

albumsForArtist
: UID -> Model -> Maybe (List Album)
albumsForArtist artistUID model
=
  getArtist artistUID model
-- Maybe Artist
 
|> Maybe.map (.albums)  -- Maybe (List UID)
 
|> Maybe.map (List.map (\uid -> getAlbum uid model)) -- Maybe (List (Maybe Album))
 
|> Maybe.map catMaybes -- Maybe (List Album) (catMaybes is from Exts.Maybe package)

getArtist
: UID -> Model -> Maybe Artist
getArtist artistUID model
=
 
case List.filter (\artist -> artist.uid == artistUID) model.artists of
    artist
:: _ ->
     
Just artist

    _
->
     
Nothing


Whenever I need one or more albums/ artists from the model, (e.g. in a List view), I use the UID to get a Maybe of some kind.
Getting Nothing triggers a fetch from the server to supply more data.

This way, I try to keep "One source of Truth": any album and any artist is only stored in 1 place in my model.
And with a flat model, I can easily build navigation/ views where a e.g. user clicks album, then one of the artists of that album, then one of the albums of that artist etc etc.

Spencer Judd

unread,
Sep 24, 2016, 4:11:51 PM9/24/16
to Elm Discuss
I tend to model things like this with Dicts, Sets, and a type alias for each identifier. So, something like

type alias Model =
 
{ artists : Dict ArtistId Artist
 
, albums : Dict AlbumId Album
 
}


type
alias ArtistId =
 
Int


type
alias Artist =
 
{ id : ArtistId
 
, name : String
 
, albums : Set AlbumId
 
}


type
alias AlbumId =
 
Int


type
alias Album =
 
{ id : AlbumId
 
, name : String
 
, artists : Set ArtistId
 
}

This is optimized for querying, adding and updating both Artists and Albums. Deletions require a bit more thought: You'll have to update all the cross-reference Set's as well.

Spencer Judd

unread,
Sep 24, 2016, 4:19:14 PM9/24/16
to Elm Discuss
Actually, additions (and updates where you're changing cross-references, obviously) would require updating the cross references, as well.

Still, this is much more optimized for additions, updates, and deletions, and much less error prone than, say, doing a linear scan through any number of lists that might contain a reference to your record.

Eric G

unread,
Sep 24, 2016, 6:54:50 PM9/24/16
to Elm Discuss
Thanks for this question and the suggestions - very useful to issues I am dealing with now. 

Some questions:

- Is it better to use Dicts as the basic 'table' structure, if frequently rendering lists of items filtered and sorted in various ways? In short, is it better to convert a `List (ID, Item)` to a Dict for finding items, or convert a `Dict ID Item` to a List for rendering them?  I kind of am leaning towards `List (ID, Item)` as the persistent data structure esp. for data that is frequently rendered in lists, but would really appreciate hearing what peoples' actual experiences have been.

- How are people modelling so-called 'value types' ?  For example in the Albums/Artists if you had a `genre` type assigned to Albums. The genre types change infrequently, but perhaps the application still needs some kind of user interface to change them, which suggests they should be stored as data, e.g. `List (ID, String)`, with no special behavioral significance to the app.  On the other hand, in some cases you have value types that do have behavioral significance, such as e.g. User Roles, and it is tempting to want to have these map to Elm types instead of strings when you `case` on them in view and update. But this means duplication of server- and/or datastore- side data, and you still have to map your Elm types back to IDs.

Anyway, some rambling thoughts but curious if people have dealt with these kinds of issues.  

Francesco Orsenigo

unread,
Sep 24, 2016, 7:16:32 PM9/24/16
to Elm Discuss

If the random access happens only on user input, ie it's not something you need to do several thousands times per second, stick to Lists.
If you need a particular sorting order, stick to Lists.

You can use `Dict.foldl/r` to map a dictionary to a list in a single step, rather than first converting and then mapping.
This should give you a rendering performance similar to that of `List.map`.

All in all, if performance is the problem, I'd try both approaches and see which one works better.
(You could even write a thin layer between your app and the container, so that you can switch from List to Dict without touching much of your code).



Re "value types", if I understand correctly, if you have complex behavior to model on that I'd use a Union Type.
I'd expect the user to want to specify a "genere" which is not in a list of preset, which mean you'd want to leave it a String.

Wouter In t Velt

unread,
Sep 25, 2016, 4:42:58 AM9/25/16
to Elm Discuss
On Sunday, September 25, 2016 at 8:54:50 AM UTC+10, Eric G wrote:
- Is it better to use Dicts as the basic 'table' structure, if frequently rendering lists of items filtered and sorted in various ways? In short, is it better to convert a `List (ID, Item)` to a Dict for finding items, or convert a `Dict ID Item` to a List for rendering them?  I kind of am leaning towards `List (ID, Item)` as the persistent data structure esp. for data that is frequently rendered in lists, but would really appreciate hearing what peoples' actual experiences have been.

I find myself using Lists most of the time. Probably because:
- Lists is sort of entry level (consider myself still beginner in Elm) - all the tutorials are in Elm
- I find code for List manipulation easier to understand/ read.
- Lists are a lot easier to manipulate (especially sorting and filtering), which is what happens a lot in my code
- Many of the the lists I work with are not very long (so no performance need to switch to Dict)
 
- How are people modelling so-called 'value types' ?  For example in the Albums/Artists if you had a `genre` type assigned to Albums. The genre types change infrequently, but perhaps the application still needs some kind of user interface to change them, which suggests they should be stored as data, e.g. `List (ID, String)`, with no special behavioral significance to the app.  On the other hand, in some cases you have value types that do have behavioral significance, such as e.g. User Roles, and it is tempting to want to have these map to Elm types instead of strings when you `case` on them in view and update. But this means duplication of server- and/or datastore- side data, and you still have to map your Elm types back to IDs.

I think you already answered this yourself :) The genre-like data, I put in my model as a separate List (ID, String). They are a List because for the program it does not matter how many genres they are, and what their names are. If the app has logic to deal with genres (e.g. filters), it will also work if the list has more items.

User Roles is a specific case: because it DOES matter how many options there are + your program needs logic for each role, Elm types do make sense. I wouldn't worry about the duplication: ALL server side data sent to client will of course be stored = duplicated there. And it is not uncommon to transform server data into different types at client side.

Eric G

unread,
Sep 25, 2016, 11:40:43 AM9/25/16
to Elm Discuss
Thanks Francesco and Wouter, your suggestions confirm what I was thinking too. 

Still getting used to the idea of having normalized data in the client - it seems mildly irritating to have to do joins client-side instead of in the database. But on the plus side maybe I can ditch my ORM on the server :)

Reply all
Reply to author
Forward
0 new messages