idiomatic data transformation

123 views
Skip to first unread message

Joe Graham

unread,
Jul 7, 2015, 5:28:21 PM7/7/15
to clo...@googlegroups.com
Hi Clojure Users,
I wanted to learn an idiomatic approach to data transformation in Clojure. For example I have a web service that is consuming some ReST services,  Ultimately I need to consolidate these services into a data model which resembles a DAG.  However I would like to transform the children of this data model in discrete transformation functions that may share some logic and the transforming functions should be able to handle nulls and fail or exit gracefully.  I have provided a simple example below.  I would appreciate any links to existing code and especially any links that include background or theory on the matter.  Bonus points if some examples use functors or monads or other interesting concepts.

Here's an example, using the two input sources, consolidate into a new data model.  In the output json I have highlighted some common transformation points such as dates and currencies.  My understanding is that a pure function take on this would be just a function per mapping pair.  So for any given item in the output set map a function to the items in the input set.  But what other clever tricks can be applied and what would this look like?

Thanks so much for your help on this!

Input data (json) for apples:
{
"fruit-category" : "apple",
"fruit-category-id": "2001",
 "fruit-items": [
   {
    "fruit-item-id": "2001-1",
    "description": "Fuji",
    "current-price": "3.01",
    "currency": "USD",
   "fresh-until": "2015/07/16"
  },
  {
    "fruit-item-id": "2001-2",
    "description": "Honeycrisp",
    "current-price": "1.69",
    "currency": "USD",
    "fresh-until": "2015/07/17"
 }]
}


Input data (json) for oranges:
{
"fruit-category" : "orange",
"fruit-category-id": "2002",
 "fruit-items": [
   {
    "fruit-item-id": "2002-1",
    "description": "Mandarin",
    "current-price": "1.88",
    "currency": "USD",
   "fresh-until": "2015/07/19"
  },
  {
    "fruit-item-id": "2002-2",
    "description": "Clementine",
    "current-price": "1.98",
    "currency": "USD",
    "fresh-until": "2015/07/13"
 }]
}

Consolidated output data (json):
{
 "fruit-basket": {
    "title": "Apples and Oranges",
    "price": "17.12",
    "currency": "USD",
    "converted-currency":"JPY",
    "converted-price":"2098.22",
    "fresh-until":"2015/07/13",
 },
 "items": [{
    "fruit-item-id":"2001-1",
    "qty": "2",
    }, {
    "fruit-item-id":"2001-2",
    "qty": "2",
    }, {
    "fruit-item-id":"2002-1",
    "qty": "2",
    }, {
    "fruit-item-id":"2002-1",
    "qty": "2",
   }],
  "entities": [
   {
    "fruit-item-id": "2001-1",
    "description": "Fuji",
    "current-price": "3.01",
    "currency": "USD",
   "fresh-until": "2015/07/16"
  },
  {
    "fruit-item-id": "2001-2",
    "description": "Honeycrisp",
    "current-price": "1.69",
    "currency": "USD",
    "fresh-until": "2015/07/17"
 },
 {
    "fruit-item-id": "2002-1",
    "description": "Mandarin",
    "current-price": "1.88",
    "currency": "USD",
   "fresh-until": "2015/07/19"
  },
  {
    "fruit-item-id": "2002-2",
    "description": "Clementine",
    "current-price": "1.98",
    "currency": "USD",
    "fresh-until": "2015/07/13"
 }
  ],
  "extra-stuff": {
    "cache-key": "foo",
    "other-data": "bar",
   "timestamp-utc": "..."
 }
}

Jordan Schatz

unread,
Jul 7, 2015, 5:56:44 PM7/7/15
to clo...@googlegroups.com
Possibly less  computer science-y then you had in mind, but take a look at Prismatic's Schema https://github.com/Prismatic/schema Graph: https://github.com/Prismatic/plumbing#graph-the-functional-swiss-army-knife and Fnhouse: https://github.com/Prismatic/fnhouse

Your mentioning of handling nulls brings to mind the maybe monad as well.

And I also like this handy little utility function:

(defn get->> [data pattern]
  (reduce #(%2 %1) data pattern))

(defn refract [data pattern]
  (if (map? pattern)
    (into {} (map #(vector (first %) (get->> data (second %))) pattern))
    (into [] (map #(get->> data (if (vector? %) % (vector %))) pattern))))

Given data like you have above, and a vector, or map of vectors, of fns, return the results of the fns to the data. 

- Jordan


--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages