Yeah Egon has some great ideas and reading his code replies to my previous posts is where I noticed this particular opportunity.
I agree that there are a lot of approaches, to dealing with unstructured data.
My intent was more narrow and was intended to show how easy it is to reimplement something akin to hasOwnProperty from the JS world in Go.
This is not to say that either approach mentioned by Egon is in anyway deficient, just the particular use case of needing an object or value at some arbitrary depth in unstructured data does not appear to be addressed any where that I have been able to find.
I showed an example using permissions since that was the thing that was buggered up when I started, but consider a different use case for a moment.
When you store a document in Mongo you tend to store the whole document, these things do not lend themselves to relational structuring very easily and can get heavy if you're not careful.
Again I'm not addressing the stupidity of the idea of trying to store *.world+dog , but merely, what do you do when you are presented with a need to extract something particular from an unstructured document containing dog+world and at completely arbitrary depth? You can say things like "just make an struct that addresses it" or "you shouldn't do it that way in the first place!". But that's not helpful when what you have has been handed to you and your job is just to make it work.
Use case...
There are custom forms created using a custom tool that runs clientside, built on top of angular.
Any form, can contain any field those fields can and do contain subforms to any arbitrary depth.
This allows for the creation of business forms that meet regulatory compliance by stringing them together from other forms that also meet compliance. In otherwords the person creating the forms does not want to have to re-invent the wheel. To their mind they are just adding new pages to existing forms.
These were persisted wholesale in node and now the "legacy" node app is being replaced, we need to ensure that anything that node may have returned previously is also returned by go.
In this particular case, the User has a custom reporting system that needs to extract data from field z of subform y of form x, which itself is embedded in form n (and n could be embedded in l etc).
What you have as far as information about the thing you need to return is a document id generated by mongo during the last upsert, and a command coming from the browser that looks like
[query: {collection: 'forms', _id: 100, path:'n.x.y.z' },...]
The server neither knows, nor cares about the type of data stored there, it's job is to isolate requested field(s), extract the data from those fields and pass it back to the caller assuming they had permission to read it :).
This was handled in node with nothing more than(pseudocode follows)
document = db.fetchOne(query._id)
if(document.hasOwnProperty(query.path)){
res.write(JSON.stringify(document[query.path]))
}else{
res.send(404)
}
I can't fathom what that would need to look like in Go, using the recommended or idiomatic approach, but replacing that lovely little tidbit was a task, way up high on my todo list.
The replacement code now looks about the same
(again, psuedo code here)
collection.Find(bson.M{'_id': query.ID}).One(&document)
section := json.HasOwnProperty(query.path,&document)
if section !=nil {
fmt.Fprintf(w, "%s",json.Marshall(&document))
}else{
http.Error(404)
}
Obviously the "correct" answer here is to dedup and normalize the data , however the goal of the project is to unify disparate information systems and to do so while having the smallest impact possible to existing systems all the while maintaining current levels of data integrity. Easiest path to maintaining data integrity is to leave legacy data laying where it sits while gradually phasing in replacment front ends and ensuring new data ends up stored in a more coherent fashion. Eventually the old data ends up in long term storage anyways, but the retention period on some of these records is 10 years before they go to archives.
I'm in a position where I didn't make the data that I'm expected to deal with, but I have to ensure that every user of that data has at least the same quality of experience they do now.
On the other hand I have no desire to take 1,000 lines of node and convert it into 100k lines of Go just to cover every edge case.
My instincts which are informed by over a decade of coming in and cleaning up after 1st year college freshman, H1Bs who couldn't spell SQL without "My" in front of it and *shudder* PHP devs, tell me that things like I've described tend to be the norm after any "super pumped, agile, web 2.0, html11, kitchen sink" project reaches more than about 6 months of development.
I have a feeling that as node and other "web server" tech becomes supplanted by Go, things like this are going to come to light pretty frequently, and my goal with this posting was to leave a trace for anyone in the future bumping up against similar things. Hope it's helpful to someone.