Abstraction for basic storage and search services

28 views
Skip to first unread message

Nils Henrik Lorentzen

unread,
May 21, 2019, 3:22:20 PM5/21/19
to Eclipse MicroProfile

Hi, I am not sure this is the right place to ask but I have been lately looking for projects that standardise cloud environments for Java, and this might be it.

Background: For hobby and learning, I have been working on a simple sales portal project with faceted search like Best Buy or Craigslist or any other online shopping or ad portal.
Indexing/faceted search in Elasticsearch or Lucene (local test), ads or articles info in S3 or local file system.

Have somewhat reasonable and simple abstractions for these so that could move to other cloud provider or search engine.

Full-text/attribute search with facets and cheap object storage are two common scenarios that many cloud apps would use, storing an ad or item with photos in S3 or Blob Storage is very cheap compared to other options.

So for the question: are such among APIs that could be standardised or is it widely out of scope? Are you mostly concerned about the runtime environment/programming model?
Perhaps storage APIs also should be standardised but in a different umbrella project.

My thinking would be to cover > 75% of the use cases, even if Elasticsearch and similar (Solr, Algolia?) have very powerful aggregation features beyond faceted search. Covering most cases through standardised simple APIs does not preclude calling these services directly for difficult cases.  

For object storage, there is at least Amazon S3 and Azure Blob Storage, they have slightly different characteristics where it seems to be possible to do optimistic concurrency (HTTP If-Match) in the latter, not in the former. So there are slight differences that might make common abstractions difficult, unless one goes for lowest common denominator. 

Into details here now but my thinking is, if I am to implement a cloud service with storage, I will have highlevel requirements like "It should be cheap, does not have to be relational and does not require atomicity". If a Java standardised cloud environment provide an API for that and point to the implementing services (Azure, AWS, Google, ...) that would make things easier.

Issue is could result in too many such APIs, as was brought up in prior post here. However given the varied and specialized nature of cloud features, it seems difficult to avoid.

Key to help the situation could be a combined tutorial/navigator for narrowing down what one is looking for, with task/use case/example based info like ("I want to do Xyz", eg. search and categorisation, like a sales portal. Or store a large amount of data cheaply. Or process a lot of realtime data statelessly before storage). For each API, how would that be used in some familiar use case application implemented in the cloud? Eg. how can the metrics spec be utilised in a sales portal?

With introduction though real world examples, Java standardised cloud APIs could be a starting point for learning about the various cloud features from application viewpoint. An "IKEA kitchen planner" for ones cloud app though the lens of standardised APIs, if you will.
Then one looks for providers, rather than first trying to navigate the jungle of providers and their opaquely-named services and try to make sense of them, thereafter look for how to avoid vendor lock-in.

This email was a bit all over the map but hope the gist of it is understandable.

Regards,
Nils Henrik Lorentzen

Nils Henrik Lorentzen

unread,
May 22, 2019, 3:39:24 AM5/22/19
to Eclipse MicroProfile

Hi again, answering myself here but after thinking about it, I'd argue that basic APIs for storage is something that one could consider standardise over time, there is after all JPA for relational databases. JPA does not cover all of SQL but covers CRUD and least common denominator of query functionality.

I might see if I can refine the API for freetext/attribute/faceted search and submit it as a propasal, it could of course also be useful outside of microservices (ie. a monolithic application in the cloud, or even deployed on-premise).
Generally speaking, it works a bit similar to JPA, one annotates Java beans with whether a field contains freetext information, an attribute is facetable (like Price/Make and Model/Odometer for cars on Craigslist).

Now for NoSQL stores and indexing, there are more scenarios to cover due to the various cost/speed/feature tradeoffs among stores but on the other hand, NoSQL stores are way less complex than relational databases so each API would be simpler. JPA has quite the learning curve. Also one could share some interfaces, CRUD for a document DB and an indexing search engine could use the same method signatures for storing (possibly annotated) Java bean POJOs.

Regards,
Nils Henrik Lorentzen

Nils Henrik Lorentzen

unread,
May 23, 2019, 2:53:59 AM5/23/19
to Eclipse MicroProfile

Never mind, I found jnosql.org which is also under JakartaEE umbrella and filed issues on their github project for this.

This salesportal project of mine could perhaps be made into a nice demo for JakartaEE micro services as it is a well known use case, but not priority right now.

Regards,
Nils Henrik Lorentzen
Reply all
Reply to author
Forward
0 new messages