Loading Demo Data through API instead of database inserts

22 views
Skip to first unread message

Paweł Nawrocki

unread,
Feb 2, 2017, 10:55:54 AM2/2/17
to OpenLMIS Dev
Hello everyone!

Recently we've noticed some issues with demo data being invalid in some points from business point of view - it doesn't breach any database constraints, but it contains invalid entity state (example: submitted requisitions having null or invalid price per pack in their items). This leads to some errors on interacting with the data, and difficulties with changing it (one change often needs to be introduced in several places).

To avoid such problems in the future, we've came up with an idea to allow loading demo data through API, so the resources would have to pass required validation, preventing invalid data being inserted into the database. The obvious downside is that we would not be able to make it a part of service's build process, though at this moment we're not doing it anyway (and I guess we ultimately do not want this to be loaded always and by default), so the ability to prevent invalid data from being inserted (which seems to be happening alot) seems much more valuable for development (and further usage).

The main idea in my mind is to make it a separate project, that would be ran in it's own container, with the whole demo data inside, and would make all required API calls to insert it, after the other services are booted. It would be attached to Blue's docker-compose file and inject the data once each service has started. This way we'd separate the demo data logics (which is currently copy-pasted in each service) from services themselves, and also be able to create some kind of Jenkins task, that would check the data validity automatically.

Best Regards,
Paweł

Darius Jazayeri

unread,
Feb 2, 2017, 11:25:18 AM2/2/17
to Paweł Nawrocki, OpenLMIS Dev
In my experience on OpenMRS, loading a sufficient amount of demo data via the API is too slow. (I'm talking about the "as a user who downloaded the software and chose 'yes I want demo data' in the installer" story, so I don't know if it's 100% relevant here.)

One approach is to continue to define the demo data as a SQL script, but to have a CI plan that runs the application-level validation on all data after it has been inserted.

-Darius

--
You received this message because you are subscribed to the Google Groups "OpenLMIS Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openlmis-dev+unsubscribe@googlegroups.com.
To post to this group, send email to openlm...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/openlmis-dev/b21793f7-93e3-47b0-931e-a103ccefd63d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--

Darius JazayeriPrincipal Architect - Global Health
ThoughtWorks

Josh Zamor

unread,
Feb 2, 2017, 12:32:47 PM2/2/17
to Darius Jazayeri, Paweł Nawrocki, OpenLMIS Dev
Hi Darius, do you have a ballpark of how much demo data this was and how slow too slow was for OpenMRS?  And how does OpenMRS validate it in CI?  

Best tool we currently have in CI for this is Contract Tests, though there are a number of edge cases I can imagine that those can’t currently catch based on the Service’s code.  e.g. a Facility is no longer inserted in Reference Data Service, though SQL is still inserting a Requisition in an authorized status for that Facility in Requisition Service, I doubt the Requisition Service is re-checking the validity of that invariant (the facility it was created for) after the Requisition has been created. 


To unsubscribe from this group and stop receiving emails from it, send an email to openlmis-dev...@googlegroups.com.

To post to this group, send email to openlm...@googlegroups.com.

Darius Jazayeri

unread,
Feb 2, 2017, 2:25:10 PM2/2/17
to Josh Zamor, Paweł Nawrocki, OpenLMIS Dev
Actually OpenMRS creates demo data via API calls, which we now regret. It adds ~10 minutes to the installation when someone chooses to add demo data.

The OpenMRS API has a Validator class for each domain object, so you can programmatically say "validate everything".

If we were to take this approach in OpenLMIS, we'd want each service to support an endpoint like "POST .../validate-data", which goes through and validates all the data owned by that service, and reports anything invalid.

Josh Zamor

unread,
Feb 2, 2017, 2:46:42 PM2/2/17
to Darius Jazayeri, Paweł Nawrocki, OpenLMIS Dev
Agreed, I think entities validating themselves have a number of uses beyond just demo data - I’ve used that pattern before and quite liked it.  Pawel maybe you could comment on how far away you think the Requisition service is from being able to do that - at will validate every Requisition,  line item, every template, etc is all valid (fields, business logic, cross-service references, etc)?  We can assume we’re talking post-3.0 here.

To throw out another idea, what about a container that loads the demo-data via API (a running Blue), generates a SQL snapshot (pgdump on data), and publishes an image with that SQL that could load directly into the DB?  I don’t believe we’re using sequences anywhere, so perhaps it’d be less brittle and more performant.

Łukasz Lewczyński

unread,
Feb 3, 2017, 5:48:46 AM2/3/17
to openlm...@googlegroups.com

Hi,

Current demo data is very hard to read and find out why some things does not work in the system. JSON files do not represent class object but row in a table (differences in the properties names). For example to find out that a user did not have a correct rights I had to went thought several files, copied id's of objects, tried to find a correct objects in another demo data file and etc etc. I think I lost 30/60 minutes on that. That is my two cents.

Regards,
Łukasz

Łukasz Lewczyński
Software Developer
llewc...@soldevelo.com

SolDevelo Sp. z o. o. [LLC]
Office: +48 58 782 45 40 / Fax: +48 58 782 45 41 Al. Zwycięstwa 96/98 81-451, Gdynia
http://www.soldevelo.com

Place of registration: Regional Court for the City of Gdansk KRS: 0000332728, TAX ID: PL5862240331, REGON: 220828585, Share capital: 60,000.00 PLN

To unsubscribe from this group and stop receiving emails from it, send an email to openlmis-dev...@googlegroups.com.

To post to this group, send email to openlm...@googlegroups.com.

Darius Jazayeri

unread,
Feb 3, 2017, 11:12:47 AM2/3/17
to Łukasz Lewczyński, OpenLMIS Dev
To summarize at a high level:
  • Demo data should be valid/consistent, and we need to do something to ensure this is the case
  • The process for creating demo data should be easy for devs to understand and update
  • When we have official releases of OpenLMIS, loading demo data should be fast
  • Being able to validate any (or all) domain objects at any time is useful (independently of demo data)
---

There could be many ways to solution this. One approach is:
  • Generate demo data in code, calling the API (easy for devs, and guarantees consistency)
  • As part of the release process, export this generated demo data as a sql dump.
    • Technically this doesn't have to match the demo data that devs work with daily, so you could even just dump the staging/UAT/demo server's database, allowing for BA/POs to create demo data manually
    • Could do this more often than every release, via CI
  • In between releases devs can load demo data via code (may be slow, but this doesn't matter)
  • Independently, add a story to the backlog about validating data on the fly
-Darius

On Fri, Feb 3, 2017 at 2:48 AM, Łukasz Lewczyński <llewc...@soldevelo.com> wrote:

Hi,

Current demo data is very hard to read and find out why some things does not work in the system. JSON files do not represent class object but row in a table (differences in the properties names). For example to find out that a user did not have a correct rights I had to went thought several files, copied id's of objects, tried to find a correct objects in another demo data file and etc etc. I think I lost 30/60 minutes on that. That is my two cents.

Regards,
Łukasz

Łukasz Lewczyński
Software Developer
llewc...@soldevelo.com

SolDevelo Sp. z o. o. [LLC]
Office: +48 58 782 45 40 / Fax: +48 58 782 45 41 Al. Zwycięstwa 96/98 81-451, Gdynia
http://www.soldevelo.com

Place of registration: Regional Court for the City of Gdansk KRS: 0000332728, TAX ID: PL5862240331, REGON: 220828585, Share capital: 60,000.00 PLN

On 02/02/2017 08:46 PM, Josh Zamor wrote:
Agreed, I think entities validating themselves have a number of uses beyond just demo data - I’ve used that pattern before and quite liked it.  Pawel maybe you could comment on how far away you think the Requisition service is from being able to do that - at will validate every Requisition,  line item, every template, etc is all valid (fields, business logic, cross-service references, etc)?  We can assume we’re talking post-3.0 here.

To throw out another idea, what about a container that loads the demo-data via API (a running Blue), generates a SQL snapshot (pgdump on data), and publishes an image with that SQL that could load directly into the DB?  I don’t believe we’re using sequences anywhere, so perhaps it’d be less brittle and more performant.


On Feb 2, 2017, at 11:24 AM, Darius Jazayeri <djaz...@thoughtworks.com> wrote:

Actually OpenMRS creates demo data via API calls, which we now regret. It adds ~10 minutes to the installation when someone chooses to add demo data.

The OpenMRS API has a Validator class for each domain object, so you can programmatically say "validate everything".

If we were to take this approach in OpenLMIS, we'd want each service to support an endpoint like "POST .../validate-data", which goes through and validates all the data owned by that service, and reports anything invalid.
On Thu, Feb 2, 2017 at 9:32 AM, Josh Zamor <josh.zamor@villagereach.org> wrote:
Hi Darius, do you have a ballpark of how much demo data this was and how slow too slow was for OpenMRS?  And how does OpenMRS validate it in CI?  

Best tool we currently have in CI for this is Contract Tests, though there are a number of edge cases I can imagine that those can’t currently catch based on the Service’s code.  e.g. a Facility is no longer inserted in Reference Data Service, though SQL is still inserting a Requisition in an authorized status for that Facility in Requisition Service, I doubt the Requisition Service is re-checking the validity of that invariant (the facility it was created for) after the Requisition has been created. 
On Feb 2, 2017, at 8:24 AM, Darius Jazayeri <djaz...@thoughtworks.com> wrote:

In my experience on OpenMRS, loading a sufficient amount of demo data via the API is too slow. (I'm talking about the "as a user who downloaded the software and chose 'yes I want demo data' in the installer" story, so I don't know if it's 100% relevant here.)

One approach is to continue to define the demo data as a SQL script, but to have a CI plan that runs the application-level validation on all data after it has been inserted.

-Darius
On Thu, Feb 2, 2017 at 7:55 AM, Paweł Nawrocki <pnawrocki@soldevelo.com> wrote:
Hello everyone!

Recently we've noticed some issues with demo data being invalid in some points from business point of view - it doesn't breach any database constraints, but it contains invalid entity state (example: submitted requisitions having null or invalid price per pack in their items). This leads to some errors on interacting with the data, and difficulties with changing it (one change often needs to be introduced in several places).

To avoid such problems in the future, we've came up with an idea to allow loading demo data through API, so the resources would have to pass required validation, preventing invalid data being inserted into the database. The obvious downside is that we would not be able to make it a part of service's build process, though at this moment we're not doing it anyway (and I guess we ultimately do not want this to be loaded always and by default), so the ability to prevent invalid data from being inserted (which seems to be happening alot) seems much more valuable for development (and further usage).

The main idea in my mind is to make it a separate project, that would be ran in it's own container, with the whole demo data inside, and would make all required API calls to insert it, after the other services are booted. It would be attached to Blue's docker-compose file and inject the data once each service has started. This way we'd separate the demo data logics (which is currently copy-pasted in each service) from services themselves, and also be able to create some kind of Jenkins task, that would check the data validity automatically.

Best Regards,
Paweł


-- 
You received this message because you are subscribed to the Google Groups "OpenLMIS Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openlmis-dev+unsubscribe@googlegroups.com.
To post to this group, send email to openlmis-dev@googlegroups.com.



-- 

Darius JazayeriPrincipal Architect - Global Health
Telephone
+1 617 383 9369
ThoughtWorks

-- 
You received this message because you are subscribed to the Google Groups "OpenLMIS Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openlmis-dev+unsubscribe@googlegroups.com.
To post to this group, send email to openlmis-dev@googlegroups.com.



-- 

Darius JazayeriPrincipal Architect - Global Health
Telephone
+1 617 383 9369
ThoughtWorks
--
You received this message because you are subscribed to the Google Groups "OpenLMIS Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openlmis-dev+unsubscribe@googlegroups.com.

To post to this group, send email to openlm...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "OpenLMIS Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openlmis-dev+unsubscribe@googlegroups.com.
To post to this group, send email to openlm...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Mary Jo Kochendorfer

unread,
Feb 3, 2017, 3:00:03 PM2/3/17
to Darius Jazayeri, Łukasz Lewczyński, OpenLMIS Dev

I agree with the high-level summary points that Darius put together. I want to stress the importance of easy and reliable/valid way of changing/updating demo data. We will have many demos and will need to continue changing/updating the data to showcase different aspects/features of OpenLMIS.

 

Right now it is painful and difficult to figure out where the issues in the data are. For example, OMIS-1790 is a great case of how sometimes the readme documentation does not reflect the data. Identifying where the issue is quite challenging given the UUIDs and we don’t want to constantly spend so much time on troubleshooting the data. In conjunction with the approach we take for loading the data, I do think we need to create a process or instructions for ensuring all the associated data is created/implemented for each change.

 

In addition, we may eventually need different demo data sets. It would be great if our solution allowed for a way to easily story different sets of data so that we can swap them out with little effort.

 

HTH,

Mary Jo

-- 

To unsubscribe from this group and stop receiving emails from it, send an email to openlmis-dev...@googlegroups.com.


To post to this group, send email to openlm...@googlegroups.com.



 

-- 

 

Darius JazayeriPrincipal Architect - Global Health

Telephone

+1 617 383 9369

ThoughtWorks

-- 
You received this message because you are subscribed to the Google Groups "OpenLMIS Dev" group.

To unsubscribe from this group and stop receiving emails from it, send an email to openlmis-dev...@googlegroups.com.


To post to this group, send email to openlm...@googlegroups.com.


For more options, visit https://groups.google.com/d/optout.



 

-- 

 

Darius JazayeriPrincipal Architect - Global Health

Telephone

+1 617 383 9369

ThoughtWorks

--
You received this message because you are subscribed to the Google Groups "OpenLMIS Dev" group.

To unsubscribe from this group and stop receiving emails from it, send an email to openlmis-dev...@googlegroups.com.


To post to this group, send email to openlm...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "OpenLMIS Dev" group.

To unsubscribe from this group and stop receiving emails from it, send an email to openlmis-dev...@googlegroups.com.


For more options, visit https://groups.google.com/d/optout.



 

--

 

Darius JazayeriPrincipal Architect - Global Health

Telephone

+1 617 383 9369

ThoughtWorks

--

You received this message because you are subscribed to the Google Groups "OpenLMIS Dev" group.

To unsubscribe from this group and stop receiving emails from it, send an email to openlmis-dev...@googlegroups.com.


To post to this group, send email to openlm...@googlegroups.com.

Paweł Nawrocki

unread,
Feb 6, 2017, 5:31:09 AM2/6/17
to OpenLMIS Dev, djaz...@thoughtworks.com, llewc...@soldevelo.com
We need to keep in mind, though, that one of our initial expectations was to make demo-data be easy to update for non-devs, and creating it in code would make it really hard to achieve. The other expectation was to make it easily usable for API calls on real-time. That's why we've chosen JSON - it is both easy to learn, and a ready API call. So, we actually have (almost) ready data to make a set of API calls, at least to check it's validity. Although, I think that there's some work needed to allow services to test validity of all this data.  

To post to this group, send email to openl...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "OpenLMIS Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openlmis-dev...@googlegroups.com.

To post to this group, send email to openl...@googlegroups.com.


For more options, visit https://groups.google.com/d/optout.



 

--

 

Darius JazayeriPrincipal Architect - Global Health

Telephone

+1 617 383 9369

ThoughtWorks

--
You received this message because you are subscribed to the Google Groups "OpenLMIS Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openlmis-dev...@googlegroups.com.

To post to this group, send email to openl...@googlegroups.com.

Josh Zamor

unread,
Feb 6, 2017, 10:27:49 AM2/6/17
to Paweł Nawrocki, OpenLMIS Dev, djaz...@thoughtworks.com, llewc...@soldevelo.com
Paweł I wouldn't expect us to write a bunch of code to generate the JSON we have. Of course there'd be changes to align the JSON with the RESTful API, and eventually we also want to solve for the problem of making demo data easier to create for non-devs, but I don't think anyone is recommending we build a library that devs would use to build largely the same JSON we have for the purpose of data validity.
Reply all
Reply to author
Forward
0 new messages