--
You received this message because you are subscribed to the Google Groups "pygrametl-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pygrametl-de...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pygrametl-dev/a111c777-2e21-4152-ba4b-d47d3ea19c4a%40googlegroups.com.
Ray Goel | Software Engineer in Test
M: 647-906-5569
tbl = dtt.Table("browser",
"""
| bid:int (pk) | browser:text | os:text |
-----------------------------------------
| -1 | Unknown | Unknown |
| 1 | Firefox | Linux |
""")
tbl.ensure()etl.runETL() # this could be pygrametl or any other ETL tool
expected = tbl + "| 2 | Chrome | Windows |"
expected.assertEqual()Thanks for the quick response.Since ETL pipeline testing is a comparatively newer field, we are also figuring out the kind of tests we want to write for the pipeline.However, some of the areas we identified are as follows:
- Unit tests for each component(function) of the Transform stage - This is handled by the devs using Pytest
- Integrations tests to make sure the different components interact with each other as expected. Each transform component we have writes data either to a BQ table or a MySQL table. Tests to make sure the schema of the table being written to is correct and a quick sanity check of the data in these tables. (We are trying to implement DBT for this, however a python library which allows testers to assert expected data here would help. This is where I was thinking of putting pygrametl to use)
- Feed all these test to a continuous testing pipeline which will mimic the ETL process as it is in production. (Airflow with the tests integrated at each node).
If pygrametl can be used to develop a framework to incorporate tests like these, it would be first of its kind and would be really helpful!Let me know if you guys need any more information. And it would be great to be a part of the development if there is a framework being developed.Thanks,
Ray
On Mon, Dec 9, 2019 at 8:32 AM Christian Thomsen <c...@cs.aau.dk> wrote:
--Hi,Thank you for your interest in pygrametl.The current version of pygrametl has no direct support for testing, but since it is code-based, it can be used with existing Python unittest frameworks.We are, however, working on a simple framework where you easily can define the pre- and post-conditions of the database tables and test that the post-condition holds after the ETL process finishes.Is it something like this you are looking for? If you have ideas about how programmatic ETL testing should be done, we would be happy to hear about them.Best regards,Søren Kejser Jensen and Christian Thomsen
You received this message because you are subscribed to the Google Groups "pygrametl-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pygrametl-dev+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pygrametl-dev/a111c777-2e21-4152-ba4b-d47d3ea19c4a%40googlegroups.com.