Hi Thibaut,
Thanks for this great answer!
This is the case that I was looking for
> - the row should be removed from the pipeline if I already processed a row
> with the same value for a given field
But the other one interests me as well if you can provide an example
of that.
To answer your other question...
Yes! That is much clearer, thank you. I was playing around with the
order of things and it was confusing in that it would only change the
outcome in some cases.
This is a great project btw. It is extremely useful and I cannot
believe its not more well known. Its useful for more that just a DW,
in some cases I am just using it to import large csv files or
migrating from a legacy database.
Cheers!
On Apr 3, 3:55 pm, Thibaut Barrère <
thibaut.barr...@gmail.com> wrote:
> Hi Scott!
>
> Is there any way to check the uniqueness after other transforms have been
>
> > completed?
>
> There are plenty of situations and plenty of ways to do that actually!
>
> Can you elaborate a bit more on which result you'd like to achieve?
>
> Ex:
> - the row should be removed from the pipeline if there is already a record
> in the database with the same value of a given field
> - the row should be removed from the pipeline if I already processed a row
> with the same value for a given field
> - etc
>
> Let me know and I'll give you an accurate answer.
>
> > How do you know the order of execution of things?
>
> This is something that really needs a "guide" and this will be addressed
> (ie: control file lifecycle, what is executed when etc).
>
> Roughly:
> - there's a first pass (declaration time, when the control file is loaded)
> where the sources, transforms and destinations are declared (seehttps://
github.com/activewarehouse/activewarehouse-etl/blob/master/li...
> )
> - then there's a second pass handled by the "engine" which will fetch rows
> from the sources, go through the transforms etc, then to the destinations
> (seehttps://
github.com/activewarehouse/activewarehouse-etl/blob/master/li...
> )