Greetings,
I manage an hourly data pipeline that fetches new orders and orders with an updated shipping date.
The filter looks something like:
CreatedDateUtc ge {now - 1hr}
and CreatedDateUtc lt {now}
or
ShippingDateUtc ge {now - 1hr}
and ShippingDateUtc lt {now}
where {now} is passed by our pipeline orchestration (airflow execution time specifically).
Our pipeline currently follows the odata.nextLink property and uploads the raw responses to s3 for processing. Based on our logs we have a small percentage of orders that are not retrieved when the CreatedDateUtc is applicable but are retrieved when the ShippingDateUtc filter is applicable. The logs will show that we parsed every page possible given our query/followed every nextLink until there wasn't one.
Currently I am wondering if:
- Something is wrong with the filter
- There is a delay in orders being applicable for the filter
We are working on converting this pipeline to use the export flag for new orders but until then I am looking for insight with the observed behavior.
Best,
Erik