I just read this pros/cons list of NATS vs STAN semantics. [0]
I'm not sure which approach best applies to my application, which is a
web-scraper (I know, I know, 😢) implemented similarly to an ETL
pipeline. I'm not even sold 100% on NATS. I need observability of each
stage in the pipeline, and while replay would be nice a lot of the data
in question is too large to fit in a 1MB message so will end up in
bucket storage anyway. Reply/request semantics at the edge might be
useful.
I was planning to use AWS EventBridge bound to SQS for each stage in the
pipeline, also binding to Kinesis Firehose then storing raw data in an
S3 bucket for long term analysis. I'm pivoting away from an AWS-specific
product to Kubernetes-native. I think I can have a similar experience
with either one of NATS or STAN seeing as it's a Pub-Sub system but can
be treated with queue semantics too.
I still want to know if anyone has experience or opinions as to whether
plain NATS or Streaming is better for initial development.
[0]
https://docs.nats.io/developing-with-nats-streaming/streaming