I am considering developing a solution for auditing accounting information that involves millions of tax documents.
The structure should allow for each accounting document to go through a rule engine maintained by the business area and generate alerts.
The architecture would be:
- Airflow to check when a tax document arrives.
- Pre-processing of the tax document data by querying the data.
- These data are saved in Kafka.
- Kafka receives the events and sends them to Kogito.
- Kogito would have a series of rules to be executed in parallel and series. The idea is that the user could potentially modify some rules.
- Kogito would have a series of rules to be executed in parallel and series. The idea is that the user could potentially modify some rules.
- After the rules are processed, Kogito sends the output back to Kafka’s consumer.
- Kafka saves it in a non-relational database.
- An alert application reads from the non-relational database.
I have some questions about using Kogito and architecture:
- Does using Kogito make sense in this big data architecture?
- Is the idea of allowing users to change the rules viable?
- How can the input data be parameterized for Kogito?
- Do you have any suggestions for the architecture?