Hello Shaun Forgie and the Kill Bill team,
Thanks for you reference. I already created kafka Consumer Plugin which using java based OSGI module. Currently, we are utilising this plugin for our internal use cases, and we believe it would be valuable to contribute this plugin to the wider community.
Lets come to each queries:
1. I would like to hear more about how your design works for more than one billing tenant, what sorts of validation is going to be applied to usage messages, the message formats and how errors are to be handled.
The solution should also be aware of tenant specific usage billing settings that relate to past billing periods and have rules about accepting forward dated transactions.
Our design follows this structure: Each service, particularly those whose subscriptions and accounts are associated with a specific tenant, pushes messages in JSON format similar to the push usage API. These messages include additional fields such as apiKey. Utilising the API Key, we retrieve the tenant, which require in setting up the PluginCallContext.
With the OSGIKillBillAPI interface, we can directly push this usage data using osgiKillbillAPI.getUsageUserApi().recordRolledUpUsage().
Here's an example JSON payload which look similar to Record usage for a subscription
{
"subscriptionId": "1b652c50-6ced-4335-bd6b-daf70a6c65e0",
"trackingId": "1",
"apiKey": "router",
"unitUsageRecords": [
{
"unitType": "txnCount",
"usageRecords": [
{
"recordDate": "2024-04-20T02:59:33.147",
"amount": 25
}
]
},
{
"unitType": "txnVolume",
"usageRecords": [
{
"recordDate": "2024-04-20T02:59:33.147",
"amount": 4000
}
]
}
]
}
Here's another aspect to consider: Other services responsible for pushing usage data only have knowledge of the external Subscription Key used to create the subscription. These services should not rely on the getSubscriptionByExternalKey API call to retrieve the subscription ID. It's more efficient for each service to manage its subscription information rather than relying on Kill Bill to fetch the subscription ID through an API call as our idea is to not depend on API calls.
Furthermore, we've made an enhancement where each service is treated as its own tenant. Therefore, instead of making an API call to retrieve the tenant ID, we pass the tenant ID directly from the service itself to reduce getTenant calls at the consumer end. However, this approach can be adjusted to include the API key if necessary.
During Kafka consumption, we first retrieve the subscription using the external Subscription ID. This step itself do the validation
that, that particular sub belong to tenant or not. and also if that tenant exist or not by throwing the UsageApiException, SubscriptionApiException and TenantApiException. If the tenant and the subscription is valid, we proceed to push the usage data directly using osgiKillbillAPI.getUsageUserApi().recordRolledUpUsage().
Since we're using the osgiKillbillAPI interface, it handles the tenant-specific usage billing settings seamlessly.
Here's an example of the JSON payload:
{
"externalSubKey": "Service1_merchantId_planName",
"trackingId": "1",
"tenantId": "9a652a50-6ced-4335-bd6b-daf70a6c65e0",
"unitUsageRecords": [
{
"unitType": "txnCount",
"usageRecords": [
{
"recordDate": "2024-04-20T02:59:33.147",
"amount": 25
}
]
},
{
"unitType": "txnVolume",
"usageRecords": [
{
"recordDate": "2024-04-20T02:59:33.147",
"amount": 4000
}
]
}
]
}
Now come to next query regarding
2. In terms of protecting the integrity of the billing data you will also need to address problems related to processing the same message more than once and dealing with the accidental resubmission of duplicate usage records into the kafka queue
Since if duplicate message come, there tracking id will remain same. So, if two message processing parallel, only one usage will be added. Hence idempotency will be maintained.
3. You will also need to support different run time topologies as more than one instance of Kill Bill can be running. This will mean potentially more than one consumer could be connected to kafka at any one time.
To address the query regarding multiple instances running in parallel and connecting to Kafka, let's delve into how consumers interact with Kafka. Each consumer is associated with a group ID, which automatically manages consumption across multiple instances. In the case of our KafkaConsumer Plugin, all instances will share the same group ID.
Now, consider there are three partitions for a topic, each containing its own set of messages. If there's only one instance available, all three partitions will be assigned to a single instance, effectively allowing one consumer to consume from all partitions.
However, if two consumers/instance are present. Since the consumer group id remain same, one partition will be assigned to one instance, while the other two partitions will be assigned to the second instance. Similarly, if there are three instances, each partition will be assigned to one instance. This distribution process is also known as horizontal scaling.
In the event that multiple instances are in operation, there's no issue because each message resides in only one partition. Even if the same message is inadvertently pushed twice into different partitions and consumed by two different consumers, only one instance will process it since they share the same tracking ID. This ensures that duplicate processing of the same message is effectively prevented.
![Screenshot 2024-05-13 at 6.22.44 PM.png](https://groups.google.com/group/killbilling-users/attach/b6c230016e724/Screenshot%202024-05-13%20at%206.22.44%E2%80%AFPM.png?part=0.2&view=1)
.
Hope, I explained all your queries. Please let me know if you have any query regarding this. And also let me know if we are ready to contribute.
Thanks and regards
Prashant Kumar