Hi,
The Blueflood team is currently engaged in an endeavor to store as few items using the metadata cache as possible to get better performance.
The following proposal is to stop storing Units of metrics in the metadata cache and start using them from ElasticSearch (ES)
Current System:
Units are stored in the metadata cache (an in-memory cache that is later written into Cassandra) as well as Elastic Search.
When a metric is queried, the RollupHandler class invokes these methods as shown here which internally invoke Datapoints retrieval methods of AstyanaxReader. In those methods, Units are retrieved from Cassandra by AstyanaxReader class, for example here
Proposed System:
We retrieve Units from ElasticSearch for queries.
We include a reference to DiscoveryIO in the RollupHandler class. In the method, getRollupsByGranularity (that is used by all queries), we do not rely on the AstyanaxReader to retrieve Units. Rather, in the Rollup Handler class itself we use DiscoveryIO to search for the given metric and retrieve its Unit.
The code will look as follows. The getUnitFromES method gets Units for the given metric from ES.
protected MetricData getRollupByGranularity(
String tenantId,
String metricName,
final long from,
final long to,
final Granularity g) {
........
MetricData metricData = AstyanaxReader.getInstance().getDatapointsForRange(locator,
new Range(g.snapMillis(from), to),
g);
unit = getUnitFromES(tenantId,metricName);metricData.setUnit(unit);..........}
private String getUnitFromES(String tenantID, String metricName) {String unit = null;List<SearchResult> results = discoveryIO.search(tenantId, metricName);
for (SearchResult res : results) {
unit = res.getUnit();
break;
}return unit;
}
We are also considering the following two enhancements to this plan:
1. We can make the Units retrieval asynchronous to overlap with the retrieval of data points from Cassandra, by encapsulating it in a Callable.
2. We make the retrieval of units from ES configurable, so that ES is not a hard dependency for users of Blueflood.