Hi Yiannis,
There are a few possible approaches to defining a custom index, depending on the kinds of queries you are trying to optimize.
The C++ index API is defined by the base class TableIndex in src/ee/indexes/tableindex.h. This allows "arbitrary implementations" to be used by the existing index scan and nest loop index join executor code.
Selection of index implementations is determined in src/ee/indexes/tableindexfactory.cpp. This is where a "fork" would have to be inserted that detected that the custom index is the most suitable for a given index definition.
src/ee/indexes/indexkey.h defines classes for "index key" structures which are all variations on a tuple of values.
The common API to these classes is also implicitly part of the index API. Depending on your needs, new key structures may need to be defined.
That covers one basic facet of custom index implementation. Here are a few others:
-- Custom indexes typically optimize custom SQL functions which have to be supported both on the C++ execution engine and in the java SQL parser.
There is a framework for defining new sql functions and a recipe described at:
https://github.com/VoltDB/voltdb/wiki/Implementing-sql-functions
-- Some thought should be given to indexed column and expression types.
VoltDB's type system is not easily extensible, so there is a great advantage to storing your indexed values using existing types like INTEGER, VARCHAR, VARBINARY, or some combination (using multiple columns/arguments in parallel).
-- In some cases, it may be necessary to change the java DDL parser and compiler --
other cases may be covered by the existing code for defining indexes on functions/expressions of columns and constants.
-- In some cases, it may be necessary to change the java query planner -- other cases may be covered by the existing code for applying a given index to a given query.
Spatial indexing is a topic of interest to the VoltDB team, but we have not yet committed to a development/delivery schedule.
My initial take on it is that it would not strictly require support for new TYPES, but WILL likely (at least eventually) require some support in all of these other code areas.