Hi Abhishek,
thanks. I think it would be helpful, if we can meet or have a call. as we are moving forward, facing some issues/questions :)
Putting most of the latest one here :)
1. For creating CustomCashingFS, I had to override some of the abstract CachingFileSystem class methods as well.
For example getFileStatus because the PrestoCachingS3FS uses PrestoS3FileSystem to handle S3 scheme for Presto.
but for custom scheme, few things were different so, had to do it.
I hope that should not be a problem or you see any problem in this approach?
2. "Presto Rubix integration depends on what remote file system you are using. For S3, Presto doesn't allow to override fs.s3.impl configuration."
What does this exactly mean? we dont use S3 proeprty, but its for custom scheme ... fs.<custom>.impl , so that should be ok?
3. Is there an easier way to test the Rubix with Presto in local apart from running all the daemons in separately in local?
4. For now Im testing in local with separate daemons for Presto,
So, when trying to run following daemons in local, and facing some issues:
a. bookkeeper master mode
b. bookkeeper non-master mode
c. locadata transfer
d. Presto ( local mode includes coordinator and worker in same JVM )
e. Hive Metastore Service ( Thrift server )
f. Derby DB in network mode for hive metastore db.
Issue with bookkeeper daemons :
Not sure If Im missing something something,
but this is the case: Bookkeeper master starts at say port 8899 based on this (hadoop.cache.data.bookkeeper.port)
public static int getServerPort(Configuration conf)
{
return conf.getInt(KEY_SERVER_PORT, DEFAULT_SERVER_PORT);
}
and when Bookkeeper worker starts it first tries to connect to HeartbeatServer on the same port as using same variable (KEY_SERVER_PORT).
and then tries to start the server using the same variable (KEY_SERVER_PORT), so find it difficult to start both master and worker in local machine.
Please let me know if this is possible to change some property, to start in local?
3. If Bookkeeper master is down, I see the Bookkeeper worker node doesn't start as its trying to do connect to Bookkeeper.master.heartbeat server.
and until successful its not able to start the server.
So, it means if the Bookkeeper.master.heartbeat server is down, then we cant start Bookkeeper.worker.
This I faced in the local, because of the port issue I cant start both the services, and therefore my query requests always were getting data from remote FS instead of getting from local/rubix.
thanks,
Manish