Hi Supun and all,
Currently following Twister2 component use disks:
Checkpointing:
CheckpointManager in JobMaster saves checkpoint references to persistent storage.
CheckpointingClient in workers save checkpointed data to persistent storage.
Config parameters in checkpoint.yaml: twister2.checkpointing.store.fs.dir and twister2.checkpointing.store.hdfs.dir
TSet:
When TSets can not fit into the memory, they are spilled over to the disk.
This happens only in workers not in JobMaster.
Local/nfs/hdfs are supported.
Config Parameters in data.yaml: twister2.data.fs.root and twister2.data.hdfs.root
Networking:
When the data cannot fit into the memory during communication with other workers, networking component can save the data into the disk temporarily and continue execution.
Config Parameters in network.yaml: twister2.network.ops.persistent.dir
Logging:
logs of workers are saved into logs directory under the working directory.
Maybe we can define two types of storage:
* persistent
* volatile
Persistent storage can be resilient to worker failures.
Volatile storage can be available only during the lifetime of workers
Checkpointing and loggers can save the data into persistent storage.
TSet and Networking components can save into the volatile storage.
Persistent storage can be NFS, HDFS, S3, etc.
Volatile storage can be the local disk of the node. By default, it can be /tmp directory.
So, users can set two configuration parameters: one is for the persistent storage, and the other is for the volatile storage.
what do you think?
Ahmet