Hi Team, 
I have configured alluxio with HDFS cluster, and running TPC-DS benchmarking comparing HDFS Vs Alluxio on Hive and Spark. But I don't see any performance gain while accessing through alluxio in fact some queries are under performing with Alluxio for both Hive and Spark. 
I am using alluxio 2.9.5 and Hadoop 3.3 
I am testing this with the disaggregated setup of Hadoop workers Vs Alluxio as follows 
HDFS test:
Hadoop DataNode and NodeManager are running on separate nodes 
 Vs 
Alluxio test:
Alluxio workers are co-located with Hadoop Nodemanagers(compute layer) and hadoop  Data node isolated from node manager.
As these are TPC-DS queries, they are mostly read heavy jobs. 
- Running with 1TB TPC-DS dataset
- Alluxio configured with CACHE and ASYNC_THROUGH as read and write types. 
- Tried clearing OS buffer cache as it mentioned in other docs but no luck
- Alluxio configured only with memory as a cache and it has 30% of free cache available at any given point. 
- As per metrics, cache hit is happing as expected for subsequent queries 
- Added following properties to see if it makes any difference but no luck 
alluxio.user.ufs.block.read.location.policy=alluxio.client.block.policy.DeterministicHashPolicy
alluxio.user.ufs.block.read.location.policy.deterministic.hash.shards=3
alluxio.user.file.persistence.initial.wait.time=-1
alluxio.user.file.persist.on.rename=true
alluxio.master.persistence.blacklist=_temporary
I am missing anything here?
Regards
Vinod Gundala