Hi
I have 2 dataframes representing cassandra tables
Dataframe A
root
|-- d_date: string (nullable = false)
|-- m_d_id: string (nullable = false)
|-- m_b_key: integer (nullable = false)
|-- d_next_date: timestamp (nullable = false)
Dataframe B
root
|-- d_id: string (nullable = false)
|-- b_key: integer (nullable = false)
|-- timestamp: timestamp (nullable = true)
|-- version: string (nullable = true)
select * from A a join B b where a.m_d_id=b.d_id and m_b_key=b.b_key
Explain plan as expected
Project[ cols ....]
Cassandra Direct Join [m_d_id=d_id, m_b_key=b_key]... Pushed {}
..
..
If I want to filter with constant values
select * from A a join B b where a.m_d_id=b.d_id and m_b_key=b.b_key and b.timestamp>to_date('2020-10-22') and b.timestamp< o_date('2020-10-23')
Explain plan as expected
Project[ cols ....]
Cassandra Direct Join [m_d_id=d_id, m_b_key=b_key] ... Pushed {("timestamp" > ?:2020-10-22 00:00:00.0), ("timestamp" < ?:2020-10-23 00:00:00.0)}
..
..
Now if I want to make same filter conditions dynamic for each partition with direct join
select * from A a join B b where a.m_d_id=b.d_id and m_b_key=b.b_key and b.timestamp> a.d_date and b.timestamp< a.d_next_date
Here is where it goes wrong
Project[ cols ....]
Cassandra Direct Join [m_d_id=d_id, m_b_key=b_key]... Pushed {}
..
..
Query returns empty results and also i dont see in explain plan filters being pushed down
What would be write way to filter data across paritions by different filter conditions ? I dont want to read whole partition and then filter in memory.
Regards
Amol