Hi
I am using hive tap as source and sink in cascading flow, there is no pipe level operation. Just copying data from one table to another. In sink table, I have observed few records missed. I have tested it against two different tables.
Table 1.
source count : 2032472
sink count : 2032472
I have a unique column in table. 2 records missed in sink. Counts for source/sink are same because other 2 records got duplicated.
Table 2.
source count: 20607550
sink count: 20607550
38 records missed in this case.
Everything works fine without combineInput.
Is there anything that I need to add while using combinedInput??