How are we supposed to handle, what seems like wrong/inconsistent data.
For eg: In the inpatient charges data(csv), there's this row of data: 249 - PERC CARDIOVASC PROC W NON-DRUG-ELUTING STENT W/O MCC,500051,OVERLAKE HOSPITAL MEDICAL CENTER,1035-116TH AVE NE,BELLEVUE,WA,98004,WA - Seattle,23,44499,84499.26087
The last two cols, are Average Covered Charges , Average Total Payments respectively. So in this case avg covered charges is 44,999 but payments is almost double of that, 84,499.26. how is that possible?
I haven't gone through the whole dataset to test this, just saw this anomaly during some random observation with my visualization.