Very few points show error as large as 100 W/m2: 50% of values are within +/- ~10 W/m2 and 90% are within +/- ~30 W/m2. I think that's pretty decent accuracy overall.
You could try using a different model (instead of 'king') and see if the results improve, but I doubt it will help much with these large-error points. The problem is not the model, but the fact that the sensors are momentarily experiencing different conditions, and so it might be cloudy according to the DNI sensor but sunny according to the POA sensor, or vice versa. The attached plot shows an example where the POA sensor reports a dip, but the DNI and DHI are stable. The transposition model cannot be expected to be accurate in the case when the irradiance measurements are inconsistent with each other.
error = poa_irr['poa_global'] - df['POAirradiance']
fig, axes = plt.subplots(2, 1, sharex=True)
df[['DNI', 'DHI', 'POAirradiance']].plot(ax=axes[0])
error.plot(ax=axes[1])
axes[1].set_ylabel('Difference (modeled - measured) [W/m2]')
Best,
Kevin