I am encountering some issues in optimizing the downstream flow simulation results using the nudging data assimilation function and was hoping someone could help. I am currently conducting an assimilation experiment with flow observation data from three stations in the watershed. Station 1 is the downstream site for which forecasts are required, while the upstream Stations 2 and 3 are used for assimilation. The layout is shown in Figure 1.
Experiment 1: The observed flow data from Station 2 were assimilated, and the simulated flow at the downstream Station 1 after assimilation (i.e., DA) was compared with the observed flow (i.e., obs) and the simulated flow without assimilation (i.e., sim). The results are shown in Figure 2. The assimilation of the observed data from Station 2 significantly improved the simulation results at the downstream Station 1.
Experiment 2: The observed flow data from Station 3 were assimilated, and the simulated flow at the downstream Station 1 after assimilation (i.e., DA) was compared with the observed flow (i.e., obs) and the simulated flow without assimilation (i.e., sim). The results are shown in Figure 3. Interestingly, however, the assimilation of the observed data from Station 3 actually caused the simulation results at the downstream Station 1 to deteriorate.
Experiment 3: The observed flow data from both upstream Stations 2 and 3 were assimilated. The results are shown in Figure 4. While the simulated results at the downstream Station 1 after assimilation showed improvement compared to those without assimilation, the degree of improvement was not as significant as in Experiment 1, where only the flow data from Station 2 was assimilated.
I attempted to simulate the flow at Station 2, with the results shown in Figure 5, and also simulated the flow at Station 3, with the results shown in Figure 6.


The simulated flow at the downstream Station 1 is higher than the observed flow, while the observed flow at the upstream Station 2 is lower than its simulated flow. Therefore, assimilating data from Station 2 improves the simulation at the downstream station. In contrast, the observed flood peak at the upstream Station 3 is higher than its simulated flow, and assimilating data from Station 3 instead exacerbates the error at the downstream station. This has left me quite puzzled.
I have the following questions:
First, why does the same set of parameters lead to overestimated simulated flows at Stations 1 and 2, but an underestimated simulated flow at Station 3?
Second, my understanding is that nudging data assimilation essentially treats the watershed above the assimilated station as a reservoir, where the simulated flow at that station is replaced with the observed flow—similar to adjusting the reservoir outflow—for use in downstream simulations. This approach only improves downstream simulation results if the bias trend (overestimation or underestimation) at the upstream station is consistent with that at the downstream station. Is this understanding correct? I look forward to discussing this further.
Third, I am currently working on using nudging data assimilation to optimize simulation results at downstream stations. What types of experiments could I set up for this purpose?
Thank you for your time and consideration. I look forward to your thoughts and to further discussion.
Zed Li