sequence = [3.6630, 2.1860, -1.6470, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]df = pd.DataFrame(sequence, columns=["x"])df["x_sum"] = df["x"].rolling(10).sum()df["x_npsum"] = df["x"].rolling(10).agg({"x": np.sum})df["x_npvol"] = df["x"].rolling(10).agg({"x": np.std})df["x_vol"] = df["x"].rolling(10).std()df["something_divided_by_vol"] = 1/df["x_vol"]print(df)
x x_sum x_npsum x_npvol x_vol \ 0 3.663 NaN NaN NaN NaN 1 2.186 NaN NaN NaN NaN 2 -1.647 NaN NaN NaN NaN 3 0.000 NaN NaN NaN NaN 4 0.000 NaN NaN NaN NaN 5 0.000 NaN NaN NaN NaN 6 0.000 NaN NaN NaN NaN 7 0.000 NaN NaN NaN NaN 8 0.000 NaN NaN NaN NaN 9 0.000 4.202000e+00 4.202000e+00 1.458427e+00 1.458427e+00 10 0.000 5.390000e-01 5.390000e-01 9.105647e-01 9.105647e-01 11 0.000 -1.647000e+00 -1.647000e+00 5.208271e-01 5.208271e-01 12 0.000 2.220446e-16 2.220446e-16 1.216675e-08 1.216675e-08 13 0.000 2.220446e-16 2.220446e-16 1.216675e-08 1.216675e-08 14 0.000 2.220446e-16 2.220446e-16 1.216675e-08 1.216675e-08 something_divided_by_vol 0 NaN 1 NaN 2 NaN 3 NaN 4 NaN 5 NaN 6 NaN 7 NaN 8 NaN 9 6.856701e-01 10 1.098220e+00 11 1.920023e+00 12 8.219124e+07 13 8.219124e+07 14 8.219124e+07
Now, row 12, 13, 14 should produce 0.0 for the sums and standard deviation computations. Instead it returns very small numbers. When I scale something by its standard deviation obviously, row 13, 14, 15 blow up. Btw., if I simply print("{0:.20f}".format(np.sum(np.array([0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0])))), I get 0.00000000000000000000.
Does anyone know what I am doing wrong here? I'm on a Windows 10 machine, with essentially the latest anaconda distribution. (python 3.5.1, pandas 0.18.1, numpy 1.10.4, numexpr 2.5.2, mkl 11.3.3)
best regards
Harm
--
You received this message because you are subscribed to the Google Groups "PyData" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pydata+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Thank you all,that makes things clear. I stumbled upon this when porting some old analysis code from another language to Python. I'll think about just rounding the rolling output then. Or maybe I will keep that part of the analysis out of pandas. I don't know yet. Just seems like a place where easy mistakes might creep into an analysis, if one isn't careful. Of course, one should always be careful; I know.
sequence = [3.6630, 2.1860, -1.6470, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
df = pd.DataFrame(sequence, columns=["x"])
df["x_sum"] = df["x"].rolling(10).apply(np.sum)
df["x_vol"] = df["x"].rolling(10).apply(np.std)
df["something_divided_by_vol"] = 1/df["x_vol"]
print(df)
x x_sum x_vol something_divided_by_vol 0 3.663 NaN NaN NaN 1 2.186 NaN NaN NaN 2 -1.647 NaN NaN NaN 3 0.000 NaN NaN NaN 4 0.000 NaN NaN NaN 5 0.000 NaN NaN NaN 6 0.000 NaN NaN NaN 7 0.000 NaN NaN NaN 8 0.000 NaN NaN NaN 9 0.000 4.202 1.383586 0.722760 10 0.000 0.539 0.863838 1.157625 11 0.000 -1.647 0.494100 2.023882 12 0.000 0.000 0.000000 inf 13 0.000 0.000 0.000000 inf 14 0.000 0.000 0.000000 inf
--
You received this message because you are subscribed to the Google Groups "PyData" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pydata+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.