报告一个可能的Bug

52 views
Skip to first unread message

D.Y Feng

unread,
Mar 6, 2014, 4:00:03 AM3/6/14
to dpark...@googlegroups.com
DStream里面的self.rememberDuration不应该为None,应该为0,WindowedDStream的parentRememberDuration也应该把if去掉,否则这个时间间隔就不能有效传递到parent那里

因为我改动了不少其他地方,下面行号仅供参考

@@ -44,7 +52,7 @@
         self.outputStreams = []
         self.zeroTime = None
         self.batchDuration = batchDuration
-        self.rememberDuration = None
+        self.rememberDuration = 0


@@ -319,7 +345,7 @@
         self.dependencies = []
 
         self.generatedRDDs = {}
-        self.rememberDuration = None
+        self.rememberDuration = 0
         self.mustCheckpoint = False
         self.checkpointDuration = None
         self.checkpointData = []

@@ -665,7 +711,7 @@
 
     @property       
     def parentRememberDuration(self):
-        if self.rememberDuration:
+        # if self.rememberDuration:
             return self.rememberDuration + self.windowDuration


--
                                                       

DY.Feng(叶毅锋)
yyfeng88625@twitter
Department of Applied Mathematics
Guangzhou University,China
dyf...@stu.gzhu.edu.cn
                                                       

田忠博

unread,
Mar 6, 2014, 4:45:42 AM3/6/14
to dpark-users
rememberDuration = None代表没有设置记住的duration,如果想只记住最近的几个,可以ssc.remember(duration)



--
You received this message because you are subscribed to the Google Groups "DPark Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dpark-users...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

D.Y Feng

unread,
Mar 6, 2014, 5:28:01 AM3/6/14
to dpark...@googlegroups.com
额,那就是说如果没有设置ssc.remember的话,WindowedDStream的上游就不会记住windowDuration时间段以内的东西了,这不大合理吧。。。因为每次rdd过后都会forgetOldRDDs的
Reply all
Reply to author
Forward
0 new messages