Bugs, bugs, bugs

0 views
Skip to first unread message

David Isaac Wolinsky

unread,
Sep 11, 2009, 9:19:09 PM9/11/09
to aci...@googlegroups.com
One thing that I was cautious in announcing was that there are some bugs
in Brunet. I have dealt with quite a few of them, but there are a few
remaining bugs.

To see the bugs I've squashed in the past few days please look here:
http://github.com/davidiw/brunet/commits/7d433f966ab243f3dce899b0267be1799971addc

Some bugs that remain:
- Nodes that use FunctionEdgeListener will not allow unclosed edges to
be garbage collected, see this
http://github.com/davidiw/brunet/commit/7d433f966ab243f3dce899b0267be1799971addc
- Node.Abort will not allow the node to be garbage collected ... Most
likely due, though not limited, to this logic:
http://github.com/davidiw/brunet/blob/7d433f966ab243f3dce899b0267be1799971addc/src/protocol/Node.cs#L729

I don't envision any of the bugs being too bad for existing systems, but
I strongly recommend you upgrade regardless.

Also note, that due to a change in queue size max:
http://github.com/davidiw/brunet/blob/7d433f966ab243f3dce899b0267be1799971addc/src/protocol/Node.cs#L341

Memory may swell on planetlab before the node is restarted, I
hypothesize the node may get up to 80 MB before restarting.

In the future, I am working on a "HeavyChurnSimulator" that I think
would have revealed most of these problems. I'll wrap it up in the near
future and I'd like to recommend it be used as the pre-planetlab defacto
standard for ensuring there are no leaks.

Also I'd like to recommend that individuals consider using Visual Studio
when debugging, it helped speed up this process significantly.
Especially since the mono tools have trouble keeping up with our
system. For example, I tried doing a "heap-shot" with Mono and before I
even executed my first "heap-shot" I was already storing 4 GB of
application profile data. Maybe I messed something up, but I
reattempted this many, many times. Also to relate to a conversation, I
had with Pierre regarding my "conversion" or rather usage of Visual
Studio. As a student who has many tasks, I sadly don't have enough time
to improve Mono. Forcing myself to debug and profile with it in lieu of
MS VS hinders my productivity. It is my experience that the MS
Debuggers and Profilers tend to be of a higher quality than Mono. Alas
if I had time (and desire), I would gladly work on improving Mono.

I'll be releasing updated GroupVPN binaries and GridAppliances.
Actually the former is already done, I will announce it in another
e-mail though.

I need sleep :).

Regards,
David

P. Oscar Boykin

unread,
Sep 13, 2009, 2:11:36 PM9/13/09
to aci...@googlegroups.com
This code looks good to me.

Should I pull it into the master branch?

more below:

On Fri, Sep 11, 2009 at 9:19 PM, David Isaac Wolinsky <dav...@ufl.edu> wrote:
>
> One thing that I was cautious in announcing was that there are some bugs
> in Brunet.  I have dealt with quite a few of them, but there are a few
> remaining bugs.

Why are you cautious? It's great to find bugs. We know they are
there, we just don't know where they all are or exactly how many.

What are the remaining bugs? What behaviors have you seen that cause them?

> To see the bugs I've squashed in the past few days please look here:
> http://github.com/davidiw/brunet/commits/7d433f966ab243f3dce899b0267be1799971addc
>
> Some bugs that remain:
> - Nodes that use FunctionEdgeListener will not allow unclosed edges to
> be garbage collected, see this
> http://github.com/davidiw/brunet/commit/7d433f966ab243f3dce899b0267be1799971addc

It's good to fix this bug for completeness sake, but I'm not sure I
see how this is a real problem. As long as a node can be garbage
collected and the FEL thread is stopped, eventually all the edges will
be garbage collected (assuming everything works) because none of the
objects can be reached by any running threads.

Also, the edges will be closed eventually by the node if no traffic
comes over them.

Am I misunderstanding this? It's not bad to be precise, so the
patches seem smart to me, but I don't see how unclosed edges would not
be garbage collected.

> - Node.Abort will not allow the node to be garbage collected ... Most
> likely due, though not limited, to this logic:
> http://github.com/davidiw/brunet/blob/7d433f966ab243f3dce899b0267be1799971addc/src/protocol/Node.cs#L729
>
> I don't envision any of the bugs being too bad for existing systems, but
> I strongly recommend you upgrade regardless.
>
> Also note, that due to a change in queue size max:
> http://github.com/davidiw/brunet/blob/7d433f966ab243f3dce899b0267be1799971addc/src/protocol/Node.cs#L341
>
> Memory may swell on planetlab before the node is restarted, I
> hypothesize the node may get up to 80 MB before restarting.
>
> In the future, I am working on a "HeavyChurnSimulator" that I think
> would have revealed most of these problems.  I'll wrap it up in the near
> future and I'd like to recommend it be used as the pre-planetlab defacto
> standard for ensuring there are no leaks.

Sounds like a good plan.

> Also I'd like to recommend that individuals consider using Visual Studio
> when debugging, it helped speed up this process significantly.
> Especially since the mono tools have trouble keeping up with our
> system.  For example, I tried doing a "heap-shot" with Mono and before I
> even executed my first "heap-shot" I was already storing 4 GB of
> application profile data.  Maybe I messed something up, but I
> reattempted this many, many times.  Also to relate to a conversation, I
> had with Pierre regarding my "conversion" or rather usage of Visual
> Studio.  As a student who has many tasks, I sadly don't have enough time
> to improve Mono.  Forcing myself to debug and profile with it in lieu of
> MS VS hinders my productivity.  It is my experience that the MS
> Debuggers and Profilers tend to be of a higher quality than Mono.  Alas
> if I had time (and desire), I would gladly work on improving Mono.
>
> I'll be releasing updated GroupVPN binaries and GridAppliances.
> Actually the former is already done, I will announce it in another
> e-mail though.
>
> I need sleep :).
>
> Regards,
> David
>
> >
>



--
P. Oscar Boykin http://boykin.acis.ufl.edu
Assistant Professor, Department of Electrical and Computer Engineering
University of Florida

P. Oscar Boykin

unread,
Sep 14, 2009, 4:06:28 PM9/14/09
to aci...@googlegroups.com
David,

One question about this patch:

http://github.com/davidiw/brunet/commit/51eed5be4386f3cbc3014e0736ada7b90ebeb597

That line of code *was* in there before (I think). It was removed for
simplicity due to this:

http://msdn.microsoft.com/en-us/library/b1yfkh5e.aspx

which says that IDisposable objects should call Dispose in their
destructor. So, if no one ever calls the Close() method of the
LFBlockingQueue, when the object goes out of scope, _are's destructor
will be called, which will call Dispose().

I'm not sure I see the advantage of calling it in our destructor. Can
you explain that one a little more?

Best,

David Isaac Wolinsky

unread,
Sep 22, 2009, 12:09:44 PM9/22/09
to aci...@googlegroups.com
P. Oscar Boykin wrote:
> David,
>
> One question about this patch:
>
> http://github.com/davidiw/brunet/commit/51eed5be4386f3cbc3014e0736ada7b90ebeb597
>
> That line of code *was* in there before (I think). It was removed for
> simplicity due to this:
>
> http://msdn.microsoft.com/en-us/library/b1yfkh5e.aspx
>
> which says that IDisposable objects should call Dispose in their
> destructor. So, if no one ever calls the Close() method of the
> LFBlockingQueue, when the object goes out of scope, _are's destructor
> will be called, which will call Dispose().
>
> I'm not sure I see the advantage of calling it in our destructor. Can
> you explain that one a little more?
>
> Best,
>
Hmm... I was doing some checking with a Visual Studio profiler and it
complained to me that we weren't properly disposing of our objects. I
guess it wants us to be explicit rather. This was prior to finding that
the edges weren't properly being garbage collected, so I'd have to
re-verify that this was still lingering around in memory heaps. For
example, Linkers weren't disappearing from the heap. I don't see how
this would hurt though, as it would probably enqueue this object for
garbage collection sooner than later.

Regards,
David

David Isaac Wolinsky

unread,
Sep 22, 2009, 12:15:03 PM9/22/09
to aci...@googlegroups.com
P. Oscar Boykin wrote:
> This code looks good to me.
>
> Should I pull it into the master branch?
>
> more below:
>
> On Fri, Sep 11, 2009 at 9:19 PM, David Isaac Wolinsky <dav...@ufl.edu> wrote:
>
>> One thing that I was cautious in announcing was that there are some bugs
>> in Brunet. I have dealt with quite a few of them, but there are a few
>> remaining bugs.
>>
>
> Why are you cautious? It's great to find bugs. We know they are
> there, we just don't know where they all are or exactly how many.
>
Well, I had foolishly announced the release of GroupVPN and hate to
inconvenience people when I suspected that I hadn't conquered the root
of the problem.

> What are the remaining bugs? What behaviors have you seen that cause them?
>
The only ones were listed in this e-mail and one I found shortly later
which is that the throw an exception on queue cap had unpredictable
results... Namely, Edge Listeners didn't handle the exception and even
when I made sure they did the system would still get into a stale state,
I guess other enqueuers weren't capable of handling the exception or
didn't handle not having their item not enqueued.

>
>> To see the bugs I've squashed in the past few days please look here:
>> http://github.com/davidiw/brunet/commits/7d433f966ab243f3dce899b0267be1799971addc
>>
>> Some bugs that remain:
>> - Nodes that use FunctionEdgeListener will not allow unclosed edges to
>> be garbage collected, see this
>> http://github.com/davidiw/brunet/commit/7d433f966ab243f3dce899b0267be1799971addc
>>
>
> It's good to fix this bug for completeness sake, but I'm not sure I
> see how this is a real problem. As long as a node can be garbage
> collected and the FEL thread is stopped, eventually all the edges will
> be garbage collected (assuming everything works) because none of the
> objects can be reached by any running threads.
>
> Also, the edges will be closed eventually by the node if no traffic
> comes over them.
>
> Am I misunderstanding this? It's not bad to be precise, so the
> patches seem smart to me, but I don't see how unclosed edges would not
> be garbage collected.
>
Even if a FEL is stopped and the nodes edges are all closed and garbage
collected, there are edges that the node doesn't "own" but subscribes to
them. Those are added to the ToURI static code and never get garbage
collected, because they never go out of reference.
Reply all
Reply to author
Forward
0 new messages