Brunet Regression Testing

3 views
Skip to first unread message

David Wolinsky

unread,
Jun 15, 2011, 4:54:34 PM6/15/11
to aci...@googlegroups.com
I've begun a massive regression test into Brunet:
https://www.grid-appliance.org/index.php?option=com_p2ppool&view=status&Itemid=107

I broke it up on two things: contributors and then on random guess. I
noticed a handful of patches that don't exist in Pierre's branch but
do in testing that may have contributed:

https://github.com/davidiw/brunet/commit/01e72a1f99cf58eb71f2e952c5f49eef1fe20a84
https://github.com/davidiw/brunet/commit/29e6f18c1e9ce1673076ae79ab89615e94b1caf5
https://github.com/davidiw/brunet/commit/9a366c54394cf454a508f4c2c57b41f6543328e2

Can anyone tell me if these were ever tested on Plab before and if so
what were the results?

Cheers,
David

David Wolinsky

unread,
Jun 16, 2011, 12:44:05 AM6/16/11
to aci...@googlegroups.com
Yeah, I am 98% confident that one of those patches caused the
regression. I'll see if I can reproduce some consistency behavior
problems in the simulator and see if that helps.

Cheers,
David

David Wolinsky

unread,
Jun 16, 2011, 2:11:15 AM6/16/11
to aci...@googlegroups.com
I think [1] is the problem. Let's peruse the code and see where this exists:

[2] -- I am receiving a connection request ... don't allow if I
already have a connection
[3] -- Successfully received a connection response ... don't allow
already have a connection
[4] -- Creating a new linker ... maybe we already have an existing
connection ... now we won't even try

All of these seem counter to the entire purpose of why we made this
change. Sure we'll create more edges, but we should get the most
optimal edge. One thing we could and potentially should do is limit
how many times we create new connection attempts to specific addresses
as well as parallel link attempts. Setting this to 0 retries is
actually already done in ConnectionOverlord.

Given that, I am removing this change as well as all the Connection
locking code as it is unused and adds only confusion to actually
reading the code. I'm going to do another test with my fingers
crossed ... tightly.

[1] https://github.com/davidiw/brunet/commit/29e6f18c1e9ce1673076ae79ab89615e94b1caf5
[2] https://github.com/davidiw/brunet/blob/29e6f18c1e9ce1673076ae79ab89615e94b1caf5/src/Brunet/Connections/ConnectionPacketHandler.cs#L478
[3] https://github.com/davidiw/brunet/blob/29e6f18c1e9ce1673076ae79ab89615e94b1caf5/src/Brunet/Connections/LinkProtocolState.cs#L463
[4] https://github.com/davidiw/brunet/blob/29e6f18c1e9ce1673076ae79ab89615e94b1caf5/src/Brunet/Connections/Linker.cs#L852

David Wolinsky

unread,
Jun 16, 2011, 2:13:24 AM6/16/11
to aci...@googlegroups.com
I'd like to mention that given what happens in [2], [3], and [4], I
think [1] can cause race conditions that cause failed connection
attempts.

Regards,
David

On Thu, Jun 16, 2011 at 2:11 AM, David Wolinsky

David Wolinsky

unread,
Jun 16, 2011, 2:41:54 AM6/16/11
to aci...@googlegroups.com
Haha, I'm not really having the conversation that I thought might come
from this, so apologies for all the e-mail spam.

I've made a changeset and am now doing a test:
https://github.com/davidiw/brunet/commit/886c4d5d2de5ee699e101db209ac87d030496c12
https://www.grid-appliance.org/index.php?option=com_p2ppool&view=systemstats&pool_id=27

Regards,
David

On Thu, Jun 16, 2011 at 2:13 AM, David Wolinsky

Pierre St Juste

unread,
Jun 16, 2011, 2:50:02 AM6/16/11
to aci...@googlegroups.com
Unfortunately, I have not studied that section of the code well enough to provide any valuable feedback.

--
You received this message because you are subscribed to the Google Groups "acis.p2p" group.
To post to this group, send email to aci...@googlegroups.com.
To unsubscribe from this group, send email to acisp2p+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/acisp2p?hl=en.




--
Pierre St Juste

David Wolinsky

unread,
Jun 16, 2011, 9:17:10 AM6/16/11
to aci...@googlegroups.com
I was mostly feeling guilty for having a conversation with myself
rather than sending a single e-mail describing everything prior to
going to sleep.

Cheers,
David

David Wolinsky

unread,
Jun 16, 2011, 9:21:19 AM6/16/11
to aci...@googlegroups.com
If you click [1], you'll see that consistency is much higher on the
current deployment. If you go further back historically, which I have
enabled, you'll see the previous overlay never had 100% consistency
even after deployment. I wouldn't say we've conquered the regression,
but we've at least got results matching Pierre's experiences and
deployment. I'll continue to let this run and potentially become the
next master branch pool.

With that said, beyond doing some exception hunting in the logs, this
nearly concludes my Brunet consolidation work. I have a patch from
Kyungyong that removes ISender from connection, since it could only
support sending from but not return senders, thus reqrep was measuring
time for connections and edges seperately. Beyond that and merging in
Pierre's tweaks to SVPN. Is there anything else?

Cheers,
David

[1] https://www.grid-appliance.org/index.php?option=com_p2ppool&view=systemstats&pool_id=27

On Thu, Jun 16, 2011 at 2:41 AM, David Wolinsky

Renato Figueiredo

unread,
Jun 16, 2011, 9:23:51 AM6/16/11
to aci...@googlegroups.com
Thank you very much, David! Nice job.

--
You received this message because you are subscribed to the Google Groups "acis.p2p" group.
To post to this group, send email to aci...@googlegroups.com.
To unsubscribe from this group, send email to acisp2p+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/acisp2p?hl=en.




--
Dr. Renato J. Figueiredo
Associate Professor
ACIS Lab - ECE - University of Florida
UF Site Director, Center for Autonomic Computing
http://byron.acis.ufl.edu
ph: 352-392-6430
Reply all
Reply to author
Forward
0 new messages