What algorithm would you use to find the "best" file by speed?

VPN user

unread,

Mar 10, 2016, 11:06:12 PM3/10/16

to

What algorithm would you use to find the "best" file by speed?

Which speed is most important?
1. Download speed?
2. Upload speed?
3. Ping times?

a. An average of all three?
b. The total of all three?
c. The mean?

Once I have a bunch of speed-tagged ovpn files, there is a need to
*sort* them by fastest file, which is easy enough to sort:

$ ls
vpngate_DE_09_Saarland_Neunkirchen_95.88.252.235-vpn149931469.opengw.net_tcp995_20160308_0.46dn_0.38up_738ms.ovpn
vpngate_DE_09_Saarland_Neunkirchen_95.88.252.235-vpn149931469.opengw.net_tcp995_20160310_1.22dn_0.51up_582ms.ovpn
vpngate_DE_09_Saarland_Neunkirchen_95.88.252.235-vpn149931469.opengw.net_udp1195_20160308_1.07dn_0.17up_325ms.ovpn
vpngate_DE_09_Saarland_Neunkirchen_95.88.252.235-vpn149931469.opengw.net_udp1195_20160310_1.50dn_1.59up_320ms.ovpn
vpngate_DE_na_na_na_176.199.177.41-vpn281592706.opengw.net_tcp1921_20160310_1.21dn_0.69up_595ms.ovpn
vpngate_DE_na_na_na_213.136.71.159-213.136.71.159_udp1195_20160304_5.70dn_1.42up_287ms.ovpn
vpngate_DE_na_na_na_213.136.71.159-213.136.71.159_udp1195_20160307_5.60dn_3.63up_284ms.ovpn
vpngate_DE_na_na_na_213.136.71.159-vpn755373027.opengw.net_tcp995_20160304_2.07dn_0.00up_500ms.ovpn
vpngate_DE_na_na_na_213.136.71.159-vpn755373027.opengw.net_tcp995_20160307_0.99dn_0.00up_521ms.ovpn
vpngate_DE_na_na_na_213.136.71.159-vpn755373027.opengw.net_udp1195_20160304_5.55dn_5.54up_278ms.ovpn
vpngate_DE_na_na_na_213.136.71.159-vpn755373027.opengw.net_udp1195_20160307_5.48dn_3.81up_278ms.ovpn
vpngate_DE_na_na_na_217.240.31.81-217.240.31.81_udp1195_20160309_0.73dn_0.00up_534ms.ovpn
vpngate_DE_na_na_na_217.240.31.81-vpn187314653.opengw.net_udp1195_20160308_2.46dn_0.00up_482ms.ovpn
vpngate_DE_na_na_na_5.189.135.48-5.189.135.48_udp1195_20160305_3.37dn_1.48up_299ms.ovpn

$ alias vpnsort='vpnsort.sh'
$ cat $(which vpnsort.sh)
#!/bin/bash
# vpnsort lists speed-tagged files slowest to fastest
ls *dn_*up_*ms*.ovpn | awk -F"_" '{print $9,$10,$11,$0}' | sort -n
exit 0
## End ##

$ vpnsort
0.46dn 0.38up 738ms.ovpn vpngate_DE_09_Saarland_Neunkirchen_95.88.252.235-vpn149931469.opengw.net_tcp995_20160308_0.46dn_0.38up_738ms.ovpn
0.73dn 0.00up 534ms.ovpn vpngate_DE_na_na_na_217.240.31.81-217.240.31.81_udp1195_20160309_0.73dn_0.00up_534ms.ovpn
0.99dn 0.00up 521ms.ovpn vpngate_DE_na_na_na_213.136.71.159-vpn755373027.opengw.net_tcp995_20160307_0.99dn_0.00up_521ms.ovpn
1.07dn 0.17up 325ms.ovpn vpngate_DE_09_Saarland_Neunkirchen_95.88.252.235-vpn149931469.opengw.net_udp1195_20160308_1.07dn_0.17up_325ms.ovpn
1.21dn 0.69up 595ms.ovpn vpngate_DE_na_na_na_176.199.177.41-vpn281592706.opengw.net_tcp1921_20160310_1.21dn_0.69up_595ms.ovpn
1.22dn 0.51up 582ms.ovpn vpngate_DE_09_Saarland_Neunkirchen_95.88.252.235-vpn149931469.opengw.net_tcp995_20160310_1.22dn_0.51up_582ms.ovpn
1.50dn 1.59up 320ms.ovpn vpngate_DE_09_Saarland_Neunkirchen_95.88.252.235-vpn149931469.opengw.net_udp1195_20160310_1.50dn_1.59up_320ms.ovpn
2.07dn 0.00up 500ms.ovpn vpngate_DE_na_na_na_213.136.71.159-vpn755373027.opengw.net_tcp995_20160304_2.07dn_0.00up_500ms.ovpn
2.46dn 0.00up 482ms.ovpn vpngate_DE_na_na_na_217.240.31.81-vpn187314653.opengw.net_udp1195_20160308_2.46dn_0.00up_482ms.ovpn
3.37dn 1.48up 299ms.ovpn vpngate_DE_na_na_na_5.189.135.48-5.189.135.48_udp1195_20160305_3.37dn_1.48up_299ms.ovpn
5.48dn 3.81up 278ms.ovpn vpngate_DE_na_na_na_213.136.71.159-vpn755373027.opengw.net_udp1195_20160307_5.48dn_3.81up_278ms.ovpn
5.55dn 5.54up 278ms.ovpn vpngate_DE_na_na_na_213.136.71.159-vpn755373027.opengw.net_udp1195_20160304_5.55dn_5.54up_278ms.ovpn
5.60dn 3.63up 284ms.ovpn vpngate_DE_na_na_na_213.136.71.159-213.136.71.159_udp1195_20160307_5.60dn_3.63up_284ms.ovpn
5.70dn 1.42up 287ms.ovpn vpngate_DE_na_na_na_213.136.71.159-213.136.71.159_udp1195_20160304_5.70dn_1.42up_287ms.ovpn

Notice the "average" numbers of the penultimate file in the list above
may indicate it's faster overall than the last file, which is the fastest
by download speed.

So I'm not sure *what* algorithm to use for choosing the "best" speed.

What algorithm would you use to find the "best" file by speed?

Bit Twister

unread,

Mar 11, 2016, 6:30:27 AM3/11/16

to

On Fri, 11 Mar 2016 04:06:10 -0000 (UTC), VPN user wrote:
> What algorithm would you use to find the "best" file by speed?
>
> Which speed is most important?
> 1. Download speed?
> 2. Upload speed?
> 3. Ping times?

Direction depends majority of data being transferred. :)

> a. An average of all three?
> b. The total of all three?
> c. The mean?

Look at results of your samples.
0.46dn 0.38up 738ms.ovpn
0.73dn 0.00up 534ms.ovpn
0.99dn 0.00up 521ms.ovpn
1.07dn 0.17up 325ms.ovpn
1.21dn 0.69up 595ms.ovpn
1.22dn 0.51up 582ms.ovpn
1.50dn 1.59up 320ms.ovpn
2.07dn 0.00up 500ms.ovpn
2.46dn 0.00up 482ms.ovpn
3.37dn 1.48up 299ms.ovpn
5.48dn 3.81up 278ms.ovpn
5.55dn 5.54up 278ms.ovpn
5.60dn 3.63up 284ms.ovpn
5.70dn 1.42up 287ms.ovpn

Above is sorted by download speed; would you pick the best ping time
for downloads. :(

> ls *dn_*up_*ms*.ovpn | awk -F"_" '{print $9,$10,$11,$0}' | sort -n

That does work with the benefit showing selection times for file name.
Your final logic may be able to do away with awk. Example:
ls *dn_*up_*ms*.ovpn | sort -V -t '_' --key=9 --key=10

I know when I am sorting package names for highest version I have
found it best to use -V instead of -n.

Have you considered the case of picking the highest upload speed if
the download speed is the same?
You can help that logic by using --key switches. See above example.

Since we are speaking about --key, you can have the user pick which is
the best key to pick from and use it in your sort command.

Example:
case $_arg1 in
-d) _sort_key='--key=9 --key=10' ;;
-u) _sort_key='--key=10 --key=9' ;;
-p) _sort_key='--key=11 --key=9 --key=10' ;;
*) usage "Error: invalid switch $_arg1: ;;
esac

_ovpn_fn=$(ls *dn_*up_*ms*.ovpn | sort -V -t '_' $_sort_key | head -1)

VPN user

unread,

Mar 11, 2016, 11:00:36 AM3/11/16

to

Bit Twister wrote...

> Direction depends majority of data being transferred. :)

I guess I deserved that response! :)

Here me out though, because I'm trying to figure out how the "brain"
works first, and then to implement *that* in code.

I think I'm a light-duty user in that I mostly just use the web for
searches (where I don't like Google knowing everything that I do)
or for reading or watching youtube (low bandwidth stuff) and for
text-only Usenet posting.

Once in a great while (like once every few months) I run a torrent,
usually to get a movie that is an oldie from the 40s or 50s or 60s
which are usually less than a GB in size.

So, the speeds I get on the relatively faster vpngate servers is fine
such as this server (5.55dn 5.54up 278ms.ovpn):
vpngate_DE_na_na_na_213.136.71.159-vpn755373027.opengw.net_udp1195_20160304_5.55dn_5.54up_278ms.ovpn

I don't "game" at all, so, I don't know if the "ping" times are
relevant to me. Generally I've noticed slow ping times (~> 500ms)
are on slow servers (~<1Mbps) but in and of itself, I'm not sure
how the latency matters for stuff I do (I don't do VOIP on VPN for
example).

> Look at results of your samples.
> 0.46dn 0.38up 738ms.ovpn
> 0.73dn 0.00up 534ms.ovpn
> 0.99dn 0.00up 521ms.ovpn
> 1.07dn 0.17up 325ms.ovpn
> 1.21dn 0.69up 595ms.ovpn
> 1.22dn 0.51up 582ms.ovpn
> 1.50dn 1.59up 320ms.ovpn
> 2.07dn 0.00up 500ms.ovpn
> 2.46dn 0.00up 482ms.ovpn
> 3.37dn 1.48up 299ms.ovpn
> 5.48dn 3.81up 278ms.ovpn
> 5.55dn 5.54up 278ms.ovpn
> 5.60dn 3.63up 284ms.ovpn
> 5.70dn 1.42up 287ms.ovpn
>
> Above is sorted by download speed; would you pick the best ping time
> for downloads. :(

Here's what I was trying to get at, I guess.
Notice the last 5 times above.
The ping times are about the same, so let's just look at speeds:
3.37dn 1.48up
5.48dn 3.81up
5.55dn 5.54up
5.60dn 3.63up
5.70dn 1.42up

To a "human", it seems the middle one is the best.
5.55dn 5.54up

But *how* did I derive that?
I don't know. It just *looks* better to me.

My "human" brain processes this by thinking that the download speeds
are all within a "range" (give or take a bit) such that it *looks* like,
to my brain, something like this:
3dn 1up
5dn 4up
5dn 5up
5dn 4up
5dn 1up

When I look at it *that* way, the middle numbers seem best to me.
5dn 5up

They're consistent in both directions, and they're reasonably high
in both directions even though they weren't the highest in either
direction.

So, I guess, I'm trying to figure out what algorithm "my brain" used
first, before I can figure out what algorithm to implement in code.

> Your final logic may be able to do away with awk. Example:
> ls *dn_*up_*ms*.ovpn | sort -V -t '_' --key=9 --key=10

I like that, but I think the "algorithm" should pick that middle
file of the last five, which would have been (in the sort above)
file number 3 below:
1. vpngate_DE_na_na_na_5.189.135.48-5.189.135.48_udp1195_20160305_3.37dn_1.48up_299ms.ovpn
2. vpngate_DE_na_na_na_213.136.71.159-vpn755373027.opengw.net_udp1195_20160307_5.48dn_3.81up_278ms.ovpn
3. vpngate_DE_na_na_na_213.136.71.159-vpn755373027.opengw.net_udp1195_20160304_5.55dn_5.54up_278ms.ovpn
4. vpngate_DE_na_na_na_213.136.71.159-213.136.71.159_udp1195_20160307_5.60dn_3.63up_284ms.ovpn
5. vpngate_DE_na_na_na_213.136.71.159-213.136.71.159_udp1195_20160304_5.70dn_1.42up_287ms.ovpn

> I know when I am sorting package names for highest version I have
> found it best to use -V instead of -n.

I've modified the one-line vpnsort script to use the "-V" but I think
I first need to figure out what algorithm my "brain" used to choose
file #3 above instead of file #5.

> Have you considered the case of picking the highest upload speed if
> the download speed is the same?
> You can help that logic by using --key switches. See above example.

I thought about double sorts and figured the chances of an *exact* match
were slim and rare, but what I think I really want is to do some kind
of analysis of both the download and upload (taking ping into account
only momentarily, just in case pings suck).

So, mostly I'm trying to figure out what *algorithm* my brain used to
choose file #3 in that list above, instead of file #5 (which had the
fastest download) or file #3 (because it had the fastest upload).

> Since we are speaking about --key, you can have the user pick which is
> the best key to pick from and use it in your sort command.
>
> Example:
> case $_arg1 in
> -d) _sort_key='--key=9 --key=10' ;;
> -u) _sort_key='--key=10 --key=9' ;;
> -p) _sort_key='--key=11 --key=9 --key=10' ;;
> *) usage "Error: invalid switch $_arg1: ;;
> esac
>
> _ovpn_fn=$(ls *dn_*up_*ms*.ovpn | sort -V -t '_' $_sort_key | head -1)

Thanks for that code snippet.

Your suggestion used things I wasn't familiar with (as I've never had a
script take arguments other than a file name), and it didn't work on
the first pass so I'll play with it a bit to implement the useful
choice mechanism of types of sorts to run to choose the best file.

William Unruh

unread,

Mar 11, 2016, 11:53:59 AM3/11/16

to

On 2016-03-11, VPN user <vpn...@example.com> wrote:
> Bit Twister wrote...
>
>> Direction depends majority of data being transferred. :)
>
> I guess I deserved that response! :)
>
> Here me out though, because I'm trying to figure out how the "brain"
> works first, and then to implement *that* in code.

Why? You state that you are a low useage user. do you really care about
the speed, as long as it is not absurdly slow? Ie, it does not matter.

>
> I think I'm a light-duty user in that I mostly just use the web for
> searches (where I don't like Google knowing everything that I do)
> or for reading or watching youtube (low bandwidth stuff) and for
> text-only Usenet posting.
>
> Once in a great while (like once every few months) I run a torrent,
> usually to get a movie that is an oldie from the 40s or 50s or 60s
> which are usually less than a GB in size.
>

...

>
> Here's what I was trying to get at, I guess.
> Notice the last 5 times above.
> The ping times are about the same, so let's just look at speeds:
> 3.37dn 1.48up
> 5.48dn 3.81up
> 5.55dn 5.54up
> 5.60dn 3.63up
> 5.70dn 1.42up

No. If I were wanting to only upload I would choose the last, if I were
only uploading the middle. And if I did not care probably the middle
one, but would not really care. ( and besides these results are likely
HIGHLY time dependant. Ie if I ran it again 5 min later I would get very
different results.)

>
> To a "human", it seems the middle one is the best.
> 5.55dn 5.54up

No. Depends on what I want to do.

>
> But *how* did I derive that?
> I don't know. It just *looks* better to me.

It has high upload and download times.

>
> My "human" brain processes this by thinking that the download speeds
> are all within a "range" (give or take a bit) such that it *looks* like,
> to my brain, something like this:
> 3dn 1up
> 5dn 4up
> 5dn 5up
> 5dn 4up
> 5dn 1up
>
> When I look at it *that* way, the middle numbers seem best to me.
> 5dn 5up
>
> They're consistent in both directions, and they're reasonably high
> in both directions even though they weren't the highest in either
> direction.

And since the results are probably only accurate to 1/2 significant
figure (ie even a 3 vs a 5 would not matter, because which was faster
would change from time to time).

If one HAS to make a choice the human brain will pick, based on all
kinds of totally irrelevant material. Even -- It was first in the list.
Or It was printed in blue and I like blue.

>
> So, I guess, I'm trying to figure out what algorithm "my brain" used
> first, before I can figure out what algorithm to implement in code.

So perhaps you should figure out what you want the algorithm to do
first.

>
>
> So, mostly I'm trying to figure out what *algorithm* my brain used to
> choose file #3 in that list above, instead of file #5 (which had the
> fastest download) or file #3 (because it had the fastest upload).

Because the instructions you sent your brain at that moment were
hopelessly vague, so it grabbed something, anything, to make a decision.
Decisions are costly. It is hugely taxing on the brain to make decisions
so it will make one based on the most irrelevant criteria just so the
decision gets made.

Mike Easter

unread,

Mar 11, 2016, 12:01:00 PM3/11/16

to

To a.o.l only

VPN user wrote:
> I think I'm a light-duty user in that I mostly just use the web for
> searches (where I don't like Google knowing everything that I do)
> or for reading or watching youtube (low bandwidth stuff) and for
> text-only Usenet posting.
>
> Once in a great while (like once every few months) I run a torrent,
> usually to get a movie that is an oldie from the 40s or 50s or 60s
> which are usually less than a GB in size.

I opine that download is 3x as important (ergo 'weighty' in this
context) as upload, maybe more, like 5x.

--
Mike Easter

Bit Twister

unread,

Mar 11, 2016, 12:26:43 PM3/11/16

to

Unless a person is running a file server of some sort I would suggest
the normal direction is 99.8% download, .1% ack/nack traffic overhead
and some .1% typing upload and dns work.

I find it funny a lot of work went in to getting lots of vpn
connection files in order to make tracking difficult then come along
and nullify that work by always picking a few fast connections. :)

VPN user

unread,

Mar 11, 2016, 12:46:50 PM3/11/16

to

William Unruh wrote in message nbut1b$gcc$1...@dont-email.me

> Why? You state that you are a low useage user. do you really care about
> the speed, as long as it is not absurdly slow? Ie, it does not matter.

Nobody, given the choice of a slow server versus a faster server
would choose the slower one, even if they're light-duty users.

> No. If I were wanting to only upload I would choose the last, if I were
> only uploading the middle. And if I did not care probably the middle
> one, but would not really care. ( and besides these results are likely
> HIGHLY time dependant. Ie if I ran it again 5 min later I would get very
> different results.)

It's very true one statement you said, and very untrue the other:
1. It's true that the times are extremely time dependent.
Run them twice in the same ten-minute period & results will differ!

2. It's not really true that only download matters if you only download,
since there is almost no such thing as just downloading. There is
always a bit of uploading even when downloading, if for no other
reason than communication purposes.

>> To a "human", it seems the middle one is the best.
>> 5.55dn 5.54up
>
> No. Depends on what I want to do.

Well. To "me" it's pretty clear that "my brain" processes that list
and figures out that the middle one was best. The differences between
the middle one and the fastest on download wasn't great - but the
difference between the middle one and the fastest on upload was great.

My point is that the algorithm that the brain uses is not just to
pick the biggest number. At the very least, we round off the numbers
in our head - so the algorithm I select should probably round off all
numbers to 1 decimal place at most, and probably to zero decimal
places in reality.

>> But *how* did I derive that?
>> I don't know. It just *looks* better to me.
>
> It has high upload and download times.

Yes. But my "brain" didn't select the one with the highest download
speed!

So, how does "my brain" seem to compensate to pick the one that had
both high download and upload times, but not the highest perhaps of
each?

That's the algorithm I'm trying to figure out.

Once I figure out the algorithm, it's a *different* task to implement
that algorithm in code.

> And since the results are probably only accurate to 1/2 significant
> figure (ie even a 3 vs a 5 would not matter, because which was faster
> would change from time to time).

I do agree that the accuracy is NOT to the two decimal places that
we see in the upload/download numbers. You're probably correct that
the algorithm should limit the numbers to 1/2 such as 1, 1.5, 2, 2.5,
3, 3.5, etc. Maybe even whole numbers such as 1, 2, 3, 4 are best?

> If one HAS to make a choice the human brain will pick, based on all
> kinds of totally irrelevant material. Even -- It was first in the list.
> Or It was printed in blue and I like blue.

I disagree. Respectfully so.

I think if I figure out a good algorithm, it will pick, out of 1000
files, the best files overall, based on the four available inputs:
a. Ping times (preferably <200 ms but the lower the better)
b. Download speeds (preferably above 5Mbps but the higher the better)
c. Uploads speeds (preferably above 5Mbps but the higher the better)
d. How much these 3 things above deviate from each other.

For example, if there was a great file based on speeds, but the ping
times were unusually high, "my brain" would probably not choose it.

Likewise, if there was a great download but just a mediocre upload,
then "my brain" would see that as flaky, and I would probably not
choose it.

However, if my "brain" saw nice ping times (but perhaps not the best)
with good download speeds (but perhaps not the best) and decent upload
speeds (but perhaps not the best), overall, my "brain" might see
that *consistently good* ovpn file as the most-solid server.

> Because the instructions you sent your brain at that moment were
> hopelessly vague, so it grabbed something, anything, to make a decision.
> Decisions are costly. It is hugely taxing on the brain to make decisions
> so it will make one based on the most irrelevant criteria just so the
> decision gets made.

I agree that looking at the output of a thousand files would be hard
for the brain, but I think I can come up with an algorithm that works.

It would not be the fastest of one item only.

For example, what do you think of this 2-step algorithm?
STEP 1: First find the fastest 5% for ping, download, & upload respectively.
STEP 2: Then compare the 3 lists to report any files in all 3 lists.

If we can't find files in the top 5% of all three lists, then we check
the top 10%, and if still nothing, then we move on to the top 20%,
and so on, until we find at least one match.

Is that a good "brain" algorithm?

VPN user

unread,

Mar 11, 2016, 12:50:34 PM3/11/16

to

Mike Easter wrote in message dkgbu8...@mid.individual.net

> I opine that download is 3x as important (ergo 'weighty' in this
> context) as upload, maybe more, like 5x.

That makes sense to weight the download higher than the upload.

Given that download is probably the *most* important figure, how
does this "brain" algorithm sound to you?

1. Find the top 10% files by download speed.
2. Of those, find the top 10% by upload speed.
3. Of those, find the top 10% by ping speeds.

Does that look like a decent algorithm for finding the best files?

VPN user

unread,

Mar 11, 2016, 1:15:14 PM3/11/16

to

Bit Twister wrote in message slrnne5vuh.6...@wb.home.test

> Unless a person is running a file server of some sort I would suggest
> the normal direction is 99.8% download, .1% ack/nack traffic overhead
> and some .1% typing upload and dns work.

Is this a decent summary of priorities?
a. Download speeds are usually most important
b. Upload speeds are much less important
c. Ping times seem to be even less important

So, maybe the "brain" algorithm is:
A. Pick files in the top 10% for download speed
B. Pick out of that list, those that are in the top 20% for upload speed
C. Then pick out of that new list, those that are in the top 30% for ping

> I find it funny a lot of work went in to getting lots of vpn
> connection files in order to make tracking difficult then come along
> and nullify that work by always picking a few fast connections.

This is a fair observation.

Bear in mind that vpngate is designed to make censorship hard, while
I'm not using it for that purpose.

I could pick a different free VPN service, but, most that I know of
have one or more of the following "issues":
a. login/password changes (e.g., weekly changes listed on their web page)
b. Time limits (e.g., 30 minute sessions perhaps)
c. Bandwidth limits (e.g., 1Mbps caps perhaps)
d. Server blocks (e.g., Craiglist blocks a lot of the commercial free vpns)
e. Reliability (e.g., server drops suddenly on many of these free vpns)
f. Protocol (e.g., some are only PPTP & some require their own clients)
g. Some require registration by email for the login/password

This is based on my personal testing of the following free VPN services:
http://freevpn.me/accounts
https://www.vpnme.me/freevpn.html
http://www.vpnbook.com/freevpn
http://vpn.vpnreactor.com
https://www.vpnoneclick.com
http://freeusvpn.itshidden.eu
http://www.securitykiss.com
https://www.threatspike.com/portal/register
http://vpngate.net

Given those tests, none are perfect, and all work to some extent; but
I choose vpngate.net because mostly it has the fewest of those
annoyances above.

William Unruh

unread,

Mar 11, 2016, 3:28:02 PM3/11/16

to

On 2016-03-11, VPN user <vpn...@example.com> wrote:

> William Unruh wrote in message nbut1b$gcc$1...@dont-email.me
>
>> Why? You state that you are a low useage user. do you really care about
>> the speed, as long as it is not absurdly slow? Ie, it does not matter.
>
> Nobody, given the choice of a slow server versus a faster server
> would choose the slower one, even if they're light-duty users.
>
>> No. If I were wanting to only upload I would choose the last, if I were
>> only uploading the middle. And if I did not care probably the middle
>> one, but would not really care. ( and besides these results are likely
>> HIGHLY time dependant. Ie if I ran it again 5 min later I would get very
>> different results.)
>
> It's very true one statement you said, and very untrue the other:
> 1. It's true that the times are extremely time dependent.
> Run them twice in the same ten-minute period & results will differ!
>
> 2. It's not really true that only download matters if you only download,
> since there is almost no such thing as just downloading. There is
> always a bit of uploading even when downloading, if for no other
> reason than communication purposes.

Yes, so if download speeds are high but upload are zero you are in
trouble. But most home ISP have vastly different upload from download
speeds (10MB/s down vs .5MB/s upload) which makes virtually no differnce
to the useability.

>
>>> To a "human", it seems the middle one is the best.
>>> 5.55dn 5.54up
>>
>> No. Depends on what I want to do.
>
> Well. To "me" it's pretty clear that "my brain" processes that list
> and figures out that the middle one was best. The differences between
> the middle one and the fastest on download wasn't great - but the
> difference between the middle one and the fastest on upload was great.

Well my brain, ( and I am a human) processes it differently depending on
what I want to do. So, it would seem what you want is not what a "human"
does, but what you do.

>
> My point is that the algorithm that the brain uses is not just to
> pick the biggest number. At the very least, we round off the numbers
> in our head - so the algorithm I select should probably round off all
> numbers to 1 decimal place at most, and probably to zero decimal
> places in reality.

Of course, since we really do not care if it is 5.4 or 5.5 especdially
when we know that measurement is out by a factor of 2 depending on the
time. If it is a critical situation and we know that the numbers
actually mean something and we want to download, we will probably pick
the one with the fastest download speed, as long as the upload is not
absurdly small. Similarly with upload speed. And if we do not care, we
do not care.

>
>>> But *how* did I derive that?
>>> I don't know. It just *looks* better to me.
>>
>> It has high upload and download times.
>
> Yes. But my "brain" didn't select the one with the highest download
> speed!

IF I wanted to download I would pick the one with the highest download
speed. If I wanted to upload the highest upload speed. that is now my
brain operates.

>
> So, how does "my brain" seem to compensate to pick the one that had
> both high download and upload times, but not the highest perhaps of
> each?

Since none of us have your brain to study, this is an impossible
question to answer, and as I have pointed out, my brain apparently
operates differently from yours.

>
> That's the algorithm I'm trying to figure out.
>
> Once I figure out the algorithm, it's a *different* task to implement
> that algorithm in code.
>
>> And since the results are probably only accurate to 1/2 significant
>> figure (ie even a 3 vs a 5 would not matter, because which was faster
>> would change from time to time).
>
> I do agree that the accuracy is NOT to the two decimal places that
> we see in the upload/download numbers. You're probably correct that
> the algorithm should limit the numbers to 1/2 such as 1, 1.5, 2, 2.5,
> 3, 3.5, etc. Maybe even whole numbers such as 1, 2, 3, 4 are best?

As I said, about half a decimal place is all they are worth. Ie, take
the log base 10, multiply by 2, and round off.

>
>> If one HAS to make a choice the human brain will pick, based on all
>> kinds of totally irrelevant material. Even -- It was first in the list.
>> Or It was printed in blue and I like blue.
>
> I disagree. Respectfully so.
>
> I think if I figure out a good algorithm, it will pick, out of 1000

You asked how the "human brain" picks things. I was trying to tell you.
Now you say that is not a "good algorithm". So you do not want how
humand do it, but rather some definition of "good".

VPN user

unread,

Mar 11, 2016, 9:17:10 PM3/11/16

to

William Unruh wrote in message nbv9in$4vr$1...@dont-email.me

> Yes, so if download speeds are high but upload are zero you are in
> trouble. But most home ISP have vastly different upload from download
> speeds (10MB/s down vs .5MB/s upload) which makes virtually no differnce
> to the useability.

This makes sense.
So would you agree the following is the "algorithm"?
1. Give download speed the most weight.
2. Give upload speed a LOT less weight.
3. Give ping times not much weight at all?

> Well my brain, ( and I am a human) processes it differently depending on
> what I want to do. So, it would seem what you want is not what a "human"
> does, but what you do.

Heh heh heh ... :)

> Of course, since we really do not care if it is 5.4 or 5.5 especdially
> when we know that measurement is out by a factor of 2 depending on the
> time. If it is a critical situation and we know that the numbers
> actually mean something and we want to download, we will probably pick
> the one with the fastest download speed, as long as the upload is not
> absurdly small. Similarly with upload speed. And if we do not care, we
> do not care.

I have to agree that the numbers can be off by a LOT!
How much?
I don't know.
Maybe 50%?

So, I think the "algorithm" will just round up or down to zero decimals.
Does that sound reasonable?

> IF I wanted to download I would pick the one with the highest download
> speed. If I wanted to upload the highest upload speed. that is now my
> brain operates.

I don't do much uploading, as we discussed.
Of course, a ZERO on upload would kill this post even, but, as bandwidth
goes, I don't upload much (except when sending photos in mail, I guess).

I suspect torrenting needs uploads to earn download speeds though,
but I do torrenting maybe 3 or 5 times a year, so it's not all that
often done.

> Since none of us have your brain to study, this is an impossible
> question to answer, and as I have pointed out, my brain apparently
> operates differently from yours.

I guess your algorithm is to just pick the fastest download speed
even if the ping and upload times stink. That seems too simple for
me to be effective.

Seems to me a better algorithm is to weigh download as most important
but to also care about upload and perhaps care about ping.

> As I said, about half a decimal place is all they are worth. Ie, take
> the log base 10, multiply by 2, and round off.

Huh? Really? Or are you joking?

My algorithm would be to round to zero decimal places, such that:
3.37dn 1.48up becomes 3dn & 1up
5.48dn 3.81up becomes 5dn & 4up
5.55dn 5.54up becomes 6dn & 6up <==== this wins in my algorithm
5.60dn 3.63up becomes 6dn & 4up
5.70dn 1.42up becomes 6dn & 1up

If you are actually serious, your download-only log-based algorithm is:
3.37dn becomes (log 3.37)*2 = 1.06 = 1
5.48dn becomes (log 5.48)*2 = 1.48 = 1
5.55dn becomes (log 5.55)*2 = 1.49 = 1
5.60dn becomes (log 5.60)*2 = 1.50 = 2 <=== this wins in your algorithm
5.70dn becomes (log 5.70)*2 = 1.51 = 2 <=== along with this

> You asked how the "human brain" picks things. I was trying to tell you.
> Now you say that is not a "good algorithm". So you do not want how
> humand do it, but rather some definition of "good".

Your approach as I try to understand it, is not to choose the the fastest
download times but to choose the logarithmic derivation of those download
times that results in the largest number.

Do I understand the approach right yet?

Char Jackson

unread,

Mar 11, 2016, 11:27:01 PM3/11/16

to

On Sat, 12 Mar 2016 02:17:08 -0000 (UTC), VPN user <vpn...@example.com>
wrote:

>William Unruh wrote in message nbv9in$4vr$1...@dont-email.me
>
>> Yes, so if download speeds are high but upload are zero you are in
>> trouble. But most home ISP have vastly different upload from download
>> speeds (10MB/s down vs .5MB/s upload) which makes virtually no differnce
>> to the useability.
>
>This makes sense.
>So would you agree the following is the "algorithm"?
>1. Give download speed the most weight.
>2. Give upload speed a LOT less weight.
>3. Give ping times not much weight at all?

I agree with this approach. See below.

>Seems to me a better algorithm is to weigh download as most important
>but to also care about upload and perhaps care about ping.
>

>My algorithm would be to round to zero decimal places, such that:
> 3.37dn 1.48up becomes 3dn & 1up
> 5.48dn 3.81up becomes 5dn & 4up
> 5.55dn 5.54up becomes 6dn & 6up <==== this wins in my algorithm
> 5.60dn 3.63up becomes 6dn & 4up
> 5.70dn 1.42up becomes 6dn & 1up

It looks like your brain is sorting first on the total of (dn+up), as
follows (note the last column showing the total of dn+up):
3.37dn 1.48up becomes 3dn & 1up (3+1=4)
5.48dn 3.81up becomes 5dn & 4up (5+4=9)
5.55dn 5.54up becomes 6dn & 6up (6+6=12=winner)
5.60dn 3.63up becomes 6dn & 4up (6+4=10)
5.70dn 1.42up becomes 6dn & 1up (6+1=7)

I suspect that a secondary qualification is happening, or even multiple
additional qualifications, such as:

- was the winner picked in spite of one of its values being abysmal? In
other words, did it float to the top by having an unusually large value
coupled with an unusually low value? In this case, no, so the winner
survives this round. To code this, perhaps the two values need to be within
a certain percentage of each other, and furthermore perhaps its desirable to
ensure that the dn value is higher than the up value.

- is ping time really relevant? Unless it's way out of hand, (satellite
ISP?), then probably not really relevant.

You can handle these cases, and more, by weighting and/or scoring
everything, so that the 'best' result naturally floats to the top in a
predictable and repeatable fashion. Maybe dn counts for 60% of the final
answer, up counts for another 30%, and ping counts for the last 10%. Weight
and score everything, do the math, declare the winner.

VPN user

unread,

Mar 12, 2016, 7:49:53 AM3/12/16

to

Char Jackson wrote in message un57ebt3ht8hp4rig...@4ax.com

> It looks like your brain is sorting first on the total of (dn+up), as
> follows (note the last column showing the total of dn+up):
> 3.37dn 1.48up becomes 3dn & 1up (3+1=4)
> 5.48dn 3.81up becomes 5dn & 4up (5+4=9)
> 5.55dn 5.54up becomes 6dn & 6up (6+6=12=winner)
> 5.60dn 3.63up becomes 6dn & 4up (6+4=10)
> 5.70dn 1.42up becomes 6dn & 1up (6+1=7)

That's an interesting observation, because I had not realized that the
total of dn + up was what I was using (as long as the up isn't zero).

But it must be a little more complex because of the weighting of down
greater than up. So, if it was 1dn and 6up, that "7" would be less
valuable than the 7 from 6dn and 1up.

Maybe a good algorithm is to double (or triple?) the down number (to
give it more weight) and then either add or multiply the up number.

If we multiply times the upload, that automatically handles the case
of deprecating a zero upload so we don't have to check if upload is 0.

> - was the winner picked in spite of one of its values being abysmal? In
> other words, did it float to the top by having an unusually large value
> coupled with an unusually low value? In this case, no, so the winner
> survives this round.

I agree with you that the algorithm needs to deprecate an unusually
high number coupled with an unusually low number.

> To code this, perhaps the two values need to be within
> a certain percentage of each other, and furthermore perhaps its desirable to
> ensure that the dn value is higher than the up value.

That makes a LOT of sense!

Consistency between down & up adds extra credit (I think) simply because,
in *practice*, the consistent servers seem to be better overall when they
are consistently high.

> - is ping time really relevant? Unless it's way out of hand, (satellite
> ISP?), then probably not really relevant.

I do not really understand how ping times affect the speeds, because
latency isn't the same as bandwidth.

The ping time from my ISP, without VPN, is on the order of 20ms to 50ms,
while the VPN ping times are far longer - usually on the order of 200ms
to over a thousand ms.

So I'm not sure if ping matters greatly; but I have noticed that those
with high ping (roughly over 500ms) usually have lousy results, so,
maybe pings would be an if-then-else cutoff speed around 400ms?

BTW, for VOIP, the ping jitter is critical but I'm not using VPN for
VOIP. I'm sure latency is critical for gamers too. But that's not me.

> You can handle these cases, and more, by weighting and/or scoring
> everything, so that the 'best' result naturally floats to the top in a
> predictable and repeatable fashion. Maybe dn counts for 60% of the final
> answer, up counts for another 30%, and ping counts for the last 10%. Weight
> and score everything, do the math, declare the winner.

That algorithm that I just weight each one out of 100% makes sense.

The good news is that once I have the initial algorithm working in
code, then it's relativity easy to tweak, so I think I'll start working
on the code now, where there is no doubt the weighting will be on:

1. The download speed gets most of the weight
2. The upload counts as well (and hurts if it's 0)
3. The ping also counts but not very much unless it's a high number
4. Consistency between down & up adds extra credit (I think)

William Unruh

unread,

Mar 12, 2016, 12:31:05 PM3/12/16

to

On 2016-03-12, VPN user <vpn...@example.com> wrote:
> Char Jackson wrote in message un57ebt3ht8hp4rig...@4ax.com
>
>> It looks like your brain is sorting first on the total of (dn+up), as
>> follows (note the last column showing the total of dn+up):
>> 3.37dn 1.48up becomes 3dn & 1up (3+1=4)
>> 5.48dn 3.81up becomes 5dn & 4up (5+4=9)
>> 5.55dn 5.54up becomes 6dn & 6up (6+6=12=winner)
>> 5.60dn 3.63up becomes 6dn & 4up (6+4=10)
>> 5.70dn 1.42up becomes 6dn & 1up (6+1=7)
>
> That's an interesting observation, because I had not realized that the
> total of dn + up was what I was using (as long as the up isn't zero).
>
> But it must be a little more complex because of the weighting of down
> greater than up. So, if it was 1dn and 6up, that "7" would be less
> valuable than the 7 from 6dn and 1up.
>
> Maybe a good algorithm is to double (or triple?) the down number (to
> give it more weight) and then either add or multiply the up number.
>
> If we multiply times the upload, that automatically handles the case
> of deprecating a zero upload so we don't have to check if upload is 0.

The product( geometric mean) ( the average of logarithms) weighs small values much
more than large. Thus if download is 15 and upload is 1 that is much
smaller than if upload is download is 5 and upload is 2. If you are
mainly downloading, that would seem a pretty silly weighting.

>
>> - was the winner picked in spite of one of its values being abysmal? In
>> other words, did it float to the top by having an unusually large value
>> coupled with an unusually low value? In this case, no, so the winner
>> survives this round.
>
> I agree with you that the algorithm needs to deprecate an unusually
> high number coupled with an unusually low number.

It depends on what you want to do.

>
>> To code this, perhaps the two values need to be within
>> a certain percentage of each other, and furthermore perhaps its desirable to
>> ensure that the dn value is higher than the up value.
>
> That makes a LOT of sense!

It does? If you primarily do downloading, why does weighting uploads
make any sense at all.

>
> Consistency between down & up adds extra credit (I think) simply because,
> in *practice*, the consistent servers seem to be better overall when they
> are consistently high.

Is that really true? You need to study it. Otherwise this becomes and
exercise in "How do I make my prejudices the determining factor".

>
>> - is ping time really relevant? Unless it's way out of hand, (satellite
>> ISP?), then probably not really relevant.
>
> I do not really understand how ping times affect the speeds, because
> latency isn't the same as bandwidth.

But it is response. I once had a ISP (Virgin in the UK) where the
latency to N Am was about 5 seconds. The download and upload speeds were
fine for the time. It made typing on my home computer impossible.

>
> The ping time from my ISP, without VPN, is on the order of 20ms to 50ms,
> while the VPN ping times are far longer - usually on the order of 200ms
> to over a thousand ms.

Usually ping times are purely a measure of distance. So this says that
your IPS is closer to you than the VPN. Surprize surprize.

>
>

VPN user

unread,

Mar 12, 2016, 3:36:11 PM3/12/16

to

William Unruh wrote in message nc1jis$k7s$1...@dont-email.me

> The product( geometric mean) ( the average of logarithms) weighs small values much
> more than large. Thus if download is 15 and upload is 1 that is much
> smaller than if upload is download is 5 and upload is 2. If you are
> mainly downloading, that would seem a pretty silly weighting.

Based on everyone's input, I have come up with a rudimentary algorithm
that I think will work. I only need now to add it to the existing
vpnspeed.sh script.

Since I've never done a calculation in a script, I will have to do some
research so I will post my script only when it works at least rudimenarily.

If I do it right, the algorithm can be easily tweaked, so, I'm not
too worried about getting the algorithm right on the first pass.

> It depends on what you want to do.

Yeah. And no.
Most people are similar, which is why ISPs can get away with throttling
upload speeds in favor of download speeds for example.

I'm still unsure how or why ping times matter though...

> It does? If you primarily do downloading, why does weighting uploads
> make any sense at all.

This is a "human experience" artificial intelligence question.
In my human experience, when the upload was reported as 0.00, the entire
VPN experience sucked.

> Is that really true? You need to study it. Otherwise this becomes and
> exercise in "How do I make my prejudices the determining factor".

Fair enough. But, I have a rudimentary algorithm now, so, the real
work isn't in the algorithm anymore. It's in implementing that algorithm
(since, as you know, I'm just a script noob).

Once I implement the algorithm, if it's reasonably well written, I should
be able to tweak it to improve it - so - I'm switching gears from design
to writing now.

> But it is response. I once had a ISP (Virgin in the UK) where the
> latency to N Am was about 5 seconds. The download and upload speeds were
> fine for the time. It made typing on my home computer impossible.

This is good to know. 5,000 ms isn't so far fetched for these VPNs,
although I mostly see speeds in the 500ms to 1000ms range.

> Usually ping times are purely a measure of distance. So this says that
> your IPS is closer to you than the VPN. Surprize surprize.

I would think a ping inside a VPN would be slower than a ping outside
that very same VPN. Here's a test - let's see what happens:

$ vpnwhich.sh
vpngate_US_NY_NewYork_Buffalo_198.23.197.184-198.23.197.184_tcp443_20160307.ovpn

$ curl http://myip.dnsomatic.com; echo
198.23.197.184

$ ping -c3 198.23.197.184
PING 198.23.197.184 (198.23.197.184) 56(84) bytes of data.
64 bytes from 198.23.197.184: icmp_seq=1 ttl=106 time=76.3 ms
64 bytes from 198.23.197.184: icmp_seq=2 ttl=106 time=72.9 ms
64 bytes from 198.23.197.184: icmp_seq=3 ttl=106 time=85.1 ms
--- 198.23.197.184 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2002ms
rtt min/avg/max/mdev = 72.967/78.155/85.176/5.150 ms

$ vpnspeed.sh
Good: Directory where openvpn is currently running is /home/bar/doc/cert/vpn_winners/US
Good: Directory for renamed file is /home/bar/doc/cert/vpn_winners/US/renamed
Good: File to be renamed is vpngate_US_NY_NewYork_Buffalo_198.23.197.184-198.23.197.184_tcp443_20160307.ovpn
Good: All 3 speedtest-cli results are numerical; therefore we can rename the ovpn file...
Good: Moving /home/bar/doc/cert/vpn_winners/US/vpngate_US_NY_NewYork_Buffalo_198.23.197.184-198.23.197.184_tcp443_20160307.ovpn to /home/bar/doc/cert/vpn_winners/US/renamed/vpngate_US_NY_NewYork_Buffalo_198.23.197.184-198.23.197.184_tcp443_20160307_4.26dn_0.15up_426ms.ovpn
dn speed is 4.26
up speed is 0.15
platency is 426

$ ping -c3 198.23.197.184
PING 198.23.197.184 (198.23.197.184) 56(84) bytes of data.
64 bytes from 198.23.197.184: icmp_seq=1 ttl=106 time=100 ms
64 bytes from 198.23.197.184: icmp_seq=2 ttl=106 time=75.4 ms
64 bytes from 198.23.197.184: icmp_seq=3 ttl=106 time=72.8 ms
--- 198.23.197.184 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2003ms
rtt min/avg/max/mdev = 72.861/82.992/100.694/12.562 ms

So, on my own ISP, the ping average was 100ms while on VPN the
average was 85ms by the ping method, and 426ms by speedtest-cli.

If I discount the speedtest-cli (it's off by over 4x the pings),
and if I compare the ping to the VPN server both on and off VPN,
you are correct in that they're in the same order of magnitude.

In fact, I'm surprised that the ping times on VPN were actually
shorter than the ping times off of VPN!

Bit Twister

unread,

Mar 12, 2016, 6:59:50 PM3/12/16

to

On Sat, 12 Mar 2016 20:36:09 -0000 (UTC), VPN user wrote:

> Since I've never done a calculation in a script, I will have to do some
> research so I will post my script only when it works at least rudimenarily.

If you have any serious math to do, you might look into bc.

echo "scale=10; (215605-3303-212086)/215605" | bc

or just some usual math type stuff not easy with bash
_odd_wk=$(echo "$(date '+%_W') % 2" | bc )
echo Is this an odd week of the year (0=yes 1=no) $_odd_wk

Scott Hemphill

unread,

Mar 12, 2016, 9:21:00 PM3/12/16

to

This last example is easy to do in bash:

_odd_wk=$(($(date '+%_W') % 2))

Scott
--
Scott Hemphill hemp...@alumni.caltech.edu
"This isn't flying. This is falling, with style." -- Buzz Lightyear

VPN user

unread,

Mar 13, 2016, 3:08:01 AM3/13/16

to

Bit Twister wrote in message slrnne9bbk.9...@wb.home.test

> If you have any serious math to do, you might look into bc.
>
> echo "scale=10; (215605-3303-212086)/215605" | bc

Taking your advice to look at "bc" in more detail, I used "bc" just
now for the first time, in order to round the floating-point ping
time to the nearest integer.

The purpose of this vpnspeed script is merely to determine the speeds
and then rename the ovpn file with those speeds. The script that will
actually choose the fastest file will be the vpnrun script.

Nonetheless, since this is the first use of "bc", here's the vpnspeed
script with the new use of "bc" to round ping numbers to the nearest
integer.

#!/bin/bash
# File: vpnspeed.sh renames the currently running ovpn file with speed results.
# Version: 20160313
# Many thanks to Marek Novotny for patient algorithmic suggestions.
# Thanks also to BitTwister for program-flow streamlining guidance.

# Get error checking for free (courtesy of BitTwister).
set -u

# Abort if openvpn is not currently running
openvpnPID=$(pgrep openvpn)
if [ $? -ne 0 ] ; then
echo "Oops: There's nothing to do as openvpn isn't running; aborting"
exit 1
fi

# Set up the display for the xmessage (courtesy of Marek Novotny).
setupDisplay()
{
if [[ -z "$DISPLAY" ]]; then
DISPLAY=':0'
fi
}

setupDisplay

# Popup message when the speedtest-cli finishes (courtesy of Marek Novotny).
sendMessage()
{
ans=$(xmessage -display $DISPLAY -fg black -bg white \
-title "VPN Speed" -geom +60+30 \
-buttons yes,no -default yes "$1")
}

# Obtain the directory that openvpn was last run from:
openvpnCWD=$(sudo ls -l /proc/$openvpnPID/cwd | awk '{print $NF}')
echo "Good: Directory where openvpn is currently running is $openvpnCWD"

# Define the directory to contain the renamed openvpn file:
renameSubDir=${openvpnCWD}/renamed
echo "Good: Directory for renamed file is $renameSubDir"

# Obtain the currently running openvpn file name:
openvpnFileName=$(cat /proc/$openvpnPID/cmdline | awk -F"--config" '{print $2}' | awk -F"--script" '{print $1}')
echo "Good: File to be renamed is $openvpnFileName"

# Create filename variables which will be used more than once below:
shortFileName=$(basename $openvpnFileName .ovpn)
fullFileSpec=${openvpnCWD}/${openvpnFileName}

# Run a ping as an early speed indicator (because speedtest-cli is slow!):
pingAvgFloat=$(ping -c3 8.8.8.8 | grep rtt| awk -F"/" '{print $5}')

# Round the floating point ping time to the nearest integer millisecond:
pingRoundInteger=$(echo "($pingAvgFloat+0.5)/1" | bc)
echo "Good: Average integer latency to 8.8.8.8 is ${pingRoundInteger}ms"

# Capture speedtest into variables (courtesy of Marek Novotny):
IFS=$'\n'
set -- $(speedtest-cli --simple --secure --timeout 5)
pingResult="${1/*: /}"
downResult="${2/*: /}"
upResult="${3/*: /}"
IFS=$'\t\n '
pingR=${pingResult/.*/}
downR=${downResult/' Mbit/s'/}
upR=${upResult/' Mbit/s'/}

# Determine if each of the speedtest results are numerical:
isnum() { awk -v a="$1" 'BEGIN {print (a == a + 0)}'; }
isNumPingR=$(isnum "$pingR")
isNumDownR=$(isnum "$downR")
isNumUpR=$(isnum "$upR")

# Rename the current ovpn file only if all 3 speedtest-cli results are numerical:
if [ "$isNumPingR" == "1" ] && [ "$isNumUpR" == "1" ] && [ "$isNumDownR" == "1" ]; then
echo "Good: All 3 speedtest-cli results are numerical; therefore we can rename the ovpn file..."
if [ -f $fullFileSpec ]; then
echo "Good: Moving $fullFileSpec to ${renameSubDir}/${shortFileName}_${downR}dn_${upR}up_${pingR}ms.ovpn"
if [ ! -d "$renameSubDir" ]; then
echo "Making $renameSubDir directory"
mkdir "$renameSubDir"
fi
mv "$fullFileSpec" "${renameSubDir}/${shortFileName}_${downR}dn_${upR}up_${pingR}ms.ovpn"
echo "dn speed is ${downR}"
echo "up speed is ${upR}"
echo "platency is ${pingR}"
sendMessage "Kill ${downR}dn ${upR}up ${pingR}ms ${openvpnFileName}?"
else
echo "Oops: $fullFileSpec does not exist; not renaming"
exit 1
fi
else
echo "Oops: Speedtest results are not all numerical; can't rename ${openvpnFileName}"
exit 1
fi

VPN user

unread,

Mar 17, 2016, 7:25:45 PM3/17/16

to

I got sidetracked on this speed thing partially because I lost all
my speed data (aauurrgghh!) which happened when 'something' malfunctioned
such that there were multiple openvpn processes running.

None of the scripts had assumed multiple ovpenvpn sessions could occur
simultaneously, and even I don't understand how it can happen, but it
did.

So the scripts malfunctions, and one of them (somehow) wiped out
everything in the vpn_winners directory. Over a thousand good files,
most of which I had speed tested. Sigh.

So, I started anew, and I now have about 800 winners based on new
downloads and some fortuitously saved directories of previously
downloaded raw ovpn files.

Since each server can have as many as 4 different types, I now
organize them with a vpnorgbytype script which puts them into
4 directories based on both their need for DNS (or not) and their
protocol (either UDP or TCP).

Here are some statistics on the roughly 800 files by the four types:
1. Needs dns, tcp protocol
2. Needs dns, udp protocol
3. No DNS needed, tcp protocol
4. No DNS needed, udp protocol

There is a slight bit of slop in the numbers because an "ls" doesn't
'count' files, per se; but it's close enough for our rough numbers
since the count is only off by a single-digit margin of error.

$ ls ./vpn_winners/dns_tcp/* | wc -l
212

$ ls ./vpn_winners/dns_udp/* | wc -l
229

$ ls ./vpn_winners/ipa_tcp/* | wc -l
168

$ ls ./vpn_winners/ipa_udp/* | wc -l
213

I generally arbitrarily use the files since I don't really see the
difference between the UDP & TCP on speed and I don't worry about
someone hijacking my DNS server.

But, to report some statistics back to the team (Marek, for one, said he likes
getting statistics), roughly, out of 800 vpn_winners files, 1/4 were of
each of the four types (as might be expected).

So, basically, each server, on average, has 4 different ways to access it
so that's why I now organize by four directories.
1. dns & tcp
2. dns & udp
3. ipa & tcp
4. ipa & udp

Here's the script (yes, I know it's ugly) that organizes the vpn_winners
by type if you're interested in using it yourself.
#!/bin/bash
# vpnorgbytype organizes new ovpn files by the type of DNS lookup & by protocol.
# 20160314
# The file name tells us if the server is specified as an IP address or as a domain name.
# The file name also tells us the protocol (and port, but we don't care about the port).
# Four directories are created (if needed) and populated:
# 1. No DNS lookup is required & the protocol is TCP (private & slightly slower)
# 2. No DNS lookup is required & the protocol is UDP (private & slightly faster)
# 3. A DNS lookup is required & the protocol is TCP (slightly less private and slower)
# 4. A DNS lookup is required & the protocol is UDP (slightly less private but faster)
# An example of each type is:
# 1. dns_udp
# vpngate_GB_I6_MiltonKeynes_MiltonKeynes_86.160.4.74-vpn714471204.opengw.net_udp1723_20160316.ovpn
# 2. dns_tcp
# vpngate_US_NC_NorthCarolina_FortBragg_75.178.28.124-vpn773294077.opengw.net_tcp1668_20160316.ovpn
# 3. ipa_tcp
# vpngate_ES_56_Catalonia_Barcelona_83.55.225.31-83.55.225.31_tcp995_20160314.ovpn
# 4. ipa_udp
# vpngate_TR_34_Istanbul_Istanbul_78.188.128.102-78.188.128.102_udp1993_20160317.ovpn
#
# Thanks to BitTwister for variable syntax & Marek Novotny for the case statement syntax.

set -u

# Organize the files by type based on information inherent in the file names.
_organizeByType () {

# 1. A DNS lookup is required and the (slightly faster) UDP protocol.
for _fileName in $(ls *-[^0-9]*|grep _udp) ; do
mkdir -p ./dns_udp/
mv $_fileName ./dns_udp/.
done

# 2. A DNS lookup is required and the (slightly slower) TCP protocol.
for _fileName in $(ls *-[^0-9]*|grep _tcp) ; do
mkdir -p ./dns_tcp/
mv $_fileName ./dns_tcp/.
done

# 3. No DNS lookup needed and the (slightly slower) TCP protocol.
for _fileName in $(ls *-[0-9]*|grep _tcp) ; do
mkdir -p ./ipa_tcp/
mv $_fileName ./ipa_tcp/.
done

# 4. No DNS lookup needed and the (slightly faster) UDP protocol.
for _fileName in $(ls *-[0-9]*|grep _udp) ; do
mkdir -p ./ipa_udp/
mv $_fileName ./ipa_udp/.
done

}

# Ask the user if they are in the current directory.
# (Note: Template is courtesy of Marek Novotny's kickorkeep script.)
_currentWorkingDir=$(pwd)
read -p "Is $_currentWorkingDir OK? " answer
case $answer in
[Yy]* )
echo "Good: Organizing ovpn files by need for DNS & protocol"
_organizeByType
;;
[Nn]* )
echo "Ooops: Wrong directory; aborting"
exit 1
;;
* )
echo "Ooops: Wrong key pressed; do nothing..."
exit 1
;;
esac

echo " "
ls -d */
echo " "

VPN user

unread,

Mar 17, 2016, 7:40:01 PM3/17/16

to

Since I had to start over with generating the speedtest results,
I organized the vpn_winners by type and also by country.

Since Marek said he likes statistics, here are the country
statistics of just a download from the past few days (since I had
to start over) including some leftover ovpn files I had lying around
from the vpngate server.

$ for i in ./vpn_winners/dns_udp/*; do ls $i/*|wc -l; echo $i; done

I sorted that list below, and found not surprisingly that Japan
and Korea (again) predominate (by far). Usually Korea has more than
Japan, but in this case, they were even.

Usually the US is next so I was surprised that France was higher
than the US, but, usually France and Germany have more than the
rest of the European countries.

Thailand and Russia were in the mid tier, which seems pretty typical,
and then there were all the rest with only a handful of servers
per country.

48 ./vpn_winners/dns_udp/JP
48 ./vpn_winners/dns_udp/KR
22 ./vpn_winners/dns_udp/FR
19 ./vpn_winners/dns_udp/US
12 ./vpn_winners/dns_udp/TH
10 ./vpn_winners/dns_udp/RU
7 ./vpn_winners/dns_udp/GB
6 ./vpn_winners/dns_udp/SA
5 ./vpn_winners/dns_udp/VE
4 ./vpn_winners/dns_udp/ID
4 ./vpn_winners/dns_udp/VN
3 ./vpn_winners/dns_udp/DE
3 ./vpn_winners/dns_udp/TR
2 ./vpn_winners/dns_udp/AR
2 ./vpn_winners/dns_udp/BR
2 ./vpn_winners/dns_udp/BY
2 ./vpn_winners/dns_udp/CA
2 ./vpn_winners/dns_udp/CL
2 ./vpn_winners/dns_udp/EG
2 ./vpn_winners/dns_udp/IR
2 ./vpn_winners/dns_udp/IT
2 ./vpn_winners/dns_udp/PT
2 ./vpn_winners/dns_udp/SE
2 ./vpn_winners/dns_udp/TW
1 ./vpn_winners/dns_udp/AU
1 ./vpn_winners/dns_udp/CN
1 ./vpn_winners/dns_udp/CR
1 ./vpn_winners/dns_udp/ES
1 ./vpn_winners/dns_udp/HK
1 ./vpn_winners/dns_udp/IS
1 ./vpn_winners/dns_udp/LV
1 ./vpn_winners/dns_udp/MX
1 ./vpn_winners/dns_udp/NL
1 ./vpn_winners/dns_udp/NO
1 ./vpn_winners/dns_udp/NZ
1 ./vpn_winners/dns_udp/PE
1 ./vpn_winners/dns_udp/PL
1 ./vpn_winners/dns_udp/QA
1 ./vpn_winners/dns_udp/SG
1 ./vpn_winners/dns_udp/UA

When I have a sufficient number of files to test (around 1000),
I will begin the speed tests anew, although I'm working on a
*faster* speedtest than the very slow speedtest-cli method.

VPN user

unread,

Mar 17, 2016, 7:44:13 PM3/17/16

to

VPN user wrote in message ncff8f$9ck$2...@news.mixmin.net

> Since I had to start over with generating the speedtest results,
> I organized the vpn_winners by type and also by country.

Here is the script used to organize by country.

I basically just took sections from Marek's other scripts
and cobbled them together to create this script.

#!/bin/bash
# 20160314
# vpnorgbyco organizes vpn files by two-character ISO-3166 country folders
# Variable syntax courtesy of Bit Twister (_) & Marek Novotny (camel case).
# AE BR CA CN DE DO FR HK IP JP KR NL PL QA RO RU SA SE TH TR US VE VN
# It creates "IP" for the unknown countries
# Note the anamolies have to be handled (errors are written to standard out, not standard err!):
# $ geoiplookup -f /usr/share/GeoIP/GeoLiteCity.dat 8.8.8.8
# GeoIP City Edition, Rev 1: US, CA, California, Mountain View, 94040, 37.384499, -122.088097, 807, 650
# $ geoiplookup -f /usr/share/GeoIP/GeoLiteCity.dat 0.8.8.8
# GeoIP City Edition, Rev 1: IP Address not found
# $ geoiplookup -f /usr/share/GeoIP/GeoLiteCity.dat 0.0.0.0
# GeoIP City Edition, Rev 1: can't resolve hostname ( 0.0.0.0 )
# Here are some filename examples corresponding to countries GB, US, ES, & TR:
# vpngate_GB_I6_MiltonKeynes_MiltonKeynes_86.160.4.74-vpn714471204.opengw.net_udp1723_20160316.ovpn
# vpngate_US_NC_NorthCarolina_FortBragg_75.178.28.124-vpn773294077.opengw.net_tcp1668_20160316.ovpn
# vpngate_ES_56_Catalonia_Barcelona_83.55.225.31-83.55.225.31_tcp995_20160314.ovpn
# vpngate_TR_34_Istanbul_Istanbul_78.188.128.102-78.188.128.102_udp1993_20160317.ovpn

set -u

# Move renamed VPN files into iso 3166 country folders
_organizeByCountry () {
for _fileName in $(ls *.ovpn) ; do
_isoCountry="$(echo $_fileName | cut -d'_' -f2)"
if [ ! -d "$_isoCountry" ]; then
echo "Good: Making $_isoCountry folder"
mkdir $_isoCountry
fi
if [ ! -f ${_isoCountry}/${_fileName} ]; then
echo "Good: Moving $_fileName into $_isoCountry subfolder"
mv $_fileName $_isoCountry
else
echo "Oops: ${_isoCountry}/${_fileName} already exists; not moving"
fi
done
}

# Make sure the script runs in the correct directory

_currentWorkingDir=$(pwd)
read -p "Is $_currentWorkingDir OK? " answer
case $answer in
[Yy]* )

echo "Good: Organizing ovpn files by ISO-3166 country code"
_organizeByCountry
;;
[Nn]* )
echo "Oops: Wrong directory; aborting"
exit 1
;;
* )
echo "Oops: Wrong key pressed; do nothing..."
exit 1
;;
esac

echo " "
# List only directories in the current folder.

ls -d */
echo " "
exit 0

## End ##

# Uncomment this for a count of files per country
# for _isoCountryCode in *;do
# _numberOfFiles=$(ls $_isoCountryCode|wc -l)
# echo $_isoCountryCode $_numberOfFiles
# done

# Use this for a country code lookup key
#!/bin/bash
# https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2
# _isoCountryFile=/usr/local/bin/iso3166.txt
# egrep -i ^${1} ${_isoCountryFile}

VPN user

unread,

Mar 17, 2016, 7:49:59 PM3/17/16

to

VPN user wrote in message ncff8f$9ck$2...@news.mixmin.net

> When I have a sufficient number of files to test (around 1000),
> I will begin the speed tests anew, although I'm working on a
> *faster* speedtest than the very slow speedtest-cli method.

Back on topic is a very difficult task which I ask your advice.

If we can find a FASTER speedtest, that can help everyone, not
just those of us using a variety of flaky VPN servers.

The speedtest-cli method works fine; but it's really slow. :(

Googling for how to speed up the speedtest, I found this:
http://www.howtogeek.com/179016/how-to-test-your-internet-speed-from-the-command-line/

Which basically suggested using these "timing" methods:
5MB:
$ curl -o http://download.thinkbroadband.com/5MB.zip
$ wget -O http://download.thinkbroadband.com/5MB.zip

10MB:
$ curl -o http://download.thinkbroadband.com/10MB.zip
$ wget -O http://download.thinkbroadband.com/10MB.zip

20MB:
$ curl -o http://download.thinkbroadband.com/20MB.zip
$ wget -O http://download.thinkbroadband.com/20MB.zip

50MB:
$ curl -o http://download.thinkbroadband.com/50MB.zip
$ wget -O http://download.thinkbroadband.com/50MB.zip

100MB:
$ curl -o http://download.thinkbroadband.com/100MB.zip
$ wget -O http://download.thinkbroadband.com/100MB.zip

100MB:
$ curl -o /dev/null http://speedtest.sea01.softlayer.com/downloads/test100.zip
$ wget -O /dev/null http://speedtest.sea01.softlayer.com/downloads/test100.zip

200MB:
$ curl -o http://download.thinkbroadband.com/200MB.zip
$ wget -O http://download.thinkbroadband.com/200MB.zip

512MB:
$ curl -o http://download.thinkbroadband.com/512MB.zip
$ wget -O http://download.thinkbroadband.com/512MB.zip

1GB:
$ curl -o http://download.thinkbroadband.com/1GB.zip
$ wget -O http://download.thinkbroadband.com/1GB.zip

But, in the end, even with the smallest file available,
I found them about as slow as speedtest-cli was.

Do you have a suggestion (which will help everyone) for
speeding up a command-line speedtest for Linux users?