[Wikitech-l] Time to redirect to https by default?

David Gerard

unread,

Apr 1, 2012, 6:06:43 AM4/1/12

to Wikimedia developers

Lots of monitoring going into place:

https://en.wikipedia.org/wiki/Wikipedia:List_of_articles_censored_in_Saudi_Arabia
http://www.bbc.co.uk/news/uk-politics-17576745

What are the current technical barriers to redirection to https by default?

- d.

_______________________________________________
Wikitech-l mailing list
Wikit...@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Petr Bena

unread,

Apr 1, 2012, 6:55:11 AM4/1/12

to Wikimedia developers

I see no point in doing that. Https doesn't support caching well and
is generally slower. There is no use for readers for that.

David Gerard

unread,

Apr 1, 2012, 7:01:02 AM4/1/12

to Wikimedia developers

On 1 April 2012 11:55, Petr Bena <bena...@gmail.com> wrote:

> I see no point in doing that. Https doesn't support caching well and
> is generally slower. There is no use for readers for that.

The use is that the requests themselves are encrypted, so that the
only thing logged is that they went to Wikimedia. You did read the
linked articles, right?

Svip

unread,

Apr 1, 2012, 7:20:54 AM4/1/12

to Wikimedia developers

On 1 April 2012 13:01, David Gerard <dge...@gmail.com> wrote:

> On 1 April 2012 11:55, Petr Bena <bena...@gmail.com> wrote:
>
>> I see no point in doing that. Https doesn't support caching well and
>> is generally slower. There is no use for readers for that.
>
> The use is that the requests themselves are encrypted, so that the
> only thing logged is that they went to Wikimedia. You did read the
> linked articles, right?

Obviously, I cannot confirm whether Mr Bena read the linked articles
or not, but he did provide an answer regarding the technical
restrictions.

Wikimedia already spends an incredible amount of time caching its
content, because *so many* users use Wikipedia and its sister projects
daily.

And since most of the content is fairly static, caching makes a lot of sense.

However, HTTPS does not support caching (at least not well), which
means each page would suddenly have to be generated for *each* page.
It's true that MediaWiki itself supports caching, but its own caching
is no where near as fast as a caching server like Varnish (although I
believe a less powerful caching server is used on Wikimedia's
servers).

The trade off is that the service would be slower for everyone or we
would need more servers. And I am not sure Wikimedia has that kind of
money.

Those are the *technical* limitations to defaulting to HTTPS.

Svip

unread,

Apr 1, 2012, 7:23:32 AM4/1/12

to Wikimedia developers

On 1 April 2012 12:06, David Gerard <dge...@gmail.com> wrote:

> http://www.bbc.co.uk/news/uk-politics-17576745

Also, this article was written on 1 April and is far beyond any
monitoring scheme ever suggested in the Western World. And I am sure
we would have heard about it being mentioned up until this point, if
it was real.

So I would take that article with a grain of salt. Particularly the
statement about 'real time'. That's not even feasible.

David Gerard

unread,

Apr 1, 2012, 7:59:32 AM4/1/12

to Wikimedia developers

On 1 April 2012 12:23, Svip <svi...@gmail.com> wrote:
> On 1 April 2012 12:06, David Gerard <dge...@gmail.com> wrote:

>> http://www.bbc.co.uk/news/uk-politics-17576745

> Also, this article was written on 1 April and is far beyond any
> monitoring scheme ever suggested in the Western World. And I am sure
> we would have heard about it being mentioned up until this point, if
> it was real.

It would be nice, but if it's a prank then (a) lots of other
newspapers are in on it (b) ORG flagged the programme described
several weeks in advance:

http://wiki.openrightsgroup.org/wiki/Communications_Capabilities_Development_Programme
http://www.openrightsgroup.org/issues/ccdp

So no, it's in no way a joke. This is absolutely real.

> So I would take that article with a grain of salt. Particularly the
> statement about 'real time'. That's not even feasible.

That a desired monitoring regime would require a violation of physics
has *never* stopped a legislative push for such.

- d.

Petr Bena

unread,

Apr 1, 2012, 8:52:52 AM4/1/12

to Wikimedia developers

I said there is a little benefit for most of users, of course there
would be some who could find it usefull, however that's no reason to
redirect all users. I use wikipedia a lot, and I don't care if someone
see which pages I open. If someone does care, they should switch to
https themselves.

Svip

unread,

Apr 1, 2012, 8:53:43 AM4/1/12

to Wikimedia developers

On 1 April 2012 13:59, David Gerard <dge...@gmail.com> wrote:

> On 1 April 2012 12:23, Svip <svi...@gmail.com> wrote:
>
>> On 1 April 2012 12:06, David Gerard <dge...@gmail.com> wrote:
>>
>>> http://www.bbc.co.uk/news/uk-politics-17576745
>>
>> Also, this article was written on 1 April and is far beyond any
>> monitoring scheme ever suggested in the Western World. And I am sure
>> we would have heard about it being mentioned up until this point, if
>> it was real.
>
> It would be nice, but if it's a prank then (a) lots of other
> newspapers are in on it (b) ORG flagged the programme described
> several weeks in advance:
>
> http://wiki.openrightsgroup.org/wiki/Communications_Capabilities_Development_Programme
> http://www.openrightsgroup.org/issues/ccdp
>
> So no, it's in no way a joke. This is absolutely real.

Still *kind of* a joke.

>> So I would take that article with a grain of salt. Particularly the
>> statement about 'real time'. That's not even feasible.
>
> That a desired monitoring regime would require a violation of physics
> has *never* stopped a legislative push for such.

But it has always stopped it from being implemented or executed in
practice. While the development is terrifying, it is also important
to note the lack of actual consequences it will have. Other than
being a huge embarrassment.

But I was always under the influence that the UK didn't really care
about free speech and privacy.

Piotr Jagielski

unread,

Apr 1, 2012, 10:04:22 AM4/1/12

to wikit...@lists.wikimedia.org

Hello,

I'm trying to import categorylinks.sql dump into my MySQL database. I'm
able to import it and query for articles in specific categories as long
the category name contains only English-language characters. I don't get
any results if I try to query for non-English category name. My
understanding is that the dump is in UTF-8 format so I tried the following:

create the database using the following command:
CREATE DATABASE wiki CHARACTER SET utf8 COLLATE utf8_general_ci;

import the dump using the following command:
mysql --user root --password=root wiki <
C:\Path\plwiki-20111227-categorylinks.sql --default-character-set=utf8

set my data source URL to the following in my Java code:
jdbc:mysql://localhost/plwiki?useUnicode=true&characterEncoding=UTF-8

It still doesn't work. What am I missing? Are there any instructions on
how to correctly import the dump anywhere?

Thanks,
Piotr

Svip

unread,

Apr 1, 2012, 10:31:47 AM4/1/12

to Wikimedia developers

On 1 April 2012 16:04, Piotr Jagielski <piotr.j...@op.pl> wrote:

> mysql --user root --password=root wiki <
> C:\Path\plwiki-20111227-categorylinks.sql --default-character-set=utf8

It's -p, not --password=root and it will prompt you for the password.

Piotr Jagielski

unread,

Apr 1, 2012, 11:05:05 AM4/1/12

to wikit...@lists.wikimedia.org

These options should be equivalent. It does load the data using the
below command. It just incorrectly handles non-English characters.

Regards,
Piotr

Platonides

unread,

Apr 1, 2012, 11:28:06 AM4/1/12

to wikit...@lists.wikimedia.org

On 1 April 2012 14:53, Svip wrote:
> On 1 April 2012 13:59, David Gerard <dge...@gmail.com> wrote:
>> On 1 April 2012 12:23, Svip <svi...@gmail.com> wrote:
>>> So I would take that article with a grain of salt. Particularly the
>>> statement about 'real time'. That's not even feasible.
>>
>> That a desired monitoring regime would require a violation of physics
>> has *never* stopped a legislative push for such.
>
> But it has always stopped it from being implemented or executed in
> practice. While the development is terrifying, it is also important
> to note the lack of actual consequences it will have. Other than
> being a huge embarrassment.

I don't see why it *couldn't* be implemented.
Note that the real time statement is no different on how they can snoop
your phone calls in real time.
Sure, the storage requirements would be crazy, but I don't see specific
details on what is to be stored, so it may well be implementable given
enough funding.

Platonides

unread,

Apr 1, 2012, 11:30:24 AM4/1/12

to wikit...@lists.wikimedia.org

On 01/04/12 17:05, Piotr Jagielski wrote:
> These options should be equivalent. It does load the data using the
> below command. It just incorrectly handles non-English characters.
>
> Regards,
> Piotr

Do you have $wgDBmysql5 set in your LocalSettings.php?

Piotr Jagielski

unread,

Apr 1, 2012, 11:37:34 AM4/1/12

to Wikimedia developers, Platonides

I don't have MediaWiki installed. I'm just trying to import the dump
into a standalone database so I can do some batch processing on the data.

Regards,
Piotr

Bináris

unread,

Apr 1, 2012, 12:00:10 PM4/1/12

to Wikimedia developers

2012/4/1 David Gerard <dge...@gmail.com>

> http://www.bbc.co.uk/news/uk-politics-17576745
>
> This one may be an April 1 joke, let's wait one day. :-)

--
Bináris

David Gerard

unread,

Apr 1, 2012, 12:39:39 PM4/1/12

to Wikimedia developers

On 1 April 2012 17:00, Bináris <wiki...@gmail.com> wrote:
> 2012/4/1 David Gerard <dge...@gmail.com>

>> http://www.bbc.co.uk/news/uk-politics-17576745

> This one may be an April 1 joke, let's wait one day. :-)

No, it really isn't, sadly.

- d.

Antoine Musso

unread,

Apr 1, 2012, 12:43:50 PM4/1/12

to wikit...@lists.wikimedia.org

Le 01/04/12 12:55, Petr Bena wrote:
> I see no point in doing that. Https doesn't support caching well and
> is generally slower. There is no use for readers for that.

HTTPS has nothing to do with caching, it just transports informations
between the client and the server so they can actually handle caching.

HTTPS supports caching as well as HTTP since they are exactly the same
protocol, the first just being encrypted.

You are right though, in the sense of most web browsers will BY DEFAULT
not save a copy of the received content whenever it is received through
HTTPS. The reason behind is that HTTPS page is/was usually used to
serve private content. Caching can be explicitly set to caching by
marking it as public, send "Cache-Control: public" and that should work.

I do agree there is probably no use for readers to have HTTPS enabled.
If the purposes is to bypass countries firewall such as in China (or I
think Thailand), they will just intercept the HTTPS connection form the
server on their hardware, decypher it for analysis and resign the
content with their own certificate before sending it back to clients.

That is exactly what you do in a big company when you want to make sure
(as an example) that your employee do not use the chat function in Facebook.

The only thing HTTPS is going to prevent, is being still its password
when logging in or getting the session cookie hijacked by sniffing the
local network. The WMF has already moved its private wikis to HTTPS
just for that :-]

cheers,

--
Antoine "hashar" Musso

Marcin Cieslak

unread,

Apr 1, 2012, 1:50:35 PM4/1/12

to wikit...@lists.wikimedia.org

>> Piotr Jagielski <piotr.j...@op.pl> wrote:
> Hello,

>
> set my data source URL to the following in my Java code:
> jdbc:mysql://localhost/plwiki?useUnicode=true&characterEncoding=UTF-8

Please note you have "plwiki" here and you imported into "wiki".
Assuming your .my.cnf is not making things difficult I ran a small
Jython script to test:

$ jython
Jython 2.5.2 (Release_2_5_2:7206, Mar 2 2011, 23:12:06)
[OpenJDK 64-Bit Server VM (Sun Microsystems Inc.)] on java1.6.0
Type "help", "copyright", "credits" or "license" for more information.
>>> from com.ziclix.python.sql import zxJDBC
>>> d, u, p, v = "jdbc:mysql://localhost/wiki", "root", None, "org.gjt.mm.mysql.Driver"
>>> db = zxJDBC.connect(d, u, p, v, CHARSET="utf8")
>>> c=db.cursor()
>>> c.execute("select cl_from, cl_to from categorylinks where cl_from=61 limit 10")
>>> c.fetchone()
(61, array('b', [65, 110, 100, 111, 114, 97]))
>>> (a,b) = c.fetchone()
>>> print b
array('b', [67, 122, -59, -126, 111, 110, 107, 111, 119, 105, 101, 95, 79, 114, 103, 97, 110, 105, 122, 97, 99, 106, 105, 95, 78, 97, 114, 111, 100, -61, -77, 119, 95, 90, 106, 101, 100, 110, 111, 99, 122, 111, 110, 121, 99, 104])
>>> for x in b:
... try:
... print chr(x),
... except ValueError:
... print "%02x" % x,
...
C z -3b -7e o n k o w i e _ O r g a n i z a c j i _ N a r o d -3d -4d w _ Z j e d n o c z o n y c h

array('b", [ ... ]) in Jython means that SQL driver returns an array of bytes.

It seems to me that array of bytes contains raw UTF-8, so you need to decode it into
proper Unicode that Java uses in strings.

I think this behaviour is described in

http://bugs.mysql.com/bug.php?id=25528

Probably you need to play with getBytes() on a result object
to get what you want.

//Saper

Platonides

unread,

Apr 1, 2012, 2:33:22 PM4/1/12

to wikit...@lists.wikimedia.org

On 01/04/12 18:43, Antoine Musso wrote:
> Le 01/04/12 12:55, Petr Bena wrote:
>> I see no point in doing that. Https doesn't support caching well and
>> is generally slower. There is no use for readers for that.
>
> HTTPS has nothing to do with caching, it just transports informations
> between the client and the server so they can actually handle caching.
>
> HTTPS supports caching as well as HTTP since they are exactly the same
> protocol, the first just being encrypted.

There would be a small difference if you're behind a caching proxy, but
that's unlikely to make a difference to pretty much everyone.

> I do agree there is probably no use for readers to have HTTPS enabled.
> If the purposes is to bypass countries firewall such as in China (or I
> think Thailand), they will just intercept the HTTPS connection form the
> server on their hardware, decypher it for analysis and resign the
> content with their own certificate before sending it back to clients.

Note that such approach would yield a certificate, which if stored
during the attack and later published, is a proof of their evil-doing.
Any CA willingly doing that (even if "forced by the government") would
(should) be immediately revoked from the browsers certificate bundles.

(I believe such interposition has been done in the past, though)

> That is exactly what you do in a big company when you want to make sure
> (as an example) that your employee do not use the chat function in Facebook.

A company can install its own CA certificate in their own computers, and
have a policy of "we will sniff everything" (note that if the employee
is not conveniently informed of that, the wiretapping could well be
illegal).
I wonder how they handle self-signed certificates.

Piotr Jagielski

unread,

Apr 1, 2012, 2:32:05 PM4/1/12

to Wikimedia developers, Marcin Cieslak

Sorry, I made a mistake in the e-mail. I had the database set to the
same name in both places.

My problem is actually opposite because I don't get any result where I
use UTF-8 string as an input in the query. But I verified that I don't
get correct results where using the query you provided neither. The link
with the MySQL bug report might be helpful in resolving the problem so
thanks for providing it.

Piotr

Ryan Lane

unread,

Apr 1, 2012, 4:14:49 PM4/1/12

to Wikimedia developers

TL;DR: we have no plans for anonymous HTTPS by default, but will
eventually default to HTTPS for logged-in users.

1. It would require an ssl terminator on every frontend cache. The ssl
terminators eat memory, which is also what the frontend caches do.
2. HTTPS dramatically increases latency, which would be kind of
painful for mobile.
3. Some countries may completely block HTTPS, but allow HTTP to our
sites so that they can track users. Is it better for us to provide
them content, or protect their privacy?
4. It's still possible for governments to see that people are going to
wikimedia sites when using HTTPS, so it's still possible to oppress
people for trying to visit sites that are disallowed.

Leslie Carr

unread,

Apr 1, 2012, 4:24:34 PM4/1/12

to Wikimedia developers

On Sun, Apr 1, 2012 at 1:14 PM, Ryan Lane <rla...@gmail.com> wrote:
> TL;DR: we have no plans for anonymous HTTPS by default, but will
> eventually default to HTTPS for logged-in users.
>
> 1. It would require an ssl terminator on every frontend cache. The ssl
> terminators eat memory, which is also what the frontend caches do.
> 2. HTTPS dramatically increases latency, which would be kind of
> painful for mobile.

Without getting into how other countries censor data (boo!) I agree
with the first two points. SSL terminators are much more memory and
cpu intensive which would require many more machines. Also there are
more RTT's required for https/ssl and our ping latency is not very
good since we do not have a very geographically diverse
infrastructure.

The two solutions for this are #1 more and beefier machines and #2
caching centers in various locations physically closer to users (which
also requires a lot of #1). Sadly the biggest drawback of these two
points is that they both cost a lot of money and that would mean a lot
more pop up banners of Jimmy asking for cash :(

Leslie

P.S. I peronally like the idea of a cookie that you can check box at
the top of the page (one time showing only perhaps?) that would
default send users to https upon request. However I don't think we
can do this with our current infrastructure due to the above issues.

> 3. Some countries may completely block HTTPS, but allow HTTP to our
> sites so that they can track users. Is it better for us to provide
> them content, or protect their privacy?
> 4. It's still possible for governments to see that people are going to
> wikimedia sites when using HTTPS, so it's still possible to oppress
> people for trying to visit sites that are disallowed.
>
> On Sun, Apr 1, 2012 at 7:06 PM, David Gerard <dge...@gmail.com> wrote:
>> Lots of monitoring going into place:
>>
>> https://en.wikipedia.org/wiki/Wikipedia:List_of_articles_censored_in_Saudi_Arabia
>> http://www.bbc.co.uk/news/uk-politics-17576745
>>
>> What are the current technical barriers to redirection to https by default?
>>
>>
>> - d.
>>
>> _______________________________________________
>> Wikitech-l mailing list
>> Wikit...@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
> _______________________________________________
> Wikitech-l mailing list
> Wikit...@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l

--
Leslie Carr
Wikimedia Foundation
AS 14907, 43821

Platonides

unread,

Apr 1, 2012, 4:30:52 PM4/1/12

to wikit...@lists.wikimedia.org

On 01/04/12 17:37, Piotr Jagielski wrote:
> I don't have MediaWiki installed. I'm just trying to import the dump
> into a standalone database so I can do some batch processing on the data.
>
> Regards,
> Piotr

It inserts the data fine for me. I suspect your java code is failing to
appropiately read them. Try reading the table with a different tool,
such as phpMyAdmin.

> mysql> select * from categorylinks limit 20;
> +---------+---------------------------------------+-------------------------------------+---------------------+-------------------+--------------+---------+
> | cl_from | cl_to | cl_sortkey | cl_timestamp | cl_sortkey_prefix | cl_collation | cl_type |
> +---------+---------------------------------------+-------------------------------------+---------------------+-------------------+--------------+---------+
> | 0 | Ekspresowe_kasowanko | Golembiovski Andzey | 2009-07-09 21:01:30 | | | page |
> | 2 | Języki_skryptowe | AWK
> AWK | 2011-01-18 01:11:23 | Awk | uppercase | page |
> | 4 | Specjalności_lekarskie | ALERGOLOGIA | 2008-04-25 10:31:22 | | uppercase | page |
> | 6 | Formaty_plików_komputerowych | ASCII | 2011-09-23 11:01:05 | | uppercase | page |
> | 6 | Kodowania_znaków | ASCII | 2011-09-23 11:01:05 | | uppercase | page |
> | 7 | Artykuły_na_medal | ATOM | 2010-12-01 16:40:37 | | uppercase | page |
> | 7 | Artykuły_wymagające_dopracowania | ATOM | 2011-08-16 15:53:43 | | uppercase | page |
> | 7 | Atomy |
> ATOM | 2011-08-09 00:56:39 | | uppercase | page |
> | 8 | Logika_matematyczna | AKSJOMAT | 2007-11-10 08:18:06 | | uppercase | page |
> | 10 | Arytmetyka |
> ARYTMETYKA | 2011-10-17 02:36:39 | | uppercase | page |
> | 11 | Artykuły_pod_opieką_Projektu_Chemia | AMINOKWASY | 2011-08-19 02:48:21 | | uppercase | page |
> | 12 | Alkeny | *
> ALKENY | 2006-08-07 17:23:22 | * | uppercase | page |
> | 13 | Multimedia | ACTIVEX | 2007-05-24 20:20:15 | | uppercase | page |
> | 13 | Windows | ACTIVEX | 2007-05-24 20:20:15 | | uppercase | page |
> | 14 | Interfejsy_programistyczne | !
> APPLICATION PROGRAMMING INTERFACE | 2011-04-27 11:33:17 | ! | uppercase | page |
> | 15 | Amiga | AMIGAOS | 2007-09-09 17:19:11 | | uppercase | page |
> | 15 | Systemy_operacyjne | AMIGAOS | 2007-09-09 17:19:11 | | uppercase | page |
> | 16 | Organizacje_międzynarodowe | ASSOCIATION FOR COMPUTING MACHINERY | 2011-10-19 15:52:28 | | uppercase | page |
> | 18 | Funkcje_boolowskie | ALTERNATYWA | 2007-03-23 17:43:05 | | uppercase | page |
> | 19 | Logika_matematyczna | AKSJOMAT INDUKCJI | 2007-08-31 22:54:55 | | uppercase | page |
> +---------+---------------------------------------+-------------------------------------+---------------------+-------------------+--------------+---------+
> 20 rows in set (0.00 sec)

Tim Starling

unread,

Apr 1, 2012, 11:33:14 PM4/1/12

to wikit...@lists.wikimedia.org

On 02/04/12 06:14, Ryan Lane wrote:
> TL;DR: we have no plans for anonymous HTTPS by default, but will
> eventually default to HTTPS for logged-in users.
>
> 1. It would require an ssl terminator on every frontend cache. The ssl
> terminators eat memory, which is also what the frontend caches do.

Once we enable it by default for logged-in users, we will care a lot
more if someone tries to take it down with a DoS attack. Unless the
redirection can be disabled without actually logging in, a DoS attack
on the HTTPS frontend would prevent any authenticated activity.

It suggests a need for a robust, overprovisioned service, with tools
and procedures in place for identifying and blocking or throttling
malicious traffic.

[...]

> 3. Some countries may completely block HTTPS, but allow HTTP to our
> sites so that they can track users. Is it better for us to provide
> them content, or protect their privacy?
> 4. It's still possible for governments to see that people are going to
> wikimedia sites when using HTTPS, so it's still possible to oppress
> people for trying to visit sites that are disallowed.

It's also possible for governments to snoop on HTTPS communications,
by using a private key from a trusted CA to perform a
man-in-the-middle attack. Apparently the government of Iran has done this.

If we really want to protect the privacy of our users then we should
shut down the regular website and serve our content only via a Tor
hidden service ;)

-- Tim Starling

Petr Bena

unread,

Apr 2, 2012, 3:20:43 AM4/2/12

to Wikimedia developers

That's not what I wanted to say, I wanted to say "https may cause
troubles with caching", In fact some caching servers have problems
with https since the header is encrypted as well, so they usually just
forward the encrypted traffic to server. I don't say it's impossible
to cache this, but it's very complicated

On Sun, Apr 1, 2012 at 6:43 PM, Antoine Musso <hasha...@free.fr> wrote:
> Le 01/04/12 12:55, Petr Bena wrote:
>> I see no point in doing that. Https doesn't support caching well and
>> is generally slower. There is no use for readers for that.
>
> HTTPS has nothing to do with caching, it just transports informations
> between the client and the server so they can actually handle caching.
>
> HTTPS supports caching as well as HTTP since they are exactly the same
> protocol, the first just being encrypted.
>

_______________________________________________

Antoine Musso

unread,

Apr 2, 2012, 5:00:39 AM4/2/12

to wikit...@lists.wikimedia.org

On 2012-04-02 09:20, Petr Bena wrote:
> That's not what I wanted to say, I wanted to say "https may cause
> troubles with caching", In fact some caching servers have problems
> with https since the header is encrypted as well, so they usually just
> forward the encrypted traffic to server. I don't say it's impossible
> to cache this, but it's very complicated

That might indeed by an issue.

That is why you want to use HTTPS off loader at the edge of your
cluster, they will handle unencryption and then server that as
unencrypted traffic again :-]

I believe that is what the WMF is doing by using nginx as an HTTPS
proxy. Someone with better knowledge will confirm.

--
Antoine "hashar" Musso

Tei

unread,

Apr 2, 2012, 5:34:57 AM4/2/12

to Wikimedia developers

Perhaps have a black list of countries that are know to break the
privacy of communications, then make https default for logued users in
these countries.

This may help because:

- It only affect a subgroup of users (the ones from these countries)
- It only affect a subgroup of that subgroup, the logued users (not all)
- It create a blacklist of "bad countries" where citizens are under
surveillance by the governement

This perhaps is not feasible, if theres not easy way to detect the
country based on the ip.

--
--
ℱin del ℳensaje.

Petr Bena

unread,

Apr 2, 2012, 11:31:32 AM4/2/12

to Wikimedia developers

I believe it would be best if login form was served using http with
check box "Disable ssl" which would be not checked as default. The
target page of form would be ssl page in case users wouldn't check it.
So that in countries where ssl is problem they could just check it and
proceed using unencrypted connection.

Daniel Friesen

unread,

Apr 2, 2012, 1:33:53 PM4/2/12

to wikit...@lists.wikimedia.org

Serving the login page over http opens login up to MITM attacks by
injecting scripts to swipe passwords or modifying the form to only use
http. So you've already eliminated half the reason we introduced https.
Additionally you cannot control the action="" using a checkbox unless you
use JS to do it (and we strive to make sure our login form works for those
without JS). So in order to make a disable SSL checkbox work you have to
make the action="" a http page that does redirection.
However doing that means that now the password is posted over HTTP and a
MITM middle can now snoop passwords. Worse this eliminates most of the
rest of the advantage of https because now MITM also means we're all the
way back to making it possible to snoop user passwords in open Wi-Fi.

On Mon, 02 Apr 2012 08:31:32 -0700, Petr Bena <bena...@gmail.com> wrote:

> I believe it would be best if login form was served using http with
> check box "Disable ssl" which would be not checked as default. The
> target page of form would be ssl page in case users wouldn't check it.
> So that in countries where ssl is problem they could just check it and
> proceed using unencrypted connection.
>
> On Mon, Apr 2, 2012 at 11:34 AM, Tei <oscar...@gmail.com> wrote:
>> Perhaps have a black list of countries that are know to break the
>> privacy of communications, then make https default for logued users in
>> these countries.
>>
>> This may help because:
>>
>> - It only affect a subgroup of users (the ones from these countries)
>> - It only affect a subgroup of that subgroup, the logued users (not
>> all)
>> - It create a blacklist of "bad countries" where citizens are under
>> surveillance by the governement
>>
>> This perhaps is not feasible, if theres not easy way to detect the
>> country based on the ip.
>>
>> --
>> --
>> ℱin del ℳensaje.

--
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]

Ryan Lane

unread,

Apr 2, 2012, 2:34:13 PM4/2/12

to Wikimedia developers

On Mon, Apr 2, 2012 at 12:33 PM, Tim Starling <tsta...@wikimedia.org> wrote:
> On 02/04/12 06:14, Ryan Lane wrote:
>> TL;DR: we have no plans for anonymous HTTPS by default, but will
>> eventually default to HTTPS for logged-in users.
>>
>> 1. It would require an ssl terminator on every frontend cache. The ssl
>> terminators eat memory, which is also what the frontend caches do.
>
> Once we enable it by default for logged-in users, we will care a lot
> more if someone tries to take it down with a DoS attack. Unless the
> redirection can be disabled without actually logging in, a DoS attack
> on the HTTPS frontend would prevent any authenticated activity.
>
> It suggests a need for a robust, overprovisioned service, with tools
> and procedures in place for identifying and blocking or throttling
> malicious traffic.
>

Indeed. We're already pretty over provisioned. We have 4 servers per
datacenter, each of which is very bored. All they are doing is acting
as a transparent proxy, after ssl termination. We're using RC4 by
default (due to BEAST), and AES is also available (the processors we
are using have AES support).

Ideally we'll be using STS for logged in users. This will mean it's
impossible to turn off the redirection for users that have already
logged in for whatever period of time we have STS headers set. We need
to consider blocking a DoS from the SSL proxies, the LVS servers, or
the routers.

>> 3. Some countries may completely block HTTPS, but allow HTTP to our
>> sites so that they can track users. Is it better for us to provide
>> them content, or protect their privacy?
>> 4. It's still possible for governments to see that people are going to
>> wikimedia sites when using HTTPS, so it's still possible to oppress
>> people for trying to visit sites that are disallowed.
>
> It's also possible for governments to snoop on HTTPS communications,
> by using a private key from a trusted CA to perform a
> man-in-the-middle attack. Apparently the government of Iran has done this.
>

We really should publish our certificate fingerprints. An attack like
this can be detected. An end-user being attacked can see if the
certificate they are being handed is different from the one we
advertise. We could also provide a convergence notary service (or one
of the other things like convergence).

> If we really want to protect the privacy of our users then we should
> shut down the regular website and serve our content only via a Tor
> hidden service ;)
>

I agree that it's impossible to provide total protection of a user's
privacy. We could provide a number of services that would help users,
though. That said, I don't feel this should be on the top of our
priority list.

- Ryan

Ryan Lane

unread,

Apr 2, 2012, 2:58:41 PM4/2/12

to Wikimedia developers

On Mon, Apr 2, 2012 at 4:20 PM, Petr Bena <bena...@gmail.com> wrote:
> That's not what I wanted to say, I wanted to say "https may cause
> troubles with caching", In fact some caching servers have problems
> with https since the header is encrypted as well, so they usually just
> forward the encrypted traffic to server. I don't say it's impossible
> to cache this, but it's very complicated
>

Using SSL by default means all transparent proxies inbetween aren't
hit at all, since they'd be a MITM. I don't necessarily see this as a
bad thing, as transparent proxies often break things.

Browsers cache things differently from HTTPS sites, but otherwise
everything should work as normal. The SSL termination proxies
transparently proxy to our frontend caches after termination. Links
are sent as protocol-relative so that we don't split our cache, as
well.

- Ryan

Ryan Lane

unread,

Apr 2, 2012, 3:00:32 PM4/2/12

to Wikimedia developers

On Mon, Apr 2, 2012 at 6:34 PM, Tei <oscar...@gmail.com> wrote:
> Perhaps have a black list of countries that are know to break the
> privacy of communications, then make https default for logued users in
> these countries.
>
> This may help because:
>
> - It only affect a subgroup of users (the ones from these countries)
> - It only affect a subgroup of that subgroup, the logued users (not all)
> - It create a blacklist of "bad countries" where citizens are under
> surveillance by the governement
>
> This perhaps is not feasible, if theres not easy way to detect the
> country based on the ip.
>

I'd definitely not support doing something like this. This would
incredibly complicate things.

- Ryan

MZMcBride

unread,

Apr 2, 2012, 4:26:57 PM4/2/12

to Wikimedia developers

Ryan Lane wrote:
> On Mon, Apr 2, 2012 at 6:34 PM, Tei <oscar...@gmail.com> wrote:
>> Perhaps have a black list of countries that are know to break the
>> privacy of communications, then make https default for logued users in
>> these countries.
>>
>> This may help because:
>>
>> - It only affect a subgroup of users (the ones from these countries)
>> - It only affect a subgroup of that subgroup, the logued users (not all)
>> - It create a blacklist of "bad countries" where citizens are under
>> surveillance by the governement
>>
>> This perhaps is not feasible, if theres not easy way to detect the
>> country based on the ip.
>
> I'd definitely not support doing something like this. This would
> incredibly complicate things.

Someone came into #wikimedia-tech a few days ago and asked about something
similar to this. The idea was to use site-wide JavaScript to auto-redirect
users to https on one of the Chinese Wikipedias. I believe this was in
combination with geolocation functionality, but I'm not sure.

Do you have any thoughts on individual wikis doing this, assuming there's
local community consensus?

MZMcBride

Platonides

unread,

Apr 2, 2012, 5:31:28 PM4/2/12

to wikit...@lists.wikimedia.org

On 02/04/12 20:34, Ryan Lane wrote:
>> It's also possible for governments to snoop on HTTPS communications,
>> by using a private key from a trusted CA to perform a
>> man-in-the-middle attack. Apparently the government of Iran has done this.
>>
>
> We really should publish our certificate fingerprints. An attack like
> this can be detected. An end-user being attacked can see if the
> certificate they are being handed is different from the one we
> advertise. We could also provide a convergence notary service (or one
> of the other things like convergence).

Indeed. Detecting a potential MITM is useless if you can't determine if
it's real or not. For instance the switch from RapidSSL to DigiCert
certificate was quite suspicious.

I don't know how to best publicise it, though. I suppose we would list
them somewhere like https://secure.wikimedia.org/servers.html but if
nobody knows it's there...

Ryan Lane

unread,

Apr 2, 2012, 5:35:50 PM4/2/12

to Wikimedia developers

> Indeed. Detecting a potential MITM is useless if you can't determine if
> it's real or not. For instance the switch from RapidSSL to DigiCert
> certificate was quite suspicious.
>
> I don't know how to best publicise it, though. I suppose we would list
> them somewhere like https://secure.wikimedia.org/servers.html but if
> nobody knows it's there...
>

What's https://secure.wikimedia.org?

- Ryan

Antoine Musso

unread,

Apr 2, 2012, 7:22:10 PM4/2/12

to wikit...@lists.wikimedia.org

On April 2nd, 2012 at 23:35, Ryan Lane wrote:
> What's https://secure.wikimedia.org?

Some old experiment. Nothing to see here :-)

--
Antoine "hashar" Musso

Platonides

unread,

Apr 3, 2012, 11:26:55 AM4/3/12

to wikit...@lists.wikimedia.org

Ryan Lane wrote:
> What's https://secure.wikimedia.org?
>
> - Ryan

The server which contains
https://secure.wikimedia.org/keys.html

Helder

unread,

Apr 3, 2012, 11:34:18 AM4/3/12

to Wikimedia developers

On Tue, Apr 3, 2012 at 12:26, Platonides <Plato...@gmail.com> wrote:
> Ryan Lane wrote:
>> What's https://secure.wikimedia.org?
>>
>> - Ryan
>
> The server which contains
> https://secure.wikimedia.org/keys.html

When I access that page, Google Chrome gives this error message:
Failed to load resource: the server responded with a status of 404 (Not Found)
GET http://en.wikipedia.org/skins-1.5/monobook/headbg.jpg 404 (Not Found)

Best regards,
Helder

Petr Bena

unread,

Apr 3, 2012, 11:43:07 AM4/3/12

to Wikimedia developers

Can we move to the initial discussion regarding http redirect to https
please :-) That page doesn't contain anything interesting anyway...
(Now after saying this I guess that it's gonna have way more visitors
than ever, hehe)

Reply all

Reply to author

Forward