Skupiny Google už nepodporují nová předplatná ani příspěvky Usenet. Historický obsah lze zobrazit stále.

OT: porn attack against automated Turing tests

0 zobrazení
Přeskočit na první nepřečtenou zprávu

Paul Rubin

nepřečteno,
30. 1. 2004 15:43:4530.01.04
komu:
Slashdot reports a cute attack porn spammers are using against
automated Turing tests (Captcha's). The idea is they want to register
1000's of (e.g.) Hotmail accounts so they can send spam advertising
their porn site, but Hotmail requires users on enrollment to answer an
automated test that involves reading some distorted characters that
are hard for a computer program to recognize.

The spammers have found a low-tech way around the tests: their Hotmail
bot simply copies the distorted characters to their porn site in real
time, so the porn users have to supply the answers in order to be
shown more porn. The bot then gets the answers and supplies them to
Hotmail to enroll more Hotmail accounts. Even given the proxying
delays and the slowdown caused by the porn users typing with just one
hand, with enough users online at once they can apparently make this
fly with sufficiently low latency for the Captcha-using sites.

I don't know why but I just have to chuckle at this.

Johan Lindh

nepřečteno,
30. 1. 2004 15:58:1130.01.04
komu:
Paul Rubin wrote:

I don't see how the spammers can lose the war. They have the strongest
forces in the human psyche on their side. Stupidity, greed and lust.

/J

John Savard

nepřečteno,
30. 1. 2004 22:25:1230.01.04
komu:
On 30 Jan 2004 21:58:11 +0100, Johan Lindh
<jo...@linkdata.getridofthis.se> wrote, in part:

Well, someone reading distorted characters to see porn is aiding and
abetting a criminal fraud against Hotmail and other companies, so just
monitor the Internet, track a few thousand of these individuals down
and give them jail terms, and no one will unscramble distorted
characters to see porn.

John Savard
http://home.ecn.ab.ca/~jsavard/index.html

Phil Carmody

nepřečteno,
2. 2. 2004 7:09:0102.02.04
komu:

Thanks for that, Paul.
Evolution in action, eh?

It's basically the "win against a grandmaster" algorithm.


Being a w3m user, and my g/f being a lynx user, I've never
had much time for these schemes anyway.

Phil
--
Unpatched IE vulnerability: NavigateAndFind file proxy
Description: cross-domain scripting, cookie/data/identity
theft, command execution
Reference: http://safecenter.net/liudieyu/NAFfileJPU/NAFfileJPU-Content.HTM
Exploit: http://safecenter.net/liudieyu/NAFfileJPU/NAFfileJPU-MyPage.htm

Phil Carmody

nepřečteno,
2. 2. 2004 7:21:3002.02.04
komu:
jsa...@ecn.aSBLOKb.caNADA.invalid (John Savard) writes:
> Well, someone reading distorted characters to see porn is aiding and
> abetting a criminal fraud against Hotmail and other companies,

And telling someone the time, or directions to the town square,
is aiding and abetting bank robberies? I don't think so.
I think you'll have some problem with the mens rea, for a start.
I can only suppose that you'll be lobbying for all termination of
public cryptographic research forthwith, after things like Ross
Anderson's utterly amoral aiding and abetting of PIN-cracking etc.

> so just
> monitor the Internet, track a few thousand of these individuals down
> and give them jail terms, and no one will unscramble distorted
> characters to see porn.

How am I, as a hypothetical one-handed surfer, supposed to tell
the difference between a legitimate online gallery of artistic
glamour images which is validly protected by a "captcha"-style
scheme in order to hinder robots from illegally acquiring the
copyrighted images, and a hotmail-busting porn-spammer?
What crypto algorithm would you suggest in order to help me
tell the difference? Not that you could answer me, of course,
just in case it aids and abets some future fraud.

Phil
--
Unpatched IE vulnerability: Basic Authentication URL spoofing
Description: Spoofing the URL displayed in the Address bar
Reference: http://msgs.securepoint.com/cgi-bin/get/bugtraq0306/15.html

Francois Grieu

nepřečteno,
2. 2. 2004 10:25:1402.02.04
komu:
In article <87vfmpv...@nonospaz.fatphil.org>,
Phil Carmody <thefatphi...@yahoo.co.uk> wrote:

> jsa...@ecn.aSBLOKb.caNADA.invalid (John Savard) writes:
> > Well, someone reading distorted characters to see porn is aiding and
> > abetting a criminal fraud against Hotmail and other companies,
>
> And telling someone the time, or directions to the town square,
> is aiding and abetting bank robberies? I don't think so.

Or could it be that John Savard was joking like the OP?
That's how I read his comment.

Francois Grieu

John Savard

nepřečteno,
3. 2. 2004 2:19:5503.02.04
komu:
On Mon, 02 Feb 2004 16:25:14 +0100, Francois Grieu
<fgr...@micronet.fr> wrote, in part:

Well, I realize the suggestion might be problematic, but it wasn't
made in jest. I think that the _mens rea_ wouldn't be too hard to
prove, given the publicity surrounding this, and the uses of
"captcha".

Of course, the porn site operators, by violating the copyright on the
images generated by the licensed "captcha" software, would be much
easier to put behind bars. (They're giving the software away free?
Evidently controlled distribution, and image watermarking, is going to
be needed here...)

John Savard
http://home.ecn.ab.ca/~jsavard/index.html

Phil Carmody

nepřečteno,
3. 2. 2004 6:00:1603.02.04
komu:
jsa...@ecn.aSBLOKb.caNADA.invalid (John Savard) writes:
> Of course, the porn site operators, by violating the copyright on the
> images generated by the licensed "captcha" software, would be much
> easier to put behind bars. (They're giving the software away free?
> Evidently controlled distribution, and image watermarking, is going to
> be needed here...)

This could be the snag that catches them.

Note that the "watermarking" as such needn't be all that clever, it
needn't be real watermarking. We have a priori that:
a) the porn spammers don't have the time to do clever image processing
b) the image gets more useful the more they obscure it, up to a point.

What's to stop them serving non-watermarked captcha images looking like:
+------------------------------------------------+
| (c)(c) (c)(c) (c) (c) (c) |
| (c) (c) (c) (c) (c) (c) (c)) (c) |
| (c)(c) (c)(c) (c) (c) (c)c)(c) |
| (c) (c) (c) (c) (c) (c)(c(c) |
| (c) (c) (c) (c) (c) (c) hotmail 2004 |
+------------------------------------------------+
?

Phil

--
Unpatched IE vulnerability: window.open search injection


Description: cross-domain scripting, cookie/data/identity theft, command execution

Reference: http://safecenter.net/liudieyu/WsFakeSrc/WsFakeSrc-Content.HTM
Exploit: http://safecenter.net/liudieyu/WsFakeSrc/WsFakeSrc-MyPage.htm

Peter Gutmann

nepřečteno,
4. 2. 2004 2:26:4704.02.04
komu:
jsa...@ecn.aSBLOKb.caNADA.invalid (John Savard) writes:

>On 30 Jan 2004 21:58:11 +0100, Johan Lindh
><jo...@linkdata.getridofthis.se> wrote, in part:
>>Paul Rubin wrote:
>>
>>> Slashdot reports a cute attack porn spammers are using against
>>> automated Turing tests (Captcha's). The idea is they want to register
>>> 1000's of (e.g.) Hotmail accounts so they can send spam advertising
>>> their porn site, but Hotmail requires users on enrollment to answer an
>>> automated test that involves reading some distorted characters that
>>> are hard for a computer program to recognize.
>>>
>>> The spammers have found a low-tech way around the tests: their Hotmail
>>> bot simply copies the distorted characters to their porn site in real
>>> time, so the porn users have to supply the answers in order to be
>>> shown more porn. The bot then gets the answers and supplies them to
>>> Hotmail to enroll more Hotmail accounts. Even given the proxying
>>> delays and the slowdown caused by the porn users typing with just one
>>> hand, with enough users online at once they can apparently make this
>>> fly with sufficiently low latency for the Captcha-using sites.
>>>
>>> I don't know why but I just have to chuckle at this.
>>
>>I don't see how the spammers can lose the war. They have the strongest
>>forces in the human psyche on their side. Stupidity, greed and lust.

>Well, someone reading distorted characters to see porn is aiding and abetting
>a criminal fraud against Hotmail and other companies,

I'm sure they're quaking in their boots over this.

>so just monitor the Internet, track a few thousand of these individuals down
>and give them jail terms, and no one will unscramble distorted characters to
>see porn.

Ah of course, now why didn't I think of that? However, what about the folks
who are paid to sit in Internet cafes in third-world countries and do the same
thing, or the dozen other ways of defeating reverse Turing tests?

(For people interested in this area, google for Human Interactive Proof for a
pile of publications in this area. It's an interesting way to kill an
afternoon).

Peter.

thisisme

nepřečteno,
4. 2. 2004 16:37:5604.02.04
komu:
Ah yes, the classic "man in the middle" attack.


Mok-Kong Shen

nepřečteno,
8. 2. 2004 12:11:3508.02.04
komu:

Paul Rubin wrote:
>
> Slashdot reports a cute attack porn spammers are using against

> automated Turing tests (Captcha's). ........
[snip]

The character P in CAPTCHA stands for Public and means
that the code and data used by CAPTCHA should be publicly
available. Why is this? If e.g. there is a huge public
data base involved, humans would have disadvantages in
making use of it (time required for exhaustive serach)
as compared to computers, i.e. humans would tend to
perform poorly in doing the tests, which would be
against the very purpose of having the test, wouldn't it?
Thanks.

M. K. Shen

John A. Malley

nepřečteno,
8. 2. 2004 15:43:2608.02.04
komu:
Mok-Kong Shen wrote:


The answer is in the paper "CAPTCHA: Using Hard AI Problems For
Security", by Luis von Ahn, Manuel Blum, Nicholas J. Hopper, and John
Langford, available at

http://www-2.cs.cmu.edu/~biglou/captcha_crypt.pdf


"A captcha is a program that can generate and grade tests that most humans can pass, but current computer programs can't pass."


Your example does not meet the informal definition of a CAPTCHA hard-AI
problem. (The formal definition is in the paper.)

Making CAPTCHA code and data public ensures its security doesn't depend
in any way on secret code or data. Adversaries who learn the secret code
or data would defeat it with impunity. CAPTCHA security depends on the
state of the art of AI algorithms that solve the problem incarnated by
the CAPTCHA and the randomness in every interaction with the verifier.
(See page 7 of the cited paper). I'd also add the Adversary cannot have
oracle access to a human, either, after reading Paul Rubin's post. :-)

(I think every CAPTCHA falls to any Adversary with oracle access to a
human when considered in the computational-complexity/computational
effort sense, but maybe a CAPTCHA augmented with physical information
can distinguish a human from an Adversary with access to a human oracle?
Something along the lines of applying physically observable cryptography
to the problem {http://eprint.iacr.org/2003/120/} ).


HTH,

John A. Malley
10266...@compuserve.com

Mok-Kong Shen

nepřečteno,
8. 2. 2004 16:52:5908.02.04
komu:

"John A. Malley" wrote:


>
> Mok-Kong Shen wrote:
>
> > The character P in CAPTCHA stands for Public and means
> > that the code and data used by CAPTCHA should be publicly
> > available. Why is this? If e.g. there is a huge public
> > data base involved, humans would have disadvantages in
> > making use of it (time required for exhaustive serach)
> > as compared to computers, i.e. humans would tend to
> > perform poorly in doing the tests, which would be
> > against the very purpose of having the test, wouldn't it?

>

> The answer is in the paper "CAPTCHA: Using Hard AI Problems For
> Security", by Luis von Ahn, Manuel Blum, Nicholas J. Hopper, and John
> Langford, available at
>
> http://www-2.cs.cmu.edu/~biglou/captcha_crypt.pdf
>
> "A captcha is a program that can generate and grade tests that most humans can pass, but current computer programs can't pass."
>
> Your example does not meet the informal definition of a CAPTCHA hard-AI
> problem. (The formal definition is in the paper.)
>
> Making CAPTCHA code and data public ensures its security doesn't depend
> in any way on secret code or data. Adversaries who learn the secret code
> or data would defeat it with impunity. CAPTCHA security depends on the
> state of the art of AI algorithms that solve the problem incarnated by
> the CAPTCHA and the randomness in every interaction with the verifier.
> (See page 7 of the cited paper). I'd also add the Adversary cannot have
> oracle access to a human, either, after reading Paul Rubin's post. :-)
>
> (I think every CAPTCHA falls to any Adversary with oracle access to a
> human when considered in the computational-complexity/computational
> effort sense, but maybe a CAPTCHA augmented with physical information
> can distinguish a human from an Adversary with access to a human oracle?
> Something along the lines of applying physically observable cryptography
> to the problem {http://eprint.iacr.org/2003/120/} ).

I don't think that 'adversary with oracle access to
a human' is an issue, since the matter concerns whether
there is an 'automatic' (and hence cheap/fast, highly
effective) means to defeat a restriction/barrier that is
intended to be such as not to be overcome by a machine.
With the help of a human, the test (to be passed by
humans) will of course be passed. Or have I misunderstood
the point above? (Please explain a bit in this case.)

As to the answer to my question, I suppose you mean
what is said in the 2nd paragraph under 'Who Knows What?'
there. I am yet not fully convinced of that. I think
that the goal CAPTCHA is of somewhat different nature in
comparison to crypto. In crypto, a revealed secret could
have very grave consequences and is not repairable. In
the context of CAPTCHA, the effects of a failure due
to some secret in the scheme becoming known to the
adversary might be tolerable at first in some
applications and, after detection of the failure, a new
secret be employed in its place.

M. K. Shen

John A. Malley

nepřečteno,
8. 2. 2004 17:53:4208.02.04
komu:
Mok-Kong Shen wrote:

[...]


>
> I don't think that 'adversary with oracle access to
> a human' is an issue, since the matter concerns whether
> there is an 'automatic' (and hence cheap/fast, highly
> effective) means to defeat a restriction/barrier that is
> intended to be such as not to be overcome by a machine.
> With the help of a human, the test (to be passed by
> humans) will of course be passed. Or have I misunderstood
> the point above? (Please explain a bit in this case.)


The Adversary is a computer program (algorithm) that attempts to beat
the CAPTCHA. An Adversary with oracle access to a human is
(tongue-in-cheek) a computer program that queries a human and gets an
answer that the algorithm uses in its own calculations. There are many
complexity theory problems dealing with the computational complexity of
decision problems assessed by algorithms with access to a "black box" to
answer a specific type of question in O(1) time. The black box is an
oracle.

The situation Paul Rubin described is an example of an CAPTCHA adversary
with oracle access to a human. And my point is that the test will get
solved by such an Adversary, so yes, you understood my post.


>
> As to the answer to my question, I suppose you mean
> what is said in the 2nd paragraph under 'Who Knows What?'
> there. I am yet not fully convinced of that. I think
> that the goal CAPTCHA is of somewhat different nature in
> comparison to crypto. In crypto, a revealed secret could
> have very grave consequences and is not repairable. In
> the context of CAPTCHA, the effects of a failure due
> to some secret in the scheme becoming known to the
> adversary might be tolerable at first in some
> applications and, after detection of the failure, a new
> secret be employed in its place.


A CAPTCHA depends on the difficulty of the cognitive problem to be
solved, and not any secret information. By definition a CAPTCHA does not
rely on secret data or secret code. The paper presents important
requirements on the kinds of cognitive problems that can be used as a
CAPTCHA, including the ability to sample instances of the problem
uniformly at random from a set of specific instances of the problem. A
CAPTCHA depends on instances of the problem being "hard" to solve with
current AI algorithms.

John A. Malley
10266...@compuserve.com

Mok-Kong Shen

nepřečteno,
9. 2. 2004 3:06:2209.02.04
komu:

"John A. Malley" wrote:
>
> Mok-Kong Shen wrote:
>
> [...]
> >
> > I don't think that 'adversary with oracle access to
> > a human' is an issue, since the matter concerns whether
> > there is an 'automatic' (and hence cheap/fast, highly
> > effective) means to defeat a restriction/barrier that is
> > intended to be such as not to be overcome by a machine.
> > With the help of a human, the test (to be passed by
> > humans) will of course be passed. Or have I misunderstood
> > the point above? (Please explain a bit in this case.)
>
> The Adversary is a computer program (algorithm) that attempts to beat
> the CAPTCHA. An Adversary with oracle access to a human is
> (tongue-in-cheek) a computer program that queries a human and gets an
> answer that the algorithm uses in its own calculations. There are many
> complexity theory problems dealing with the computational complexity of
> decision problems assessed by algorithms with access to a "black box" to
> answer a specific type of question in O(1) time. The black box is an
> oracle.
>
> The situation Paul Rubin described is an example of an CAPTCHA adversary
> with oracle access to a human. And my point is that the test will get
> solved by such an Adversary, so yes, you understood my post.

One could consider a man-machine combination either
as a machine with a human oracle or as a human with
machine assistance (e.g. fast searching in data base)
in my view, and there could also be a n-m combination
instead of 1-1. Probably in a given practical situation
there is an optimal one (from economical standpoint)
of such combination to do the attack. The question may
be on the other hand whether this could be considered
to be outside the framework of the original CAPTCHA,
or, in other words, one should properly always deal
with this generalized situation.

>
> >
> > As to the answer to my question, I suppose you mean
> > what is said in the 2nd paragraph under 'Who Knows What?'
> > there. I am yet not fully convinced of that. I think
> > that the goal CAPTCHA is of somewhat different nature in
> > comparison to crypto. In crypto, a revealed secret could
> > have very grave consequences and is not repairable. In
> > the context of CAPTCHA, the effects of a failure due
> > to some secret in the scheme becoming known to the
> > adversary might be tolerable at first in some
> > applications and, after detection of the failure, a new
> > secret be employed in its place.
>
> A CAPTCHA depends on the difficulty of the cognitive problem to be
> solved, and not any secret information. By definition a CAPTCHA does not
> rely on secret data or secret code. The paper presents important
> requirements on the kinds of cognitive problems that can be used as a
> CAPTCHA, including the ability to sample instances of the problem
> uniformly at random from a set of specific instances of the problem. A
> CAPTCHA depends on instances of the problem being "hard" to solve with
> current AI algorithms.

By 'definition', certainly yes. I was questioning
whether it could have sense in practice to relax the
'definition' (like security with obscurity, which
some people believe that the agencies actually do).

M. K. Shen

John A. Malley

nepřečteno,
9. 2. 2004 3:49:1309.02.04
komu:
Mok-Kong Shen wrote:
[...]

>>The situation Paul Rubin described is an example of an CAPTCHA adversary
>>with oracle access to a human. And my point is that the test will get
>>solved by such an Adversary, so yes, you understood my post.
>>
>
> One could consider a man-machine combination either
> as a machine with a human oracle or as a human with
> machine assistance (e.g. fast searching in data base)
> in my view, and there could also be a n-m combination
> instead of 1-1. Probably in a given practical situation
> there is an optimal one (from economical standpoint)
> of such combination to do the attack.


No, the situation is either an Adversary program attacking the CAPTCHA
or an Adversary program with oracle access to a human who unwittingly
assists the Adversary attacking the CAPTCHA by solving the CAPTCHA's
problem instance for the Adversary.

A human doesn't need machine assistance to tackle a CAPTCHA. A CAPTCHA
presents an instance of a cognitive problem "easy" for a human to solve
just because he's human. No special tools, no sweat, just look, point,
and click. (Finally we get points for just existing. Oh wait - nope,
that's not right, I forgot about the Neilson Ratings...)


> The question may
> be on the other hand whether this could be considered
> to be outside the framework of the original CAPTCHA,
> or, in other words, one should properly always deal
> with this generalized situation.
>

From the viewpoint of algorithms/computational complexity, I think it's
a safe bet to assert any Adversary with oracle access to a human will
always defeat a CAPTCHA. A CAPTCHA defeats an Adversary with oracle
access to a human if it detects the human is being "used" by the
Adversary, but this is hard to differentiate from an actual cooperating
human responding as he should to the CAPTCHA's posed problem. (A kind of
neat problem in my opinion, especially if it can be proved that there is
no defense against an Adversary with oracle human access.)

John A. Malley
10266...@compuserve.com


Nicol So

nepřečteno,
9. 2. 2004 6:34:5809.02.04
komu:
John A. Malley wrote:
>
> A human doesn't need machine assistance to tackle a CAPTCHA. A CAPTCHA
> presents an instance of a cognitive problem "easy" for a human to solve
> just because he's human. No special tools, no sweat, just look, point,
> and click. (Finally we get points for just existing. Oh wait - nope,
> that's not right, I forgot about the Neilson Ratings...)

So it is "specto, ergo sum" now? :)

--
Nicol So
Disclaimer: Views expressed here are casual comments and should
not be relied upon as the basis for decisions of consequence.

Mok-Kong Shen

nepřečteno,
9. 2. 2004 9:56:3609.02.04
komu:

"John A. Malley" wrote:
>
> Mok-Kong Shen wrote:
> [...]
>
> >>The situation Paul Rubin described is an example of an CAPTCHA adversary
> >>with oracle access to a human. And my point is that the test will get
> >>solved by such an Adversary, so yes, you understood my post.
> >>
> >
> > One could consider a man-machine combination either
> > as a machine with a human oracle or as a human with
> > machine assistance (e.g. fast searching in data base)
> > in my view, and there could also be a n-m combination
> > instead of 1-1. Probably in a given practical situation
> > there is an optimal one (from economical standpoint)
> > of such combination to do the attack.
>
> No, the situation is either an Adversary program attacking the CAPTCHA
> or an Adversary program with oracle access to a human who unwittingly
> assists the Adversary attacking the CAPTCHA by solving the CAPTCHA's
> problem instance for the Adversary.
>
> A human doesn't need machine assistance to tackle a CAPTCHA. A CAPTCHA
> presents an instance of a cognitive problem "easy" for a human to solve
> just because he's human. No special tools, no sweat, just look, point,
> and click. (Finally we get points for just existing. Oh wait - nope,
> that's not right, I forgot about the Neilson Ratings...)

But this is just 2 sides of the same coin, isn't it?
For an 'outsider', one sees only a human-machine pair
that is at work, right? Anyway, it's a question of
how much work is done by the machine and how much
by the human. It's conceivable that in one extreme
almost all work is done by the human. isn't it? I
don't think this terminological issue is essential.
Essential is, though, whether one considers (allows
for) in a given practical situation the possiblity
of human involvement at all, for there is a cost
factor (the salary of the person) to be considered,
I suppose.

>
> > The question may
> > be on the other hand whether this could be considered
> > to be outside the framework of the original CAPTCHA,
> > or, in other words, one should properly always deal
> > with this generalized situation.
> >
>
>
> From the viewpoint of algorithms/computational complexity, I think it's
> a safe bet to assert any Adversary with oracle access to a human will
> always defeat a CAPTCHA. A CAPTCHA defeats an Adversary with oracle
> access to a human if it detects the human is being "used" by the
> Adversary, but this is hard to differentiate from an actual cooperating
> human responding as he should to the CAPTCHA's posed problem. (A kind of
> neat problem in my opinion, especially if it can be proved that there is
> no defense against an Adversary with oracle human access.)

In case that a CAPTCHA (without human aid) could detect
whether a human is being involved, that would have meant
that a machine could have passed the Turing test, I would
think. Right?

M. K. Shen

John A. Malley

nepřečteno,
9. 2. 2004 16:05:4709.02.04
komu:
Mok-Kong Shen wrote:

>
> "John A. Malley" wrote:
[...]


>>>
>>No, the situation is either an Adversary program attacking the CAPTCHA
>>or an Adversary program with oracle access to a human who unwittingly
>>assists the Adversary attacking the CAPTCHA by solving the CAPTCHA's
>>problem instance for the Adversary.
>>
>>A human doesn't need machine assistance to tackle a CAPTCHA. A CAPTCHA
>>presents an instance of a cognitive problem "easy" for a human to solve
>>just because he's human. No special tools, no sweat, just look, point,
>>and click. (Finally we get points for just existing. Oh wait - nope,
>>that's not right, I forgot about the Neilson Ratings...)
>>
>
> But this is just 2 sides of the same coin, isn't it?
> For an 'outsider', one sees only a human-machine pair
> that is at work, right? Anyway, it's a question of
> how much work is done by the machine and how much
> by the human. It's conceivable that in one extreme
> almost all work is done by the human. isn't it? I
> don't think this terminological issue is essential.


I do. A CAPTCHA poses problems that a human can solve quickly but not a
program. A CAPTCHA is a type of Turing Test.

There is no question of how much work is done by a machine and by a
human to solve a CAPTCHA-posed problem. It's like this:

A human by definition can solve the CAPTCHA's query without any (that is
zero) assistance.

An Adversary program tries to solve the CAPTCHA-posed problem with an
algorithm.

An Adversary program with oracle access to a human passes the query to
the human and gets the answer from the human, and then carries on
subverting whatever it was the CAPTCHA existed to protect. The Adversary
and the human do not need to work together to solve the problem posed by
the CAPTCHA. There is no amount of joint effort between human and
Adversary to solve the CAPTCHA-posed problem. Every CAPTCHA-posed
problem can be solved by a human by himself and never requires any
machine assistance. There is no "question of how much work is done by
the machine and how much by the human". There is no "flip side" to an
Adversary with a human oracle to crack the CAPTCHA.

>> From the viewpoint of algorithms/computational complexity, I think it's
>>a safe bet to assert any Adversary with oracle access to a human will
>>always defeat a CAPTCHA. A CAPTCHA defeats an Adversary with oracle
>>access to a human if it detects the human is being "used" by the
>>Adversary, but this is hard to differentiate from an actual cooperating
>>human responding as he should to the CAPTCHA's posed problem. (A kind of
>>neat problem in my opinion, especially if it can be proved that there is
>>no defense against an Adversary with oracle human access.)
>>
>
> In case that a CAPTCHA (without human aid) could detect
> whether a human is being involved, that would have meant
> that a machine could have passed the Turing test, I would
> think. Right?
>


The CAPTCHA _is_ a type of Turing Test. It needs no human aid; it
presents instances of a cognitive problem chosen uniformly at random
from a set of such problem instances. The problem is chosen such that it
is "hard" for an AI to solve it but "easy" for a human to solve it.
Based on that assumption, repeated correct answers to CAPTCHA-posed
questions are interpreted as human responses to the CAPTCHA, so the
CAPTCHA decides the responder is a human.

An Adversary using oracle access to a human succeeds as well as a human,
since in both cases a human answers the question - indirectly of directly.

The harder problem to solve is a CAPTCHA that can decide between a human
and an Adversary with oracle access to a human. In this scenario, the
CAPTCHA could (possibly) rely on the observed characteristics of the
response (like time to respond, etc.) to try to spot the Adversary using
oracle access to a human from a true human response, but I'm not sure
this can be done. It's (IMO) an interesting problem.


John A. Malley
10266...@compuserve.com

Mok-Kong Shen

nepřečteno,
9. 2. 2004 17:07:3809.02.04
komu:

O.k. There are three cases: (1) CAPTCHA (machine alone),
(2) human, (3) CAPTCHA with human. (3) is at least
as capable in passing the tests as (2), isn't it?
Do you think that (3) is more capable of passing the
tests? Depending on your answer, we might 'eventually'
discuss on this, in case we don't have the same opinion.

On the other hand, a (in my view unessential) point
till now is whether (3) is called CAPTCHA with human
oracle or human with machine assistance. I said these
two views are two sides of the same coin, while you
apparently didn't agree. Sorry that I don't yet clearly
see how you have refuted my point above. (Note that,
referring to what you said, the 'assistance' could
consist in the CAPTCHA's passing the question to human
and routing his answers to the questioner.)

Here you said that the harder problem to solve is a
CAPTCHA that can decide between (2) and (3) of what
I listed above. I would think at the moment that
this harder problem is not solvable. But it would be
better that this issue be delayed until my question
mentioned above (i.e. whether (3) is more capable of
passing the test than (2)) is answered by you and
appropriately discussed, for that might eventually
show that my thought here is wrong from the outset.

M. K. Shen

Mok-Kong Shen

nepřečteno,
9. 2. 2004 17:39:2509.02.04
komu:

Mok-Kong Shen wrote:
>
oracle to crack the CAPTCHA.
>
> O.k. There are three cases: (1) CAPTCHA (machine alone),
> (2) human, (3) CAPTCHA with human. (3) is at least
> as capable in passing the tests as (2), isn't it?
> Do you think that (3) is more capable of passing the
> tests? Depending on your answer, we might 'eventually'
> discuss on this, in case we don't have the same opinion.

Sorry, I wrote 'entirely' wrongly. Read: The adversary
is: (1) machine alone, (2) human, (3) machine with human.

Neglect also the paragraph immediately following the
above in the previous post.

M. K. Shen

John A. Malley

nepřečteno,
9. 2. 2004 19:17:2009.02.04
komu:
Mok-Kong Shen wrote:

>
> "John A. Malley" wrote:

>>
>>An Adversary program with oracle access to a human passes the query to
>>the human and gets the answer from the human, and then carries on
>>subverting whatever it was the CAPTCHA existed to protect. The Adversary
>>and the human do not need to work together to solve the problem posed by
>>the CAPTCHA. There is no amount of joint effort between human and
>>Adversary to solve the CAPTCHA-posed problem. Every CAPTCHA-posed
>>problem can be solved by a human by himself and never requires any
>>machine assistance. There is no "question of how much work is done by
>>the machine and how much by the human". There is no "flip side" to an
>>Adversary with a human oracle to crack the CAPTCHA.
>>
>
> O.k. There are three cases: (1) CAPTCHA (machine alone),
> (2) human, (3) CAPTCHA with human. (3) is at least
> as capable in passing the tests as (2), isn't it?


Ah, I see your confusion.

The CAPTCHA is a stand-alone program that quizzes someone to see if that
someone is a human. Your three cases are mixing up the CAPTCHA with the
Adversary.

Let me explain it this way:

The CAPTCHA is a program that chooses an instance of a "hard-AI" problem
at random from a set of such instances V and presents it as a question
to a prover. The prover gives an answer, and the CAPTCHA verifies the
answer. The probability that a human can answer an instance of V is very
close to 1 over all of V. On the other hand, it is very hard to write a
computer program (an _Adversary_) that can answer an instance of V with
probability close to 1 over all of V. Therefore, if the CAPTCHA gives
multiple instances of V to a prover, and the prover answers correctly on
all those trials, then it is almost certain that the prover is a human.

An Adversary is a computer program playing the role of prover.

I then added this twist to the Adversary - give it oracle access to a
human. Now the situation is this:

The CAPTCHA chooses an instance of V and gives it to the prover, in this
case a computer program (Adversary) that has oracle access to a human.
This means the prover (program = Adversary ) redirects the instance of V
to a human who unwittingly solves the problem and gives the answer to
the prover (program = Adversary), and the prover then gives the answer
to the CAPTCHA. The CAPTCHA verifies the answers are correct, as it
should, since the answers (indirectly) came from a human!


> Do you think that (3) is more capable of passing the
> tests? Depending on your answer, we might 'eventually'
> discuss on this, in case we don't have the same opinion.


A CAPTCHA doesn't try to pass it own tests. It gives a test to someone
(the prover) and then checks that someone's response to see if that
someone is human or not human, based on their response. The role of the
CAPTCHA is the verifier.

The CAPTCHA acts as a gatekeeper to something of value. If the CAPTCHA
verifies the prover is human, it gives the human access to the something
of value. Otherwise it denies the prover access to the something of value.


>
> On the other hand, a (in my view unessential) point
> till now is whether (3) is called CAPTCHA with human
> oracle or human with machine assistance. I said these
> two views are two sides of the same coin, while you
> apparently didn't agree. Sorry that I don't yet clearly
> see how you have refuted my point above. (Note that,
> referring to what you said, the 'assistance' could
> consist in the CAPTCHA's passing the question to human
> and routing his answers to the questioner.)


Here again you've confused the CAPTCHA with the Adversary. The CAPTCHA
_presents_ a question to a prover, the prover gives the answer to the
CAPTCHA which then verifies the answer. The CAPTCHA is an automated
Turing Test. The CAPTCHA is not using oracle access to a human.


[...]

>>
>>The harder problem to solve is a CAPTCHA that can decide between a human
>>and an Adversary with oracle access to a human. In this scenario, the
>>CAPTCHA could (possibly) rely on the observed characteristics of the
>>response (like time to respond, etc.) to try to spot the Adversary using
>>oracle access to a human from a true human response, but I'm not sure
>>this can be done. It's (IMO) an interesting problem.
>>
>
> Here you said that the harder problem to solve is a
> CAPTCHA that can decide between (2) and (3) of what
> I listed above. I would think at the moment that
> this harder problem is not solvable. But it would be
> better that this issue be delayed until my question
> mentioned above (i.e. whether (3) is more capable of
> passing the test than (2)) is answered by you and
> appropriately discussed, for that might eventually
> show that my thought here is wrong from the outset.

Yes it was.

John A. Malley
10266...@compuserve.com


John A. Malley

nepřečteno,
9. 2. 2004 19:42:5109.02.04
komu:
Nicol So wrote:


>
> So it is "specto, ergo sum" now? :)
>

Aye.


John A. Malley
10266...@compuserve.com

John A. Malley

nepřečteno,
9. 2. 2004 20:46:3009.02.04
komu:
Mok-Kong Shen wrote:

One more clarification. An Adversary is a prover that is either (1)
machine (or computer program) alone or (2) machine (or computer program)

with oracle access to a human.

A human prover is not an Adversary. The human prover is normally
accepted by the CAPTCHA.

John A. Malley
10266...@compuserve.com

Michael Amling

nepřečteno,
9. 2. 2004 23:38:1409.02.04
komu:
John A. Malley wrote:
> The harder problem to solve is a CAPTCHA that can decide between a human
> and an Adversary with oracle access to a human. In this scenario, the
> CAPTCHA could (possibly) rely on the observed characteristics of the
> response (like time to respond, etc.) to try to spot the Adversary using
> oracle access to a human from a true human response, but I'm not sure
> this can be done. It's (IMO) an interesting problem.

It helps the Adversary to know what to expect from the CAPTCHA-using
server (the Protagonist?). E.g. it will be a .gif next to an input
field. Then the Adversary can send the .gif to the Oracle, embedded in
HTML that provides the input field, and relay back the content of the
input field.
The protagonist can make this part of the Adversary's job harder by
providing, say, a Java Applet that decides at run time what to display
and what input fields, buttons, drop down menus, etc., to have.

--Mike Amling

John A. Malley

nepřečteno,
10. 2. 2004 2:54:0510.02.04
komu:
Michael Amling wrote:


This looks to be a different approach to the problem - taking measures
to prevent the Adversary from gaining oracle access to an (unwitting)
human.

There is a way (I think - someone correct me if I missed something
critical in this post) to make the oracle access to a human unpalatable
for the human. Assume the Adversary needs to dupe a human into helping
it by rewarding the human (with porno, for example). We need to take
away the reward!

The CAPTCHA selects a URL at random from a set of URLs under its control
and presents it in the form of a distorted image. A human prover must
manually enter the URL into his web browser and then go to that page.
The URL may resolve to a new, different problem from the same CAPTCHA
that must be solved to complete the transaction so the human can prove
he's a human and gain access to whatever it is the CAPTCHA guards. (We
can repeat this several times, too.) A human interested in what the
CAPTCHA guards will follow through and complete the transaction.

An Adversary program that passes CAPTCHA questions to humans, accepts
their answers, and rewards them (with a page of porno) suffers with this
change. The human must leave the site and its reward (the porno) and go
to another web page! Perhaps even go through a trail of web pages,
dynamically constructed and of brief duration. The Adversary can't read
the URL to figure out where the human must go. (I won't rule out some
information leakage from browser history and cache if the Adversary can
get to it if/when the human returns to the Adversary's site.) The reward
is diminished if not extinguished.

This defense doesn't work if the Adversary is a gateway between the
human and the rest of the Internet, since every packet from the human's
machine is then known to the Adversary, so it always knows where the
human is going on the Internet.


John A. Malley
10266...@compuserve.com

Michael Amling

nepřečteno,
10. 2. 2004 8:48:0710.02.04
komu:

Hmm... If the CAPTCHA image said "To create asdf...@hotmail.com,
type such-and-such", the Oracle would catch on to the fact he's being
manipulated (and might not care).

>
> The CAPTCHA selects a URL at random from a set of URLs under its control
> and presents it in the form of a distorted image. A human prover must
> manually enter the URL into his web browser and then go to that page.

Right there, you've restricted the Adversary. The Oracle will open
the page from his computer, not the Adversary's. The Protagonist may be
able to distinguish the two.

> The URL may resolve to a new, different problem from the same CAPTCHA
> that must be solved to complete the transaction so the human can prove
> he's a human and gain access to whatever it is the CAPTCHA guards. (We
> can repeat this several times, too.) A human interested in what the
> CAPTCHA guards will follow through and complete the transaction.
>
> An Adversary program that passes CAPTCHA questions to humans, accepts
> their answers, and rewards them (with a page of porno) suffers with this
> change. The human must leave the site and its reward (the porno) and go
> to another web page!

Go to another web page, yes. Leave the site, not necessarily. The
Oracle could just open up a new window or a new tab.

> Perhaps even go through a trail of web pages,
> dynamically constructed and of brief duration. The Adversary can't read
> the URL to figure out where the human must go. (I won't rule out some
> information leakage from browser history and cache if the Adversary can
> get to it if/when the human returns to the Adversary's site.) The reward
> is diminished if not extinguished.
>
> This defense doesn't work if the Adversary is a gateway between the
> human and the rest of the Internet, since every packet from the human's
> machine is then known to the Adversary, so it always knows where the
> human is going on the Internet.

That could happen, if the Adversary offers web anonymization. The
Adversary would depend on the Oracle opening the CAPTCHA's URL in the
anonymizer.

--Mike Amling

Mok-Kong Shen

nepřečteno,
10. 2. 2004 9:25:5410.02.04
komu:

"John A. Malley" wrote:
>
> Mok-Kong Shen wrote:
>

> >
> > Sorry, I wrote 'entirely' wrongly. Read: The adversary
> > is: (1) machine alone, (2) human, (3) machine with human.
> >
> > Neglect also the paragraph immediately following the
> > above in the previous post.

>

> One more clarification. An Adversary is a prover that is either (1)
> machine (or computer program) alone or (2) machine (or computer program)
> with oracle access to a human.
>
> A human prover is not an Adversary. The human prover is normally
> accepted by the CAPTCHA.

In (2), the machine could choose simply to do nothing
of itself, i.e. pass the problem/question to the human
and route his answers back. So what would be the
difference between this and a human prover (alone)?
Thanks.

M. K. Shen

John A. Malley

nepřečteno,
10. 2. 2004 16:41:1110.02.04
komu:
Mok-Kong Shen wrote:

>
> "John A. Malley" wrote:
[...]
>

>>One more clarification. An Adversary is a prover that is either (1)
>>machine (or computer program) alone or (2) machine (or computer program)
>>with oracle access to a human.
>>
>>A human prover is not an Adversary. The human prover is normally
>>accepted by the CAPTCHA.
>>
>
> In (2), the machine could choose simply to do nothing
> of itself, i.e. pass the problem/question to the human
> and route his answers back. So what would be the
> difference between this and a human prover (alone)?
> Thanks.

That's what I've been talking about in my posts in this thread. That is
an Adversary with oracle access to a human. The program passes the
problem instance from the CAPTCHA to the human and gets the human to
answer it for the program. See my other posts and read them again in
this new light.

John A. Malley
10266...@compuserve.com


Mok-Kong Shen

nepřečteno,
11. 2. 2004 6:58:3511.02.04
komu:

But you seem to have ignored my question: What IS
the 'difference' between this kind of processing and
the case of a human alone (in what concerns the result
or decision of CAPTCHA)? I can yet see no difference
at all.

M. K. Shen

John A. Malley

nepřečteno,
11. 2. 2004 12:16:1011.02.04
komu:
Mok-Kong Shen wrote:

>
> "John A. Malley" wrote:
[...]

>>>In (2), the machine could choose simply to do nothing
>>>of itself, i.e. pass the problem/question to the human
>>>and route his answers back. So what would be the
>>>difference between this and a human prover (alone)?
>>>Thanks.
>>>
>>
>>That's what I've been talking about in my posts in this thread. That is
>>an Adversary with oracle access to a human. The program passes the
>>problem instance from the CAPTCHA to the human and gets the human to
>>answer it for the program. See my other posts and read them again in
>>this new light.
>>
>
> But you seem to have ignored my question: What IS
> the 'difference' between this kind of processing and
> the case of a human alone (in what concerns the result
> or decision of CAPTCHA)? I can yet see no difference
> at all.
>


I said way back in this thread,

"(I think every CAPTCHA falls to any Adversary with oracle access to a

human when considered in the computational-complexity/computational
effort sense, but maybe a CAPTCHA augmented with physical information
can distinguish a human from an Adversary with access to a human oracle?
Something along the lines of applying physically observable cryptography
to the problem {http://eprint.iacr.org/2003/120/} )."

and later on,

"The harder problem to solve is a CAPTCHA that can decide between a

human and an Adversary with oracle access to a human. In this scenario,

the CAPTCHA could (possibly) rely on the observed characteristics of the
response (like time to respond, etc.) to try to spot the Adversary using
oracle access to a human from a true human response, but I'm not sure
this can be done. It's (IMO) an interesting problem."

My hypothesis is that a CAPTCHA must rely on the physically observable
characteristics of the prover in order to distinguish an Adversary with
oracle access to a human from a human. What those characteristics are,
and how indicative they are of one situation verses the other, is the
subject of further thought.

The time it takes a human verses an Adversary with oracle access to to a
human to answer the verifier should be different. Where the answers come
from (if location can be measured) could be different. There may be
other physical characteristics to look at.

John A. Malley
10266...@compuserve.com

Mok-Kong Shen

nepřečteno,
11. 2. 2004 16:01:5111.02.04
komu:

It all depends on the task/questions that CAPTCHA
poses. If these are designed to differentiate between
machine's and human's capability to solve, then
human alone couldn't differ in performance from machine
plus human where the machine simply does nothing but
relays the stuff to the human. On the other hand, there
is a way to differentiate between human alone and
machine plus human. One first poses a task that
could show whether machine alone is the case (the
CAPTCHA as described in the literature). If one
knows that it's not the case, one then poses another
task of the nature that human alone couldn't solve,
e.g. factoring the product of two sufficiently large
primes.

M. K. Shen

John A. Malley

nepřečteno,
11. 2. 2004 16:49:5911.02.04
komu:
Mok-Kong Shen wrote:


No, that totally misses the definition of a CAPTCHA.

A CAPTCHA _never_ poses a question that only a machine or program could
solve and not a human! See the referenced paper on CAPTCHA back in the
beginning of this thread.

The CAPTCHA by definition poses questions that a human can easily answer
and that a program cannot easily answer. The CAPTCHA is intended to deal
with humans to grant them access to some protected item or behavior. A
CAPTCHA by definition is not going to pose a question that only a
program could answer.

The problem I am addressing is the Adversary to a CAPTCHA, as it is
defined, given oracle access to a human, verses the normal human
response to a CAPTCHA's question.


> On the other hand, there
> is a way to differentiate between human alone and
> machine plus human. One first poses a task that
> could show whether machine alone is the case (the
> CAPTCHA as described in the literature). If one
> knows that it's not the case, one then poses another
> task of the nature that human alone couldn't solve,
> e.g. factoring the product of two sufficiently large
> primes.


Again, this does not fit the definition of the CAPTCHA or the problem
that I posed.

We are at an end here.

John A. Malley
10266...@compuserve.com

Mok-Kong Shen

nepřečteno,
11. 2. 2004 17:04:1111.02.04
komu:

That is the CAPTCHA as (currently) defined, and as such
is also (by definition) intended to differentiate
between machine and human. If you want to further
differentiates between human and machine plus human,
then you have to extend that definition. Otherwise
that further distinction is simply not possible.
(I also don't understand why you want to have such
further distinction. Could you please tell?)

>
> The CAPTCHA by definition poses questions that a human can easily answer
> and that a program cannot easily answer. The CAPTCHA is intended to deal
> with humans to grant them access to some protected item or behavior. A
> CAPTCHA by definition is not going to pose a question that only a
> program could answer.
>
> The problem I am addressing is the Adversary to a CAPTCHA, as it is
> defined, given oracle access to a human, verses the normal human
> response to a CAPTCHA's question.

See above. (What's the motivation of your problem?)

>
> > On the other hand, there
> > is a way to differentiate between human alone and
> > machine plus human. One first poses a task that
> > could show whether machine alone is the case (the
> > CAPTCHA as described in the literature). If one
> > knows that it's not the case, one then poses another
> > task of the nature that human alone couldn't solve,
> > e.g. factoring the product of two sufficiently large
> > primes.
>
> Again, this does not fit the definition of the CAPTCHA or the problem
> that I posed.

See again above.

M. K. Shen

John A. Malley

nepřečteno,
11. 2. 2004 22:54:2211.02.04
komu:

No. That has not been proved. No one has shown that.
I post only to correct your assertion. Proving it can't be done, or that
it can be done using other information, is the problem I consider.

I explained that physical observations of time and location with respect
to the Adversary with oracle access to a human are possible. In a thread
some time ago a CAPTCHA in series with the echo protocol showed how to
(in theory) show a person is where he says he is to differentiate from a
program in the vicinity using a remote human to solve the problem from a
CAPTCHA (another example of a CAPTCHA Adversary with oracle access to a
human.)


> (I also don't understand why you want to have such
> further distinction. Could you please tell?)

Paul Rubin's original post is the motivation. Look again at the problem
he pointed out with CAPTCHAs and Adversaries with oracle access to
humans.

>
> >
> > The CAPTCHA by definition poses questions that a human can easily answer
> > and that a program cannot easily answer. The CAPTCHA is intended to deal
> > with humans to grant them access to some protected item or behavior. A
> > CAPTCHA by definition is not going to pose a question that only a
> > program could answer.
> >
> > The problem I am addressing is the Adversary to a CAPTCHA, as it is
> > defined, given oracle access to a human, verses the normal human
> > response to a CAPTCHA's question.
>
> See above. (What's the motivation of your problem?)

Paul Rubin's post, generalizing the problem, and looking at what is
achievable with respect to the definition of a CAPTCHA.

>
> >
> > > On the other hand, there
> > > is a way to differentiate between human alone and
> > > machine plus human. One first poses a task that
> > > could show whether machine alone is the case (the
> > > CAPTCHA as described in the literature). If one
> > > knows that it's not the case, one then poses another
> > > task of the nature that human alone couldn't solve,
> > > e.g. factoring the product of two sufficiently large
> > > primes.
> >
> > Again, this does not fit the definition of the CAPTCHA or the problem
> > that I posed.
>

I think this final post clarifies what I am interested in and what
possible avenues there are for further analysis. I'm working now on how
to apply a "cost function" to the oracle access (in an effort to capture
what Mike Amling said in another branch) where the human interested in
gaining access to what the CAPTCHA protects is willing to pay the cost
of doing what the CAPTCHA says(like going to a new URL) verses an
unwitting human duped into answering the CAPTCHA question in return for
a reward (like porno). I have a hunch that game theory, maybe even
Cake-Cutting Algorithms, have a role here in distinguishing or
protecting against an Adversary with oracle access to humans.

Our thread here is ended. There is no more to discuss.

Mok-Kong Shen

nepřečteno,
12. 2. 2004 4:45:3412.02.04
komu:

So let me give a proof. It's trivial! CAPTCHA poses
one and the same question to (1) human (alone), (2)
machine plus human. In (2), according to you, the
machine simply passes the question to human and
routes his answer back. So CAPTCHA gets the 'same'
answer in both cases. With what could CAPTCHA 'ever'
differentiate between the two cases?? I might have
interpreted some sentences of yours above incorrectly,
but you seem to mean that there could be a reaction
time difference between (1) and (2) that CAPTCHA
could observe and exploit. If so, then you are
entirely wrong. For a human may answer in 1/4 minute
or 1/2 minute, depending on his ('non-deterministic'!)
current mode (i.e. his physical and emotional conditions),
but the routing of the machine is done in neglible
(additional) time. Further one human may react faster,
another may react slower. Do you see the point?

This very clearly refutes your arguments. Now, if your
goal is only to find out whether a machine alone is
the case (in other words there is no human, which
is what CAPTCHA is designed for to detect, cf. the
issue of automatic registration), then the current
CAPTCHA is sufficient/adequate. If you want to go
further to differentiate between human and human plus
machine, then it is clear from the outset that you
have to pose questions that human (alone) can't answer
but human plus machine can. (This is in fact entirely
analogous to the original CAPTCHA situation for
differentiating between machine and human where it
has to pose questions that machine can't answer but
human can.) That's why I gave previously the example
of factoring to show that possibility in practice.

M. K. Shen
--------------------------------------
http://home.t-online.de/home/mok-kong.shen

John A. Malley

nepřečteno,
12. 2. 2004 13:32:4212.02.04
komu:
Mok-Kong Shen wrote:

>
> "John A. Malley" wrote:


We need to correct factual errors here.

First, if you read my posts in this thread, you'll see that I already
presented an algorithm that distinguishes (with some probability) a
human prover from an "Adversary with oracle access to a human" (AWOATAH)
prover.

The CAPTCHA sends a distorted graphic of a URL to which the human must
go to continue the prover-verifier interaction with the CAPTCHA. A human
duped into acting as an oracle with the reward of porno for answering
now must manually enter the URL into his web browser to bring up that
new page. And he may need to do this repeatedly (per the Gap Theorem in
the CAPTCHA paper cited way back in this thread.)

A duped human will not necessarily put up with this kind of interaction.
There's a probability p per each interaction that he will give up and
abandon the transaction because his reward is dwindling in value. A
human who's genuinely interested in completing the transaction in order
to get whatever the CAPTCHA normally guards, like an email account, will
put up with these repeated tests at different URLs. (Other readers, if
there are other readers of this thread, may now recognize my interest in
game theory as part of a general model for the AWOATAH attack.)

This algorithm forces the human to go to different "locations", to get
away from working through the AWOATAH. If the human is not willing to go
to those "locations" then the algorithm assumes the human is being used
by an AWOATAH.

Note that a human can always answer the question, when direct to when
used as an oracle. The difference is a physical observation of the
willingness of the human to carry out the number of verifier-prover
steps needed to ascertain there's no AWOATAH attacking the CAPTCHA. The
algorithm is probabilistic, too, in that there is some finite p that a
duped human will do all these steps, but that p should be made small by
the number of interactions required.

Second, the assertion that "there exists a case C where physical
observation of the AWOATAH verses a human cannot distinguish between the
two, therefore, there are no physical observations of an AWOATAH verses
a human that can distinguish the two" is false. One must show that all
possible ways to physically observe AWOATAH responses are identical to
the single case C to generalize. That has not been shown (and the above
algorithm demonstrates your assertion is not true.)

Third, I did not give any argument that physical observation of timing,
location, or other physically observable attributes of the AWOATAH
available to the CAPTCHA will always work. I said this is an avenue for
investigation and further work, an open problem.

Fourth, the statement of the problem that (IMO) I find interesting
stands on the definition of the CAPTCHA. I am exploring the
possibilities of protecting against AWOATAHs using CAPTCHAs as defined.
Feel free to go off in other directions. If you want to assert there is
no merit in my interest in modeling what is achievable when protecting
CAPTCHAs against AWOATAHs, that is your opinion. My opinion is otherwise.


Mok-Kong Shen

nepřečteno,
13. 2. 2004 4:20:1713.02.04
komu:

"John A. Malley" wrote:
>
> Mok-Kong Shen wrote:
>

If I understand correctly, an essential point of yours
is that the human has a limited (small) patientience
in doing work (getting bored/tired, giving up). But
couldn't a machine somehow simulate that (afterall,
a sort of Turing test is concerned here)? Wouldn't
that make the distinction between human and human
plus machine imfeasible? Thanks.

M. K. Shen
------------------------------------
http://home.t-online.de/home/mok-kong.shen

d...@florence.edu

nepřečteno,
13. 2. 2004 10:10:5713.02.04
komu:

Fortunately for Microsoft generally available optical character
recognition software is limited in it's capability. Auto scanning of
books for example might allow stealing of intellectual property. As a
result scanning and OCR software has historically been somewhat
inaccurate and usually required a lot of human intervention

There is usually a logistical bottleneck in any criminal scheme that
would allow a simple exception report to highlight abusers. Actually
it is surprising that given the state of the art in profiling
technology more criminal activity is not caught. Look at the success
of the pyramid schemes and timing attacks used on Wall Street for
example.

Paul Rubin

nepřečteno,
13. 2. 2004 11:46:2513.02.04
komu:
d...@Florence.edu writes:
> Fortunately for Microsoft generally available optical character
> recognition software is limited in it's capability. Auto scanning of
> books for example might allow stealing of intellectual property. As a
> result scanning and OCR software has historically been somewhat
> inaccurate and usually required a lot of human intervention

Hate to tell you, but OCR'ing printed books generally works pretty
well, and it's done all the time. And before OCR'ing was feasible,
printed books got photocopied all the time. OCR'ing handwriting is
more difficult.

d...@florence.edu

nepřečteno,
13. 2. 2004 17:29:2013.02.04
komu:
On 13 Feb 2004 08:46:25 -0800, Paul Rubin
<http://phr...@NOSPAM.invalid> wrote:

Back in the Win 98 days I used to set up scanners and OCR to be used
as reading machines for blind people. The OCR was advertised as being
able to read 99 percent of printed material however in practice the
success rate was somewhat less.

With consumer grade software there was no way you could scan a book to
disk without proof reading for errors and rescanning some pages.

John A. Malley

nepřečteno,
13. 2. 2004 18:29:4613.02.04
komu:
Mok-Kong Shen wrote:

[...]


>>
>
> If I understand correctly, an essential point of yours
> is that the human has a limited (small) patientience
> in doing work (getting bored/tired, giving up). But
> couldn't a machine somehow simulate that (afterall,
> a sort of Turing test is concerned here)? Wouldn't
> that make the distinction between human and human
> plus machine imfeasible? Thanks.

No, you misunderstood. My previous post lists my essential points as the
second, third and fourth points in that post. I have not asserted that
the existence of defenses against AWOATAH depends on limited patience
with repetitive tasks. (Oh, the irony, look at this thread!...)

Now for the specific question about the example defense against an
AWOATAH, you asked, "But couldn't a machine somehow simulate that

(afterall, a sort of Turing test is concerned here)? Wouldn't that make
the distinction between human and human plus machine imfeasible?"

First, that CAPTCHA defense against a AWOATAH expects the duped human
will not want to continuously leave the (porn) reward. The duped human
will find this task more trouble than it's worth. This very response is
what the CAPTCHA is trying to detect and reject. Writing a program to
_simulate_ a human who doesn't cooperate and gets rejected by the
CAPTCHA does nothing to defeat the CAPTCHA! The CAPTCHA still wins. It
protects what it's supposed to protect. I think you did not understand
how the algorithm works, or the basic attack embodied by the AWOATAH?
Why would you want to write a program to simulate something that wants
to get caught and rejected by the CAPTCHA?

Second, the CAPTCHA in that algorithm presents a URL as a distorted
graphic as an instance of a hard-AI problem. To write a simulator of a
human duped into helping the Adversary requires one write a program that
_can_ understand the distorted URL graphics, go to the designated URLs,
and then stop doing so before the CAPTCHA finishes directing it to other
URLs. That requires an Adversary that cracks the hard-AI problem anyway
so why bother trying to simulate a duped human? If you have that
program, you can defeat the CAPTCHA. You have a conventional Adversary
(a program) for the CAPTCHA. But by definition the CAPTCHA's set of
hard-AI problem instances is difficult if not impossible to solve using
current algorithms. (See that paper I referenced way back in this thread.)



Paul Rubin

nepřečteno,
13. 2. 2004 22:18:5213.02.04
komu:
d...@Florence.edu writes:
> With consumer grade software there was no way you could scan a book to
> disk without proof reading for errors and rescanning some pages.

So what? The latest Harry Potter book was circulating on the Internet
within about 2 days of its publication. Maybe it had some typos.
Nobody cared.

Bill Unruh

nepřečteno,
14. 2. 2004 2:50:3014.02.04
komu:
"John A. Malley" <10266...@compuserve.com> writes:

]Mok-Kong Shen wrote:

][...]
]>>
]>
]> If I understand correctly, an essential point of yours
]> is that the human has a limited (small) patientience
]> in doing work (getting bored/tired, giving up). But
]> couldn't a machine somehow simulate that (afterall,
]> a sort of Turing test is concerned here)? Wouldn't
]> that make the distinction between human and human
]> plus machine imfeasible? Thanks.

]No, you misunderstood. My previous post lists my essential points as the
]second, third and fourth points in that post. I have not asserted that
]the existence of defenses against AWOATAH depends on limited patience
]with repetitive tasks. (Oh, the irony, look at this thread!...)

]Now for the specific question about the example defense against an
]AWOATAH, you asked, "But couldn't a machine somehow simulate that
](afterall, a sort of Turing test is concerned here)? Wouldn't that make
]the distinction between human and human plus machine imfeasible?"

]First, that CAPTCHA defense against a AWOATAH expects the duped human
]will not want to continuously leave the (porn) reward. The duped human

Lets see, probably over 50% of humans would ot want to spend any time with
the porn "reward"/ Does that make them non-human?

]will find this task more trouble than it's worth. This very response is

Mok-Kong Shen

nepřečteno,
14. 2. 2004 12:34:2714.02.04
komu:

I suppose that there is misunderstanding between us.
Let me first say that I assume your 'machine with
human oracle' simply to be a 'combination' of a machine
(a software module) and a human. This combination has
the intention to jointly work optimally in the environment
given by CAPTCHA. Do you agree with me so far? If not,
we have to dispute in more detail right at this point
(i.e. it would not be necessary at the moment for you
to treat the lines that I am writing immediately below).
Now assuming that you agree with the above, the
CAPTCHA poses a problem (of whatever sort). The
man-machine combination jointly considers the 'best'
strategy to react. Since it is 'known' that (assuming
the kind of problems posed by the current types of
CAPTCHA, i.e. hard AI problems) a 'manifested'
behaviour of the 'presence' of human alone is the right
stategy of dealing with CAPTCHA, the combination would
certainly (reasonably) decide that the machine does
'nothing' at all with the human doing (all) the work
alone (i.e. in practical situation the human could
just as well simply switch off that software module),
which is clearly 'equivalent' to the case that the
machine (the software module) is absent and the human
alone is there, isn't it? I must say I don't yet see
what's wrong with this 'general' description of the
situation. BTW, could we discuss on abstract terms
(independent of porn etc.), i.e. alone the lines: The
CAPTCHA gives a reward R to every human that access
it but refuses (wants to exclude) to give R to machines
that are working 'alone' and accessing it 'automatically'?
In this situation (assuming that the questions of
CAPTCHA are formulated appropriately/effectively)
putting a machine on the side of the human (to
strengthen him) is evidently something redundant/futile
(for using the machine can't contribute to getting
more reward but could on the contrary be disadvantageous),
isn't it? From my view point, what I write above is
actually (almost) 'tautology'. So I can hardly understand
why you don't yet see my point.

Perhaps an analogy could also help: One could replace
'machine' with 'child' and 'human' with 'adult'.
Now one could rather easily design a questionnaire
offering a reward to distinguish a child's (poor) answer
from an adult's (good) answer, but it's impossible to
have a questionnaire that could distinguish an adult
alone from a pairing of child and adult (assuming that
the pair acts reasonably, i.e. desiring to obtain
the reward). Am I right?

John A. Malley

nepřečteno,
14. 2. 2004 13:48:1114.02.04
komu:
Mok-Kong Shen wrote:

[...]


>>
>
> I suppose that there is misunderstanding between us.


No supposition here. I understand you want to change the nature of the
prolem as it's been defined for the AWOATAH, consistent with the
definition of a CAPTCHA. And I explained why your repeated attempts to
change the problem are not consistent with the definition of a CAPCTHA.


> Let me first say that I assume your 'machine with
> human oracle' simply to be a 'combination' of a machine
> (a software module) and a human. This combination has
> the intention to jointly work optimally in the environment
> given by CAPTCHA. Do you agree with me so far?


No. There is no _joint_ effort between the Adversary and the human duped
into helping the Adversary to solve the hard AI prolem instance from the
CAPTCHA under attack by the AWOATAH. The Adversary has oracle access to
a human - exactly as descried in Paul Rubin's original post. The
Adversary does NOT work on solving the hard-AI problem with the oracle
accessed human, it merely passes the hard AI prolem to the human and the
human solves the hard AI problem FOR THE Adversary.

By definition, there is no program that can solve the hard AI problem
used by the CAPTCHA, but the majority of humans can easily solve the
prolem. The division of labor for solving the problem is human effort on
hard AI problem, 100%, program effort on hard AI problem, 0%.

To argue that there is a CAPTCHA-posed hard-AI problem that REQUIRES a
the human use a program to help solve the hard-AI problem is not
consistent with the definition of a CAPTCHA per the paper I pointed you
to far back in this thread.

That is the sticking point.

My explanation of the problem that I find interesting for
discussion/investigation remains in this thread, and my refuations of
your logical errors with respect to that model/problem as well.

This thread is going nowhere. We're done.

-----------------------------------------------------------------------

Paul Rubin

nepřečteno,
14. 2. 2004 21:13:4914.02.04
komu:
"John A. Malley" <10266...@compuserve.com> writes:
> This thread is going nowhere. We're done.

That's the third or fourth time I remember you saying that. The only
way to stop responding to MKS is to stop responding to him. He will
never stop responding to you.

Zpráva byla smazána

Mok-Kong Shen

nepřečteno,
15. 2. 2004 6:18:3515.02.04
komu:

"John A. Malley" wrote

I am extremely surprised that you consist in refusing
to admit the truth of simple/evident fact. To put
in a nutshell, one has two blackboxes, in one there
is mechanism/algorithm A, in the other there is
mechanism/algorithm A and B, but B is entirely
inactive/dormant (or, as you said previously, B simply
accepts the input and passes it to A and relays the
result from A -- without any change! -- to the outside
world and does that relaying work at negligible time,
or, as you formulated above, B does 0% word). How could
one 'ever' find out the difference of content of the
two boxes by observing the input/output behaviour
'alone'? That's 'entirely' and trivially impossible,
isn't it???

M. K. Shen
--------------------------------------------

Was sich ueberhaupt sagen laesst, | What can be said at all can
laesst sich klar sagen; und wovon | be said clearly; and
man nicht reden kann, darueber | whereof one cannot speak
muss man schweigen. | thereof one must be silent.
|
Ludwig Wittgenstein | (Translation of C. K.
(1889 - 1951) | Ogden and F. Ramsey)

Mok-Kong Shen

nepřečteno,
15. 2. 2004 6:23:0815.02.04
komu:

In scientific debates, the person who is wrong and
couldn't further argue should be couragoues enough to
admit that. (That's no shame, isn't it??)

M. K. Shen

Mok-Kong Shen

nepřečteno,
15. 2. 2004 7:00:4615.02.04
komu:

Mok-Kong Shen wrote:
>

> I am extremely surprised that you consist in refusing
> to admit the truth of simple/evident fact. To put
> in a nutshell, one has two blackboxes, in one there
> is mechanism/algorithm A, in the other there is
> mechanism/algorithm A and B, but B is entirely
> inactive/dormant (or, as you said previously, B simply
> accepts the input and passes it to A and relays the
> result from A -- without any change! -- to the outside
> world and does that relaying work at negligible time,
> or, as you formulated above, B does 0% word). How could
> one 'ever' find out the difference of content of the
> two boxes by observing the input/output behaviour
> 'alone'? That's 'entirely' and trivially impossible,
> isn't it???

Sorry, some (hopefully evident) typos: 'you consist'
should read 'you persist' and '0% word' should read
'0% work'.

M. K. Shen

Mok-Kong Shen

nepřečteno,
15. 2. 2004 7:09:5615.02.04
komu:

"Arthur J. O'Dwyer" wrote:
>
[snip]
> Here's a [literally] foolproof scheme, as far as I can see: Have
> a whole paragraph of text in the image, and make the human user
> have to *think* to get his reward (whether it be a new e-mail account
> or AWOATAH-supplied porn). Somehow. I'm not entirely clear on the
> details. Quote from Shakespeare, and force the user to identify
> the play in question -- too harsh. Brain teasers -- too hard.
> Maybe: Give a brief biography of a historical figure and ask for
> his name before proceeding. The legitimate user will probably (IMO)
> try to find the answer on Google, possibly learning something in the
> process. The illegitimate user will probably find the impromptu
> history lesson a buzzkill. ;-)
[snip]

Sorry, I don't understand your last sentence. (Could
you elaborate a little bit?) In my understanding, the
illegitimate user certainly has some motivations
(maybe even stronger, though different from those of
the legitimate user) to pass the barrier posed by
CAPTCHA or any similar scheme, right?

M. K. Shen

Zpráva byla smazána

Richard Herring

nepřečteno,
16. 2. 2004 5:02:4716.02.04
komu:
In message <7xptci8...@ruckus.brouhaha.com>, Paul Rubin
<http@?.cx.invalid> writes

See also, for example, http://www.1911encyclopedia.org: the entire 1911
Britannica OCRd with no trace of proof reading.

--
Richard Herring

Mok-Kong Shen

nepřečteno,
16. 2. 2004 16:57:3416.02.04
komu:

"Arthur J. O'Dwyer" wrote:
>

> Before you can access Hot Sexy Girls Of Porn, please answer the
> following question:
>
> [Born in Holguin, Cuba, in 1943, this poet, novelist, and
> playwright studied at the University of Havana on scholarship
> from his work during the Revolution, but did not earn a degree.
> His first novel, published in 1965 while he was working at the
> Jose Marti National Library, was "Celestino antes del Alba."
> After a three-year imprisonment, he left Cuba in the exodus of
> Mariel, went to New York, was diagnosed with AIDS in 1987, and
> in 1990 committed suicide by drug overdose. Please enter the
> name of this Cuban writer at the prompt.]
>
> [_________]
>
> Once you have entered the correct answer and our AWOATAH software
> has verified your answer, you may click <here> to continue to
> Hot Sexy Girls Of Porn!

That is one form of question that I suppose is suitable
for CAPTCHA as is currently conceived, i.e. distinguishing
between a machine (alone) and a human. But one couldn't
with that distinguish between (1) a human and (2) machine
(software module) plus a human where the software module
does 'nothing' but simply relaying the question to the
human and routing his answer back, which is what I had
been arguing in a number of posts.

M. K. Shen

John A. Malley

nepřečteno,
16. 2. 2004 17:46:2616.02.04
komu:
Arthur J. O'Dwyer wrote:

> On Thu, 12 Feb 2004, John A. Malley wrote:
>
>>[...] I already presented an algorithm that distinguishes (with some


>>probability) a human prover from an "Adversary with oracle access to
>>a human" (AWOATAH) prover.
>>
>>The CAPTCHA sends a distorted graphic of a URL to which the human must
>>go to continue the prover-verifier interaction with the CAPTCHA. A human
>>duped into acting as an oracle with the reward of porno for answering
>>now must manually enter the URL into his web browser to bring up that
>>new page. And he may need to do this repeatedly (per the Gap Theorem in
>>the CAPTCHA paper cited way back in this thread.)
>>
>>A duped human will not necessarily put up with this kind of interaction.
>>
>

> Corollary: an AWOATAH will not necessarily require its "pet human"
> to put up with this kind of interaction. For example, given an
> original page along the lines of
>
> Go to the following URL and click on the "Sign up" button. [IMG]
>
> the AWOATAH will show a page to its pet human something like this:
>
> Type the following URL in this box to continue: [________] [IMG]
>
> and all the following of links will happen "in the background,"
> from the point of view of the pet human involved. I don't think
> traditional schemes will get around the fact that computers are
> remarkably good at taking bits and pieces of data and re-arranging
> them. :)

Yes, the AWOATAH's designer knows the CAPCTHA's algorithm. The CAPTCHA's
set of hard-AI problem instances is public knowledge. It's the hard-AI
problem and randomness over the set of problem instances that makes for
a secure CAPTCHA with respect to software Adversaries. Mike Amling
suggested we generate random hard-AI problem presentations to make the
AWOATAH work harder to intercept and repackage the CAPTCHA's problem
instance for re-presentation to a duped human.

In the suggested algorithm, the CAPTCHA's defense against an AWOATAH
depends on the effort required to answer the CAPTCHA. A human genuinely
interested in what the CAPTCHA offers (protects) will answer all of the
CAPTCHA's questions. A human duped into answering questions for an
AWOATAH prover is not (by definition) interested in what the CAPTCHA
offers, and may in fact know nothing about it. The AWOATAH repackages
and pitches the CAPTCH's problem instances to the duped human and offers
a different reward (like immediate access to free porn). If the "price"
of the CAPTCHA's guarded item is more than what the duped human is
willing to "pay" to gain access to the AWOATAH's reward (free porn) then
the duped human will give up before the AWOATAH can satisfy the CAPTCHA.

It's a game based on the premise that a human truly intereseted in the
service or product protected (and offered) by the CAPTCHA will pay the
price (in terms of effort) to get to it by answering the CAPTCHA's
questions, while a human duped into helping the AWOATAH will find the
price too high for what the AWOATAH is offering as the reward for
answering those same questions (like free porn.) Hence the buzzkill. And
hence a way to distinguish a human who wants what the CAPTCHA protects
from an Adeversary who's using an unwitting human in it's effort to get
what the CAPTCHA protects.

The kinds of hard-AI problems to pose, and how to pose them in a way
that buzzkills quick gratification, are areas for further thought. Your
suggestion of forcing a Google query to get an answer to a question
posed in a graphical manner that is hard for a program to understand but
easy for a human is interesting. Definitely a buzzkill for instant
gratification.

I'm working on a more formal definition of the problem using the
notation in the original CAPTCHA paper and some way to "quantify" the
different costs incurred by a human who wants what the CAPTCHA guards,
and a (duped, unsuspecting) human who wants what the AWOATAH offers (and
knows nothing about what the CAPTCHA is protecting) instead. The value
of what the AWOATAH offers the duped human must be less than the cost of
satisfying the CAPTCHA. The value of what the CAPTCHA offers must be
equal to or greater than the cost of satisfying the CAPTCHA.

[...]

>
> Oh, and one more note: Aren't a lot of modern CAPTCHAs essentially
> exploiting the [combinatorially-explosive] one-way properties of
> printing text in funny fonts and ripple distortion? That is, you
> can easily write a program to un-distort the images used by, e.g.,
> Yahoo! Mail, and then use regular OCR techniques to recover the
> passwords. The only problem is figuring out which particular set
> of parameters Yahoo! used to distort the image -- was it
> "salt-and-pepper #4287" or "horizontal ripple #3"? At which point
> it's a combinatorial problem, and you solve it brute-force.

The kinds of distortions chosen are supposed to be beyond current AI
algorithm performance. I don't think there's a COTS program or immediate
algorithm to tackle those problems with sufficient probability of
success that the program passes as human with significant probability.

> The "pet human" AWOATAH approach is a heck of a lot more elegant,
> though. :-D

Yes, it is a smart idea, and kudos to those who thought up the attack. I
don't condone what they do with it, not at all, but I respect their
problem solving capability. :-)

John A. Malley
10266...@compuserve.com

Mok-Kong Shen

nepřečteno,
16. 2. 2004 18:28:1116.02.04
komu:

"John A. Malley" wrote:
>
[snip]

> It's a game based on the premise that a human truly intereseted in the
> service or product protected (and offered) by the CAPTCHA will pay the
> price (in terms of effort) to get to it by answering the CAPTCHA's
> questions, while a human duped into helping the AWOATAH will find the
> price too high for what the AWOATAH is offering as the reward for
> answering those same questions (like free porn.) Hence the buzzkill. And
> hence a way to distinguish a human who wants what the CAPTCHA protects
> from an Adeversary who's using an unwitting human in it's effort to get
> what the CAPTCHA protects.

[snip]

One motivation of that duped human could be to causes
accesses to the service protected by CAPTCHA in such
a way that the server is overloaded (DOS attack), I
suppose. It's difficult to determine the cost threshold
for him to give up, I am afraid. (Compare: Apparently
most hackers or virus writers are not paid for their
work, yet they seem to persist to do damages to the
internet community.)

M. K. Shen

stefe...@hp.com

nepřečteno,
16. 2. 2004 21:04:0616.02.04
komu:
In sci.crypt, John A. Malley <10266...@compuserve.com> wrote:
>
> > The "pet human" AWOATAH approach is a heck of a lot more elegant,
> > though. :-D
>
> Yes, it is a smart idea, and kudos to those who thought up the attack. I
> don't condone what they do with it, not at all, but I respect their
> problem solving capability. :-)
>
John -

I'm afraid I haven't followed the thread carefully, so do feel free to
apply a clue-by-four ;-) It seems your improvement hinges on the idea
that a motivated human working on a single, multi-stage problem will
keep up the effort needed, while someone being used as a 'pet human'
will give up/get bored/go elsewhere to buy porn. If so, it would seem
to hinge on the multi-stageness; but I don't see how you're going to
ensure that the stages are all linked in such a way as to be solved by
only one human, as opposed to a 'zoo' of pet humans each of whom are
motivated enough to do one stage of the problem...

As I say, kick me if I've misunderstood...

Stefek

John A. Malley

nepřečteno,
17. 2. 2004 3:01:2217.02.04
komu:
stefe...@hp.com wrote:

>

> As I say, kick me if I've misunderstood...


No kick needed. That's the gist of it. :-)

The CAPTCHA needs to carry out a multi-stage transaction with the truly
interested human. The CAPTCHA needs "state". The human needs to prove
his history of interaction as part of the current "hard-AI" problem
instance. The CAPTCHA needs to present a question about the past
interactions as well as a new interaction in a way that a human who
participated in that history can answer easily, but an AI program cannot.

Off the top of my head here's a (simple) example:

The CAPTCHA presents a distorted image of a dog, cat or fish and
distorted text for a URL selected uniformly at random from a pool of
URLs to which the user must go next. (All of these graphics should be
displayed in a fashion that takes up most of the screen real estate so
an AWOATAH can't repackage rewards around it without the human user
forced to scroll up and down, or left and right. It's part of making it
a nuisance for AWOATAH reuse. It should not be easy for an AI program to
figure out just where the images are embedded in a larger screen-size
graphic, for example. This is to give some randomness to the layout as
per Mike Amling's suggestion.) The CAPTCHA generates new pages at new
URLs as part of this transaction with a human prover.

The first page shows a distorted dog and a distorted URL (dog-URL1).

The human user is expected to type in the URL (URL1) on his web browser
and get that page.

That page (URL1) shows two rows of images. The top row consists of
another distorted dog, a distorted cat and a distorted URL. (dog-cat-URL2)

The bottom row consists of a distorted cat, another distorted cat and a
different, distorted URL. (cat-cat-URL3)

The CAPTCHA asks the human to go to the URL in the row whose first
element matches the content of his first interaction.

A human who remembers the previous interaction chooses the top row and
goes to URL2. Any human without memory of the first interaction picks a
row at random. The CAPTCHA starts over for the wrong choice.

The page at URL2 shows two rows of images. The top row consists of a
distorted fish, distorted dog, distorted cat and a distorted URL.
(fish-dog-cat-URL4)

The bottom row consists of a distorted dog, distorted cat, distorted
fish and a another distorted URL. (dog-cat-fish-URL5).

The CAPTCHA asks the human to go to the URL in the row whose first and
second elements match the content of his first and second interaction.

A human who remembers the previous interactions chooses the bottom row
and goes to URL5. Any human without memory of the first and second
interaction picks a row at random. The CAPTCHA starts over for the wrong
choice.

The final page (say URL5 in this example) adds a neat twist. The page
shows a distorted alphanumeric string and a text entry box. The CAPTCHA
asks the human to enter in the sequence of animals (image values) that
got it here and the alphanumeric string's value to gain access to
whatever the CAPTCHA protects. The human is expected to enter in the
sequence as "dog-cat-fish" + "alphanumeric string".

A human who remembers the previous interactions types in the sequence
and the alphanumeric string value shown distorted on this page. Any
human who gets this page without the previous history must guess at the
sequence of images that got to this page - he doesn't know how many, or
which. The CAPTCHA starts over if he's wrong.

The CAPTCHA keeps track of the interactions using state. It picks the
URLs, images, distortions and alphanumeric strings at random from sets
of such items, and remembers the order of images and URLs it chose as a
session history that should be unique to a human prover.

Well, that's kind of what I had in mind. I'm convinced we need to
formalize interactions like this to make sure there are no subtle holes
in a protocol like this, and the depth of stages can't be so much that a
typical human has trouble completing it, or makes lots of errors, yet it
must be deep enough to keep AWOATAHs from exploiting it. This example
is very sketchy with respect to any concrete security measures or
provable security.

HTH,

John A. Malley
10266...@compuserve.com

>
> Stefek
>


Mok-Kong Shen

nepřečteno,
17. 2. 2004 3:31:4517.02.04
komu:

A software module could in fact make the work even easier
for the human, e.g. letting him type '1' instead of the
character sequence for URL1.

>
> That page (URL1) shows two rows of images. The top row consists of
> another distorted dog, a distorted cat and a distorted URL. (dog-cat-URL2)
>
> The bottom row consists of a distorted cat, another distorted cat and a
> different, distorted URL. (cat-cat-URL3)

See above.

>
> The CAPTCHA asks the human to go to the URL in the row whose first
> element matches the content of his first interaction.
>
> A human who remembers the previous interaction chooses the top row and
> goes to URL2. Any human without memory of the first interaction picks a
> row at random. The CAPTCHA starts over for the wrong choice.

One could assume that the human behaves (has motivations
to do so) properly in both cases (i.e. whether a
software module is interposed that does practically
'nothing').

>
> The page at URL2 shows two rows of images. The top row consists of a
> distorted fish, distorted dog, distorted cat and a distorted URL.
> (fish-dog-cat-URL4)
>
> The bottom row consists of a distorted dog, distorted cat, distorted
> fish and a another distorted URL. (dog-cat-fish-URL5).
>
> The CAPTCHA asks the human to go to the URL in the row whose first and
> second elements match the content of his first and second interaction.
>
> A human who remembers the previous interactions chooses the bottom row
> and goes to URL5. Any human without memory of the first and second
> interaction picks a row at random. The CAPTCHA starts over for the wrong
> choice.

See above. The human can be assumed to perform in
the same way in both cases. (BTW, why should a human
'discard' his memmory?)

>
> The final page (say URL5 in this example) adds a neat twist. The page
> shows a distorted alphanumeric string and a text entry box. The CAPTCHA
> asks the human to enter in the sequence of animals (image values) that
> got it here and the alphanumeric string's value to gain access to
> whatever the CAPTCHA protects. The human is expected to enter in the
> sequence as "dog-cat-fish" + "alphanumeric string".

Even this could be made a little bit easier, in case
a software module is available.

>
> A human who remembers the previous interactions types in the sequence
> and the alphanumeric string value shown distorted on this page. Any
> human who gets this page without the previous history must guess at the
> sequence of images that got to this page - he doesn't know how many, or
> which. The CAPTCHA starts over if he's wrong.

In both cases, the human will go through the same steps.
Hence no difference.

>
> The CAPTCHA keeps track of the interactions using state. It picks the
> URLs, images, distortions and alphanumeric strings at random from sets
> of such items, and remembers the order of images and URLs it chose as a
> session history that should be unique to a human prover.
>
> Well, that's kind of what I had in mind. I'm convinced we need to
> formalize interactions like this to make sure there are no subtle holes
> in a protocol like this, and the depth of stages can't be so much that a
> typical human has trouble completing it, or makes lots of errors, yet it
> must be deep enough to keep AWOATAHs from exploiting it. This example
> is very sketchy with respect to any concrete security measures or
> provable security.

A formalization could have some value for the 'original'
goal of CAPTCHA of distinuishing between the case of a
machine (software module) alone and the case of with
human involvement (either human alone or a software
module that routes the question to the human and
transmits his answer back) but can't dinstinguish
between the two sub-cases of the latter, as I have
repeatedly shown. (BTW, another analogy, if a function
A in a program does nothing but simply calls another
function B and returns the result of A, then every
occurence of A in the program could be replaced by a
call of B. The behavior of the program remains exactly
the same, i.e. not distinguishable, excepting that
in the second case the efficiency could be expected
to be even higher.)

M. K. Shen

Paul Rubin

nepřečteno,
17. 2. 2004 12:42:4017.02.04
komu:
"John A. Malley" <10266...@compuserve.com> writes:
> The human user is expected to type in the URL (URL1) on his web
> browser and get that page. ...

> The CAPTCHA asks the human to go to the URL in the row whose first
> element matches the content of his first interaction...

> The CAPTCHA asks the human to go to the URL in the row whose first and
> second elements match the content of his first and second interaction.
> ... The final page (say URL5 in this example) adds a neat twist. ...

> The human is expected to
> enter in the sequence as "dog-cat-fish" + "alphanumeric string".

How does the human know that s/he is supposed to do all those things?
Are the instructions part of the distorted text image? If yes, there
goes most of the screen space, and also, they have to be placed and
located differently every time, and probably phrased differently too,
or else some other program can remove or change them before the human
sees them. If they're outside the image, it's over. The attacker
just gives different instructions instead, like "type the displayed
url into this form" instead of "navigate your browser to the url",
proxying the different stages of the captcha to the human.

John A. Malley

nepřečteno,
17. 2. 2004 13:04:5917.02.04
komu:
Paul Rubin wrote:

Yes, the instructions must always be part of the distorted text image to
prevent the AWOATAH from understanding and tailoring them for its own
use. And its layout should be varied randomly to prevent repackaging an
AWOATAH.

The CAPTCHA's best defensive characteristic is the hard-AI problem (I
feel silly saying that because that's core to its definition). All of
its instructions and tests for transaction history must be in a hard-AI
problem format to ensure only a human will understand it.

John A. Malley
10266...@compuserve.com

John A. Malley

nepřečteno,
17. 2. 2004 13:06:3517.02.04
komu:
John A. Malley wrote:

[...]


oops, forgot a word, this will make more sense:

> And its layout should be varied randomly to prevent repackaging BY an
> AWOATAH.


Vernon Schryver

nepřečteno,
17. 2. 2004 14:18:0817.02.04
komu:

All of this is intellectually interesting in the same way that
investigating Turing Tests is interesting. Thinking about what
constitutes a "hard AI test" yesterday, today, and 10 years from
now is quite interesting. It's also interesting to see how people
and computers fare in practice on such tests.

However, it is all nonsense as a spam defense or to prevent robots
from signing up for drop-boxes. Even without the elegant hack by
spammers that triggered this thread, such tests as defenses are silly.
They will always have too many false positives and false negatives for
serious work with the public. Any collection of more than several
dozen people includes some who seem dumber than the stupidest shell
script. Each of us at one time or another is too tired, distracted,
drunk, senile, illiterate in the right language, deaf, blind, not using
computers that can do pictures (e.g. lynx), not using computers that
can do sounds, uncaring, or otherwise unable to pass such a test.
Unless you want to deny your free mail service to those people or
reject their mail, you cannot use such tests.

On the other hand, robots don't need to pass such tests every time or
even most of the time. Consider the extreme tactic of random guessing.
As I recall, the Yahoo tests involve 5 digit numbers. A spammer doesn't
need don't need more than a few new drop boxes per day. If the spammer
sends half a dozen random guesses through one open to the Yahoo filter,
how many proxies must the spammer burn daily to get its day's quota
of 3 or 4 new drop-boxes? Will the spammer ever run out of open
proxies?--I don't think so. There are many millions of proxies
available, and Yahoo can blacklist each only for a short time smaller
than a likely DHCP IP address reassignment.

I suspect the reason these tests are effective for free mail providers
is that there are many thousands of other free mail providers that
do not use any tests. The big mystery to me is why the spammers
bothered to implement their clever hack.


Vernon Schryver v...@rhyolite.com

Mok-Kong Shen

nepřečteno,
17. 2. 2004 15:19:0017.02.04
komu:

"John A. Malley" wrote:
>
[snip]

> The CAPTCHA's best defensive characteristic is the hard-AI problem (I
> feel silly saying that because that's core to its definition). All of
> its instructions and tests for transaction history must be in a hard-AI
> problem format to ensure only a human will understand it.

That's clear form the literature on CAPTCHA. The
additional desire of distinguishing between a human
and the pairing of machine and human remains yet
a seemingly unattainable goal, as I repeatedly
explained.

M. K. Shen

0 nových zpráv