Math/CompSci Interview Question

Andrew Tomazos

unread,

Nov 21, 2009, 11:12:53 AM11/21/09

to

I was posed the following question in a technical interview for a
Software Engineering position by a major multinational NASDAQ company:

[Paraphrasing] "You are given an array of 1,000,000 32-bit integers.
One int value x occurs 500,001 times or more in the array. Specify an
algorithm to determine x."

The assumption being extra points for efficiency.

I initially came up with a linear space, linear time solution. With
some prompting and hints from the interviewer we worked our way to a
smaller constant space and linear time algorithm. That being
basically:

int findmode(int* p, int n)
{
int count[32];
for(int i = 0; i < 32; i++)
count[i] = 0;

for (int i = 0; i < n; i++)
for (int j = 0; j < 32; j++)
if (p[i] & (1 << j)) // if bit j is on
count[j]++;
else
count[j]--;

int x = 0;
for (int i = 0; i < 32; i++)
if (count[i] > 0)
x = x | (1 << i);

return x;
}

The idea here is to count the frequency of each of the 32 bits in the
array separately, knowing that these bit counts will be dominated by
the mode.

The interviewer already knew the answer, so I can only assume the test
was based on how much help he had to give me to arrive at it.

My question about this is as follows. I, perhaps boldly, claim to
employers to have a "masters-equivilant" experience/knowledge of
computer science and math. Should I have been able to come up with
this solution without prompting or help?

Would you expect someone with a CompSci Masters or PhD from some major
ivy league university to be able to come up with this solution without
help?

If so in what course or from which book would you expect to learn the
required knowledge to come up with the above solution?

Is the skill to be able to come up with such an algorithm something
that is learned by studying lots of algorithms and being able to mix
and match the "tricks"? If so, what is a similar commonly known
algorithm(s) on which the above solution could have been based?

Or, is the ability to invent such a solution simply a matter of IQ?
Some people have the talent to solve the puzzle, see the pattern and
come up with the solution - and some just don't?

Thanks,
Andrew.

Pascal J. Bourguignon

unread,

Nov 21, 2009, 12:24:10 PM11/21/09

to

Andrew Tomazos <and...@tomazos.com> writes:

> employers to have a "masters-equivalant" experience/knowledge of

> computer science and math. Should I have been able to come up with
> this solution without prompting or help?

If what you're asking is whether anybody having a master in CS and
maths would have been able to come up with this solution in the
interview time, I guess we can answer that definitely no, otherwise
there would be no point in trying to select candidates with this test.

Obviously, it's because some people (with or without a master diploma,
this really isn't relevant) get or don't get it, that this test is
useful for the recruiter.

Now if you want this kind of jobs, yes you should better be able to
come up with smart solutions to little puzzles like this in
interviews.

> Would you expect someone with a CompSci Masters or PhD from some major
> ivy league university to be able to come up with this solution without
> help?
>
> If so in what course or from which book would you expect to learn the
> required knowledge to come up with the above solution?

Not a single one. You have to develop your knowledge of algorithms,
mathematics, your symbolic thinking and your imagination in these
matters.

> Is the skill to be able to come up with such an algorithm something
> that is learned by studying lots of algorithms and being able to mix
> and match the "tricks"?

That could help yes. I'd tend to think that imagination is the most
important ingredient here, to be able to come with a possible solution
fast enough.

Also, don't limit yourself to CS and maths, there are ideas to be
taken from other domains too.

> If so, what is a similar commonly known
> algorithm(s) on which the above solution could have been based?
>
> Or, is the ability to invent such a solution simply a matter of IQ?

CS IQ, yes, if such a IQ test exists.

> Some people have the talent to solve the puzzle, see the pattern and
> come up with the solution - and some just don't?

It seems so. At least, in a given time.

--
__Pascal Bourguignon__

GJ Woeginger

unread,

Nov 21, 2009, 12:29:14 PM11/21/09

to

In comp.theory Andrew Tomazos <and...@tomazos.com> wrote:
# I was posed the following question in a technical interview for a
# Software Engineering position by a major multinational NASDAQ company:
#
# [Paraphrasing] "You are given an array of 1,000,000 32-bit integers.
# One int value x occurs 500,001 times or more in the array. Specify an
# algorithm to determine x."
#
# The assumption being extra points for efficiency.

There is an old analysis of this problem by Fischer and Salzberg.
M.J. Fisher and S.L. Salzberg (1982)
Finding a majority among n votes.
Journal of Algorithms 3, pp 143-152.

If 2k elements contain a majority element (= an element that occurs at
least k+1 times), then you can find it with 3k-2 element comparisons
(and some small overhead). The basic idea in their algorithm is that
whenever you find two distinct elements, then you can destroy both without
changing the majority element among the remaining elements. This yields:

Run once through the array, and maintain a majority-candidate and a counter.
The majority-candidate is initialized as the first element, and
the counter (counting how often you have seen the candidate) is
initialized at 1.
If you hit the current candidate again, increment the counter.
If you hit another element, decrement the counter.
If the counter becomes 0, drop the current candidate and start from
scratch with the remaining part of the array.

Fischer and Salzberg also show that this bound 3k-2 is best possible in
the worst case (and that's the main part of their paper).

--Gerhard

___________________________________________________________
Gerhard J. Woeginger http://www.win.tue.nl/~gwoegi/

Richard Harter

unread,

Nov 21, 2009, 1:33:38 PM11/21/09

to

On Sat, 21 Nov 2009 08:12:53 -0800 (PST), Andrew Tomazos
<and...@tomazos.com> wrote:

>I was posed the following question in a technical interview for a
>Software Engineering position by a major multinational NASDAQ company:
>
>[Paraphrasing] "You are given an array of 1,000,000 32-bit integers.
>One int value x occurs 500,001 times or more in the array. Specify an
>algorithm to determine x."
>
>The assumption being extra points for efficiency.

[snip code]

>
>The idea here is to count the frequency of each of the 32 bits in the
>array separately, knowing that these bit counts will be dominated by
>the mode.
>
>The interviewer already knew the answer, so I can only assume the test
>was based on how much help he had to give me to arrive at it.
>
>My question about this is as follows. I, perhaps boldly, claim to
>employers to have a "masters-equivilant" experience/knowledge of
>computer science and math. Should I have been able to come up with
>this solution without prompting or help?

I hope "masters-equivilant" isn't on your resume.

>
>Would you expect someone with a CompSci Masters or PhD from some major
>ivy league university to be able to come up with this solution without
>help?

Probably not, but that is not necessarily a good thing.

>
>If so in what course or from which book would you expect to learn the
>required knowledge to come up with the above solution?

That's an interesting question with an interesting
presupposition. The first thing to understand is that this is a
puzzle rather than a programming problem. The presupposition is
that you need required knowledge to come up with a solution. The
essence of a good puzzle is that you need to step outside the box
to see the solution; that is, there is a trick that involves an
unexpected way of looking at things.

Creating good puzzles is tricky. Once you've come up with a
good trick it is easy to come with many more similar tricks;
unfortunately the puzzle solver will have learned the trick too.

Perhaps the best way to learn how to solve puzzles like this is
try to solve puzzles like this.

I don't know of any courses, though I think the MIT AI group used
to have a course like that. As far as books and web sites are
concerned, google is your friend. Do a search on Programming
puzzles and tricks.

>
>Is the skill to be able to come up with such an algorithm something
>that is learned by studying lots of algorithms and being able to mix
>and match the "tricks"? If so, what is a similar commonly known
>algorithm(s) on which the above solution could have been based?
>
>Or, is the ability to invent such a solution simply a matter of IQ?
>Some people have the talent to solve the puzzle, see the pattern and
>come up with the solution - and some just don't?

Partly it is a matter of talent - some people are inately better
at solving puzzles than others. Partly it is a matter of
intellectual personality - some people don't like solving
puzzles. Partly it is a matter of experience; exposure to lots
of puzzle solving makes it easier to solve puzzles.

One point of using puzzles in interviews is that it tests the
preparedness of interviewees. These things are chestnuts. If
you are serious about getting the job, you go through the
chestnuts in advance. It's similar to finding what a company
does before you interview with it.

The hope of the interviewer is that you've never seen that kind
of puzzle before and they get to see how agile you are on your
mental feet. Speaking from the interviewers side of the fence
those sort of tests are mostly good for checking out bright young
puppies who are short on hard experience. When one is dealing
with more experienced people the important thing is to discover
if they know what they are supposed to know and how alive their
intellectual curiosity is. Oh yes, the most important thing to
discover whether they are BS artists giving you a snow job. BS
artists are deadly in the technical world.

Richard Harter, c...@tiac.net
http://home.tiac.net/~cri, http://www.varinoma.com
Infinity is one of those things that keep philosophers busy when they
could be more profitably spending their time weeding their garden.

A.G.McDowell

unread,

Nov 21, 2009, 1:46:21 PM11/21/09

to

In article <6a8340b8-19b0-4fda...@m26g2000yqb.googlegroup
s.com>, Andrew Tomazos <and...@tomazos.com> writes

>I was posed the following question in a technical interview for a
>Software Engineering position by a major multinational NASDAQ company:
>
>[Paraphrasing] "You are given an array of 1,000,000 32-bit integers.
>One int value x occurs 500,001 times or more in the array. Specify an
>algorithm to determine x."
>
>The assumption being extra points for efficiency.
>
>I initially came up with a linear space, linear time solution. With
>some prompting and hints from the interviewer we worked our way to a
>smaller constant space and linear time algorithm. That being
>basically:
>

This sort of problem is covered by articles on Data Stream Processing in
CACM Oct 2009. (CACM is a lot more interesting these days than it was
some years ago). There are some very neat ideas in there, of which the
algorithm "MAJORITY" matches the question reasonably well. Proving that
it works under interview conditions would be extremely impressive,
though.

This is not the first time that I have heard of interview questions that
discuss issues recently covered in the computing literature. I am unable
to tell whether these come from a desire to know if the candidate keeps
themselves abreast of the subject, or from the interviewer grasping the
first thing that comes to hand when they are trying to think up a
question. The few times that I have posed interview questions, I have
tried to find evidence in the candidate of a knowledge of basic theory
or mathematics that I could show was relevant to the job for which we
were trying to recruit.
--
A.G.McDowell

Bill Dubuque

unread,

Nov 21, 2009, 4:33:03 PM11/21/09

to

c...@tiac.net (Richard Harter) wrote:
>Andrew Tomazos <and...@tomazos.com> wrote:
>>
>>I was posed the following question in a technical interview for a
>>Software Engineering position by a major multinational NASDAQ company:
>>
>>[Paraphrasing] "You are given an array of 1,000,000 32-bit integers.
>>One int value x occurs 500,001 times or more in the array. Specify an
>>algorithm to determine x."
>>
>>The assumption being extra points for efficiency. [snip code] The idea

>>is to count the frequency of each of the 32 bits in the array separately,

>>knowing that these bit counts will be dominated by the mode. [...]

>>
>>My question about this is as follows. I, perhaps boldly, claim to
>>employers to have a "masters-equivilant" experience/knowledge of
>>computer science and math. Should I have been able to come up with
>>this solution without prompting or help?

>>Would you expect someone with a CompSci Masters or PhD from some major
>>ivy league university to be able to come up with this solution without

>>help? If so in what course or from which book would you expect to learn

>>the required knowledge to come up with the above solution?
>
> That's an interesting question with an interesting presupposition.
> The first thing to understand is that this is a puzzle rather than
> a programming problem.

I disagree. Saying that it's a puzzle rather than a programming problem
seems to imply that the solution is ad-hoc, rather than a special case
of some more universal technique. But that is not true here. Namely,
the proposed solution follows immediately from the obvious fact that
majority elts on product sets are preserved by component projections.
So the solution is simply an instance of well-known divide-and-conquer
techniques for product objects. Such reductions are ubiquitous in both
mathematics and computer science. So I would expect a good student
to find this solution given enough time. I'd also expect a student
from a top-tier school to discover the more optimal solution, viz.

bi != a => Maj({a^k b1 b2... bk} \/ S) = Maj(S)

via m/n > 1/2 => (m-k)/(n-2k) > 1/2 via mediants/arithmetic

again, assuming enough time. But that assumption is highly problematic
when it comes to designing tests that quickly measure various skills.
It's impossible to test such in the short time span of an interview.
It remains difficult even in much longer timed tests. For example
many of the most successful mathematicians did not perform well on
the Putnam competition, and some of those who did well didn't turn
out to be exceptional mathematicians. Thus there isn't necessarily
much correlation between intelligence and random test scores.

Marilyn vos Savant is a prime example. The woman who supposedly
has the "world's highest IQ" published a Parade magazine column [1]
and book [2] on Wiles proof of FLT. Her nonsensical arguments proved
beyond a doubt that she has absolutely no clue about higher math
and, worse, doesn't have the intelligence to know that (even after
many experts pointed out gaping flaws in her very naive arguments).
So much for IQ tests.

--Bill Dubuque

[1] sci.math, 11/29/1993, Full Text of Savant FLT Parade Article
http://google.com/group/sci.math/msg/8cb44534c63372ad
http://google.com/groups?selm=WGD.93Nov29055015%40martigny.ai.mit.edu

[2] Boston, Nigel; Granville, Andrew.
Review of: The World's Most Famous Math Problem (The Proof
of Fermat's Last Theorem and Other Mathematical Mysteries)
by Marilyn vos Savant
The American Mathematical Monthly, 102, 5, 1995, 470-473
http://www.dms.umontreal.ca/~andrew/PDF/VS.pdf
http://www.jstor.org/stable/2975048

Alf P. Steinbach

unread,

Nov 21, 2009, 4:53:29 PM11/21/09

to

* Bill Dubuque:

>
> I disagree. Saying that it's a puzzle rather than a programming problem
> seems to imply that the solution is ad-hoc, rather than a special case
> of some more universal technique. But that is not true here. Namely,
> the proposed solution follows immediately from the obvious fact that
> majority elts on product sets are preserved by component projections.
> So the solution is simply an instance of well-known divide-and-conquer
> techniques for product objects. Such reductions are ubiquitous in both
> mathematics and computer science. So I would expect a good student
> to find this solution given enough time. I'd also expect a student
> from a top-tier school to discover the more optimal solution, viz.
>
> bi != a => Maj({a^k b1 b2... bk} \/ S) = Maj(S)
>
> via m/n > 1/2 => (m-k)/(n-2k) > 1/2 via mediants/arithmetic

Hiya. Could you please explain the above more optimal solution. I'm unfamiliar
with the notation and not from a top-tier school, but in my experience anything
that's not nonsense can be visualized or explained in simple terms (e.g., Albert
Einstein did that beautifully with his book on special relativity, which except
for one little proof in an appendix -- I didn't yet know anything about
solving sets of equations with multiple variables -- I found eminently
grokkable as a teenager, so should also be possible for the above, yes?).

Cheers & TIA.,

- Alf

Axel Vogt

unread,

Nov 21, 2009, 5:07:31 PM11/21/09

to

Andrew Tomazos wrote:
> I was posed the following question in a technical interview for a
> Software Engineering position by a major multinational NASDAQ company:
>
> [Paraphrasing] "You are given an array of 1,000,000 32-bit integers.
> One int value x occurs 500,001 times or more in the array. Specify an
> algorithm to determine x."
>
> The assumption being extra points for efficiency.

Being a bit stupid I first would ask "Why? What do you do with it?"
and then I would pick on random. I am almost certain, that even at
a low number of draws the chance to get the very integer is higher
than implementing an algo without coding errors.

Andrew Tomazos

unread,

Nov 21, 2009, 6:45:15 PM11/21/09

to

On Nov 21, 10:53 pm, "Alf P. Steinbach" <al...@start.no> wrote:
> * Bill Dubuque:
>
>
>
> > I disagree. Saying that it's a puzzle rather than a programming problem
> > seems to imply that the solution is ad-hoc, rather than a special case
> > of some more universal technique. But that is not true here. Namely,
> > the proposed solution follows immediately from the obvious fact that
> > majority elts on product sets are preserved by component projections.
> > So the solution is simply an instance of well-known divide-and-conquer
> > techniques for product objects. Such reductions are ubiquitous in both
> > mathematics and computer science. So I would expect a good student
> > to find this solution given enough time. I'd also expect a student
> > from a top-tier school to discover the more optimal solution, viz.
>
> > bi != a => Maj({a^k b1 b2... bk} \/ S) = Maj(S)
>
> > via m/n > 1/2 => (m-k)/(n-2k) > 1/2 via mediants/arithmetic
>
> Hiya. Could you please explain the above more optimal solution.

Dubuque is referring to the solution that Woeginger more lucidly
described above. Both it and the bit counting method are
asymptotically equivalent solutions to the original problem. I'm sure
either of these solutions provided on the spot would have received
"full marks".

I guess what I am curious about is exactly what percentage of, say...
CS PhD students at tier one universities, would be able to come up
with either of these solutions on the spot. 80%, 50% or 20% ? I
guess only someone that has interviewed many people with these sort of
problems has the necessary data to answer my question.
-Andrew.

Andrew Tomazos

unread,

Nov 21, 2009, 7:01:55 PM11/21/09

to

On Nov 21, 6:29 pm, gwo...@figipc78.tu-graz.ac.at (GJ Woeginger)
wrote:

If I understand your description than it would look like:

int findmode(int* p, int n)
{

int x = p[0];
int c = 1;

for (int i = 1; i < n; i++)
{
if (c == 0)
{
x = p[i];
c = 1;
}
else if (p[i] == x)
c++;
else
c--;
}

return x;
}

It seems that this could only produce at maximum (2k-1) comparisions
in the worst case, not (3k-2) as you claim?
-Andrew.

Andrew Tomazos

unread,

Nov 21, 2009, 7:04:54 PM11/21/09

to

Oh wait... the (c==0) counts as a comparison. I see it now. Ignore
me.
-Andrew.

Bill Dubuque

unread,

Nov 21, 2009, 9:23:53 PM11/21/09

to

Andrew Tomazos <and...@tomazos.com> wrote:
>"Alf P. Steinbach" <al...@start.no> wrote:

>>Bill Dubuque <w...@nestle.csail.mit.edu> wrote:
>>>c...@tiac.net (Richard Harter) wrote:
>>>>Andrew Tomazos <and...@tomazos.com> wrote:
>>>>>
>>>>>I was posed the following question in a technical interview for a
>>>>>Software Engineering position by a major multinational NASDAQ company:
>>>>>
>>>>>[Paraphrasing] "You are given an array of 1,000,000 32-bit integers.
>>>>>One int value x occurs 500,001 times or more in the array. Specify an
>>>>>algorithm to determine x."
>>>>>
>>>>>The assumption being extra points for efficiency. [snip code] The idea
>>>>>is to count the frequency of each of the 32 bits in the array separately,
>>>>>knowing that these bit counts will be dominated by the mode. [...]
>>>>>
>>>>>My question about this is as follows. I, perhaps boldly, claim to
>>>>>employers to have a "masters-equivilant" experience/knowledge of
>>>>>computer science and math. Should I have been able to come up with
>>>>>this solution without prompting or help?
>>>>>Would you expect someone with a CompSci Masters or PhD from some major
>>>>>ivy league university to be able to come up with this solution without
>>>>>help? If so in what course or from which book would you expect to learn
>>>>>the required knowledge to come up with the above solution?
>>>>
>>>> That's an interesting question with an interesting presupposition.

>>>> The first thing to understand is that this is a puzzle rather than
>>>> a programming problem.

>>>
>>> I disagree. Saying that it's a puzzle rather than a programming problem
>>> seems to imply that the solution is ad-hoc, rather than a special case
>>> of some more universal technique. But that is not true here. Namely,
>>> the proposed solution follows immediately from the obvious fact that
>>> majority elts on product sets are preserved by component projections.
>>> So the solution is simply an instance of well-known divide-and-conquer
>>> techniques for product objects. Such reductions are ubiquitous in both
>>> mathematics and computer science. So I would expect a good student
>>> to find this solution given enough time. I'd also expect a student
>>> from a top-tier school to discover the more optimal solution, viz.
>>>
>>> bi != a => Maj({a^k b1 b2... bk} \/ S) = Maj(S)
>>>
>>> via m/n > 1/2 => (m-k)/(n-2k) > 1/2 via mediants/arithmetic
>>>

>>> again, assuming enough time. But that assumption is highly problematic
>>> when it comes to designing tests that quickly measure various skills.
>>> It's impossible to test such in the short time span of an interview.
>>> It remains difficult even in much longer timed tests. For example
>>> many of the most successful mathematicians did not perform well on
>>> the Putnam competition, and some of those who did well didn't turn
>>> out to be exceptional mathematicians. Thus there isn't necessarily
>>> much correlation between intelligence and random test scores.
>>>
>>> Marilyn vos Savant is a prime example. The woman who supposedly
>>> has the "world's highest IQ" published a Parade magazine column [1]
>>> and book [2] on Wiles proof of FLT. Her nonsensical arguments proved
>>> beyond a doubt that she has absolutely no clue about higher math
>>> and, worse, doesn't have the intelligence to know that (even after
>>> many experts pointed out gaping flaws in her very naive arguments).
>>> So much for IQ tests.
>>>

>>> [1] sci.math, 11/29/1993, Full Text of Savant FLT Parade Article
>>> http://google.com/group/sci.math/msg/8cb44534c63372ad
>>> http://google.com/groups?selm=WGD.93Nov29055015%40martigny.ai.mit.edu
>>>
>>> [2] Boston, Nigel; Granville, Andrew.
>>> Review of: The World's Most Famous Math Problem (The Proof
>>> of Fermat's Last Theorem and Other Mathematical Mysteries)
>>> by Marilyn vos Savant
>>> The American Mathematical Monthly, 102, 5, 1995, 470-473
>>> http://www.dms.umontreal.ca/~andrew/PDF/VS.pdf
>>

>> Hiya. Could you please explain the above more optimal solution.

I'll be happy to elaborate if you say what isn't clear to you.

> Dubuque is referring to the solution that Woeginger more lucidly
> described above.

No, I hadn't yet seen Woeginger's post when I posted that. Note
that what one finds "more lucid" may depend on one's background.
A computer scientist might find W's post more lucid, whereas a
mathematician may find the above more to the essence of the matter,
esp. since the mediant viewpoint make the underlying math trivial
(W's post completely omits discussion of any math or proof).

--Bill Dubuque

Casey Hawthorne

unread,

Nov 21, 2009, 9:25:50 PM11/21/09

to

These questions/puzzles are to see how you approach the problem, but
if you've seen the puzzle before it is a useless practice.

--
Regards,
Casey

Richard Harter

unread,

Nov 22, 2009, 12:33:50 AM11/22/09

to

On 21 Nov 2009 16:33:03 -0500, Bill Dubuque
<w...@nestle.csail.mit.edu> wrote:

I disagree. The fact that the solution is "a special case of
some more universal technique" is not to the point - that is
common with good puzzles. The thing that makes it a puzzle is
that the circumstances are carefully arranged. Change the
problem slightly, e.g, the most common is not a majority, or it
is permitted to use O(n) space, or the items are strings rather
than integers, and the nature of the problem and its solution
changes.

If it were a programming problem rather than a puzzle, it might
run something like this: You are given an array of n entities,
not all distinct. Which one(s) occur most frequently. What
general strategy would you suggest? If space usage is critical
does this change your answer? If time usage is critical does
this change your answer? Do any special circumstances occur to
you that might justify changing your approach?

I see that sci.math is among the newsgroups in the original
posting. The question has mathematical interest; it remains,
however, that the context is programming and interviews for
programming positions.

Virgil

unread,

Nov 22, 2009, 1:38:28 AM11/22/09

to

I
> >>>I was posed the following question in a technical interview for a
> >>>Software Engineering position by a major multinational NASDAQ company:
> >>>
> >>>[Paraphrasing] "You are given an array of 1,000,000 32-bit integers.
> >>>One int value x occurs 500,001 times or more in the array. Specify an
> >>>algorithm to determine x."

Find, to the nearest integer, the average value.

Willem

unread,

Nov 22, 2009, 3:33:09 AM11/22/09

to

Virgil wrote:
) I
)> >>>I was posed the following question in a technical interview for a
)> >>>Software Engineering position by a major multinational NASDAQ company:
)> >>>
)> >>>[Paraphrasing] "You are given an array of 1,000,000 32-bit integers.
)> >>>One int value x occurs 500,001 times or more in the array. Specify an
)> >>>algorithm to determine x."
)
) Find, to the nearest integer, the average value.

Doesn't work.
Consider the case of 500,001 times 0 and 499,999 times 2^31-1

Perhaps you are confused with 'find the median value' ?

SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT

Richard Heathfield

unread,

Nov 22, 2009, 3:47:19 AM11/22/09

to

I thought of that too, but quickly saw the flaw - if for large N you
have (N/2)+1 occurrences of X, and (N/2)-1 occurrences of Y, you
can't settle the matter (is it X, or is it Y?) satisfactorily without
reading every element.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
"Usenet is a strange place" - dmr 29 July 1999
Sig line vacant - apply within

Willem

unread,

Nov 22, 2009, 3:50:48 AM11/22/09

to

Andrew Tomazos wrote:
) Dubuque is referring to the solution that Woeginger more lucidly
) described above. Both it and the bit counting method are
) asymptotically equivalent solutions to the original problem. I'm sure
) either of these solutions provided on the spot would have received
) "full marks".

I'm not so sure. I wouldn't be surprised if you only get "full marks"
for the solution that the interviewer has in mind, even if there are
other solutions that are equivalent or better.

NB: The counting method is actually better, because you don't need
O(k * log n) space (where k is the size of an individual key).
Or, practically speaking, it also works if the items are strings.

A very simple proof of the 'other' method would be that whenever
c reaches 0, the part of the array you processed can't have more
than half of the 'majority' item. (Exactly half of it is one single
item, and the other half is not that single item) So the remaining
part *must* have more than half, i.e. the problem is reduced to
the same problem for the remaining part.

Moi

unread,

Nov 22, 2009, 6:32:50 AM11/22/09

to

On Sun, 22 Nov 2009 08:47:19 +0000, Richard Heathfield wrote:

> In <7mr6n5F...@mid.individual.net>, Axel Vogt wrote:
>
>> Andrew Tomazos wrote:
>>> I was posed the following question in a technical interview for a
>>> Software Engineering position by a major multinational NASDAQ company:
>>>
>>> [Paraphrasing] "You are given an array of 1,000,000 32-bit [integers.
>>> One int value x occurs 500,001 times or more in the array. Specify an
>>> algorithm to determine x."
>>>
>>> The assumption being extra points for efficiency.
>> <snipped>
>>
>> Being a bit stupid I first would ask "Why? What do you do with it?" and
>> then I would pick on random. I am almost certain, that even at a low
>> number of draws the chance to get the very integer is higher than
>> implementing an algo without coding errors.
>
> I thought of that too, but quickly saw the flaw - if for large N you
> have (N/2)+1 occurrences of X, and (N/2)-1 occurrences of Y, you can't
> settle the matter (is it X, or is it Y?) satisfactorily without reading
> every element.

With a kind of weight-balanced tree,
( := rotate in such a way that the node with the higher reference
count is closer to the root) you can terminate reading and inserting
the values once there is a clear "winner".

I think, something similar can be done for the bitcounting method:
if all bins have a value with (abs(value) > number_of_remaining_items)
you can terminate the inspection / insertion.

AvK

Andrew Tomazos

unread,

Nov 22, 2009, 10:01:36 AM11/22/09

to

On Nov 22, 9:50 am, Willem <wil...@stack.nl> wrote:
> ) Dubuque is referring to the solution that Woeginger more lucidly
> ) described above. Both it and the bit counting method are
> ) asymptotically equivalent solutions to the original problem. I'm sure
> ) either of these solutions provided on the spot would have received
> ) "full marks".
>
> I'm not so sure. I wouldn't be surprised if you only get "full marks"
> for the solution that the interviewer has in mind, even if there are
> other solutions that are equivalent or better.

Ummm. No, I don't think so. It was my impression that performance is
assessed according to the asymptotic equivalence class of time and
space requirements. At least in the interview setting I am talking
about.

> NB: The counting method is actually better, because you don't need
> O(k * log n) space (where k is the size of an individual key).
> Or, practically speaking, it also works if the items are strings.

Normally, k (the number of bits in an int) would be considered
constant (I haven't encountered many 1,000,000 bit integers as yet),
therefore both solutions are asymptotically equivalent in space and
time. There is usually a very large number of ways to make
nonasymptotic improvements to any algorithm. None of this precludes
one solution being "better" than the other by some other measure
though, but it gets harder to measure performance at a finer grain.
You need to start considering the impact of the memory hierarchy,
locality of reference, what the optimizer is going to do, and so on.
To be honest I would expect both algorithms to be memory bound by the
single pass of the large array. But it would be interesting to
compare anyway. Maybe I'll do a performance comparison.
-Andrew

Malcolm McLean

unread,

Nov 22, 2009, 10:45:30 AM11/22/09

to

"Andrew Tomazos" <and...@tomazos.com> wrote in message news:

>I was posed the following question in a technical interview for a
> Software Engineering position by a major multinational NASDAQ company:
>
> [Paraphrasing] "You are given an array of 1,000,000 32-bit integers.
> One int value x occurs 500,001 times or more in the array. Specify an
> algorithm to determine x."
>

Take 100 values at random. Then you need some AI. If you get a number with a
count of about 50 and no other numbers with counts of anything like that
figure, then it is vanishingly unlikely that then number is anything other
than the 50s. (Typically it is more likely that the computer will have its
memory corrupted during the execution of the program). If you've got two
counts close to 50, then it must be that there is one value represented
500,001 times and another represented 499,999 times (or thereabouts). In
this situation, you have to step through the array and count your
candidates.

To do it properly you can work out the stats.

Willem

unread,

Nov 22, 2009, 11:22:49 AM11/22/09

to

Andrew Tomazos wrote:
) On Nov 22, 9:50�am, Willem <wil...@stack.nl> wrote:
)> I'm not so sure. �I wouldn't be surprised if you only get "full marks"
)> for the solution that the interviewer has in mind, even if there are
)> other solutions that are equivalent or better.
)
) Ummm. No, I don't think so. It was my impression that performance is
) assessed according to the asymptotic equivalence class of time and
) space requirements. At least in the interview setting I am talking
) about.

My comment is based on my general impression of interviewers, not on the
way the OP worded his question. I find it more likely that an interviewer
posing such a question 'knows the trick' and wants the interviewee to find
the same trick. Also note that in this case, the interviewer goaded the
interviewee to the less general solution (see below).

)> NB: The counting method is actually better, because you don't need
)> O(k * log n) space (where k is the size of an individual key).
)> Or, practically speaking, it also works if the items are strings.

Did you read the last sentence ? You know, the one with 'strings' ?

) Normally, k (the number of bits in an int) would be considered
) constant (I haven't encountered many 1,000,000 bit integers as yet),
) therefore both solutions are asymptotically equivalent in space and
) time.

We're looking at the difference between two algorithms. With one view,
they're equivalent, with another view one of them is better. To me, that
means that that one is better overall.

Furthermore, the counting solution is more general, because it
can work on any type of item with an equivalence operator.

All of this has nothing to do with nonasymptotic behaviour.

) <snip>

Chip Eastham

unread,

Nov 22, 2009, 11:51:29 AM11/22/09

to

I don't think the (c==0) counts as a comparison.
I think the other k-1 comparisons are for the 2nd
part of the algorithm, when you make one more pass
through the array verifying the candidate found in
the 1st part. [You already know the index i for
one occurrence of the candidate, so only the other
k-1 indices have to be checked.]

You can see how badly I (mathtalk-ga) stumbled
through this same puzzle here:

[Google Answers: algorithms]
http://answers.google.com/answers/threadview/id/582711.html

regards, chip

Malcolm McLean

unread,

Nov 22, 2009, 1:00:24 PM11/22/09

to

"Richard Heathfield" <r...@see.sig.invalid> wrote in message

> I thought of that too, but quickly saw the flaw - if for large N you
> have (N/2)+1 occurrences of X, and (N/2)-1 occurrences of Y, you
> can't settle the matter (is it X, or is it Y?) satisfactorily without
> reading every element.
>

You need to calculate the error in your estimate. If two numers overlap 0.5
frequency, on some suitably stringent cutoff, then you do a full count.

However I'm sure how to do this, offhand.

Mike Terry

unread,

Nov 22, 2009, 1:50:44 PM11/22/09

to

"Chip Eastham" <hard...@gmail.com> wrote in message
news:3572ef9d-dab3-480b...@c34g2000yqn.googlegroups.com...

There is no second pass through the algorithm, if I've understood it
correctly!

Ben Bacarisse

unread,

Nov 22, 2009, 2:37:12 PM11/22/09

to

gwo...@figipc78.tu-graz.ac.at (GJ Woeginger) writes:

> In comp.theory Andrew Tomazos <and...@tomazos.com> wrote:
> # I was posed the following question in a technical interview for a
> # Software Engineering position by a major multinational NASDAQ company:
> #
> # [Paraphrasing] "You are given an array of 1,000,000 32-bit integers.
> # One int value x occurs 500,001 times or more in the array. Specify an
> # algorithm to determine x."
> #
> # The assumption being extra points for efficiency.
>
> There is an old analysis of this problem by Fischer and Salzberg.
> M.J. Fisher and S.L. Salzberg (1982)
> Finding a majority among n votes.
> Journal of Algorithms 3, pp 143-152.

I can't track this down. Have I got the right Journal of Algorithms?

http://www.informatik.uni-trier.de/~ley/db/journals/jal/jal3.html

does not list that paper. Of course, this table of contents might be
wrong. The horrendous link:

http://www.sciencedirect.com/science?_ob=PublicationURL&_tockey=%23TOC%236839%231982%23999969995%23518142%23FLA%23&_cdi=6839&_pubType=J&_auth=y&_acct=C000050221&_version=1&_urlVersion=0&_userid=10&md5=2c8bc9013d6025d08b7d24d3207dfe18

appears to be for the journal itself. The paper does not appear
there, either.

> If 2k elements contain a majority element (= an element that occurs at
> least k+1 times), then you can find it with 3k-2 element comparisons
> (and some small overhead). The basic idea in their algorithm is that
> whenever you find two distinct elements, then you can destroy both without
> changing the majority element among the remaining elements. This yields:
>
> Run once through the array, and maintain a majority-candidate and a counter.
> The majority-candidate is initialized as the first element, and
> the counter (counting how often you have seen the candidate) is
> initialized at 1.
> If you hit the current candidate again, increment the counter.
> If you hit another element, decrement the counter.
> If the counter becomes 0, drop the current candidate and start from
> scratch with the remaining part of the array.
>
> Fischer and Salzberg also show that this bound 3k-2 is best possible in
> the worst case (and that's the main part of their paper).

I can find a 1982 technical report from Yale:

FINDING A MAJORITY AMONG N VOTES Solution to Problem 81-5 (Journal
of Algorithms, June 1981)

by the same two authors. It describes what seems to me to be a
different algorithm and for which the 3k - 2 bound is proved. I
suspect the algorithm has been modified since the technical report but
I can't find a good reference that matches the algorithm you describe.

<snip>
--
Ben.

mjc

unread,

Nov 22, 2009, 4:52:05 PM11/22/09

to

I was also given this problem as part of an interview quiz. My answer
was: Use one of the known O(n) median algorithms and use that value.
This was accepted by the interviewer (but I didn't get the job).

Stuart Golodetz

unread,

Nov 22, 2009, 7:45:33 PM11/22/09

to

Andrew Tomazos wrote:
> I was posed the following question in a technical interview for a

> Software Engineering position by a major multinational NASDAQ company:
>

> [Paraphrasing] "You are given an array of 1,000,000 32-bit integers.

> One int value x occurs 500,001 times or more in the array. Specify an

> algorithm to determine x."

>
> The assumption being extra points for efficiency.
>

> I initially came up with a linear space, linear time solution. With
> some prompting and hints from the interviewer we worked our way to a
> smaller constant space and linear time algorithm. That being
> basically:
>

> int findmode(int* p, int n)
> {

> int count[32];
> for(int i = 0; i < 32; i++)
> count[i] = 0;
>
> for (int i = 0; i < n; i++)
> for (int j = 0; j < 32; j++)
> if (p[i] & (1 << j)) // if bit j is on
> count[j]++;
> else
> count[j]--;
>
> int x = 0;
> for (int i = 0; i < 32; i++)
> if (count[i] > 0)
> x = x | (1 << i);
>
> return x;
> }
>
> The idea here is to count the frequency of each of the 32 bits in the

> array separately, knowing that these bit counts will be dominated by
> the mode.
>

> The interviewer already knew the answer, so I can only assume the test
> was based on how much help he had to give me to arrive at it.

Well he certainly wasn't trying to use you as cheap labour to solve his
latest coding problem, that's for sure :) It's a rare interviewer who
asks questions to which he doesn't already know the answer...

> My question about this is as follows. I, perhaps boldly, claim to
> employers to have a "masters-equivilant" experience/knowledge of
> computer science and math. Should I have been able to come up with
> this solution without prompting or help?

Not necessarily - the purpose of these things isn't always to see
whether you can immediately get the right answer. Often they really want
to see how you tackle the problem.

On a separate note, though, it's not entirely clear to me what a
"masters-equivalent experience/knowledge of computer science and math"
entails - not because I question in any way your experience/knowledge,
but because not all masters degrees are equal (and not all of them are
trying to teach the same thing). Furthermore, people with equal degrees
are not in general equally good at what they do - and there's not even
necessarily a direct correlation between what marks people get in their
degree exams and how competent they are in a work situation. So I
understand what you're getting at, but I'm not sure that strictly
speaking it's a meaningful claim (if it impresses a potential employer,
perhaps you should worry about them).

> Would you expect someone with a CompSci Masters or PhD from some major
> ivy league university to be able to come up with this solution without
> help?

I don't think having a graduate degree per se makes any difference to
your ability to solve coding puzzles like this. (I'm currently doing a
DPhil at Oxford, and whilst it's made a big difference to a number of
skills I have, I don't think it's particularly helped me with puzzles
like this - some I can see through, some I can't, much as it always used
to be.) On the other hand, if you've done such a degree then you
probably have more of an interest in coding/maths than many people, so
it's not entirely unreasonable to ask you puzzle questions like this to
see how your mind works. (Being able to solve them isn't always the
point - it's sometimes how you go about trying that they're interested in.)

> If so in what course or from which book would you expect to learn the
> required knowledge to come up with the above solution?

No one course/book could teach you how to solve all the possible puzzles
they might throw at you - they're trying to see how your mind
works/evaluate your all-round knowledge and lateral thinking. It's
actually good that you can't just get a book of puzzles and learn
everything in it - that would defeat the point of the test.

> Is the skill to be able to come up with such an algorithm something
> that is learned by studying lots of algorithms and being able to mix
> and match the "tricks"? If so, what is a similar commonly known
> algorithm(s) on which the above solution could have been based?

I think some people are naturally better at it than others (some of the
guys I met at the British Informatics Olympiad some years ago were
scarily good at algorithm design, for example), but it is something
that's helped by seeing lots of algorithms, certainly.

> Or, is the ability to invent such a solution simply a matter of IQ?
> Some people have the talent to solve the puzzle, see the pattern and
> come up with the solution - and some just don't?

It's not IQ (whatever that measures in practice) per se - it's partly
having a particular skill set, e.g. being a lateral thinker etc., and
partly having seen lots of similar things before. Some people are better
at it - and you want some of those people on your team (one of the guys
I work with is good like this, always coming up with algorithms for
things). On the other hand, not all of them will be as strong in other
areas, so you probably wouldn't want the ability to solve algorithmic
puzzles to be your only selection criterion; nor would you want to rule
out people who were excellent in other areas because solving this sort
of puzzle isn't really their thing. It's important to recruit a balanced
team.

Cheers,
Stuart

> Thanks,
> Andrew.

James Kanze

unread,

Nov 23, 2009, 5:27:34 AM11/23/09

to

On Nov 22, 4:51 pm, Chip Eastham <hardm...@gmail.com> wrote:
> On Nov 21, 7:04 pm, Andrew Tomazos <and...@tomazos.com> wrote:
> > On Nov 22, 1:01 am, Andrew Tomazos <and...@tomazos.com> wrote:

> > > On Nov 21, 6:29 pm, gwo...@figipc78.tu-graz.ac.at (GJ Woeginger)
> > > wrote:

> > > > In comp.theory Andrew Tomazos <and...@tomazos.com> wrote:
> > > > > I was posed the following question in a technical

> > > > > interview for a Software Engineering position by a
> > > > > major multinational NASDAQ company:

> > > > > [Paraphrasing] "You are given an array of 1,000,000
> > > > > 32-bit integers. One int value x occurs 500,001 times
> > > > > or more in the array. Specify an algorithm to
> > > > > determine x."

> > > }

> > Oh wait... the (c==0) counts as a comparison. I see it now.
> > Ignore me.

> I don't think the (c==0) counts as a comparison.

Of course it does. Why wouldn't it?

> I think the other k-1 comparisons are for the 2nd
> part of the algorithm, when you make one more pass
> through the array verifying the candidate found in
> the 1st part. [You already know the index i for
> one occurrence of the candidate, so only the other
> k-1 indices have to be checked.]

What second part? There is no second pass.

--
James Kanze

GJ Woeginger

unread,

Nov 23, 2009, 5:34:20 AM11/23/09

to

In comp.theory Ben Bacarisse <ben.u...@bsb.me.uk> wrote:
# gwo...@figipc78.tu-graz.ac.at (GJ Woeginger) writes:
#
# > There is an old analysis of this problem by Fischer and Salzberg.
# > M.J. Fisher and S.L. Salzberg (1982)
# > Finding a majority among n votes.
# > Journal of Algorithms 3, pp 143-152.
#
# I can't track this down. Have I got the right Journal of Algorithms?
#
# http://www.informatik.uni-trier.de/~ley/db/journals/jal/jal3.html
#
# does not list that paper. Of course, this table of contents might be
# wrong. The horrendous link:
#
# http://www.sciencedirect.com/science?_ob=PublicationURL&_tockey=%23TOC%236839%231982%23999969995%23518142%23FLA%23&_cdi=6839&_pubType=J&_auth=y&_acct=C000050221&_version=1&_urlVersion=0&_userid=10&md5=2c8bc9013d6025d08b7d24d3207dfe18
#
# appears to be for the journal itself. The paper does not appear
# there, either.

I gave the wrong page numbers. It should be

M.J. Fisher and S.L. Salzberg (1982)
Finding a majority among n votes.

Journal of Algorithms 3, pp 375-379

And it is not an independent paper, but the discussion of
problem 81-5 in the problems column of Leo Guibas.

I also messed up the statement of the main result of Fisher and Salzberg:
The 3k-2 comparisons are for the problem where you have to decide
whether there exists some majority element.

--Gerhard

# > If 2k elements contain a majority element (= an element that occurs at
# > least k+1 times), then you can find it with 3k-2 element comparisons
# > (and some small overhead). The basic idea in their algorithm is that
# > whenever you find two distinct elements, then you can destroy both without
# > changing the majority element among the remaining elements. This yields:
# >
# > Run once through the array, and maintain a majority-candidate and a counter.
# > The majority-candidate is initialized as the first element, and
# > the counter (counting how often you have seen the candidate) is
# > initialized at 1.
# > If you hit the current candidate again, increment the counter.
# > If you hit another element, decrement the counter.
# > If the counter becomes 0, drop the current candidate and start from
# > scratch with the remaining part of the array.
# >
# > Fischer and Salzberg also show that this bound 3k-2 is best possible in
# > the worst case (and that's the main part of their paper).
#

___________________________________________________________
Gerhard J. Woeginger http://www.win.tue.nl/~gwoegi/

Ben Bacarisse

unread,

Nov 23, 2009, 8:37:49 AM11/23/09

to

gwo...@figipc78.tu-graz.ac.at (GJ Woeginger) writes:

Ah! I can't get access to the text (no library anymore) so I could
not find it. I thought it might be in the "Problems" column, but the
page numbers did stop me checking that.

[Rant: Is it reasonable to ask $20 for a PDF of a 27 year old paper?
It would cost me that to get to a library, so I suppose so. I'd be
happier if the authors, editors and reviewers got a cut of that, but I
know they won't.]

> I also messed up the statement of the main result of Fisher and Salzberg:
> The 3k-2 comparisons are for the problem where you have to decide
> whether there exists some majority element.

Ah (again). I now get it. Their algorithm has two phases. What you
describe is phase one. In the particular case where we know (from the
definition of the problem) that there *is* a majority we can stop
there. Because we don't need their phase two, there is no need for
the list they maintain in phase one.

I think their original problem is more interesting. Determining if
there is a majority at all (and what that majority element is) is more
subtle than the rather fake 500,001 of 1,000,000 element problem.

When you know a majority exists, you can find it in n-1 candidate
comparisons (that is, excluding the housekeeping tests like count ==
0). This is less than the 3n/2 - 2 (= 3k - 2 in the posted interview
problem) required if it is not known that there is a majority.

<snip>
--
Ben.

jbriggs444

unread,

Nov 23, 2009, 8:47:24 AM11/23/09

to

Because the statement in question was:

"[...] then you can find it with 3k-2 element comparisons"

Index comparisons and element comparisons are not at all the same
thing when one generalizes the problem away from arrays of 32 bit
values to arrays of arbitrary abstract values.

Bruce Wheeler

unread,

Nov 23, 2009, 11:20:15 AM11/23/09

to

Note: it is Fischer (sp) and Salzberg.

Doing a Google search on fischer salzberg majority produced the
following, a scanned version of the paper.

www.cs.yale.edu/publications/techreports/tr252.pdf

Regards,
Bruce Wheeler

Bruce Wheeler

unread,

Nov 23, 2009, 11:28:09 AM11/23/09

to

I see that this is a reference to the technical report you (Ben
Bacarisse) mentioned a few messages before, so you can ignore my
previous post (but note the spelling).

Regards,
Bruce Wheeler

Balog Pal

unread,

Nov 23, 2009, 12:55:32 PM11/23/09

to

"Richard Harter" <c...@tiac.net>
> One point of using puzzles in interviews is that it tests the
> preparedness of interviewees. These things are chestnuts. If
> you are serious about getting the job, you go through the
> chestnuts in advance. It's similar to finding what a company
> does before you interview with it.

Joel Spolsky had a bunch of great articles on interviews for sw developers.
They include stuff for the trivia-based form. His opinion matches mine:
that is as crappy as it is widespread. :-(

The point would be to find people to fit certain job/work. Going for trivia
will eliminate many good workers, and get you those with irrelevant
knowledge or luck.

Also it will make the company look dumb and incompetent in the eyes of an
ACE candidate if you happened to stumble on one...

Well, that's for general -- the particular story said here may not be in the
category, as the asker was about to grant hints. That may suggest he was not
after the trivia answer, but was seeking a topic the looks "new" to the
candidate, but he can wolk on the given hints. If the question appeared
in the "known" category he would pull something other without going deep.

What suggests go ahead to learn many trivia answers is not very productive.
Unless you are so eager to find a WTF job at a WTF company that is... ;-/

> The hope of the interviewer is that you've never seen that kind
> of puzzle before and they get to see how agile you are on your
> mental feet. Speaking from the interviewers side of the fence
> those sort of tests are mostly good for checking out bright young
> puppies who are short on hard experience. When one is dealing
> with more experienced people the important thing is to discover
> if they know what they are supposed to know and how alive their
> intellectual curiosity is. Oh yes, the most important thing to
> discover whether they are BS artists giving you a snow job. BS
> artists are deadly in the technical world.

Indeed, too bad so many of them are put in the interviewer role too. Keeping
the Parkinson schema working without much chance for a counter.

Paul E. Black

unread,

Nov 23, 2009, 1:39:40 PM11/23/09

to

On Saturday 21 November 2009 11:12, Andrew Tomazos wrote:
> I was posed the following question in a technical interview for a
> Software Engineering position by a major multinational NASDAQ company:
>
> [Paraphrasing] "You are given an array of 1,000,000 32-bit integers.
> One int value x occurs 500,001 times or more in the array. Specify an
> algorithm to determine x."
>

> ... [PEB]

>
> The interviewer already knew the answer, so I can only assume the test
> was based on how much help he had to give me to arrive at it.

I often ask questions to understand an applicant's approach, not for
their specific knowledge. (I asked one candidate to write a
quicksort, and he replied he would look it up. I was not impressed.)

> Would you expect someone with a CompSci Masters or PhD from some major
> ivy league university to be able to come up with this solution without
> help?
>

> If so in what course or from which book would you expect to learn the
> required knowledge to come up with the above solution?
>

> Is the skill to be able to come up with such an algorithm something
> that is learned by studying lots of algorithms and being able to mix
> and match the "tricks"?

I took an algorithms or analysis of algorithms class. We solved a ton
of problems like "design an O(n^2) algorithm for ..." (or n log n or
whatever). After a semester of this, we were pretty good at coming up
with stuff. Note: just "knowing" the complexity of a solution went a
long way as a hint to how it might be solved.

So yes, studying how to come up with algorithms should help people in
similar circumstances.

Did *I* solve the problem? hmm, uh, its been a coupla' decades, and
I'm, uh, kinda' busy ...

-paul-
--
Paul E. Black (p.b...@acm.org)

James Kanze

unread,

Nov 23, 2009, 5:19:33 PM11/23/09

to

On Nov 23, 6:55 pm, "Balog Pal" <p...@lib.hu> wrote:
> "Richard Harter" <c...@tiac.net>

> > One point of using puzzles in interviews is that it tests
> > the preparedness of interviewees. These things are
> > chestnuts. If you are serious about getting the job, you go
> > through the chestnuts in advance. It's similar to finding
> > what a company does before you interview with it.

> Joel Spolsky had a bunch of great articles on interviews for
> sw developers. They include stuff for the trivia-based form.
> His opinion matches mine: that is as crappy as it is
> widespread. :-(

> The point would be to find people to fit certain job/work.
> Going for trivia will eliminate many good workers, and get you
> those with irrelevant knowledge or luck.

It depends on what you consider a good answer. Of the answers
I've seen posted here, the only one I would consider acceptable
(but I wouldn't ask such a question) was the one which said to
use one of the known and established linear algorithms to find
the median; at least in the environements I've worked in
previously, any of the other answers are just mental
masturbation and make work.

> Also it will make the company look dumb and incompetent in the
> eyes of an ACE candidate if you happened to stumble on one...

> Well, that's for general -- the particular story said here may
> not be in the category, as the asker was about to grant hints.
> That may suggest he was not after the trivia answer, but was
> seeking a topic the looks "new" to the candidate, but he can
> wolk on the given hints. If the question appeared in the
> "known" category he would pull something other without going
> deep.

That may be the key. If the question was just meant to provoke
discussion, in order to find out how the candidate thinks, and
expresses his ideas, then fine.

--
James Kanze

tc...@lsa.umich.edu

unread,

Nov 23, 2009, 9:27:42 PM11/23/09

to

In article <heekp...@news1.newsguy.com>,

Paul E. Black <p.b...@acm.org> wrote:
>I often ask questions to understand an applicant's approach, not for
>their specific knowledge. (I asked one candidate to write a
>quicksort, and he replied he would look it up. I was not impressed.)

To be fair to the candidate, your question was a poor choice if your
intention was to "understand an applicant's approach, not for their
specific knowledge." Did you really ask for *quicksort* specifically?
If so, how is that *not* testing for specific knowledge? In real life,
if for some reason I really needed to write a *quicksort*, guess what?
I would look it up. Not in a book, of course; I'd find a library routine
that somebody had already written. Why reinvent the wheel?
--
Tim Chow tchow-at-alum-dot-mit-dot-edu
The range of our projectiles---even ... the artillery---however great, will
never exceed four of those miles of which as many thousand separate us from
the center of the earth. ---Galileo, Dialogues Concerning Two New Sciences

bill

unread,

Nov 24, 2009, 2:06:32 AM11/24/09

to

On Nov 21, 8:12 am, Andrew Tomazos <and...@tomazos.com> wrote:
> I was posed the following question in a technical interview for a
> Software Engineering position by a major multinational NASDAQ company:
>
> [Paraphrasing] "You are given an array of 1,000,000 32-bit integers.
> One int value x occurs 500,001 times or more in the array. Specify an
> algorithm to determine x."
>

> The assumption being extra points for efficiency.
>

> I initially came up with a linear space, linear time solution. With
> some prompting and hints from the interviewer we worked our way to a
> smaller constant space and linear time algorithm. That being
> basically:
>

> int findmode(int* p, int n)
> {

> int count[32];
> for(int i = 0; i < 32; i++)
> count[i] = 0;
>
> for (int i = 0; i < n; i++)
> for (int j = 0; j < 32; j++)
> if (p[i] & (1 << j)) // if bit j is on
> count[j]++;
> else
> count[j]--;
>
> int x = 0;
> for (int i = 0; i < 32; i++)
> if (count[i] > 0)
> x = x | (1 << i);
>
> return x;
>
> }
>
> The idea here is to count the frequency of each of the 32 bits in the
> array separately, knowing that these bit counts will be dominated by
> the mode.
>

> The interviewer already knew the answer, so I can only assume the test
> was based on how much help he had to give me to arrive at it.
>

> My question about this is as follows. I, perhaps boldly, claim to
> employers to have a "masters-equivilant" experience/knowledge of

> computer science and math. Should I have been able to come up with
> this solution without prompting or help?

>
> Would you expect someone with a CompSci Masters or PhD from some major
> ivy league university to be able to come up with this solution without
> help?
>
> If so in what course or from which book would you expect to learn the
> required knowledge to come up with the above solution?
>
> Is the skill to be able to come up with such an algorithm something
> that is learned by studying lots of algorithms and being able to mix

> and match the "tricks"? If so, what is a similar commonly known
> algorithm(s) on which the above solution could have been based?
>

> Or, is the ability to invent such a solution simply a matter of IQ?
> Some people have the talent to solve the puzzle, see the pattern and
> come up with the solution - and some just don't?
>

> Thanks,
> Andrew.

Compare the number of 1's and 0's in each bit position.
Which ever bit is in the majority is the bit in 'x' for
that bit position.

Creating an appropriate algorithm might be considerably
more difficult than I suspect! But the approach works if
x is something like 01111111111111111111111111111111.

regards, Bill J.

mohangupta13

unread,

Nov 24, 2009, 12:22:58 PM11/24/09

to

On Nov 23, 2:52 am, mjc <mjco...@acm.org> wrote:
> I was also given this problem as part of an interview quiz. My answer
> was: Use one of the known O(n) median algorithms and use that value.

How do you use median algorithm ( as i get it .those refer to the
selection algorithm ,like (n+1)/2 th order statistic) to solve the
above problem??
Mohan

mohangupta13

unread,

Nov 24, 2009, 12:26:35 PM11/24/09

to

Why should it count? Its just a simple instruction like c++ or c-- ,
it should count
As comparison should refer to the comparison of input entities (with
their satellite data's) .
>

> > I think the other k-1 comparisons are for the 2nd
> > part of the algorithm, when you make one more pass
> > through the array verifying the candidate found in
> > the 1st part. [You already know the index i for
> > one occurrence of the candidate, so only the other
> > k-1 indices have to be checked.]

But I still don't get how is it 3k -2 comparisons i just see a single
loop (of n) in which comparison is done just once.
Mohan

Paul E. Black

unread,

Nov 24, 2009, 12:54:13 PM11/24/09

to

On Monday 23 November 2009 21:27, tc...@lsa.umich.edu wrote:

> In article <heekp...@news1.newsguy.com>,
> Paul E. Black <p.b...@acm.org> wrote:
>>I often ask questions to understand an applicant's approach, not for
>>their specific knowledge. (I asked one candidate to write a
>>quicksort, and he replied he would look it up. I was not impressed.)
>
> To be fair to the candidate, your question was a poor choice if your
> intention was to "understand an applicant's approach, not for their
> specific knowledge." Did you really ask for *quicksort*
> specifically?

I think so.

> If so, how is that *not* testing for specific knowledge?

Because I didn't really care if they could write quicksort off the top
of their head. I wanted to see if they would work with me or try to
fake it. If they said, I don't remember, but I could look it up, I'd
help them (or switch to simpler algorithm). I'd typically guide them
through whatever they were writing. Did they think about correctness?
Did they talk about testing or noting (documenting) what they were
doing? What language should it be in? Did they think about
alternatives? The type of input parameters? Should the input
parameters be checked? Will they ask if they *don't* know something,
or will they just guess and write garbage?

It was the candidate's attitude of "I'm so far beyond that that I
don't even bother remembering such trivia." When I tried to push to
see what he *did* know, I could not get a feel of engagement.

> In real life,
> if for some reason I really needed to write a *quicksort*, guess what?
> I would look it up. Not in a book, of course; I'd find a library routine
> that somebody had already written. Why reinvent the wheel?

Excellent point. And I respect that answer.

Ben Bacarisse

unread,

Nov 24, 2009, 1:07:49 PM11/24/09

to

mohangupta13 <mohang...@gmail.com> writes:
<snip>

> But I still don't get how is it 3k -2 comparisons i just see a single
> loop (of n) in which comparison is done just once.

That bound was quoted in error. It is for a closely related but
sightly different algorithm. The algorithm given here uses n-1
element comparisons.

The paper from which the 3k - 2 comes describes a method to determine
*if* there is a majority and, if so, what the majority element is.
That needs 3n/2 - 2 element comparisons, but since we know there a
majority, we can stop after what the authors call phase one:
http://www.cs.yale.edu/publications/techreports/tr252.pdf

A good overview of similar problems is given here:
http://www2.research.att.com/~marioh/papers/cacm09.pdf

--
Ben.

Ben Bacarisse

unread,

Nov 24, 2009, 1:10:15 PM11/24/09

to

mohangupta13 <mohang...@gmail.com> writes:

> On Nov 23, 2:52 am, mjc <mjco...@acm.org> wrote:
>> I was also given this problem as part of an interview quiz. My answer
>> was: Use one of the known O(n) median algorithms and use that value.
> How do you use median algorithm ( as i get it .those refer to the
> selection algorithm ,like (n+1)/2 th order statistic) to solve the
> above problem??

Imagine the 1,000,000 items sorted. How can the middle item be
anything but one of 500,001 that make up the majority?

--
Ben.

mohangupta13

unread,

Nov 24, 2009, 3:39:02 PM11/24/09

to

On Nov 24, 11:10 pm, Ben Bacarisse <ben.use...@bsb.me.uk> wrote:

sorry for giving much thought before writing.
Mohan
> --
> Ben.

mike3

unread,

Nov 24, 2009, 8:32:08 PM11/24/09

to

On Nov 21, 10:24 am, p...@informatimago.com (Pascal J. Bourguignon)
wrote:

> Andrew Tomazos <and...@tomazos.com> writes:
> > I was posed the following question in a technical interview for a
> > Software Engineering position by a major multinational NASDAQ company:
>
> > [Paraphrasing] "You are given an array of 1,000,000 32-bit integers.
> > One int value x occurs 500,001 times or more in the array. Specify an
> > algorithm to determine x."
>
> > The assumption being extra points for efficiency.
>

> > I initially came up with a linear space, linear time solution. With
> > some prompting and hints from the interviewer we worked our way to a
> > smaller constant space and linear time algorithm. That being
> > basically:
>

> > int findmode(int* p, int n)
> > {

> > int count[32];
> > for(int i = 0; i < 32; i++)
> > count[i] = 0;
>
> > for (int i = 0; i < n; i++)
> > for (int j = 0; j < 32; j++)
> > if (p[i] & (1 << j)) // if bit j is on
> > count[j]++;
> > else
> > count[j]--;
>
> > int x = 0;
> > for (int i = 0; i < 32; i++)
> > if (count[i] > 0)
> > x = x | (1 << i);
>
> > return x;
> > }
>
> > The idea here is to count the frequency of each of the 32 bits in the
> > array separately, knowing that these bit counts will be dominated by
> > the mode.
>
> > The interviewer already knew the answer, so I can only assume the test
> > was based on how much help he had to give me to arrive at it.
>
> > My question about this is as follows. I, perhaps boldly, claim to

> > employers to have a "masters-equivalant" experience/knowledge of

> > computer science and math. Should I have been able to come up with
> > this solution without prompting or help?
>

> If what you're asking is whether anybody having a master in CS and
> maths would have been able to come up with this solution in the
> interview time, I guess we can answer that definitely no, otherwise
> there would be no point in trying to select candidates with this test.
>
> Obviously, it's because some people (with or without a master diploma,
> this really isn't relevant) get or don't get it, that this test is
> useful for the recruiter.
>
> Now if you want this kind of jobs, yes you should better be able to
> come up with smart solutions to little puzzles like this in
> interviews.
>

You mean he should be able to come up with them without prompting,
right?

> > Would you expect someone with a CompSci Masters or PhD from some major
> > ivy league university to be able to come up with this solution without
> > help?
>
> > If so in what course or from which book would you expect to learn the
> > required knowledge to come up with the above solution?
>

> Not a single one. You have to develop your knowledge of algorithms,
> mathematics, your symbolic thinking and your imagination in these
> matters.
>

You say not a "single" one, so does that mean no course would do it,
or
you'd need multiple ones? After all, courses and books (multiple) can
be used to increase the first 2 areas (knowledge of algorithms and
mathematics).

> > Is the skill to be able to come up with such an algorithm something
> > that is learned by studying lots of algorithms and being able to mix
> > and match the "tricks"?
>

> That could help yes. I'd tend to think that imagination is the most
> important ingredient here, to be able to come with a possible solution
> fast enough.
>

And how does one develop that? More specifically, the "speed" of
coming
up with the solution, not just being able to come up with one.

> Also, don't limit yourself to CS and maths, there are ideas to be
> taken from other domains too.
>

What would that be?

> > If so, what is a similar commonly known
> > algorithm(s) on which the above solution could have been based?
>
> > Or, is the ability to invent such a solution simply a matter of IQ?
>

> CS IQ, yes, if such a IQ test exists.
>

Heh.

> > Some people have the talent to solve the puzzle, see the pattern and
> > come up with the solution - and some just don't?
>

> It seems so. At least, in a given time.
>

Hmm.

mike3

unread,

Nov 24, 2009, 8:34:24 PM11/24/09

to

On Nov 24, 6:32 pm, mike3 <mike4...@yahoo.com> wrote:
> On Nov 21, 10:24 am, p...@informatimago.com (Pascal J. Bourguignon)
> wrote:

<snip>

> > Not a single one. You have to develop your knowledge of algorithms,
> > mathematics, your symbolic thinking and your imagination in these
> > matters.
>
> You say not a "single" one, so does that mean no course would do it,
> or
> you'd need multiple ones? After all, courses and books (multiple) can
> be used to increase the first 2 areas (knowledge of algorithms and
> mathematics).
>

<snip>

Oh yes, and I forgot to add: how does one train the "symbolic"
thinking?

Richard Heathfield

unread,

Nov 25, 2009, 1:52:03 AM11/25/09

to

In
<805a7a21-f99b-4eb9...@k19g2000yqc.googlegroups.com>,
mike3 wrote:

<snip>

> Oh yes, and I forgot to add: how does one train the "symbolic"
> thinking?

Step 1: obtain a pointy stick.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
"Usenet is a strange place" - dmr 29 July 1999
Sig line vacant - apply within

John W. Krahn

unread,

Nov 25, 2009, 2:06:52 AM11/25/09

to

Richard Heathfield wrote:
> In
> <805a7a21-f99b-4eb9...@k19g2000yqc.googlegroups.com>,
> mike3 wrote:
>
> <snip>
>
>> Oh yes, and I forgot to add: how does one train the "symbolic"
>> thinking?
>
> Step 1: obtain a pointy stick.

Or some fresh fruit.

John
--
The programmer is fighting against the two most
destructive forces in the universe: entropy and
human stupidity. -- Damian Conway

mike3

unread,

Nov 25, 2009, 2:11:14 AM11/25/09

to

On Nov 25, 12:06 am, "John W. Krahn" <some...@example.com> wrote:
> Richard Heathfield wrote:
> > In
> > <805a7a21-f99b-4eb9-abb0-fbde47cb9...@k19g2000yqc.googlegroups.com>,

> > mike3 wrote:
>
> > <snip>
>
> >> Oh yes, and I forgot to add: how does one train the "symbolic"
> >> thinking?
>
> > Step 1: obtain a pointy stick.
>
> Or some fresh fruit.
>

What's the point here?

Pascal J. Bourguignon

unread,

Nov 25, 2009, 4:32:15 AM11/25/09

to

mike3 <mike...@yahoo.com> writes:

Put yourself in the place of the employer! Between a guy who will
find the solution with two clues, and one who will find it alone, in
the same time, who will you hire?

What I'm saying here is tautologic: if you want the job that is
selected by this filter, then you better have to pass this filter!

(It's up to the employer to ensure that the filter matches the
requirements of the job).

>> > Would you expect someone with a CompSci Masters or PhD from some major
>> > ivy league university to be able to come up with this solution without
>> > help?
>>
>> > If so in what course or from which book would you expect to learn the
>> > required knowledge to come up with the above solution?
>>
>> Not a single one. �You have to develop your knowledge of algorithms,
>> mathematics, your symbolic thinking and your imagination in these
>> matters.
>>
>
> You say not a "single" one, so does that mean no course would do it,
> or
> you'd need multiple ones? After all, courses and books (multiple) can
> be used to increase the first 2 areas (knowledge of algorithms and
> mathematics).

The later of course.

>> > Is the skill to be able to come up with such an algorithm something
>> > that is learned by studying lots of algorithms and being able to mix
>> > and match the "tricks"? �
>>
>> That could help yes. �I'd tend to think that imagination is the most
>> important ingredient here, to be able to come with a possible solution
>> fast enough.
>>
>
> And how does one develop that? More specifically, the "speed" of
> coming up with the solution, not just being able to come up with
> one.

Learning, training, repeatition. Some say the brain is like a muscle,
you have to train it.

Basically, it is a kind of "intelligence", which you want to improve.

>> Also, don't limit yourself to CS and maths, there are ideas to be
>> taken from other domains too.
>>
>
> What would that be?

We cannot know in advance. The point here is that between somebody
who spends 100% of his time learning CS, and somebody who spends 90%
of his time learning CS, and 10% of his time 'learning' something
else, the later will probably be able to invent algorithms that the
first one won't.

For example, in the domain of artificial intelligence, it is
worthwhile to have some idea of neurobiology...

>> > If so, what is a similar commonly known
>> > algorithm(s) on which the above solution could have been based?
>>
>> > Or, is the ability to invent such a solution simply a matter of IQ?
>>
>> CS IQ, yes, if such a IQ test exists.
>>
>
> Heh.
>
>> > Some people have the talent to solve the puzzle, see the pattern and
>> > come up with the solution - and some just don't?
>>
>> It seems so. �At least, in a given time.
>>
>
> Hmm.

--
__Pascal Bourguignon__

Pascal J. Bourguignon

unread,

Nov 25, 2009, 4:33:35 AM11/25/09

to

mike3 <mike...@yahoo.com> writes:

http://www.leftinthedark.org.uk/

--
__Pascal Bourguignon__

Pascal J. Bourguignon

unread,

Nov 25, 2009, 4:37:08 AM11/25/09

to

mike3 <mike...@yahoo.com> writes:

I don't have a statistically valid study, but I observed a few people
who didn't play with lego (or at least mecano) as a child had more
difficulty in building programs.

--
__Pascal Bourguignon__

red floyd

unread,

Nov 25, 2009, 12:00:54 PM11/25/09

to

On Nov 24, 11:06 pm, "John W. Krahn" <some...@example.com> wrote:
> Richard Heathfield wrote:
> > In

> > <805a7a21-f99b-4eb9-abb0-fbde47cb9...@k19g2000yqc.googlegroups.com>,

> > mike3 wrote:
>
> > <snip>
>
> >> Oh yes, and I forgot to add: how does one train the "symbolic"
> >> thinking?
>
> > Step 1: obtain a pointy stick.
>
> Or some fresh fruit.
>

Or a Bengal Tiger.

Andy Champ

unread,

Nov 25, 2009, 3:19:08 PM11/25/09

to

Pascal J. Bourguignon wrote:
>
> I don't have a statistically valid study, but I observed a few people
> who didn't play with lego (or at least mecano) as a child had more
> difficulty in building programs.
>

Cause or effect? (do people who make food engineers like to play with
Lego and Meccano - or does playing with them make you an engineer?)

It occurs to me though that if an interviewer asked me to write a sort
I'd probably ask why. There are lots of standard sorts. The
interesting part of a high performance sort is getting the data in and
out, making best use of memory and CPU, things like that. If the data
is in memory it's a trivial problem - because it was solved years ago.

Although it does occur to me that if I have enough CPUs it might be
worth partitioning the work into small sorts then merges...

Andy

bartc

unread,

Nov 25, 2009, 3:29:32 PM11/25/09

to

"Andrew Tomazos" <and...@tomazos.com> wrote in message
news:6a8340b8-19b0-4fda...@m26g2000yqb.googlegroups.com...

>I was posed the following question in a technical interview for a
> Software Engineering position by a major multinational NASDAQ company:
>
> [Paraphrasing] "You are given an array of 1,000,000 32-bit integers.
> One int value x occurs 500,001 times or more in the array. Specify an
> algorithm to determine x."

How was it determined that one value occurs 500001 or more times?

Perhaps more attention should have been paid to x at the time the 500001
figure was calculated...

I quite liked Malcolm's idea of taking a small but random sample, choosing a
handful of likely candidates, then counting occurrences of those. If no
counts reach 500001, then repeat a few more times or until you know you've
been given dud information.

--
Bartc

mike3

unread,

Nov 25, 2009, 10:28:30 PM11/25/09

to

On Nov 25, 2:33 am, p...@informatimago.com (Pascal J. Bourguignon)
wrote:

So what's the pointy stick for, then?

mike3

unread,

Nov 25, 2009, 10:36:41 PM11/25/09

to

To get fruit off the trees? What? Hmm.

Nick Keighley

unread,

Nov 27, 2009, 5:05:03 AM11/27/09

to

On 25 Nov, 07:11, mike3 <mike4...@yahoo.com> wrote:
> On Nov 25, 12:06 am, "John W. Krahn" <some...@example.com> wrote:
>
> > Richard Heathfield wrote:
> > > In
> > > <805a7a21-f99b-4eb9-abb0-fbde47cb9...@k19g2000yqc.googlegroups.com>,
> > > mike3 wrote:
>
> > > <snip>
>
> > >> Oh yes, and I forgot to add: how does one train the "symbolic"
> > >> thinking?
>
> > > Step 1: obtain a pointy stick.

I think that's supposed to mean "there's no easy way". Practice maybe.

>
> > Or some fresh fruit.
>
> What's the point here?

google Monty Python

Phil Carmody

unread,

Nov 27, 2009, 10:07:14 PM11/27/09

to

Andrew Tomazos <and...@tomazos.com> writes:
> I was posed the following question in a technical interview for a
> Software Engineering position by a major multinational NASDAQ company:
>
> [Paraphrasing] "You are given an array of 1,000,000 32-bit integers.
> One int value x occurs 500,001 times or more in the array. Specify an
> algorithm to determine x."
>

> employers to have a "masters-equivilant" experience/knowledge of

> computer science and math. Should I have been able to come up with
> this solution without prompting or help?

What solution? The above exhibits undefined behaviour.

Phil
--
Any true emperor never needs to wear clothes. -- Devany on r.a.s.f1

Andrew Tomazos

unread,

Dec 20, 2009, 6:08:59 PM12/20/09

to

On Nov 28, 4:07 am, Phil Carmody <thefatphil_demun...@yahoo.co.uk>
wrote:

> > int findmode(int* p, int n)
> > {
> > int count[32];
> > for(int i = 0; i < 32; i++)
> > count[i] = 0;
>
> > for (int i = 0; i < n; i++)
> > for (int j = 0; j < 32; j++)
> > if (p[i] & (1 << j)) // if bit j is on
> > count[j]++;
> > else
> > count[j]--;
>
> > int x = 0;
> > for (int i = 0; i < 32; i++)
> > if (count[i] > 0)
> > x = x | (1 << i);
>
> > return x;
> > }
>
> > The idea here is to count the frequency of each of the 32 bits in the
> > array separately, knowing that these bit counts will be dominated by
> > the mode.
>

> What solution? The above exhibits undefined behaviour.

If you spot a bug you might bother saying what it is. I wrote this
quickly and haven't tested it. Are you referring to the assumption
that int is 32 bit, or is there a typo or some other problem?
-Andrew.

Danio

unread,

Dec 22, 2009, 11:16:09 AM12/22/09

to

I tried this with:
* int p[9] = {4, 4, 4, 5, 6, 7, 5, 5, 4};
* int mode = findmode(p, 9);
and it returns 5.
The bit-counting method correctly returns 4.

It looks to me like you missed Phase 1 from the Fischer&Salzberg
algorithm, namely
arrange the elements so that no two adjacent elements are the same.
http://www.cs.yale.edu/publications/techreports/tr252.pdf

jbriggs444

unread,

Dec 22, 2009, 1:23:57 PM12/22/09

to

> arrange the elements so that no two adjacent elements are the same.http://www.cs.yale.edu/publications/techreports/tr252.pdf- Hide quoted text -
>
> - Show quoted text -

I think you missed the definition of the problem in this thread. It
is one of the givens of the problem that the array has a strict
"majority" value that appears in over half ot its entries.

You gave a 9 member array whose mode appears in less than half of the
elements. That violates the givens of the problem. The algorithm is
required to deliver correct results for input data that is within
spec. It is permitted to engage in undefined behavior for input data
that is out of spec.

However, unless you're going to snipe about implementation defined
precision for "int", I don't see any undefined behavior even for the
spec-violating input that you provided.

You go on to claim that the code quoted above gives an answer of 5 but
that the "bit-counting method" returns 4. What you fail to realize is
that the code above _is_ the bit-counting method
and that it returns 4, not 5.

I actually ran the code. Did you?

$ ./a.out
mode is 4

In summary, in spite of your accusation of ill-defined behavior, in
spite of your providing the algorithm with non-compliant input and in
spite of your claim of incorrect output, the fact of the matter is
that the algorithm got what you consider to be the right answer.

What were you complaining about again?

Danio

unread,

Dec 22, 2009, 4:07:08 PM12/22/09

to

On Dec 22, 6:23 pm, jbriggs444 <jbriggs...@gmail.com> wrote:
> On Dec 22, 11:16 am, Danio <daniothef...@hotmail.com> wrote:
> > On Dec 20, 11:08 pm, Andrew Tomazos <and...@tomazos.com> wrote:
> > > On Nov 28, 4:07 am, Phil Carmody <thefatphil_demun...@yahoo.co.uk>
> > > wrote:
> > > > What solution? The above exhibits undefined behaviour.
>
> > > If you spot a bug you might bother saying what it is. I wrote this
> > > quickly and haven't tested it. Are you referring to the assumption
> > > that int is 32 bit, or is there a typo or some other problem?
> > > -Andrew.
>

> > It looks to me like you missed Phase 1 from the Fischer&Salzberg
> > algorithm, namely

> > arrange the elements so that no two adjacent elements are the same.http://www.cs.yale.edu/publications/techreports/tr252.pdf-Hide quoted text -

>
> > - Show quoted text -
>
> I think you missed the definition of the problem in this thread. It
> is one of the givens of the problem that the array has a strict
> "majority" value that appears in over half ot its entries.

Sorry my mistake - strict majority truned into modal value in my head
in between the original description and the latter discussions in this
thread.

> You go on to claim that the code quoted above gives an answer of 5 but
> that the "bit-counting method" returns 4. What you fail to realize is
> that the code above _is_ the bit-counting method
> and that it returns 4, not 5.

Again a mis-reading of Phil Carmody's quote - assumed that he would be
quoting
the most recent code posting of Andrew's not the initial one.

> I actually ran the code. Did you?

Yes

> In summary, in spite of your accusation of ill-defined behavior, in
> spite of your providing the algorithm with non-compliant input and in
> spite of your claim of incorrect output, the fact of the matter is
> that the algorithm got what you consider to be the right answer.

Well it wasn't my accusation of ill-defined behaviour: that was Phil
Carmody.
I was merely trying to find out what he didn't like about the
algorithm,
unfortunately with a horribly flawed test case and the wrong
algorithm...

Ostap S. B. M. Bender Jr.

unread,

Dec 22, 2009, 9:48:39 PM12/22/09

to

On Nov 21, 8:12 am, Andrew Tomazos <and...@tomazos.com> wrote:
> I was posed the following question in a technical interview for a
> Software Engineering position by a major multinational NASDAQ company:
>
> [Paraphrasing] "You are given an array of 1,000,000 32-bit integers.
> One int value x occurs 500,001 times or more in the array. Specify an
> algorithm to determine x."
>
> The assumption being extra points for efficiency.
>
> I initially came up with a linear space, linear time solution. With
> some prompting and hints from the interviewer we worked our way to a
> smaller constant space and linear time algorithm. That being
> basically:
>

> int findmode(int* p, int n)
> {
> int count[32];
> for(int i = 0; i < 32; i++)
> count[i] = 0;
>
> for (int i = 0; i < n; i++)
> for (int j = 0; j < 32; j++)
> if (p[i] & (1 << j)) // if bit j is on
> count[j]++;
> else
> count[j]--;
>
> int x = 0;
> for (int i = 0; i < 32; i++)
> if (count[i] > 0)
> x = x | (1 << i);
>
> return x;
>
> }
>
> The idea here is to count the frequency of each of the 32 bits in the
> array separately, knowing that these bit counts will be dominated by
> the mode.
>

> The interviewer already knew the answer, so I can only assume the test
> was based on how much help he had to give me to arrive at it.
>
> My question about this is as follows. I, perhaps boldly, claim to
> employers to have a "masters-equivilant" experience/knowledge of
> computer science and math. Should I have been able to come up with
> this solution without prompting or help?
>

> Would you expect someone with a CompSci Masters or PhD from some major
> ivy league university to be able to come up with this solution without
> help?
>
> If so in what course or from which book would you expect to learn the
> required knowledge to come up with the above solution?
>

> Is the skill to be able to come up with such an algorithm something
> that is learned by studying lots of algorithms and being able to mix

> and match the "tricks"? If so, what is a similar commonly known

> algorithm(s) on which the above solution could have been based?
>
> Or, is the ability to invent such a solution simply a matter of IQ?

> Some people have the talent to solve the puzzle, see the pattern and
> come up with the solution - and some just don't?
>

How badly would the following simple algorithm work:

Look at the first column. Determine which digit occurs more - 0 or 1 -
among the 1,000,000 integers, and mark those integers, that don't have
this digit, as "R" (for "rejects")

Then go to the second column. Determine which digit occurs more - 0 or
1 - among the integers not marked "R", and mark those integers, that
don't have this digit, as "R".

.
.
.

Then go to the i'th column. Determine which digit occurs more - 0 or 1
- among the integers not marked "R", and mark those integers, that
don't have this digit, as "R".

At the end, return any integer not marked by "R".

Andrew Tomazos

unread,

Dec 31, 2009, 9:38:46 PM12/31/09

to

On Dec 23 2009, 3:48 am, "Ostap S. B. M. Bender Jr."

This requires linear O(n) space to store these reject marks. The two
previously discussed algorithms only require constant O(1) space.
-Andrew.

Ostap S. B. M. Bender Jr.

unread,

Dec 31, 2009, 10:23:59 PM12/31/09

to

Do you mean this:

int findmode(int* p, int n)
{
int count[32];
for(int i = 0; i < 32; i++)
count[i] = 0;

for (int i = 0; i < n; i++)
for (int j = 0; j < 32; j++)
if (p[i] & (1 << j)) // if bit j is on
count[j]++;
else
count[j]--;

int x = 0;
for (int i = 0; i < 32; i++)
if (count[i] > 0)
x = x | (1 << i);

return x;

}

Doesn't count[j] require O(log n) space?

But of course O(log n) is still better than O(n)

Andrew Tomazos

unread,

Jan 12, 2010, 10:39:48 AM1/12/10

to

On Jan 1, 4:23 am, "Ostap S. B. M. Bender Jr."

Ummm, I'm not sure whether you are reffering to the number of ints
that make up the count array (32) or the size of the individual ints
themselves. The number of counts required is always 32 independant of
n, therefore that part is constant.

In the counting algorithm the size of each of the 32 ints has to be
large enough to hold a value of n, and so requires log2(n) bits. This
is also true of the Fisher algorithm which must maintain a similiar
count of maximum value n, and therefore also needs log2(n) bits.
These were generally referred to as constant space solutions, but I am
confused right now whether that is true anymore, and whether these two
solutions are actually technically O(log(n)) space. I remember
hearing something about there being an implied log operation on space
complexity in some context, but I'm not sure if I'm conflating this
with something else.
-Andrew.

websnarf

unread,

Jan 28, 2010, 12:18:38 AM1/28/10

to

Andrew Tomazos <and...@tomazos.com> wrote:
> I was posed the following question in a technical interview for a
> Software Engineering position by a major multinational NASDAQ company:
>
> [Paraphrasing] "You are given an array of 1,000,000 32-bit integers.
> One int value x occurs 500,001 times or more in the array. Specify an
> algorithm to determine x."

I missed your original post, but dug up the following that might be of
interest:

http://www.cs.utexas.edu/users/moore/best-ideas/mjrty/index.html

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

FÖLDY Lajos

unread,

Jan 28, 2010, 6:29:21 AM1/28/10

to

On Wed, 27 Jan 2010, websnarf wrote:

> I missed your original post, but dug up the following that might be of
> interest:
>
> http://www.cs.utexas.edu/users/moore/best-ideas/mjrty/index.html

Probably I miss something, but this algorithm gives D as the majority
element in the sequence AAAABCBCD. ???

regards,
lajos

Rob Johnson

unread,

Jan 28, 2010, 7:15:37 AM1/28/10

to

In article <Pine.LNX.4.64.10...@lxserv1.kfki.hu>,

The majority element has to account for more than 50% of the sample.
In the example above, there is no majority element (A is 4/9 of the
sample), so the algorithm does not guarantee anything.

Rob Johnson <r...@trash.whim.org>
take out the trash before replying
to view any ASCII art, display article in a monospaced font

Ilmari Karonen

unread,

Jan 28, 2010, 7:45:59 AM1/28/10

to

["Followup-To:" header set to sci.math.]

Like the original puzzle that started this thread, the linked paper is
using "majority element" to mean an element that occurs more than n/2
times in a list of n elements. By that definition, your list does not
have a majority element.

--
Ilmari Karonen
To reply by e-mail, please replace ".invalid" with ".net" in address.

Mathew Francis

unread,

Mar 3, 2010, 4:08:28 AM3/3/10

to

If the question is "one value occurs 500,001 times or more and the rest of
the values occur exactly once in the array", then it can be done by
simply checking for a value that occurs in two consecutive positions.

On Wed, 27 Jan 2010, websnarf wrote:

pete

unread,

Mar 3, 2010, 6:32:54 AM3/3/10

to

Mathew Francis wrote:
>
> If the question is "one value occurs 500,001
> times or more and the rest of
> the values occur exactly once in the array", then it can be done by
> simply checking for a value that occurs in two consecutive positions.

That seems to be an O(N*N) solution,
while a O(N) solution is claimed in the below URL.

> On Wed, 27 Jan 2010, websnarf wrote:
>
> > Andrew Tomazos <and...@tomazos.com> wrote:
> >> I was posed the following question in a technical interview for a
> >> Software Engineering position by a major
> >> multinational NASDAQ company:
> >>
> >> [Paraphrasing] "You are given an array of 1,000,000
> >> 32-bit integers.
> >> One int value x occurs 500,001 times or more in the array.
> >> Specify an
> >> algorithm to determine x."
> >
> > I missed your original post,
> > but dug up the following that might be of
> > interest:
> >
> > http://www.cs.utexas.edu/users/moore/best-ideas/mjrty/index.html

--
pete

Richard Heathfield

unread,

Mar 3, 2010, 6:55:34 AM3/3/10

to

pete wrote:
> Mathew Francis wrote:
>> If the question is "one value occurs 500,001
>> times or more and the rest of
>> the values occur exactly once in the array", then it can be done by
>> simply checking for a value that occurs in two consecutive positions.
>
> That seems to be an O(N*N) solution,

I don't think so. With Mathew's change to the spec, we have:

#include <stdio.h>

/* handwaving the O(n) array population step */
extern int populate(unsigned long *n, size_t len);

int main(void)
{
int done = 0;
static unsigned long arr[1000000];
size_t max = sizeof arr / sizeof arr[0];
unsigned long lastval;
size_t i;
populate(arr, max);
lastval = arr[0];
for(i = 1; !done && i < max; i++)
{
if(arr[i] == lastval)
{
done = 1;
}
else
{
lastval = arr[i];
}
}
if(done)
{
printf("%lu\n", lastval);
}
else
{
printf("No consecutives found\n");
}
return 0;
}

which looks pretty O(n) to me.

<snip>

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
"Usenet is a strange place" - dmr 29 July 1999
Sig line vacant - apply within

pete

unread,

Mar 3, 2010, 7:15:03 AM3/3/10

to

You are both correct.
I misread what Mathew Francis had written.
Some people say that reading is harder than writing.

--
pete

Pascal J. Bourguignon

unread,

Mar 3, 2010, 8:29:11 AM3/3/10

to

Richard Heathfield <r...@see.sig.invalid> writes:

It is not said that the other values are without repeatition!

[Paraphrasing] "You are given an array of 1,000,000 32-bit integers.
One int value x occurs 500,001 times or more in the array. Specify an
algorithm to determine x."

So you could have: {1,1,2,2,2,2} with N=6.
and your algorithm wouldn't work correctly.

The minimum number of step possible (the best case), is N/2+1,
if the wanted numbers x are all at the beginning of the vector.

--
__Pascal Bourguignon__
http://www.informatimago.com

Richard Harter

unread,

Mar 3, 2010, 9:21:09 AM3/3/10

to

On Wed, 03 Mar 2010 14:29:11 +0100, p...@informatimago.com (Pascal
J. Bourguignon) wrote:

>Richard Heathfield <r...@see.sig.invalid> writes:
>
>> pete wrote:
>>> Mathew Francis wrote:
>>>> If the question is "one value occurs 500,001
>>>> times or more and the rest of
>>>> the values occur exactly once in the array", then it can be done by
>>>> simply checking for a value that occurs in two consecutive positions.
>>>
>>> That seems to be an O(N*N) solution,
>>
>> I don't think so. With Mathew's change to the spec, we have:

[snip]

>It is not said that the other values are without repeatition!
>
> [Paraphrasing] "You are given an array of 1,000,000 32-bit integers.
> One int value x occurs 500,001 times or more in the array. Specify an
> algorithm to determine x."

You ignored the beginning of the post that you commented upon.
They aren't discussing the original post with the original
problem. They are discussing the problem stated by Matthew
Francis.

>
>So you could have: {1,1,2,2,2,2} with N=6.
>and your algorithm wouldn't work correctly.

Example does not fit the problem under discussion.

>
>
>The minimum number of step possible (the best case), is N/2+1,
>if the wanted numbers x are all at the beginning of the vector.

In the problem currently being discussed the best case is 1; the
worst case is N/2-1.

Richard Harter, c...@tiac.net
http://home.tiac.net/~cri, http://www.varinoma.com
It's not much to ask of the universe that it be fair;
it's not much to ask but it just doesn't happen.

Richard Harter

unread,

Mar 3, 2010, 10:55:50 AM3/3/10

to

On Wed, 03 Mar 2010 14:21:09 GMT, c...@tiac.net (Richard Harter)
wrote:

>
>In the problem currently being discussed the best case is 1; the
>worst case is N/2-1.

Oops, best case is 2.

Pascal J. Bourguignon

unread,

Mar 4, 2010, 4:02:46 AM3/4/10

to

c...@tiac.net (Richard Harter) writes:

> On Wed, 03 Mar 2010 14:29:11 +0100, p...@informatimago.com (Pascal
> J. Bourguignon) wrote:
>
>>Richard Heathfield <r...@see.sig.invalid> writes:
>>
>>> pete wrote:
>>>> Mathew Francis wrote:
>>>>> If the question is "one value occurs 500,001
>>>>> times or more and the rest of
>>>>> the values occur exactly once in the array", then it can be done by
>>>>> simply checking for a value that occurs in two consecutive positions.
>>>>
>>>> That seems to be an O(N*N) solution,
>>>
>>> I don't think so. With Mathew's change to the spec, we have:
> [snip]
>
>>It is not said that the other values are without repeatition!
>>
>> [Paraphrasing] "You are given an array of 1,000,000 32-bit integers.
>> One int value x occurs 500,001 times or more in the array. Specify an
>> algorithm to determine x."
>
> You ignored the beginning of the post that you commented upon.
> They aren't discussing the original post with the original
> problem. They are discussing the problem stated by Matthew
> Francis.

Oops! I must confess I didn't follow closely the thread after the
initial burst. Sorry.

Math/CompSci Interview Question - Thoughts?

Andrew Tomazos

Pascal J. Bourguignon

GJ Woeginger

Richard Harter

A.G.McDowell

Bill Dubuque

Alf P. Steinbach

Axel Vogt

Andrew Tomazos

Andrew Tomazos

Andrew Tomazos

Bill Dubuque

Casey Hawthorne

Richard Harter

Virgil

Willem

Richard Heathfield

Willem

Moi

Andrew Tomazos

Malcolm McLean

Willem

Chip Eastham

Malcolm McLean

Mike Terry

Ben Bacarisse

mjc

Stuart Golodetz

James Kanze

GJ Woeginger

Ben Bacarisse

jbriggs444

Bruce Wheeler

Bruce Wheeler

Balog Pal

Paul E. Black

James Kanze

tc...@lsa.umich.edu

bill

mohangupta13

mohangupta13

Paul E. Black

Ben Bacarisse

Ben Bacarisse

mohangupta13

mike3

mike3

Richard Heathfield

John W. Krahn

mike3

Pascal J. Bourguignon

Pascal J. Bourguignon

Pascal J. Bourguignon

red floyd

Andy Champ

bartc

mike3

mike3

Nick Keighley

Phil Carmody

Andrew Tomazos

Danio

jbriggs444

Danio

Ostap S. B. M. Bender Jr.

Andrew Tomazos

Ostap S. B. M. Bender Jr.

Andrew Tomazos

websnarf

FÖLDY Lajos

Rob Johnson

Ilmari Karonen

Mathew Francis

pete

Richard Heathfield

pete

Pascal J. Bourguignon

Richard Harter

Richard Harter