How I think Google Sets work

sudi...@gmail.com

unread,

Nov 15, 2006, 5:57:20 PM11/15/06

to

Hi everybody, I think google set works on any of the following
algorithms

1) Binary Search Tree or any of its variant, probably skip list
2) Some kind of N-dimentional adaptive hashing

Does anybody agree with me ?

Sudipta

derk...@gmx.de

unread,

Nov 20, 2006, 9:07:07 AM11/20/06

to

Hello,
I think, the Name of the service is program: It's done with sets. You
have different sets or Categories, the search items belong to. These
don't need to be known with their real name, they just need to be some
kind of meta-Entity.

By entering Search items, the corresponding sets are weighted and the
strongest weight gives the Category, wich is browsed for the results.

p.e.: you enter keyboard and guitar. Guitar is an Instrument, no
question. Keyboard can be a type of Pheripherals OR a Instrument.
Because, Guitar is an Instrument + Keyboard is an Instrument, the
weight for this Set is higher then the weight for Pheriphery. So
Instrument is the "evoked Set". This is similar to the way, Information
is recognized by Humans, where kognition also raises "evoked Sets" in
this case Sets of firing Neurons in the Brain.

At the moment, there is no argument against ist, to implement the
behavior of this with simple Sets perhaps combined in Graphs. I Would
use this way, if i had to reengineer the Service ;)

But more improtant in case of the service I think, are the
possibilities of it: If this will work one Day, Computers will be able
to understand the Text, they are Indexing in a better way, then they do
it now: Today, a Computer only can search for Keywords. It doesen't
understand the Case in which the different Keywords standing in.

If it is able to raise Sets of corresponding Words, it might be able
too take a closer look into the Information of the Text. It is somewhat
like a Thessaurus and even this is IMHO a technique wich was not pushed
to the limit of possibilities in past.

regards,
Küchi

sudi...@gmail.com schrieb:

dealinsecond

unread,

Nov 30, 2006, 1:22:05 AM11/30/06

to

http://dealinsecond.com/
http://dealinsecond.com/

http://dealinsecond.com/

http://dealinsecond.com/
http://dealinsecond.com/

deb...@gmail.com

unread,

Nov 30, 2006, 4:36:45 PM11/30/06

to

I think Google Sets is, or could be, even more fundamental than a
search structure or assigned dimensions. The purpose could be to show
which information "belongs together" in context.

Context could be related terms ~ but what if context could also be
tracked by related places, time periods, overall subject matters,
information type, use, distribution patterns, or any number of
conceptual characteristics that could be N-dimensionally hashed to
generate and spit out sets.

What is the current interface with Google Sets? What data is being
used? Who are the intended users? Will Google sets benefit the general
public or is it geared for specialized searchers?

DebMacP

skva...@gmail.com

unread,

Dec 23, 2006, 9:15:58 AM12/23/06

to

Sudipta:

I think Google Sets should have constructed artificial neural networks.
Neural networks are particularly useful for solving problems that
cannot be expressed as a series of steps, such as recognizing patterns,
classifying into groups, series prediction and data mining.

Neural networks are often not suitable for problems where you must know
exactly how the solution was derived.

Correct me if I think wrong!

-Vasantha

pranshu

unread,

Dec 26, 2006, 6:13:39 AM12/26/06

to

Well, i do not think they will have made it that complex, because of
two reasons :
1. The dataset to consider is huge
2. neural networks are not that accurate, when it comes to such
abstract data mining .

I have two theories:

1 Google has a lot of queries that are being fired on their SE. They
also must be logging the user (or may be the IP) ... After proper
processing you get a set of queries in one session (may be a span of
10-15 minutes).
2. Google has wikipedia in its index. After proper ancor text mining,
you again get a set of phrases that co-occur.

Now all you need to do is to apply "some" statistical methods on these
set, see that the sets are validated by, more than a threshold number
of hypotheses (step 1 and 2 above). Bang! you have got a graph of
phrases linking to each other ... all you need to do is have some
datastructure that eficiently stores this and given one-two-three
elements outputs nearest phrases (almost like a dictionary or a
thesaurus)...

regards,
pranshu sharma

skvasant

unread,

Dec 31, 2006, 9:27:38 PM12/31/06

to

Prashu:

No offence to your views, however you need to understand that Google
Sets has been in the Labs since 2002 and has not graduated yet.

Google Sets does not work how Google Search work! May be infuture when
Google Sets graduates from the Labs, Google Search may follow the
priciples of Google Sets.

Also Google Sets does not work like Thesauraus or Dictionary!

Try this in Google Sets:
birght, cheerful, happy would result delighted, joyful and so on
notice delighted is related to our three terms and the thesaurus
probably won't give you this

These were the few reasons that made me to think that Google might have
invested in ANN (Artificial Neural Network)

-Vasantha

> > sudipt...@gmail.com wrote:
> > > Hi everybody, I think google set works on any of the following
> > > algorithms
>
> > > 1) Binary Search Tree or any of its variant, probably skip list
> > > 2) Some kind of N-dimentional adaptive hashing
>
> > > Does anybody agree with me ?
>

> > > Sudipta- Hide quoted text -- Show quoted text -

pranshu

unread,

Jan 2, 2007, 2:22:05 AM1/2/07

to

skvasant wrote:
> Prashu:
>
> No offence to your views,

no offenses taken...

> however you need to understand that Google
> Sets has been in the Labs since 2002 and has not graduated yet.
>

how is that relevant to my post ....

> Google Sets does not work how Google Search work! May be infuture when
> Google Sets graduates from the Labs, Google Search may follow the
> priciples of Google Sets.

eggjhactly ... if you think that google search will someday follow the
google sets then it can not be on something as complex as neural nets
...

>
> Also Google Sets does not work like Thesauraus or Dictionary!
>

i said Almost !!!

> Try this in Google Sets:
> birght, cheerful, happy would result delighted, joyful and so on
> notice delighted is related to our three terms and the thesaurus
> probably won't give you this
>

and that is why there is google sets ... what i wanted to say was ...
take a section of the web that can be considered a "random, but
repesentative, sample" of the data on the web. After that if you use
any basic / simple statistical data mining tool .. like co-occurence
frequencies.. then you have relations in the entities .. which just say
they are related, and not nocessarily saying HOW ARE they related ...
using these relations u can easily genrate a set ...
this is a very simple theory, but lots of people are getting lots of
grerat things using just simple things ... so thought about what i
thought about (the qLogs and wikiDump can be a very GOOD sample of the
web)

deb...@gmail.com

unread,

Jan 2, 2007, 9:05:23 AM1/2/07

to

I wish Google Sets random but representative samples covered a spectrum
of information from structured to unstructured. For example, freeform
blogs on one end, a data sharing community like the geographic network
by ESRI on the other end, and Wikipedia somewhere in between. Returning
sets across a range and being able to see where your set fits.