the web is a directed graph

Jeff Hammel

unread,

Jan 28, 2012, 1:07:04 PM1/28/12

to mozilla-la...@googlegroups.com

I work at automation and testing at Mozilla, but have been known to don multiple hats, and, not surprisingly, I think a lot about how the web works. So I just read Stuart's blog post and couldn't help but feel that it plays somewhat in to my idea of mine.

Background....

The web is a directed graph ( http://en.wikipedia.org/wiki/Directed_graph ). The nodes of the directed graphs are web pages and the edges of the directed graphs are produced when you clicked on a link. This gives a unique opportunity to provide maps of the web. There are a few components in this visualization.

A site map. This is a map that sits on a server. When a visitor clicks on a link, a new edge is added by reading HTTP REFERER. A toy example, no longer "live", is at http://k0s.org/map.svg . This gives both the webmaster and visitors to see how the site is traveled: the well worn routes as well as the dusty trails. Note that the map itself is an element on the map :) In addition, the map could be embedded as a navigation aid on any page that desires it. Obviously a lot more could be done here than is shown in the toy example: nodes could be collapsed into metanodes until expanded (that is /pictures/mission/capp.jpg and /pictures/mission/cowboy.jpg might just be /pictures/mission for the sake of easy navigation unless a user clicked in for closer detail), etc. The algorithm is really simple:
- for each request (in middleware) read the path info and the HTTP referer
- if this edge doesn't exist create it
- if it does, add one to its count (and whatever other metadata you want....you have the whole request)
- additionally, you need a handler to display this data
Some sites do this though I rarely see it presented this way and even more rarely to the user.
In addition, you can make the system federated. For instance, lets say foo.org and bar.org both have maps (presumedly noted as <link rel="" type=""> in their <HEAD>s). If there is traffic from foo to bar you can stitch these maps together at the points where they meet. You have a federated map of the web!

The map of one's journey. So the above is from the site's point of view. What about the user's point of view? It would be easy to make an addon that recorded your own map of how you traversed the web. You click on a link in any tab and, again, an edge is created (and populated with whatever metadata you want). You can see how you use the web as well as share your travels with friends. This could also link up with site maps when they are available.

Just some things I've been thinking of in the last two years. I'm happy to go into more detail or flush them out if there is interest.

Jeff Sonstein

unread,

Jan 30, 2012, 11:33:35 AM1/30/12

to mozilla-la...@googlegroups.com

IMHO the user being able to viz their route through the Web might be really powerful

jeffs

Stuart Parmenter

unread,

Jan 30, 2012, 5:50:50 PM1/30/12

to mozilla-la...@googlegroups.com

I believe there is huge value in exploring the graph-like nature of the web and how people use it. This is one of the areas we've been exploring. The idea of using site maps (which many sites already have, for Google and other search engines) to help navigation though is a pretty interesting one that I haven't thought much about, but will do some exploration.

Katzi

unread,

Jan 31, 2012, 12:16:46 PM1/31/12

to mozilla-la...@googlegroups.com

This might be giant step ahead! Let'as also add that chrome/chromium users use "adress" bar mostly for search.

What about UI then? Reducing everything to buttons and gestures?

Peter Houghton

unread,

Apr 13, 2012, 5:12:42 PM4/13/12

to mozilla-la...@googlegroups.com

I am very keen to hear more, especially on the visualisation of one's own traces through the web. The reason for this is that I had this very same idea a long time ago (probably more than a decade) and am now quite excited to see that someone is at least talking about developing something!

What I have found very surprising is that the concept of creating visualisations of the web's directed graphs has not been followed up before, as its so blindingly obvious. Sadly lack of free time and aged software development skills made it impractical for me to construct something myself and a prior attempt to get someone interested (a web startup company which had a similar concept) failed.

I did try and take a quick look to see if I could find the startup company again to see what they may have developed, but no joy,...so I suspect they have disappeared. However, I did manage to find this, which is another take on the same idea.

http://www.acm.uiuc.edu/macwarriors/projects/trailblazer/

What is additionally interesting with this example is that they are integrating the Lucene search engine to trawl through your history as well. Having this combined capability outside of the Mac environment would be excellent.

Taking the idea further, why aren't parts of the web, especially those which attempt to encode knowledge and the relationships between concepts (perhaps a future enhanced Wikipedia), accessible via an N-Dimensional graph? Something along the lines of TheBrain perhaps?

http://www.thebrain.com/

Keen to converse more on this subject if this is possible. Is progress still being made?

Danny Ayers

unread,

Apr 13, 2012, 5:44:25 PM4/13/12

to mozilla-la...@googlegroups.com

On 13 April 2012 23:12, Peter Houghton <peter4h...@gmail.com> wrote:

> Taking the idea further, why aren't parts of the web, especially those which
> attempt to encode knowledge and the relationships between concepts (perhaps
> a future enhanced Wikipedia), accessible via an N-Dimensional graph?

dbPedia can give you such data from Wikipedia:
http://dbpedia.org/

and if you like you can follow links out into the cloud:
http://richard.cyganiak.de/2007/10/lod/

This piece argues that the graph view is (usually) a bad idea - it's
the "Pathetic Fallacy" of the somewhat misleading title. Interesting
read though.

http://swui.semanticweb.org/swui06/papers/Karger/Pathetic_Fallacy.html

I spent a while working on a graph-oriented tool (IdeaGraph) and came
to the conclusion that the view was only really useful in conjunction
with other UI elements, and it was best to avoid having more than a
little graph on screen at a time. I think I also did hook up the bits
to allow it to navigate the HTML Web through a graph, but never got to
anything I could call genuinely useful. Incidentally, the
ever-expanding tree is quite a nice alternative, you can traverse
graphs but it's more compact.

Cheers,
Danny.

--
http://dannyayers.com

http://webbeep.it - text to tones and back again

g2010a

unread,

Apr 30, 2012, 5:38:43 AM4/30/12

to mozilla-la...@googlegroups.com

Hi,

In my experience, the key to charting this type of data is abstraction: not everything has to be displayed at the same time. Brushing, highlighting, and slicing are necessary operation unless you want an indecipherable--if pretty--mess of dots and lines.

An example: One could abridge a network path into only three nodes: beginning, grouped-middle and end nodes if the question is where did I start and where did I end? Or the nodes could be grouped by category: instead of specifically displaying every website, you could show something like [search engine]->[wiki]->[social network]->[search engine]->[retailer]->[email provider].

Irrelevant nodes could be collapsed.

Olivier Yiptong

unread,

May 22, 2012, 12:14:26 AM5/22/12

to mozilla-la...@googlegroups.com

Hi!

I'm an software engineer working on Pancake and I'd like to chime in on this thread.

As Stuart has pointed out, we're storing user's data as directed graphs. The nature of the web lead to this decision.

With the data, we're able to provide better search ranking (we're not trying to be google, we provide users the ability to search within their history), recommendations (eventually) and more.

Perhaps just as importantly, this gives us the ability to think about the browsing experience a bit differently. To give you an idea, here are some questions we ask ourselves, directly from our product vision:

How can we make it easier for people to quickly get to the content and information they care about?
Is there value in modelling and structuring web navigation in new ways -- rather than bookmarks, history, windows and tabs, what if navigation was organized by topic of interest, or things recommended by people of interest?
If navigation is modeled by topics and people, what would be possible in the area of intelligent recommendations?
What opportunities are created by touch interfaces to make using the web more productive, engaging and fun?
Can we use HTML5 to create a web navigation experience that works on all of the devices in our lives?

I know there's more I'm hinting on than I'm answering, but that's what the Pancake team is working on, and its very exciting.

I'll leave you with the output of our graph visualization tool.
We store users' data as graphs sprinkled with a few conventions of our own.
The image attached shows 1 graph representing 3 kinds of subgraphs (called stacks). I can elaborate more on it if your want.

Olivier Yiptong

unread,

May 22, 2012, 12:17:12 AM5/22/12

to mozilla-la...@googlegroups.com

Google Groups decided to make a jpeg out of that svg, so here's the file i meant to attach:

http://dl.dropbox.com/u/9165/Pancake/sample_graph.svg

g2010a

unread,

May 23, 2012, 6:38:45 AM5/23/12

to mozilla-la...@googlegroups.com

One thing about recommendation engines that has always bothered me is the risk of group-think and sterile recommendations. There's nothing worse than looking at, say, an hdmi cable on amazon.com and seeing only cables all around for the next days. You might want to consider some entropy in the system or a way of randomizing a percentage of the recommendations to escape this effect.

Mohan Arun

unread,

Dec 28, 2012, 9:42:50 AM12/28/12

to mozilla-la...@googlegroups.com

I was browsing mozillalabs' website when I came across this project - I have had similar thoughts before, but never knew such a project has existed from such a big company's labs.

This is exciting. I do feel the need for a content discovery engine that is a more sophisticated version different than that of 'links shared on twitter and facebook'. I was one of the first users of the MIT eyebrowse beta project (which has since sunsetted) that lets you share the webaddress any page you are on, with other users of eyebrowse and one could visit someone's profile page and view a history of all pages they have shared, in chronological order.

I also want to chime in with this factoid relevant to the core idea behind the project:
There could be a connection between ants and browsing histories. "Rutgers professor Paul Kantor is developing a server for the department of defense that makes it possible to find information on the web in much the same way that ants leave pheromones (chemical) trails for other ants to follow in search of food - sort of a "digital information pheromone" path created by users on the web that other researchers seeking the same information could follow."

Thanks and please keep me posted regarding the status of the Pancake project.

Thks,
Marun

Dmitry Sokolov

unread,

Apr 28, 2014, 8:33:36 AM4/28/14

to mozilla-la...@googlegroups.com

Hi Olivier,

are you still working on the project?

I am developing a concept and implementation of the Virtual Associative Network:
http://confocal-manawatu.pbworks.com/w/page/68435296/What%20is%20noaSphere
There are some concerns on the AI approach to find emerging concepts, a field of interest for researchers in different fields of knowledge. However, would appreciate joining our efforts.

Please let me know if interested.

Cheers,
Dimitri

Reply all

Reply to author

Forward