There was this little challenge that I threw a week or two ago.
<http://groups.google.com/group/webby-forum/browse_thread/thread/b34699b71bb3e079
>
Don't know if a sitemap is the same concept of your site index, but
they sound very similar. If you feel like coding this up, I'm sure
others would find it useful, too.
Blessings,
TwP
Incidently, I had to do this just today for some documentation at work:
http://pastie.caboo.se/186595
Cheers,
Bruce
---
Bruce Williams
http://codefluency.com
twitter: wbruce
Sounds like a good idea :-)
I'd use an attribute on each page you'd like to ignore to flag it (vs
maintaining a separate ignore list), Hpricot to yank out content to
process, etc. Looks like a fun little project!
Tom,
I'm talking about the metadata at the top of each page (in
content/**); I wouldn't process the output files in output/**
directly.
For example you could do something like the following:
---
title: Foo Bar
created_at: 2008-04-18 22:40:00 -06:00
ignore: true
filter:
- textile
and simply check for the `ignore' attribute on page objects.
Also, rather than just writing a script that processed content/**
files directly, I'd try to do it programmatically (probably in a Rake
task; Tim might have some tips here) by loading webby and using
Webby::Resources::DB#find to grab all the pages (see
http://webby.rubyforge.org/rdoc/classes/Webby/Resources/DB.html#M000056),
and checking for page.ignore -- and you could get the HTML output of
each page for processing by calling page.render and the URL by calling
page.url (see http://webby.rubyforge.org/manual/#h2_1_1).
>
> This might be a useful convention if someone wants to write a Sitemap
> (http://www.sitemaps.org/) generator before I get around to it. :-)
>
A site indexer would be fantastic! If you're willing to share the code
when you're done, I'll gladly include it with the next release of webby.
Blessings,
TwP
But beyond all that, when I do a site map, I want the page groupings
listed in MY order, not alphabetical order, and I often want some kind
of brief description accompanying each page listing. I can envision how
that might be all set up with metadata, but it seems easier to me to
just keep a running outline of the conceptual organization of your site,
and expand that into a site map.
I keep wondering if I'm missing something here. Must be.
FINALLY - when I started this threat, what I was referring to was the
production of something akin to a book index, but for a website. Static
search output, if you will, but browsable. Tim had some comments about
how best to do this, and I liked them (and need to review them). Here's
a description I recently wrote to one of my website design customers
(and I expect to start this this week - ASAP) - it describes a
standalone program, but I can see this as a part of Webby, easily enough:
"One sets up, as an option and not a necessity, a set of tags (keywords,
we call them in other contexts) which are associated with a page, and
which are put IN the page, but styled to be invisible in a browser. Burt
my program can find them. The point of the tags is to call special
attention to principal content. The tag words will appear in the index
output in bold font, indicating a MAIN source of information - the first
place a user might want to browse to.
"Regardless of whether or not a given page is tagged, all other words on
the page are indexed. The results are reviewed, and meaningless words
are put on a "stop" list, which causes them NOT to appear in the index.
"The output then generated shows main entries (the tags aforementioned),
and all others, alphabetically, grouped by letter. Following each entry
is a link to the page where this entry appears.
"It's that simple. The webmaster can direct the output by use of the
tags, or not. Either way, the site user can see better with this tool
than with any other way the range of topics available, all on one page.
Browsable. Formatted as the webmaster desires."
It might be feasible to set this up as a rake task. That'd be cool,
but it's hardly my first priority, and besides I don't yet know how to
do that.
So...if someone beats me to this, cool. If not, I'll be happy to put my
code out for massaging by some more capable hands, if they so wish. I
just want the bloody functionality, yesterday.
t.
--
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Tom Cloyd, MS MA, LMHC
Private practice Psychotherapist
Bellingham, Washington, U.S.A: (360) 920-1226
<< t...@tomcloyd.com >> (email)
<< TomCloyd.com >> (website & psychotherapy weblog)
<< sleightmind.wordpress.com >> (mental health issues weblog)
<< DirectPathDesign.TomCloyd.com >> (web site design & consultation)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> I'm way, way below you folks in skills, but I just have to say that
> I do NOT grasp the idea of an autogenerated site map. I don't see
> how that in formation is contained in the sparse matrix of
> hyperlinks IN a set of pages, and it cannot reliably be obtained
> from directory structure, since many of us don't use that notion for
> site organization.
>
> But beyond all that, when I do a site map, I want the page groupings
> listed in MY order, not alphabetical order, and I often want some
> kind of brief description accompanying each page listing. I can
> envision how that might be all set up with metadata, but it seems
> easier to me to just keep a running outline of the conceptual
> organization of your site, and expand that into a site map.
Hi,
Such an XML-based sitemap is actually meant to be used by search
engines. In addition to proving a complete list of all pages on a web
site (which makes hard-to-discover pages easy to find), it also allows
you to set priorities for pages and can also give a hint about a
page's update frequency, so spiders can fine-tune their crawl rates
for a site with an XML sitemap.
My site has an auto-generated XML sitemap (meant for spiders) as well
as an (auto-generated) HTML sitemap (meant for humans), and they're
generated in quite different ways (they have different purposes after
all).
Hope this helps!
Denis
--
Denis Defreyne
denis.d...@stoneship.org
Tom