Especially with the newer extended archive, I've seen a lot of questions
about what all the bits of the URL do and how one can intelligently trim
it to (hopefully) fit on one line. I've been playing around with this for
a while, and can do this on my own.
Here's my first stab at sharing this knowledge widely:
GOOGLE GROUP URLS:
The URL always begins with
and then is followed by numerous short segments (controls) concatenated by &'s
There are two ways of explaining these controls:
a) by control, better to help you read & dissect URLs
b) by function, better to help you build your own URLs
as_drrb a toggle for searches by date
as_epq way of querying for an exact phrase
as_eq way of excluding words (Boolean NOT) in a query
as_maxd for searches between two dates, the end day-of-the-month
as_maxm for searches between two dates, the end month number
as_maxy for searches between two dates, the end year
as_mind for searches between two dates, the start day-of-the-month
as_minm for searches between two dates, the start month number
as_miny for searches between two dates, the start year
as_oq way of querying for optional words (Boolean OR)
as_q way of querying for required words (Boolean AND)
as_qdr way of searching by date relative to the current time
as_scoring whether results are sorted by date or relevance
as_uauthors way of querying in Author field
as_ugroup way of querying for newsgroup name
as_umsgid way of querying on Message-ID
as_usubject way of querying for words in Subject
filter whether Google will display or omit similar results
hl language the Google UI will display
ic shows full text for articles
lr way of querying by language of article
num number of results shown per screen
output shows the results in
q general query field
safe whether content-filtering is turned on
scoring whether results are sorted by date or relevance
selm way of querying on Message-ID
Building the Query:
The easiest way to do this is just use Q= and follow it by your search terms
concatenated by + (plus signs). For example, q=a+b+c yields (a AND b AND c)
Google's search treats all words as if they were joined by a Boolean AND
Use Q= and put +OR+ ("OR" must be in upper case) between those words.
For example, q=a+b+OR+c yields (a AND (b OR c))
You can also list all optional words using the as_oq control.
Use Q= and put a - (minus sign) before words you wish to exclude.
You still must include the + (plus sign) between terms.
For example, q=a+-b+c yields (a AND NOT(b) AND c)
You can also list all words to exclude using the as_eq control.
Use Q= and put quotation marks around the words in the phrase.
Continue to put + (plus signs) between all words in the phrase, and
between the phrase and other terms as necessary.
For example, q=a+"b+c"+d yields (a AND phrase(b c) AND d)
You can also list all words in a phrase using the as_epq control.
Within the q= control, enter author: followed by one word in order to
search for that word in the author field. To search for multiple words,
repeat the author: before each word.
You can also list all author words at once using the as_uauthors control.
For example, q=author:John+author:Doe == as_uauthors:John+Doe
Within the q= control, enter group: followed by as much of the
newsgroup name as you know.
You can also use the as_ugroup control.
For example, q=group:alt.fan.dejanews == as_ugroup=alt.fan.dejanews
NOTE: This is the ONLY field at present which permits wildcards.
Within the q= control, enter insubject: followed by one word in order to
search for that word in the subject field. To search for multiple words,
repeat the insubject: before each word.
You can also list all subject words at once using the as_usubject control.
For example, q=insubject:delurk+insubject:test == as_usubject:delurk+test
NOTE: the control here is "INsubject" not plain "subject" (a common pitfall)
Within the q= control, enter msgid: followed by the Message-ID
You can also use the controls as_umsgid or selm followed by the Message-ID
NOTE: Google accepts a maximum of TEN words in a query. That number doesn't
include prefixes to specify author, group or subject, but can be a limitation.
Also, Google does not recognize wildcards and does not perform stemming.
Searching "printer" vs. "printers" will give different results.
No control is needed to get articles written at any time.
To only get posts within the last 24 hours, append as_drrb=q&as_qdr=d
To only get posts within the last week, append as_drrb=q&as_qdr=w
To only get posts within the last month, append as_drrb=q&as_qdr=m
To only get posts within the last year, append as_drrb=q&as_qdr=y
To only get posts in a specified date range, takes SEVEN appended controls:
as_mind=(starting date - a number from 1 to 31)
as_minm=(starting month - a number from 1 to 12)
as_miny=(starting year - a number from 1981 to 2001)
as_maxd=(ending date - a number from 1 to 31)
as_maxm=(ending month - a number from 1 to 12)
as_maxy=(ending year - a number from 1981 to 2001)
For example, December 1, 1999 thru January 30, 2000 would look like:
Google can return results that were written in one of 28 possible languages.
If you want articles written in any language, no control is needed.
To only get messages written in English, append lr=lang_en
For French, lr=lang_fr For Spanish, lr=lang_es For Russian, lr=lang_ru
and so on. [I'm *not* going to list all the options here.]
Customizing Display of the Results:
To sort the results by date, append scoring=d OR as_scoring=d
To sort results by relevance, append scoring=r OR as_scoring=r
Number of results per screen:
To show 10 results per screen, append num=10
To show 20 results per screen, append num=20
To show 30 results per screen, append num=30
To show 50 results per screen, append num=50
To show 100 results per screen, append num=100
Keep in mind, Google Groups only show 10 screens of results.
At num=10, that's only the first hundred; num=100 can show a thousand
To show all results (don't omit "very simililar entries"), append filter=0
To show the bodies of all messages, append ic=1
To turn on content filtering (avoid "adult" messages), append safe=on
To show all results without any content safeguards, append safe=off
To show single articles in plain-text with full headers, append output=plain
(this is only enabled for searches using selm=)
TRIMMING GOOGLE GROUPS URLS:
If you want to provide a link to *one* specific post, the shortest format
is http://groups.google.com/groups?selm= and the Message-ID. The easiest
way to get there is to click on the article and copy the URL from the
"Original Format" link, removing the "&output=plain"
If you want to provide a link to a search, start weeding through the URL
using the guide above to remove the unnecessary bits. For example, all
searches begun from the Advanced page include the date fields; if you're
not searching by date, just delete them. Sometimes, clicking the "sort by
date/relevance" link in search results will also clean up the URL for you.
Just be sure to resubmit your edited URL into your browser to ensure you
still get the right results.
That's as far as I've gotten so far. This does *NOT* include terms in the
URLs that relate to message threading, but I'll see about adding them
eventually. Please *post* any responses to alt.fan.dejanews rather than
e-mailing me (I'm having e-mail problems right now, and wouldn't want to
lose anybody's replies).
Hope this helps!
----------> Elisabeth Anne Riba * l...@osmond-riba.org <----------
"[She] is one of the secret masters of the world: a librarian.
They control information. Don't ever piss one off."
- Spider Robinson, "Callahan Touch"