[INFO] Parsing Google Group URLs (v1)

50 views
Skip to first unread message

Elisabeth Riba

unread,
Dec 11, 2001, 6:22:16 PM12/11/01
to
Especially with the newer extended archive, I've seen a lot of questions
about what all the bits of the URL do and how one can intelligently trim
it to (hopefully) fit on one line. I've been playing around with this for
a while, and can do this on my own.
Here's my first stab at sharing this knowledge widely:

GOOGLE GROUP URLS:
==================

The URL always begins with
http://groups.google.com/groups?
and then is followed by numerous short segments (controls) concatenated by &'s

There are two ways of explaining these controls:
a) by control, better to help you read & dissect URLs
b) by function, better to help you build your own URLs

CONTROLS, ALPHABETICAL:
=======================
as_drrb a toggle for searches by date
as_epq way of querying for an exact phrase
as_eq way of excluding words (Boolean NOT) in a query
as_maxd for searches between two dates, the end day-of-the-month
as_maxm for searches between two dates, the end month number
as_maxy for searches between two dates, the end year
as_mind for searches between two dates, the start day-of-the-month
as_minm for searches between two dates, the start month number
as_miny for searches between two dates, the start year
as_oq way of querying for optional words (Boolean OR)
as_q way of querying for required words (Boolean AND)
as_qdr way of searching by date relative to the current time
as_scoring whether results are sorted by date or relevance
as_uauthors way of querying in Author field
as_ugroup way of querying for newsgroup name
as_umsgid way of querying on Message-ID
as_usubject way of querying for words in Subject
filter whether Google will display or omit similar results
hl language the Google UI will display
ic shows full text for articles
lr way of querying by language of article
num number of results shown per screen
output shows the results in
q general query field
safe whether content-filtering is turned on
scoring whether results are sorted by date or relevance
selm way of querying on Message-ID
===========================================================================
CONTROLS, FUNCTIONAL:
=====================
Building the Query:
-------------------
The easiest way to do this is just use Q= and follow it by your search terms
concatenated by + (plus signs). For example, q=a+b+c yields (a AND b AND c)
Boolean AND
-----------
Google's search treats all words as if they were joined by a Boolean AND
Boolean OR
----------
Use Q= and put +OR+ ("OR" must be in upper case) between those words.
For example, q=a+b+OR+c yields (a AND (b OR c))
You can also list all optional words using the as_oq control.
Boolean NOT
-----------
Use Q= and put a - (minus sign) before words you wish to exclude.
You still must include the + (plus sign) between terms.
For example, q=a+-b+c yields (a AND NOT(b) AND c)
You can also list all words to exclude using the as_eq control.
Exact Phrases
-------------
Use Q= and put quotation marks around the words in the phrase.
Continue to put + (plus signs) between all words in the phrase, and
between the phrase and other terms as necessary.
For example, q=a+"b+c"+d yields (a AND phrase(b c) AND d)
You can also list all words in a phrase using the as_epq control.

Field Searches:
---------------
Author:
-------
Within the q= control, enter author: followed by one word in order to
search for that word in the author field. To search for multiple words,
repeat the author: before each word.
You can also list all author words at once using the as_uauthors control.
For example, q=author:John+author:Doe == as_uauthors:John+Doe

Group:
------
Within the q= control, enter group: followed by as much of the
newsgroup name as you know.
You can also use the as_ugroup control.
For example, q=group:alt.fan.dejanews == as_ugroup=alt.fan.dejanews
NOTE: This is the ONLY field at present which permits wildcards.

Subject:
--------
Within the q= control, enter insubject: followed by one word in order to
search for that word in the subject field. To search for multiple words,
repeat the insubject: before each word.
You can also list all subject words at once using the as_usubject control.
For example, q=insubject:delurk+insubject:test == as_usubject:delurk+test
NOTE: the control here is "INsubject" not plain "subject" (a common pitfall)

Message-ID:
-----------
Within the q= control, enter msgid: followed by the Message-ID
You can also use the controls as_umsgid or selm followed by the Message-ID
http://groups.google.com/groups?selm=anews.Asdcsvax.285
http://groups.google.com/groups?as_umsgid=anews.Asdcsvax.285
http://groups.google.com/groups?q=msgid:anews.Asdcsvax.285

NOTE: Google accepts a maximum of TEN words in a query. That number doesn't
include prefixes to specify author, group or subject, but can be a limitation.
Also, Google does not recognize wildcards and does not perform stemming.
Searching "printer" vs. "printers" will give different results.

Date Limitations: AS_DRRB, AS_QDR, AS_MAX*, AS_MIN*
---------------------------------------------------
No control is needed to get articles written at any time.
To only get posts within the last 24 hours, append as_drrb=q&as_qdr=d
To only get posts within the last week, append as_drrb=q&as_qdr=w
To only get posts within the last month, append as_drrb=q&as_qdr=m
To only get posts within the last year, append as_drrb=q&as_qdr=y
To only get posts in a specified date range, takes SEVEN appended controls:
as_drrb=d
as_mind=(starting date - a number from 1 to 31)
as_minm=(starting month - a number from 1 to 12)
as_miny=(starting year - a number from 1981 to 2001)
as_maxd=(ending date - a number from 1 to 31)
as_maxm=(ending month - a number from 1 to 12)
as_maxy=(ending year - a number from 1981 to 2001)
For example, December 1, 1999 thru January 30, 2000 would look like:
as_drrb=b&as_mind=1&as_minm=12&as_miny=1999&as_maxd=30&as_maxm=1&as_maxy=2000

Language Limitations: LR
-------------------------
Google can return results that were written in one of 28 possible languages.
If you want articles written in any language, no control is needed.
To only get messages written in English, append lr=lang_en
For French, lr=lang_fr For Spanish, lr=lang_es For Russian, lr=lang_ru
and so on. [I'm *not* going to list all the options here.]

Customizing Display of the Results:
-----------------------------------
Sorting: SCORING, AS_SCORING
----------------------------
To sort the results by date, append scoring=d OR as_scoring=d
To sort results by relevance, append scoring=r OR as_scoring=r
Number of results per screen: NUM
---------------------------------
To show 10 results per screen, append num=10
To show 20 results per screen, append num=20
To show 30 results per screen, append num=30
To show 50 results per screen, append num=50
To show 100 results per screen, append num=100
Keep in mind, Google Groups only show 10 screens of results.
At num=10, that's only the first hundred; num=100 can show a thousand
Other options:
--------------
To show all results (don't omit "very simililar entries"), append filter=0
To show the bodies of all messages, append ic=1
To turn on content filtering (avoid "adult" messages), append safe=on
To show all results without any content safeguards, append safe=off
To show single articles in plain-text with full headers, append output=plain
(this is only enabled for searches using selm=)

TRIMMING GOOGLE GROUPS URLS:
============================
If you want to provide a link to *one* specific post, the shortest format
is http://groups.google.com/groups?selm= and the Message-ID. The easiest
way to get there is to click on the article and copy the URL from the
"Original Format" link, removing the "&output=plain"
If you want to provide a link to a search, start weeding through the URL
using the guide above to remove the unnecessary bits. For example, all
searches begun from the Advanced page include the date fields; if you're
not searching by date, just delete them. Sometimes, clicking the "sort by
date/relevance" link in search results will also clean up the URL for you.
Just be sure to resubmit your edited URL into your browser to ensure you
still get the right results.

IN CONCLUSION:
==============
That's as far as I've gotten so far. This does *NOT* include terms in the
URLs that relate to message threading, but I'll see about adding them
eventually. Please *post* any responses to alt.fan.dejanews rather than
e-mailing me (I'm having e-mail problems right now, and wouldn't want to
lose anybody's replies).

Hope this helps!
--
----------> Elisabeth Anne Riba * l...@osmond-riba.org <----------
"[She] is one of the secret masters of the world: a librarian.
They control information. Don't ever piss one off."
- Spider Robinson, "Callahan Touch"

Elisabeth Riba

unread,
Dec 11, 2001, 6:25:27 PM12/11/01
to
Especially with the newer extended archive, I've seen a lot of questions
about what all the bits of the URL do and how one can intelligently trim
it to (hopefully) fit on one line. I've been playing around with this for
a while, and can do this on my own.
Here's my first stab at sharing this knowledge widely:

==================


GOOGLE GROUP URLS:
==================
The URL always begins with
http://groups.google.com/groups?
and then is followed by numerous short segments (controls) concatenated by &'s

There are two ways of explaining these controls:
a) by control, better to help you read & dissect URLs
b) by function, better to help you build your own URLs

=======================

CONTROLS, FUNCTIONAL:
=====================
Building the Query:
-------------------
The easiest way to do this is just use Q= and follow it by your search terms
concatenated by + (plus signs). For example, q=a+b+c yields (a AND b AND c)

Boolean AND:
------------


Google's search treats all words as if they were joined by a Boolean AND

Boolean OR:


-----------
Use Q= and put +OR+ ("OR" must be in upper case) between those words.
For example, q=a+b+OR+c yields (a AND (b OR c))
You can also list all optional words using the as_oq control.

Boolean NOT:
------------


Use Q= and put a - (minus sign) before words you wish to exclude.
You still must include the + (plus sign) between terms.
For example, q=a+-b+c yields (a AND NOT(b) AND c)
You can also list all words to exclude using the as_eq control.

Exact Phrases:
--------------

Date Limitations:


-----------------
No control is needed to get articles written at any time.
To only get posts within the last 24 hours, append as_drrb=q&as_qdr=d
To only get posts within the last week, append as_drrb=q&as_qdr=w
To only get posts within the last month, append as_drrb=q&as_qdr=m
To only get posts within the last year, append as_drrb=q&as_qdr=y
To only get posts in a specified date range, takes SEVEN appended controls:
as_drrb=d
as_mind=(starting date - a number from 1 to 31)
as_minm=(starting month - a number from 1 to 12)
as_miny=(starting year - a number from 1981 to 2001)
as_maxd=(ending date - a number from 1 to 31)
as_maxm=(ending month - a number from 1 to 12)
as_maxy=(ending year - a number from 1981 to 2001)
For example, December 1, 1999 thru January 30, 2000 would look like:
as_drrb=b&as_mind=1&as_minm=12&as_miny=1999&as_maxd=30&as_maxm=1&as_maxy=2000

Language Limitations:


---------------------
Google can return results that were written in one of 28 possible languages.
If you want articles written in any language, no control is needed.
To only get messages written in English, append lr=lang_en
For French, lr=lang_fr For Spanish, lr=lang_es For Russian, lr=lang_ru
and so on. [I'm *not* going to list all the options here.]

Customizing Display of the Results:
-----------------------------------
Sorting:

--------
To sort the results by date, append scoring=d OR as_scoring=d
To sort results by relevance, append scoring=r OR as_scoring=r
Number of results per screen:

-----------------------------
To show 10 results per screen, append num=10
To show 20 results per screen, append num=20
To show 30 results per screen, append num=30
To show 50 results per screen, append num=50
To show 100 results per screen, append num=100
Keep in mind, Google Groups only show 10 screens of results.
At num=10, that's only the first hundred; num=100 can show a thousand
Other options:
--------------
To show all results (don't omit "very simililar entries"), append filter=0
To show the bodies of all messages, append ic=1
To turn on content filtering (avoid "adult" messages), append safe=on
To show all results without any content safeguards, append safe=off
To show single articles in plain-text with full headers, append output=plain
(this is only enabled for searches using selm=)

============================


TRIMMING GOOGLE GROUPS URLS:
============================
If you want to provide a link to *one* specific post, the shortest format
is http://groups.google.com/groups?selm= and the Message-ID. The easiest
way to get there is to click on the article and copy the URL from the
"Original Format" link, removing the "&output=plain"
If you want to provide a link to a search, start weeding through the URL
using the guide above to remove the unnecessary bits. For example, all
searches begun from the Advanced page include the date fields; if you're
not searching by date, just delete them. Sometimes, clicking the "sort by
date/relevance" link in search results will also clean up the URL for you.
Just be sure to resubmit your edited URL into your browser to ensure you
still get the right results.

==============

Aahz Maruch

unread,
Dec 11, 2001, 8:44:09 PM12/11/01
to
In article <9v64l7$bca$2...@news.panix.com>,

Elisabeth Riba <l...@osmond-riba.org> wrote:
>
>Especially with the newer extended archive, I've seen a lot of questions
>about what all the bits of the URL do and how one can intelligently trim
>it to (hopefully) fit on one line. I've been playing around with this for
>a while, and can do this on my own.

I suggest renaming the Subject: to "FAQ", and also posting it to some
URL. If you don't have a web site, I'm certainly willing to host it.
--
--- Aahz <*> (Copyright 2001 by aa...@pobox.com)

Hugs and backrubs -- I break Rule 6 http://www.rahul.net/aahz/
Androgynous poly kinky vanilla queer het Pythonista

You reap what you sew. --attr. Lachesis

kimji

unread,
Dec 11, 2001, 10:28:41 PM12/11/01
to
Elisabeth Riba <l...@osmond-riba.org> wrote in message news:<9v64l7$bca$2...@news.panix.com>...

> ==================
> GOOGLE GROUP URLS:
> ==================
> The URL always begins with
> http://groups.google.com/groups?
> and then is followed by numerous short segments (controls) concatenated
> by &'s

You can also browse all the groups Google archives at:
http://groups.google.com/groups?group=*

I mention this because, interestingly, "group:*"
doesn't work (but e.g. "group:alt.comics.*" does).

And yeah, I still can't believe they archive back to 1981. Frankly,
it exceeded my wildest expectations by about 10 years :)

___________________
md...@altavista.net
Gunnm: Broken Angel
http://reimeika.ca/

Hugh Watkins

unread,
Dec 12, 2001, 2:11:37 PM12/12/01
to

"kimji" <md...@altavista.net> wrote in message news:bdeee4fe.01121...@posting.google.com...

> Elisabeth Riba <l...@osmond-riba.org> wrote in message news:<9v64l7$bca$2...@news.panix.com>...
>
> > ==================
> > GOOGLE GROUP URLS:
> > ==================

I usually start with http://groups.google.co.uk/groups?group=*&hl=en

languages

http://www.google.com/intl/da/

da is danish
but German is here

http://www.google.de/

LOL http://www.google.com/intl/xx-piglatin/

HUGH W


*****************************************************************************


help wanted http://services.google.com/tc/Welcome.html

Google translation status report of main site

http://services.google.com/tcbin/tc.py?cmd=status

Volapuk 0%

Esperanto 99%

Interlingua 100%

New! Search 3 billion documents using Google

http://www.google.com/3.html

http://www.google.com/language_tools

http://www.google.com/intl/da/

Avanceret søgning

Mere information more information takes you into English again

http://www.google.com/intl/da/help/refinesearch.html#domain

Tips til søgning | Alt om Google


zarp...@gmail.com

unread,
Apr 17, 2014, 3:49:30 PM4/17/14
to
thanks 4 the Info

mbj...@y7mail.com

unread,
Jun 27, 2015, 9:32:17 PM6/27/15
to
On Friday, April 18, 2014 at 3:49:30 AM UTC+8, zarp...@gmail.com wrote:
> thanks 4 the Info

But now obsolete as Gaggle Gripes is continually churning into a
pile of poop.

Cosmos with Dr Turi Louis

unread,
Jun 20, 2016, 5:58:46 PM6/20/16
to
Source: Man who grabbed gun wanted to kill Trump! Can you really deny my prediction for June 20th, 2016 and the power of the reptilius? Read more http://www.drturi.com/shooting-omar-mateen-would-have-been-a-beautiful-beautiful-sight/
Reply all
Reply to author
Forward
0 new messages