Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Google Groups spam filter

22 views
Skip to first unread message

VK

unread,
Oct 12, 2009, 12:26:38 PM10/12/09
to
As these bags, t-shirts and other crap sellers from China are not
calming, I wrote for myself a small Google Groups spam filter. I wrote
in about 10 min where 8 were spent for studying Google page HTML
structure, so it might be as crappy as it get :-) I am more concerned
that some "shipping" offers are still sliding through so I overlooked
something important in the string search.
I am also wondering what would be the regexp pattern to sort out all
posts with pseudo-graphics in them like:
♥Paypal Payment♥cheap SMET jeans wholesale(www.wholesale789.com)
∴★paypal payment★∴ wholesale cheap the hottest Belt at www.ecyaya.com

and mixtures of Asian and Latin letters (but not Asian and Latin
words) like:
இ◥இ◤ 2009 Prada handbags get low price wholesale at WEBSITE: www.fjrjtrade.com
<paypal payment>===<free shipping>

Both Greasemonkey and IE7Pro are using the script metadata, but they
are using slightly different installation mechanics, so I placed the
script to the relevant installation sites. The code is also pasted at
the bottom of this post.

for Greasemonkey (https://addons.mozilla.org/en-US/firefox/addon/748):
http://userscripts.org/scripts/show/59377

for IE7Pro (http://www.ie7pro.com):
http://www.iescripts.org/view-scripts-676p1.htm

script:

// ==UserScript==
// @name ggNoSpam
// @namespace http://www.geocities.com/schools_ring
// @description Google Groups spam filter
// @include http://groups.google.tld/*
// ==/UserScript==

window.addEventListener('load', function(e) {

var msg = null;

var div = document.getElementsByTagName('DIV');

var i = 0;

for (i=0; i<div.length; i++) {
if (div[i].className == 'maincontoutboxatt') {
msg = div[i].
getElementsByTagName('TABLE')[0].
tBodies[0].
rows;
break;
}
}

for (var i=0; i<msg.length; i++) {
if ((msg[i].textContent.toLowerCase()).indexOf('wholesale') != -1) {
msg[i].style.display = 'none';
}
}
}, false);

Francesco S. Carta

unread,
Oct 13, 2009, 12:41:32 PM10/13/09
to

Well done, very good idea.

I've tried out the GM version linked above, but it doesn't work on
domains other than .com and gives false positives too.

I've modified it, changing the include, adding a dictionary (it's
going to grow) and replacing the "hiding" with a different formatting
- that's too dangerous to completely hide them, false positives must
be spotted and fixed:
-------


// ==UserScript==
// @name ggNoSpam
// @namespace http://www.geocities.com/schools_ring

// @version 0.2.3 + mod ;-)


// @description Google Groups spam filter

// @include http://groups.google.tld/group/*/topics*
// ==/UserScript==

/*@cc_on
/*@if (@_jscript)
var IE = true;
@else @*/
var IE = false;
/*@end
@*/

var DICTIONARY = 'leather|wholesale|whoelsale|fjrjtrade|';
DICTIONARY += 'peng Selina|dotradenow.com|toptradea.com';

var FILTER = new RegExp(DICTIONARY,'i');

var CONTENT_DIV_CLASS = 'maincontoutboxatt';

var msg = null;

var tmp = '';

var i = 0;

var div = document.getElementsByTagName('DIV');

for (i=0; i<div.length; i++) {
if (div[i].className == CONTENT_DIV_CLASS) {
msg = div[i].getElementsByTagName('TABLE')[0].tBodies[0].rows;
break;
}
}

if (msg) {
for (i=0; i<msg.length; i++) {
tmp = IE? msg[i].innerText : msg[i].textContent;
if (FILTER.test(tmp)) {
msg[i].style.fontSize = '75%';
msg[i].style.background = '#999';
}
}
}
-------

I've had the same idea just today, but I'm too JS/DOM ignorant to do
it by myself, thanks a lot for posting it.

Cheers,
Francesco

--
Francesco S. Carta, http://fscode.altervista.org

VK

unread,
Oct 13, 2009, 1:41:15 PM10/13/09
to

Thank you for your interest! It happened that your post just appeared
when I was uploading the new 0.2.4 version of the script, so it was
too late to update anything. Please be assured that I'll look trough
your code to include useful parts in the next update. I am also most
interested in the current way of removing spam and any comments are
welcome.

Who already installed the previous buggy version I am insistently
suggesting to update to the current beta 0.2.4

for Greasemonkey users ( https://addons.mozilla.org/en-US/firefox/addon/748
):
go to http://userscripts.org/scripts/show/59377

for IE7Pro users ( http://www.ie7pro.com ):
go to http://www.iescripts.org/view-scripts-676p1.htm

Who wants to play with the script locally and install it manually I am
reminding that in order to be recognized by the installer the scripts
have to be named as scriptname.user.js (Greasemonkey) and
scriptname.ieuser.js (IE7Pro) respectively. If named properly it is
enough to drag-drop updated version to the browser display area to
start the installation. I placed the current version in both name
versions to the Web as well:
http://JavaScript.MyPlus.org/greasespot/ggNoSpam/0.2.4/ggNoSpam.user.js
http://JavaScript.MyPlus.org/greasespot/ggNoSpam/0.2.4/ggNoSpam.ieuser.js

VK

unread,
Oct 13, 2009, 2:25:24 PM10/13/09
to
>  go tohttp://www.iescripts.org/view-scripts-676p1.htm

>
> Who wants to play with the script locally and install it manually I am
> reminding that in order to be recognized by the installer the scripts
> have to be named as scriptname.user.js (Greasemonkey) and
> scriptname.ieuser.js (IE7Pro) respectively. If named properly it is
> enough to drag-drop updated version to the browser display area to
> start the installation. I placed the current version in both name
> versions to the Web as well:
>  http://JavaScript.MyPlus.org/greasespot/ggNoSpam/0.2.4/ggNoSpam.user.js
>  http://JavaScript.MyPlus.org/greasespot/ggNoSpam/0.2.4/ggNoSpam.ieuse...

For one I can tell that our thoughts were going in parallels :-) I
either was concerned about possible false positives so changed the
logic from removing to hiding with a quick check possibility.

What I am dreaming about now - besides regular improvement and filter
adjustment - is to make it not only a personal convenience but some
"good for society action".
Google Groups has spam report tool but it is click-consuming plus it
requires to open first the spam message even if it is obvious from the
header that it is a spam. One needs to open message > More options >
Report this message > This message is spam > Report Abuse so not
everyone does while anyone hates spam. So I am thinking to all to
"spam (hover to check)" a "report as spam" button to make it a one
click operation. It could be also "report all sorted out as spam" or
like button.
Any code block suggestions and variants are welcome.

Francesco S. Carta

unread,
Oct 13, 2009, 5:30:27 PM10/13/09
to

Yes, we both came to the same logical conclusions ;-)

> What I am dreaming about now - besides regular improvement and filter
> adjustment - is to make it not only a personal convenience but some
> "good for society action".
> Google Groups has spam report tool but it is click-consuming plus it
> requires to open first the spam message even if it is obvious from the
> header that it is a spam. One needs to open message > More options >
> Report this message > This message is spam > Report Abuse so not
> everyone does while anyone hates spam. So I am thinking to all to
> "spam (hover to check)" a "report as spam" button to make it a one
> click operation. It could be also "report all sorted out as spam" or
> like button.

I suppose you aren't accustomed to view the topic list in "topic
summary" mode. You will notice that in that mode each post has a
"report as spam" link, also, you will notice that your script doesn't
work there - at least, not the version I posted above, I'm going to
check out your last version.

Anyway, it would be good to have those improvements you wrote about in
the "topic list" mode. But first of all you should check if it is
really worth it: I reported a lot of those messages as spam but I
didn't notice any of them disappearing :-(

> Any code block suggestions and variants are welcome.

Uh, on my part, I'd have to understand the code as it is, first!

Wish you good continuation, I'll download and try out your new
versions as you'll release them, and I'll drop in again if I'll have
something interesting to say.

Francesco S. Carta

unread,
Oct 13, 2009, 5:48:57 PM10/13/09
to
I've just checked out version 0.2.5, nice work, I didn't spot any
false positive.

Currently, when checking them out, waiting for the tool-tip to pop-up
is too long.

It would be better to display the post title immediately when
hovering, directly in that cell - you could so that the cell spans
over all columns, and setting it at some fixed height with hidden
overflow, so that other content doesn't displace or flickers when
hovering - just a couple of things I'm bringing in from my CSS
experiences, that must be even more feasible with JS.

The script still doesn't work in "topic summary" mode, and you should
recall to replace ".com" with ".tld" in the "include" path.

I look forward for the next version.

Francesco S. Carta

unread,
Oct 14, 2009, 2:21:28 PM10/14/09
to
I've fiddled a bit with your script, here is a screenshot of the
result:

http://fscode.altervista.org/gg_spam2.png

(the visible spam link is the one I was hovering when I took the shot)

The code is here below.

Sorry for having changed it that much, I'm just taking the chance to
practice and to learn something more about JS.

Please point out any bad thing I might have done - so far my changes
don't raise any error within the Firefox console (v. 3.5.3), but maybe
I've introduced some feature that isn't available in previous versions
of JS or in IE7Pro (I don't have it, I cannot test it), hence limiting
the "portability" - pass me the term, I come from a C++ background,
shall it matter.

(the actual regexpr isn't that important for me, the one I use here
below is enough for my needs, and the character-matching pattern I
omitted happened to give me false positives, for instance with foreign
quotation marks and with "inclined" single quotes, all of which are
perfectly reasonable in a non-spam post)

-------
// ==UserScript==
// @name ggNoSpam
// @namespace http://www.geocities.com/schools_ring

// @version 0.2.5 (FSC mod)


// @description Google Groups spam filter
// @include http://groups.google.tld/group/*/topics*
// ==/UserScript==

void((function(){

// Change the filter upon spammers' current tricks:
var spambuzz = 'paypal|discount|(hot |whole|whoel)sale|peng selina';
var spamfilter = new RegExp(spambuzz,'i');

/*@cc_on
/*@if (@_jscript)
var IE = true;
@else @*/
var IE = false;
/*@end
@*/

var cover = document.createElement('TR');
cover.appendChild(document.createElement('TD'));
cover.childNodes[0].setAttribute('colspan', '6');
cover.childNodes[0].style.fontSize = '80%';

var container = document.getElementsByClassName('maincontoutboxatt')
[0];
var table = container.getElementsByTagName('TABLE')[0].tBodies[0];
var messages = table.rows;

var spameven = false;
for (var i = 0; i < messages.length; ++i) {
var msgcontent = IE ? messages[i].innerText : messages
[i].textContent;
if (spamfilter.test(msgcontent)) {
var spamclass = spameven ? 'spameven' : 'spamodd';
spameven = !spameven;
var newmsg = cover.cloneNode(true);
newmsg.firstChild.setAttribute('class', spamclass);
newmsg.firstChild.innerHTML = messages[i].childNodes[3].innerHTML;
newmsg.firstChild.firstChild.innerHTML = msgcontent;
table.replaceChild(newmsg, messages[i]);
}
}

var style = document.createElement('STYLE');
style.innerHTML = 'td.spameven { background: #DDD; } '
+ 'td.spameven * { color: #DDD; } '
+ 'td.spamodd { background: #EEE; } '
+ 'td.spamodd * { color: #EEE; } '
+ 'td.spameven:hover *, '
+ 'td.spamodd:hover * { color: #444; } '
+ 'td.spameven:before, td.spamodd:before '
+ '{ content: "(spam) "; color: #777; } ';
document.getElementsByTagName('BODY')[0].appendChild(style);

})())
-------

I look forward for any correction.

Cheers,
Francesco

VK

unread,
Oct 14, 2009, 4:12:51 PM10/14/09
to

I am totally busy now on a project, sorry. Tomorrow I hope to mangle
an hour or two for gg. One question: are you sure you want to display
hidden content on hover? IMHO with a regular moused roaming top-down|
down-up by topics it creates a lot of "visual noise" with rows content
horizontally expanding|collapsing. Maybe it is better to leave it to
an explicit user action such as click|Enter on "show me"? It is
nothing but IMHO and a question to anyone.

Francesco S. Carta

unread,
Oct 14, 2009, 4:44:26 PM10/14/09
to
VK <schools_r...@yahoo.com> wrote:
> Francesco S. Carta wrote:
> > Please point out any bad thing I might have done - so far my changes
> > don't raise any error within the Firefox console (v. 3.5.3), but maybe
> > I've introduced some feature that isn't available in previous versions
> > of JS or in IE7Pro (I don't have it, I cannot test it), hence limiting
> > the "portability" - pass me the term, I come from a C++ background,
> > shall it matter.
>
> > (the actual regexpr isn't that important for me, the one I use here
> > below is enough for my needs, and the character-matching pattern I
> > omitted happened to give me false positives, for instance with foreign
> > quotation marks and with "inclined" single quotes, all of which are
> > perfectly reasonable in a non-spam post)
>
> > -------

[snip mod]

> > -------
>
> > I look forward for any correction.
>
> I am totally busy now on a project, sorry. Tomorrow I hope to mangle
> an hour or two for gg. One question: are you sure you want to display
> hidden content on hover? IMHO with a regular moused roaming top-down|
> down-up by topics it creates a lot of "visual noise" with rows content
> horizontally expanding|collapsing. Maybe it is better to leave it to
> an explicit user action such as click|Enter on "show me"? It is
> nothing but IMHO and a question to anyone.

On my part, I changed it in that way exactly because I wanted it to
display the hidden content immediately, but I understand that somebody
else could prefer otherwise.

(the most important thing I wanted to avoid, wrt my previous posts,
was content displacing on hover (assuming the hide/display was done
via JS by actually changing the TDs content); the code above does the
hide/display by changing the text color; I haven't been exactly clear
about what I wanted, sorry)

I suppose that such behaviors could be controlled by adding some kind
of sidebar to the page, with links like "show all spam", "hide all
spam", "show on click", "show on hover", "show as a tooltip" and so
on, up to you to add such features to the "official" script, if you
want to.

I'll gladly download the next versions - eventually, I'll mess with
them once more ;-)

Have fun with your projects,
cheers,

Jorge

unread,
Oct 14, 2009, 5:22:52 PM10/14/09
to

Slightly re-touched...

(function (f,cdc,div,msg,i) {


for (i=0; i<div.length; i++) {

if (div[i].className === cdc) {


msg = div[i].getElementsByTagName('TABLE')[0].tBodies[0].rows;
break;
}
}
if (msg) {
for (i=0; i<msg.length; i++) {

if (f.test(msg[i].textContent)) {
msg[i].style.opacity = 0.2;
}
}
}
})(/leather|wholesale|whoelsale|fjrjtrade|peng Selina|dotradenow.com|
toptradea.com/i,'maincontoutboxatt',document.getElementsByTagName
('div'))

...and minified it makes a nice bookmarklet :-)

javascript:(function(f,cdc,div,msg,i){for(i=0;i<div.length;i++){if(div
[i].className===cdc){msg=div[i].getElementsByTagName('TABLE')
[0].tBodies[0].rows;break;}}if(msg){for(i=0;i<msg.length;i++){if(f.test
(msg[i].textContent)){msg[i].style.opacity=0.2;}}}})(/leather|
wholesale|whoelsale|fjrjtrade|peng Selina|dotradenow.com|toptradea.com/
i,'maincontoutboxatt',document.getElementsByTagName('div'))

(Tested in Safari & FF)

--
Jorge.

Francesco S. Carta

unread,
Oct 14, 2009, 5:46:21 PM10/14/09
to

Really nice, well done - but, ugh! that's a bit cryptic for me... I'll
study it, being able to "compress" scripts in this way should come
handy someday ;-)

Thanks a lot for posting it, it gives me something new to munch on.

Francesco S. Carta

unread,
Oct 14, 2009, 6:27:29 PM10/14/09
to
"Francesco S. Carta" <entul...@gmail.com> wrote:

> document.getElementsByTagName('BODY')[0].appendChild(style);

From the series: "How To Complicate Your Very Life" ;-)

-------
document.body.appendChild(style);
-------

Must I teach myself all alone? :-P

My bad, I'm recovering on http://jibbering.com/faq

VK

unread,
Oct 15, 2009, 2:32:00 PM10/15/09
to
VK wrote:
> As these bags, t-shirts and other crap sellers from China are not
> calming, I wrote for myself a small Google Groups spam filter. I wrote
> in about 10 min where 8 were spent for studying Google page HTML
> structure, so it might be as crappy as it get :-) I am more concerned
> that some "shipping" offers are still sliding through so I overlooked
> something important in the string search.

OK, I have spent another 2 hours for it and released version 0.4.0
available form the same location at
http://userscripts.org/scripts/show/59377
Greasemonkey add-on itself is at https://addons.mozilla.org/en-US/firefox/addon/748

1) versions 0.2.6 - 0.3.9 died under terrible tortures during these
hours :) so remained as internal releases.
2) I tried to accommodate interface suggestions from Francesco
3) Extensive comments added
4) I added our standard localization block with BDO support: UN
languages (Arabic, Chinese, English, French, Russian, Spanish) + bonus
pack (German, Italian, Japanese, Dutch, Portuguese). Ones fluent in
any language from the list are welcome to correct the localization.
Extra languages are welcome.
5) IE8 is not a fully operational software (just like Vista itself) so
the version for IE is not released. The core problem is with table
manipulations, namely the luck of such in IE beyond the most primitive
operations. The version for IE with my comments until the point where
I dropped it for the luck of time can be obtained at
http://javascript.myplus.org/greasespot/ggNoSpam/0.4.0/ggNoSpam.ieuser.js
5) Spam filter is lesser aggressive now to decrease false positives.
6) INCLUDE changed from google.com to google.tld to accommodate
national domain versions

Comments and suggestions are welcome

VK

unread,
Oct 15, 2009, 2:39:01 PM10/15/09
to
> 1) versions 0.2.6 - 0.3.9 died under terrible tortures during these
> hours :) so remained as internal releases.
> 2) I tried to accommodate interface suggestions from Francesco
> 3) Extensive comments added
> 4) I added our standard localization block with BDO support: UN
> languages (Arabic, Chinese, English, French, Russian, Spanish) + bonus
> pack (German, Italian, Japanese, Dutch, Portuguese). Ones fluent in
> any language from the list are welcome to correct the localization.
> Extra languages are welcome.
> 5) IE8 is not a fully operational software (just like Vista itself) so
> the version for IE is not released. The core problem is with table
> manipulations, namely the luck of such in IE beyond the most primitive
> operations. The version for IE with my comments until the point where
> I dropped it for the luck of time can be obtained at
>  http://javascript.myplus.org/greasespot/ggNoSpam/0.4.0/ggNoSpam.ieuse...

> 5) Spam filter is lesser aggressive now to decrease false positives.
> 6) INCLUDE changed from google.com to google.tld to accommodate
> national domain versions
>
> Comments and suggestions are welcome

7) last but not least: both "Topic list" and "Topic summary" views are
now supported.


VK

unread,
Oct 15, 2009, 2:50:14 PM10/15/09
to

This message: http://groups.google.com/group/comp.lang.javascript/msg/e3114e8aff327c5f
gives a weird preview outcome as the text was superimposed on the
interface. Is it some BDO-connected trick?

VK

unread,
Oct 16, 2009, 12:46:28 PM10/16/09
to
So I updated the script today to the version 0.5.1
http://userscripts.org/scripts/show/59377

1) Many minor interface improvements

2) In List topic view it now preserves zebra-style rows coloring

3) Hebrew interface version added

4) Now properly treats BDO tricks in spam headers so the sample header
content positions properly

5) Opera .tld bug workaroung added so now can be used on Opera.
Opera has Greasemonkey support by default, so no add-on needed.
a) in Opera installation folder create new folder; the name is not
important as long as it doesn't conflict with existing folders, here
"greasespot" name is used
b) Tools menu -> Preferences -> Advanced -> Content -> JavaScript
options… In the "User JavaScript files" field set the path to the
"greasespot" folder and hit OK
c) copy ggNoSpam.user.js to the "greasespot" folder
d) Opera's Greasemonkey implementation doesn't support .tld alias for
top-level domains, so you may need to change the default fix from
'.com' to what you need.
e) restart Opera and enjoy

6) A debugging period Easter egg added: now on alt-clicking "spam"
note it displays script name, version and applied spam filter.

7) misc

IE's version is put on hold because the current IE's table
manipulation model is not functional and a completely different
approach has to be found and implemented on time availability. Sorry
for that.

I do answer to letters sent to school...@yahoo.com yet this box is
used for years for my Usenet account so perpetually overloaded by spam
even with all filters on. So I'd rather prefer to be addressed on the
subject in this thread instead.

Francesco S. Carta

unread,
Oct 16, 2009, 2:51:18 PM10/16/09
to
VK <schools_r...@yahoo.com> wrote:
> So I updated the script today to the version 0.5.1
>  http://userscripts.org/scripts/show/59377

Nice work!

And thanks a lot for adding me as a contributor, that's very kind of
you.

I'll dig your new release, I shall eventually add localization to my
script too.

Have good coding!

Matt Kruse

unread,
Oct 16, 2009, 3:53:53 PM10/16/09
to
On Oct 16, 11:46 am, VK <schools_r...@yahoo.com> wrote:
> So I updated the script today to the version 0.5.1
>  http://userscripts.org/scripts/show/59377

Does your script also filter out your multiple posts about it? If so,
I will consider installing it...

Matt Kruse

VK

unread,
Oct 16, 2009, 4:11:39 PM10/16/09
to

Hah! Welcome back! Now the whole "beating corner" :) is set but Randy
Webb...

A single cross-posting with followup-to set to a single group is OK.
If I ever was noticed in the deadly sin of multi-posting then name
this case.
All subsequent ggNoSpam-related posts are made in the same thread.

Jorge

unread,
Oct 16, 2009, 4:18:30 PM10/16/09
to

Edit the regExp:

javascript:(function(s,i,e){i=(e=document.querySelectorAll
('.maincontoutboxatt')[0].firstElementChild.tBodies
[0].rows).length;while(i--){s.test(e[i].textContent)&&(e
[i].style.opacity='0.2');}})(/leather|wholesale|whoelsale|fjrjtrade|
peng Selina|dotradenow.com|toptradea.com/i);

(FF & Safari)
--
Jorge.

Asen Bozhilov

unread,
Oct 17, 2009, 8:21:04 AM10/17/09
to
On Oct 16, 11:18 pm, Jorge <jo...@jorgechamorro.com> wrote:

> Edit the regExp:
>
> javascript:(function(s,i,e){i=(e=document.querySelectorAll
> ('.maincontoutboxatt')[0].firstElementChild.tBodies
> [0].rows).length;while(i--){s.test(e[i].textContent)&&(e
> [i].style.opacity='0.2');}})(/leather|wholesale|whoelsale|fjrjtrade|
> peng Selina|dotradenow.com|toptradea.com/i);

I use similar approach, but instead querySelectorAll i use XPATH and
document.evaluate:

var dictionary = /leather|wholesale|whoelsale|fjrjtrade|peng Selina|
dotradenow.com|toptradea.com/i,
xpath_rows = document.evaluate(
'//div[@class="maincontoutboxatt"][1]//tr',
document,
null,
XPathResult.ORDERED_NODE_SNAPSHOT_TYPE,
null
),
curr_tr;
for (var i = 2, j = 0; curr_tr = xpath_rows.snapshotItem(i++);)
{
if (dictionary.test(curr_tr.textContent))
{
curr_tr.parentNode.removeChild(curr_tr);
}
else {
curr_tr.bgColor = j++ % 2 == 0 ? '#f7f7f7' : '#ffffff';
}
}


VK

unread,
Oct 17, 2009, 11:36:00 AM10/17/09
to
Jorge wrote:
> > Edit the regExp:
>
> > javascript:(function(s,i,e){i=(e=document.querySelectorAll
> > ('.maincontoutboxatt')[0].firstElementChild.tBodies
> > [0].rows).length;while(i--){s.test(e[i].textContent)&&(e
> > [i].style.opacity='0.2');}})(/leather|wholesale|whoelsale|fjrjtrade|
> > peng Selina|dotradenow.com|toptradea.com/i);

Asen Bozhilov wrote:
> I use similar approach, but instead querySelectorAll i use XPATH and
> document.evaluate:
>
> var dictionary = /leather|wholesale|whoelsale|fjrjtrade|peng Selina|
> dotradenow.com|toptradea.com/i,
>     xpath_rows = document.evaluate(
>         '//div[@class="maincontoutboxatt"][1]//tr',
>                 document,
>         null,
>         XPathResult.ORDERED_NODE_SNAPSHOT_TYPE,
>         null
>     ),
>     curr_tr;
>     for (var i = 2, j = 0; curr_tr = xpath_rows.snapshotItem(i++);)
>     {
>             if (dictionary.test(curr_tr.textContent))
>         {
>                 curr_tr.parentNode.removeChild(curr_tr);
>         }
>         else {
>                     curr_tr.bgColor = j++ % 2 == 0 ? '#f7f7f7' : '#ffffff';
>         }
>     }

Great compacting work! But what about the Summary view? ;)

OK, I got my "IE's janitor" guy for an hour, so the next release works
for IE too. Honestly I didn't follow the IE's applied "logic"
throughout, I just said to him that this script has to do exactly the
same things as on other platforms and for the price of one Mac lunch
at Monday it does it. :)


ggNoSpam 0.6
Tested to work on: IE8, Firefox 3.5.3, Safari 4.0.3, Google Chrome
3.0.195.27, Opera 10.0
A note for purists: it doesn't mean that it doesn't work
on any other platform, it means what it means: that it was
physically seen working on the spelled platforms.

Greasemonkey script for Firefox, Safari, Google Chrome, Opera:
http://userscripts.org/scripts/show/59377

Greasemonkey add-on for Firefox:
https://addons.mozilla.org/en-US/firefox/addon/748

Greasemonkey add-on for Safari (Mac OS only):
http://8-p.info/greasekit/

Google Chrome used to have built-in support
over extra argument -enable-greasemonkey but
it seems to be blocked in the latest release.

Opera has built-in Greasemonkey support, see
http://groups.google.com/group/comp.lang.javascript/msg/65ba1be889c16a13
for details.


Greasemonkey script for IE:
http://www.iescripts.org/view-scripts-676p1.htm

Greasemonkey add-on for IE:
http://www.ie7pro.com

As usual any constructive comments and corrections are most welcome.

RobG

unread,
Oct 17, 2009, 7:15:08 PM10/17/09
to
On Oct 13, 2:26 am, VK <schools_r...@yahoo.com> wrote:
> As these bags, t-shirts and other crap sellers from China are not
> calming, I wrote for myself a small Google Groups spam filter.

Why not use Google Groups KillFile?

<URL: http://www.penney.org/google-groups-killfile-34-released.html >

It's been around for a couple of years, seems to be well supported and
works great for me.


--
Rob

Asen Bozhilov

unread,
Oct 18, 2009, 7:27:13 AM10/18/09
to
On Oct 17, 6:36 pm, VK <schools_r...@yahoo.com> wrote:
> Great compacting work! But what about the Summary view? ;)

I don't like that summary view. I prefer first list of normal topic,
and after that list of spam topic. Try my implementation of that:

<code>
var DICTIONARY = /leather|wholesale|whoelsale|fjrjtrade|peng Selina|
dotradenow.com|toptradea.com/i,
SPAM_TOPICS = 'Spam Topic',
SHOW_SPAM = 'Show spam',
HIDE_SPAM = 'Hide spam';

var xpath_rows = document.evaluate(


'//div[@class="maincontoutboxatt"][1]//tr', document, null,
XPathResult.ORDERED_NODE_SNAPSHOT_TYPE,
null
),

spam_rows = [],
curr_tr;

for (var i = 2, j = 2, k = 0; curr_tr = xpath_rows.snapshotItem(i++);)
{
if (DICTIONARY.test(curr_tr.textContent))
{
curr_tr.parentNode.appendChild(curr_tr);
setEvenOddColor(curr_tr, k++ % 2);
spam_rows.push(curr_tr);
}
else {
setEvenOddColor(curr_tr, j++ % 2);
}
}

if (k)
{
var spam_butt = document.createElement('input'),
topic_tbl = xpath_rows.snapshotItem(0).parentNode.parentNode,
title_row = topic_tbl.insertRow(j),
title_cell = title_row.insertCell(0);

spam_rows.push(title_row);

title_cell.colSpan = 6;
title_cell.innerHTML = '<br /><b>' + SPAM_TOPICS + '</b><br /><br /
>';

spam_butt.type = 'button';
spam_butt.addEventListener('click', showHideSpam, false);
insertSpamButton();

showHideSpam();
}

function setEvenOddColor(html_el, even_odd)
{
html_el.style.backgroundColor = even_odd ? '#FFFFFF' : '#F7F7F7';
}

function showHideSpam()
{
var dspl = title_row.style.display == '' ? 'none' : '';
for (var i = 0, len = spam_rows.length; i < len; i++)
{
spam_rows[i].style.display = dspl;
}
spam_butt.value = (dspl ? SHOW_SPAM : HIDE_SPAM) + ' (' + k + ')';
}

function insertSpamButton()
{
var form = document.evaluate('//div[@class="maincontbox"]//form',
document, null,
XPathResult.ORDERED_NODE_SNAPSHOT_TYPE,
null
).snapshotItem(0);
form.insertBefore(spam_butt, form.firstChild);
}
</code>

> As usual any constructive comments and corrections are most welcome.

Bg locale of ggNoSpam:

bg: { // Bulgarian
spam: '\u0441\u043F\u0430\u043C',
good: '\u043D\u0435 \u0435 \u0441\u043F\u0430\u043C',
rtl: false,
italic: true
}

VK

unread,
Oct 18, 2009, 7:54:33 AM10/18/09
to
On Oct 18, 3:27 pm, Asen Bozhilov <asen.bozhi...@gmail.com> wrote:
> On Oct 17, 6:36 pm, VK <schools_r...@yahoo.com> wrote:
>
> > Great compacting work! But what about the Summary view? ;)
>
> I don't like that summary view. I prefer first list of normal topic,
> and after that list of spam topic. Try my implementation of that:

I definitely will. Is this an implicit permission to use any part of
your code in ggNoSpam under GPLv3? If yes and used can I add you to
the contributirs list?

As of Summary view: I don't like it neither so in the first release I
even forgot about it as I never used it myself. Yet for a public
software - not a personal script - all legitimate usage modes have to
be accounted and the Summary view is one of them.

Thank you!

Jorge

unread,
Oct 18, 2009, 8:39:36 AM10/18/09
to
On Oct 17, 5:36 pm, VK <schools_r...@yahoo.com> wrote:

> Jorge wrote:
> > > javascript:(function(s,i,e){i=(e=document.querySelectorAll
> > > ('.maincontoutboxatt')[0].firstElementChild.tBodies
> > > [0].rows).length;while(i--){s.test(e[i].textContent)&&(e
> > > [i].style.opacity='0.2');}})(/leather|wholesale|whoelsale|fjrjtrade|
> > > peng Selina|dotradenow.com|toptradea.com/i);
>
> Great compacting work! But what about the Summary view? ;)

Works for both:

javascript:(function(s,i,e){i=(e=document.querySelectorAll
('.maincontoutboxatt')[0].getElementsByTagName('tr')).length;


while(i--){s.test(e[i].textContent)&&(e[i].style.opacity='0.2');}})(

/leather|wholesale|whoelsale|fjrjtrade|peng Selina|dotradenow.com|

toptradea.com/i);

(Safari, Chrome & FiereFox)
--
Jorge.

Asen Bozhilov

unread,
Oct 18, 2009, 9:07:11 AM10/18/09
to
On Oct 18, 2:54 pm, VK <schools_r...@yahoo.com> wrote:

> I definitely will. Is this an implicit permission to use any part of
> your code in ggNoSpam under GPLv3? If yes and used can I add you to
> the contributirs list?

Ofcourse.


VK

unread,
Oct 18, 2009, 9:27:50 AM10/18/09
to
VK wrote:
> > Great compacting work! But what about the Summary view? ;)

Jorge wrote:
> Works for both:
>
> javascript:(function(s,i,e){i=(e=document.querySelectorAll
> ('.maincontoutboxatt')[0].getElementsByTagName('tr')).length;
> while(i--){s.test(e[i].textContent)&&(e[i].style.opacity='0.2');}})(
> /leather|wholesale|whoelsale|fjrjtrade|peng Selina|dotradenow.com|
> toptradea.com/i);
>
> (Safari, Chrome & FiereFox)

Wow! :)
A side note: for bookmarklets (aka favelets) it is suggested to wrap
the anonymous function expression into void():
javascript:void(function(){/*your code here*/})())
This way you are solving two problems:

1) preventing occasional return value from your function in case of a
premature termination, say
if (something_is_wrong) {
return; // ore return null;
}

2) informing the engine in advance that there will be no JavaScript-
generated content from this link so there will be no navigation - that
prevents "waiting mode" state for long-running scripts.

Jorge

unread,
Oct 18, 2009, 11:54:43 AM10/18/09
to
On Oct 18, 3:27 pm, VK <schools_r...@yahoo.com> wrote:
>
> A side note: for bookmarklets (aka favelets) it is suggested to wrap
> the anonymous function expression into void():

The way you write it it looks like if void were a function... :-)

void f();

>  javascript:void(function(){/*your code here*/})())
> This way you are solving two problems:
>
> 1) preventing occasional return value from your function in case of a
> premature termination, say
>  if (something_is_wrong) {
>   return; // ore return null;
>  }

In the (not many) browsers that I've tested it in, unless something !
== undefined is returned, the page won't be navigated away.

My function returns undefined (because there's not an explicit return-
anything).

So, ISTM, whether void is needed or not depends upon what would get
returned -if anything- in the event of a parse or a runtime error ?

> 2) informing the engine in advance that there will be no JavaScript-
> generated content from this link so there will be no navigation - that
> prevents "waiting mode" state for long-running scripts.

I don't know, what's 'waiting mode' ?
Why does the engine need to be informed "in advance" ?

Thanks,
--
Jorge.

VK

unread,
Oct 18, 2009, 2:02:24 PM10/18/09
to
VK wrote:
> > A side note: for bookmarklets (aka favelets) it is suggested to wrap
> > the anonymous function expression into void():
Jorge wrote:
> The way you write it it looks like if void were a function... :-)
>
> void f();

:) Allowed by specs in either way. Either way SHOULD !== MUST ;) Or to
take it even more easy :) : it is an operator followed by a function
expression in parentheses:
void (function(){}())
with space after operator omitted for compactness which is allowed if
the next token is a parenthesis. I assume no one has an objection to
that?

> >  javascript:void(function(){/*your code here*/})())
> > This way you are solving two problems:
>
> > 1) preventing occasional return value from your function in case of a
> > premature termination, say
> >  if (something_is_wrong) {
> >   return; // ore return null;
> >  }
>
> In the (not many) browsers that I've tested it in, unless something !
> == undefined is returned, the page won't be navigated away.

The problem I experienced was with good programmers used to make
premature terminations or failure exits with
return null;
rather than with just
return;
Rather than fight with every single one it was easier to add one more
wrapper.

kangax

unread,
Oct 18, 2009, 5:54:14 PM10/18/09
to
Jorge wrote:
> On Oct 17, 5:36 pm, VK <schools_r...@yahoo.com> wrote:
>> Jorge wrote:
>>>> javascript:(function(s,i,e){i=(e=document.querySelectorAll
>>>> ('.maincontoutboxatt')[0].firstElementChild.tBodies
>>>> [0].rows).length;while(i--){s.test(e[i].textContent)&&(e
>>>> [i].style.opacity='0.2');}})(/leather|wholesale|whoelsale|fjrjtrade|
>>>> peng Selina|dotradenow.com|toptradea.com/i);
>> Great compacting work! But what about the Summary view? ;)
>
> Works for both:
>
> javascript:(function(s,i,e){i=(e=document.querySelectorAll
> ('.maincontoutboxatt')[0].getElementsByTagName('tr')).length;

Why call `querySelectorAll` if you only need first element?

Just use `document.querySelector` ;)

> while(i--){s.test(e[i].textContent)&&(e[i].style.opacity='0.2');}})(
> /leather|wholesale|whoelsale|fjrjtrade|peng Selina|dotradenow.com|
> toptradea.com/i);

[...]

--
kangax

Jorge

unread,
Oct 19, 2009, 2:12:47 PM10/19/09
to
On Oct 18, 11:54 pm, kangax <kan...@gmail.com> wrote:
> Jorge wrote:
>
> > javascript:(function(s,i,e){i=(e=document.querySelectorAll
> > ('.maincontoutboxatt')[0].getElementsByTagName('tr')).length;
>
> Why call `querySelectorAll` if you only need first element?
>
> Just use `document.querySelector` ;)

You learn something new (almost) every day...

Thanks !
--
Jorge.

Jorge

unread,
Oct 20, 2009, 5:43:28 AM10/20/09
to

What's wrong with it in Opera 10 ?

javascript:(function(s,i,e){i=(e=document.querySelector
('.maincontoutboxatt').getElementsByTagName('tr')).length;while(i--)
{s.test(e[i].textContent)&&(e[i].style.opacity='0.2')&&alert(e
[i].textContent+"\r\nopacity:"+e[i].style.opacity);}})(/leather|


wholesale|whoelsale|fjrjtrade|peng Selina|dotradenow.com|toptradea.com|

price|products/i);

Any ideas ?
--
Jorge.

VK

unread,
Oct 20, 2009, 6:51:53 AM10/20/09
to

Quotes. ... +"\r\nopacity:" ...
Replace to ... +'\r\nopacity:' ... to make it work

The rule of thumb of an error protective typing:

1) In HTML/XHTML there are not single quotes as a reality entity, so
you are forced to use only double quotes for attribute values.
2) In JavaScript there are not double quotes as a reality entity, so
you are forced to use only single quotes for string values.

Of course it can be vice versa and overall just a game with yourself
but in my strong opinion it helps hugely a lot.

Jorge

unread,
Oct 20, 2009, 8:23:00 AM10/20/09
to
On Oct 20, 12:51 pm, VK <schools_r...@yahoo.com> wrote:
>
> Quotes. ... +"\r\nopacity:" ...
> Replace to ... +'\r\nopacity:' ... to make it work
>
> The rule of thumb of an error protective typing:
>
> 1) In HTML/XHTML there are not single quotes as a reality entity, so
> you are forced to use only double quotes for attribute values.
> 2) In JavaScript there are not double quotes as a reality entity, so
> you are forced to use only single quotes for string values.
>
> Of course it can be vice versa and overall just a game with yourself
> but in my strong opinion it helps hugely a lot.

Hmm, what ?
--
Jorge

VK

unread,
Oct 20, 2009, 8:46:20 AM10/20/09
to

Using only single quotes for JavaScript strings and only double quotes
for HTML attributes. It can be vice versa of course (only double
quotes for JavaScript strings and only single quotes for HTML
attributes) or even no quotes whatsoever for HTML attributes and any
quotes for JavaScript strings. But by taking into account the common
usage pattern with HTML attributes placed into double quotes, for the
scripts made for the distribution on uncontrollable environments it is
better to stick to the very first option. By my 12 years experience it
eliminates a huge amount of sometimes very spurious errors. Just a
suggestion, any way.

Jorge

unread,
Oct 20, 2009, 10:06:17 AM10/20/09
to
On Oct 20, 2:46 pm, VK <schools_r...@yahoo.com> wrote:
>
> Using only single quotes for JavaScript strings and only double quotes
> for HTML attributes. It can be vice versa of course (only double
> quotes for JavaScript strings and only single quotes for HTML
> attributes) or even no quotes whatsoever for HTML attributes and any
> quotes for JavaScript strings. But by taking into account the common
> usage pattern with HTML attributes placed into double quotes, for the
> scripts made for the distribution on uncontrollable environments it is
> better to stick to the very first option. By my 12 years experience it
> eliminates a huge amount of sometimes very spurious errors. Just a
> suggestion, any way.

Neither
e[i].style.opacity='0.2'
nor
e[i].style.opacity="0.2"
nor
e[i].style.opacity= 0.2

work in Opera 10 ("not work" === the change in opacity is not visible
on-screen), although the element's .style.opacity gets properly set in
both cases (to check that is what the alert(e[i].style.opacity) is
there for)... but why ?
--
Jorge.

VK

unread,
Oct 20, 2009, 11:04:25 AM10/20/09
to

Oh, sorry. In your first post there was a quote mishmash in alert
block so it would fail to be inserted into HTML page for bookmarking,
this is what I was referring to. After the quote correction it started
to show the alert and I didn't follow the reason itself of this alert
display, sorry again, I was distracted this morning.

Currently Opera doesn't support opacity for table elements (tbody,
rows), only for the entire table. The relevant style interfaces are
provided for all elements but below TABLE these are just loopholes to
prevent possible environment detection - a stinky "programming
approach" Opera is famous for from the very beginning.

This is a part of the reasons I didn't go for opacity manipulations in
ggNoSpam - it is still too buggy if taken across all prominent
browsers.

kangax

unread,
Oct 20, 2009, 11:19:39 AM10/20/09
to
Jorge wrote:
> On Oct 19, 8:12 pm, Jorge <jo...@jorgechamorro.com> wrote:
>> On Oct 18, 11:54 pm, kangax <kan...@gmail.com> wrote:
>>
>>> Jorge wrote:
>>>> javascript:(function(s,i,e){i=(e=document.querySelectorAll
>>>> ('.maincontoutboxatt')[0].getElementsByTagName('tr')).length;
>>> Why call `querySelectorAll` if you only need first element?
>>> Just use `document.querySelector` ;)
>> You learn something new (almost) every day...
>>
>
> What's wrong with it in Opera 10 ?

Looks like Opera doesn't respect opacity on THEAD, TBODY and TR. TABLE,
TH and TD seem to work fine.

IIRC, Opera 10 supports RGBA format, so you can try simulating opacity
with that:

<tr style="color: rgba(0,0,0,0.2)" ...>...</tr>

>
> javascript:(function(s,i,e){i=(e=document.querySelector
> ('.maincontoutboxatt').getElementsByTagName('tr')).length;while(i--)

Why not just � `document.querySelectorAll('.maincontoutboxatt tr')` ?

[...]

--
kangax

VK

unread,
Oct 20, 2009, 11:52:47 AM10/20/09
to
VK wrote:
> This is a part of the reasons I didn't go for opacity manipulations in
> ggNoSpam - it is still too buggy if taken across all prominent
> browsers.

Not saying that table manipulations are so much better on the same
wide scale :)

VK

unread,
Oct 20, 2009, 1:52:40 PM10/20/09
to

Just in case I'd like to note that I didn't watch other project
participants going opacity way until they hit the Opera's quirk. As
Opera doesn't have a commercial importance for the projects in my area
I do support checks on it only occasionally and upon my time
availability: so I simply forgot about this silly bug.

Jorge

unread,
Oct 20, 2009, 3:32:35 PM10/20/09
to
On Oct 20, 5:19 pm, kangax <kan...@gmail.com> wrote:
> Jorge wrote:
> > On Oct 19, 8:12 pm, Jorge <jo...@jorgechamorro.com> wrote:
> >> On Oct 18, 11:54 pm, kangax <kan...@gmail.com> wrote:
>
> >>> Jorge wrote:
> >>>> javascript:(function(s,i,e){i=(e=document.querySelectorAll
> >>>> ('.maincontoutboxatt')[0].getElementsByTagName('tr')).length;
> >>> Why call `querySelectorAll` if you only need first element?
> >>> Just use `document.querySelector` ;)
> >> You learn something new (almost) every day...
>
> > What's wrong with it in Opera 10 ?
>
> Looks like Opera doesn't respect opacity on THEAD, TBODY and TR. TABLE,
> TH and TD seem to work fine.

<td>s would do.

> IIRC, Opera 10 supports RGBA format, so you can try simulating opacity
> with that:
>
> <tr style="color: rgba(0,0,0,0.2)" ...>...</tr>

No, some elements won't inherit because they've got their own colors.

> > javascript:(function(s,i,e){i=(e=document.querySelector
> > ('.maincontoutboxatt').getElementsByTagName('tr')).length;while(i--)
>

> Why not just — `document.querySelectorAll('.maincontoutboxatt tr')` ?

Yes, but with <td>s:

javascript:(function(s,i,e){i=(e=document.querySelectorAll
('.maincontoutboxatt td')).length;while(i--){s.test(e[i].textContent)&&
(e[i].style.opacity='0.2');}})(/leather|wholesale|whoelsale|fjrjtrade|


peng Selina|dotradenow.com|toptradea.com|price|products/i);

(Safari >= 3, Chrome, FiereFox >= 3.5, Opera >= 10, IE >= 27)

Thanks... again.
--
Jorge.

VK

unread,
Oct 22, 2009, 1:42:17 PM10/22/09
to
This weekend I want to finish with ggNoSpam RC1 (release candidate 1,
internal version 0.7.0) with a full account of code and interface
suggestions in this thread. To accommodate different mode versions
expressed here as well as to my private e-mail address - ggNoSpam
comes now with a user friendly setting interface. This is why I am
wondering if anyone would like to invest a little bit of time to make
a localized interface version. Any volunteering for that should simply
replace double quoted strings with her/his language version and to
replace ?? with the relevant language code.

?? : {
copyright: "free software under the GNU general license",
spam: "a spam",
good: "not a spam",
check: "hover to check"
regexp: "Anti-spam regular expression"
mode: "Detected spam display mode"
mode0: "Hide spam and commercials"
mode1: "Blur spam and commercials"
}

Language codes:
ar : Arabic
bg : Bulgarian
de : German
en : English
es : Spanish
fr : French
he : Hebrew
it : Italian
ja : Japanese
nl : Dutch
pt : Portuguese
ru : Russian
zh : Chinese (simplified)
If the necessary code is not found then you may use
http://msdn.microsoft.com/en-us/library/ms533052%28VS.85%29.aspx
as a convenience reference.

"hover to check" means content revealing on mouse pointer hovering
(placed over) the suspected spam.

in "Blur spam and commercials" blur refers on adding some transparency
to the text so it gets blurry, lesser visible.

It is a free software and no one gets paid for it, yet every
contributor will be added to the contributors list included into
script.

Jorge

unread,
Oct 22, 2009, 4:04:11 PM10/22/09
to
On Oct 22, 7:42 pm, VK <schools_r...@yahoo.com> wrote:
> (...)

> "hover to check" means content revealing on mouse pointer hovering
> (placed over) the suspected spam.
> (...)

...and the respective "hover-to-check" bookmarklet version :

javascript:(function(o,s,e,t,i){i=e.length;while(i--){t.test(e
[i].textContent)&&(e[i][s][o]="0.3",e[i].onmouseover=e
[i].onmouseout=function(){this[s][o]=this[s][o]?"":"0.3";});}})
("opacity","style",document.querySelectorAll('.maincontoutboxatt td'),/
leather|wholesale|whoelsale|fjrjtrade|peng%20Selina|dotradenow.com|
toptradea.com|price|products|pharmacy|t-shirt|gucci/i);
--
Jorge.

VK

unread,
Oct 22, 2009, 4:40:48 PM10/22/09
to

Your code is used for the "blur mode". "hide mode" eliminates spam
completely: one has to hover over the "hover to check" option to see
the beginning of the post title and if false positive or an
interesting commercial then click "not a spam" to exclude it from the
filtering. That's to accommodate two different desires expressed on
approx 50/50 basis: some want to eliminate the spam completely out of
view yet to be able to check each case for false positives; others are
more concerned about false positives so they want to make the spam
lesser distracting yet be able to see all filtered posts at once.

0 new messages