Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

/ads.txt

40 views
Skip to first unread message

Eli the Bearded

unread,
Aug 24, 2018, 7:14:06 PM8/24/18
to
Anyone here?

I'm curious about /ads.txt. I've read some background material on it
that outlines how it is supposed to be used for validating ad sales
inventory or something like that.

https://digiday.com/marketing/wtf-ads-txt/

I do not put any ads on my site, I do not run any ads for my site, and I
do not sell or host ads for anyone else.

So why am I seeing so many hits to ads.txt?

Some are from Google, some are from who knows where (35.224.0.0/12 is
"Google Cloud"; 165.227.0.0/16 is Digital Ocean):

35.229.103.78 - - [24/Aug/2018:12:29:22 -0400] "GET /ads.txt HTTP/1.1" 404 398 "-" "bidswitchbot/1.0"
66.249.70.24 - - [24/Aug/2018:12:52:38 -0400] "GET /ads.txt HTTP/1.1" 404 398 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
165.227.100.219 - - [24/Aug/2018:13:22:09 -0400] "GET http://qaz.wtf/ads.txt HTTP/1.1" 404 398 "-" "lua-resty-http/0.11 (Lua) ngx_lua/10010"

I've created a blank ads.txt now. Will that work to tell all of these
bots that no ads should exist for my site?

Elijah
------
organic traffic only

Doc O'Leary

unread,
Aug 25, 2018, 2:45:57 PM8/25/18
to
For your reference, records indicate that
Eli the Bearded <*@eli.users.panix.com> wrote:

> So why am I seeing so many hits to ads.txt?

Malicious scans. Do you also see a lot of bogus WordPress URLs?
Same thing.

> I've created a blank ads.txt now. Will that work to tell all of these
> bots that no ads should exist for my site?

It’s better to block their IP address completely. Even better, block
entire ranges by those “cloud” providers. Stop the abuse rather than
the notification of the problem.

--
"Also . . . I can kill you with my brain."
River Tam, Trash, Firefly


Eli the Bearded

unread,
Aug 25, 2018, 10:47:15 PM8/25/18
to
In comp.infosystems.www.misc,
Doc O'Leary <drol...@2017usenet1.subsume.com> wrote:
> For your reference, records indicate that
> Eli the Bearded <*@eli.users.panix.com> wrote:
>> So why am I seeing so many hits to ads.txt?
> Malicious scans. Do you also see a lot of bogus WordPress URLs?
> Same thing.

These days the biggest malicious scan offender is the D-Link one
(tries to use /login.cgi to wget and run a shell script). I don't
have any reason to think Googlebot doing a GET on a .txt file is
a malicious scan.

> It’s better to block their IP address completely. Even better, block
> entire ranges by those “cloud” providers. Stop the abuse rather than
> the notification of the problem.

Advice like this I get can get from any hypochondriac webmaster forum.
I'm perfectly capable of deciding what to block or not block on my own.
My question was just about how ad agencies use ads.txt.

Elijah
------
doesn't think the dlink scan has yet repeated a source IP address

Doc O'Leary

unread,
Aug 26, 2018, 12:20:49 PM8/26/18
to
For your reference, records indicate that
Eli the Bearded <*@eli.users.panix.com> wrote:

> I don't
> have any reason to think Googlebot doing a GET on a .txt file is
> a malicious scan.

Since you aren’t in a business relationship with them for AdSense or
any other advertising service, there’s really no legitimate reason for
them to be scanning unpublished URLs like that. Save, of course, for
that fact they they’re looking to hoover up any and all information
about everyone they can get their hands on. I see them probing under
/.well-known/ and random 404 URLs as well. Google stopped playing
nice a long time ago.

> > It’s better to block their IP address completely. Even better, block
> > entire ranges by those “cloud” providers. Stop the abuse rather than
> > the notification of the problem.
>
> Advice like this I get can get from any hypochondriac webmaster forum.
> I'm perfectly capable of deciding what to block or not block on my own.
> My question was just about how ad agencies use ads.txt.

There’s plenty of information online about legitimate uses for that
file. What should concern you, since you don’t apparently buy or show
ads, is the improper uses. Same as for any other scans for invalid
URLs on your site. Serve up a blank file if you like. I personally
issue a 204 response for things like that, saving the bans for probes
that are directly going after exploit URLs.
0 new messages