Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

iptables string match

264 views
Skip to first unread message

buck

unread,
Apr 1, 2013, 3:49:28 PM4/1/13
to
Is it possible to match a string in nat table PREROUTING? I suspect
PREROUTING only sees new connections, so the content of the packet cannot
be examined.

If not in nat, can mangle be used to mark packets matching a string> If
so, can such a marked patcket be DNATted? How?
--
buck

Martijn Lievaart

unread,
Apr 1, 2013, 5:50:08 PM4/1/13
to
On Mon, 01 Apr 2013 19:49:28 +0000, buck wrote:

> Is it possible to match a string in nat table PREROUTING? I suspect
> PREROUTING only sees new connections, so the content of the packet
> cannot be examined.

PREROUTING sees all packets before routing, however, the nat table sees
only the first packet of each connection.

>
> If not in nat, can mangle be used to mark packets matching a string> If

Should work.

> so, can such a marked patcket be DNATted? How?

Yes and no. What do you want to achieve? If it is not the first packet of
a "connection", DNATting proabably does not do what you want.

M4

buck

unread,
Apr 1, 2013, 6:36:12 PM4/1/13
to
Martijn Lievaart <m...@rtij.nl.invlalid> wrote in
news:g1cp2a-...@news.rtij.nl:
Thanks!

This may sound foolish, but there is very good reason for it.

I run IBM OmniFInd, which is the same crawler engine as Yahoo! uses to
index web pages. I need to serve a substitute robots.txt when OmniFind
asks a remote host for its robots.txt. The reason is that this is the
fastest and most reliable way to purge from OmniFind those links that are
404 or 410 (not found / gone).

I have no control over when OmniFind requests robots.txt from that
particular remote host, so debugging is difficult.

What I've done is to set up a small web server on my gateway box that
listens on port 1080 and serves only robots.txt (containing 559 bad links
right now). I hope that something like

iptables -t mangle -I PREROUTING -p tcp -m string --string robots.txt \
-j MARK --set-mark 0x2
iptables -t nat -I OUTPUT -p tcp -m mark --mark 0x2 -j REDIRECT \
--to-ports 1080

will cause the marked packets to be sent to my web server. And
accomplish the objective...

Note that the iptables "mangle" line above has been edited in hopes that
removing a bunch of "stuff" will improve clarity with respect to what I'm
trying to accomplish.
--
buck

Martijn Lievaart

unread,
Apr 2, 2013, 1:52:14 PM4/2/13
to
On Mon, 01 Apr 2013 22:36:12 +0000, buck wrote:

>> Yes and no. What do you want to achieve? If it is not the first packet
>> of a "connection", DNATting proabably does not do what you want.
>>
>
> This may sound foolish, but there is very good reason for it.
>
> I run IBM OmniFInd, which is the same crawler engine as Yahoo! uses to
> index web pages. I need to serve a substitute robots.txt when OmniFind
> asks a remote host for its robots.txt. The reason is that this is the
> fastest and most reliable way to purge from OmniFind those links that
> are 404 or 410 (not found / gone).
>
> I have no control over when OmniFind requests robots.txt from that
> particular remote host, so debugging is difficult.
>
> What I've done is to set up a small web server on my gateway box that
> listens on port 1080 and serves only robots.txt (containing 559 bad
> links right now). I hope that something like
>
> iptables -t mangle -I PREROUTING -p tcp -m string --string robots.txt \
> -j MARK --set-mark 0x2
> iptables -t nat -I OUTPUT -p tcp -m mark --mark 0x2 -j REDIRECT \
> --to-ports 1080
>
> will cause the marked packets to be sent to my web server. And
> accomplish the objective...
>
> Note that the iptables "mangle" line above has been edited in hopes that
> removing a bunch of "stuff" will improve clarity with respect to what
> I'm trying to accomplish.

This will never work. You cannot DNAT a connection that is already set up.

Alternatives:
0) The above will (try to) redirect all requests for robots.txt, why not
just replace it on the target webserver?
1) If OmniFind has a fixed IP, DNAT that (but that will redirect all
requests, not just robots.txt)
2) Put a transparant proxy on the gateway that redirects requests
robots.txt to your webserver and all other requests to the target
webserver

Other stuff that may give you building blocks for a solution:
- Does OmniFind have a unique agent identifier?
- Target webserver is Apache? Many moduless can customize what is
returned.

HTH,
M4
0 new messages