Martijn Lievaart <m...@rtij.nl.invlalid> wrote in
news:g1cp2a-...@news.rtij.nl:
Thanks!
This may sound foolish, but there is very good reason for it.
I run IBM OmniFInd, which is the same crawler engine as Yahoo! uses to
index web pages. I need to serve a substitute robots.txt when OmniFind
asks a remote host for its robots.txt. The reason is that this is the
fastest and most reliable way to purge from OmniFind those links that are
404 or 410 (not found / gone).
I have no control over when OmniFind requests robots.txt from that
particular remote host, so debugging is difficult.
What I've done is to set up a small web server on my gateway box that
listens on port 1080 and serves only robots.txt (containing 559 bad links
right now). I hope that something like
iptables -t mangle -I PREROUTING -p tcp -m string --string robots.txt \
-j MARK --set-mark 0x2
iptables -t nat -I OUTPUT -p tcp -m mark --mark 0x2 -j REDIRECT \
--to-ports 1080
will cause the marked packets to be sent to my web server. And
accomplish the objective...
Note that the iptables "mangle" line above has been edited in hopes that
removing a bunch of "stuff" will improve clarity with respect to what I'm
trying to accomplish.
--
buck