urllib怎么处理跳转的网页?

Xian Chen

unread,

Jul 2, 2009, 2:34:51 PM7/2/09

to python-cn

一个网页打开之后, 1秒后跳转到别的地址, 这种情况下urllib怎么抓取新的地址呢?

thanks

YoungKing

unread,

Jul 2, 2009, 9:38:31 PM7/2/09

to pyth...@googlegroups.com

urllib不能处理跳转，用urllib2

2009/7/3 Xian Chen <hoga...@gmail.com>

一个网页打开之后, 1秒后跳转到别的地址, 这种情况下urllib怎么抓取新的地址呢?

thanks

--
IRC:#python.cn@freenode.net |邮件列表: https://groups.google.com/group/python-cn | 论坛: http://forum.python.cn/ |昨夜布谷忽失声,今朝惊觉满城春|我不同意你的观点，但我誓死捍卫你说话的权利

shell909090

unread,

Jul 2, 2009, 9:42:26 PM7/2/09

to pyth...@googlegroups.com

Xian Chen 写道:
> 一个网页打开之后, 1秒后跳转到别的地址, 这种情况下urllib怎么抓取新的地址呢?
> thanks
>
转跳有两种实现方式，分别是http转跳和js转跳。http转跳是使用
HttpResponesCode来做的，要延时转跳多数是js转跳。这时候需要你有套js引擎来
分析整个过程，获得1秒后转跳的这个url。然后再抓。
最简单的方式是利用浏览器操作接口来干这个事情，否则的话，研究一下for
python的js引擎吧，会累死的。

@@

unread,

Jul 2, 2009, 9:42:45 PM7/2/09

to pyth...@googlegroups.com

他说的跳转不是30x吧

2009/7/3 YoungKing <yan...@gmail.com>

五湖闲人

unread,

Jul 2, 2009, 9:50:31 PM7/2/09

to python-cn`CPyUG`华蟒用户组(中文Py用户组)

response.geturl()

On 7月3日, 上午2时34分, Xian Chen <hoganx...@gmail.com> wrote:
> 一个网页打开之后, 1秒后跳转到别的地址, 这种情况下urllib怎么抓取新的地址呢?
> thanks

Xian Chen

unread,

Jul 3, 2009, 8:14:18 AM7/3/09

to pyth...@googlegroups.com

就是js处理的跳转， js自动生成url，然后跳转。。。

麻烦死了，怎么处理呢？

2009/7/2 shell909090 <shell...@gmail.com>

shell909090

unread,

Jul 3, 2009, 10:21:09 AM7/3/09

to pyth...@googlegroups.com

Xian Chen 写道:

不怕花内存就用浏览器接口吧，怕花内存问问mozilla的js引擎开源不开源。

Reply all

Reply to author

Forward