Issues spidering a website using ZAP API

Jay S

unread,

Jun 23, 2017, 2:52:17 PM6/23/17

to OWASP ZAP User Group

Hello,

I have a basic test website that I am trying to spider using ZAP API but the urls found by the spidering look questionable.

I have installed Python OWASP ZAP API version 2.4-0.0.10 and Python version 3.6. The website is running on IIS and has no authentication on it.

The site is up and I can access it fine using browser.

However when I spider it using the ZAP API it returns the following urls after the spidering is complete. I checked the spider status is 100%

retrieving the URLs.

- http://localhost/robots.txt

- http://localhost/

- http://localhost/sitemap.xml

The problem is none of these URLs/files (robots.txt and sitemap.xml) actually exist.

There are other files and subfolders (virtual directories), each containing some files, in the wwwroot folder. But these are not found or returned.

I tried adjusting the default depth and max children properties in the spider object using this:

zap.spider.set_option_max_depth(6) # This should be more than enough as the website is only 2 level deep (root and some subdirectories under)

zap.spider.set_option_max_children(0) # 0 is supposed to be interpreted as unlimited.

It does not help. I have seen it return lot more and actual folders/files within a website (different one) earlier but it only happened once and I cannot

make it reproduce it.

One other observation. It seems that for the spidering (or anything for that matter) to work I need to launch zap in daemon mode first.

Like zap -daemon. I thought when you create a zap object like below it should not have been required but correct me if am wrong.

from zapv2 import ZAPv2

...

zap = ZAPv2(apikey)

What I am missing or doing wrong? Thanks,

Jay

Jay S

unread,

Jun 25, 2017, 7:54:40 PM6/25/17

to OWASP ZAP User Group

A few more observations unless this is expected that I was/am not aware of.

I checked the ZAP application/UI to look for any differences in behavior across UI and API. Here is what I did:

Launched ZAP application, created a new session. Went to the browser and adjusted its proxy to localhost:8080. Accessed http://localhost:8080/ to make sure ZAP was active and listening on 8080 port. It was.

I then accessed as website - just a top level page. It showed under Sites in ZAP UI. I right clicked and added it to default context. There was no authentication whatsoever on this website so did not specify authentication, users etc.

I then right clicked on the site and chose Attack->Spider. Left the Recurse option checked and hit "Start Scan".

After it completed the spidering, it could not find all URLs on the website. That seems to match with what I observe when using the API. It lists things like robots.txt, sitemap.xml which actually do not exist.

The website has some folders and files which I would have expected it to find it by itself as the act of spidering.

Only if I visit each and every folder, page on the website explicitly using the browser and then do spidering in ZAP UI then it will list those URLs in the spidering output.

I thought the purpose of spidering was to find any/all hidden URLs. What I am missing or doing wrong? I am using ZAP 2.6.0 on Windows 2012.

Any help appreciated. TIA

Jay

kingthorin+owaspzap

unread,

Jun 25, 2017, 10:04:48 PM6/25/17

to OWASP ZAP User Group

Is navigation on the site JavaScript based?

Jay S

unread,

Jun 26, 2017, 12:16:01 PM6/26/17

to OWASP ZAP User Group

No. Everything on the website/app is static content - something I created for testing purposes. No java scripts whatsoever.

thx,

kingthorin+owaspzap

unread,

Jun 26, 2017, 3:08:05 PM6/26/17

to OWASP ZAP User Group

Is there a default index page or appropriate redirect in place for localhost/ or does it just result in a 404?

Jay S

unread,

Jun 26, 2017, 4:23:08 PM6/26/17

to OWASP ZAP User Group

Yes there are default pages configured in each of the virtual directories/folders and they get served properly if no page is specified (when accessed from browser).

From what I can see/tell ZAP UI does not get any 404s nor does it make any requests for any urls other than those I mentioned earlier and this one:

- http://localhost/browserconfig.xml

rgds, Jayant

On Friday, June 23, 2017 at 11:52:17 AM UTC-7, Jay S wrote:

thc...@gmail.com

unread,

Jun 27, 2017, 3:47:30 AM6/27/17

to zaprox...@googlegroups.com

Hi.

Could you provide the main page? What links does it have?

Best regards.

Jay S

unread,

Jun 27, 2017, 7:35:44 PM6/27/17

to OWASP ZAP User Group

Answer to your question explains it. The default page does not have any links to other pages. So no surprise ZAP will not find them of its own unless they are are referred to somewhere directly/indirectly from where ZAP starts.

Sincere apologies :-(

On Friday, June 23, 2017 at 11:52:17 AM UTC-7, Jay S wrote:

Message has been deleted

kingthorin+owaspzap

unread,

Jun 27, 2017, 9:04:05 PM6/27/17

to OWASP ZAP User Group

Thanks for letting us know, glad you got it sorted out.

Reply all

Reply to author

Forward