Groups
Conversations
All groups and messages
Send feedback to Google
Help
Training
Sign in
Groups
Abot Web Crawler
Conversations
About
Abot Web Crawler
Contact owners and managers
1–30 of 357
This is the google group for the
Abot Web Crawler
and
AbotX Web Crawler
. Please feel free to post questions or start discussions regarding the use of Abot.
Mark all as read
Report group
0 selected
rjone...@gmail.com
6/21/23
Crawling an HTML File
What would I override/extend if I wanted to crawl an html text file instead of a website?
unread,
Crawling an HTML File
What would I override/extend if I wanted to crawl an html text file instead of a website?
6/21/23
rolf michielson
,
rolf michielson
2
5/30/23
I am aware of your abilities
Hello I hope all is well with you and that your day is going well. I am an attorney and financial
unread,
I am aware of your abilities
Hello I hope all is well with you and that your day is going well. I am an attorney and financial
5/30/23
David Hansen
, …
Suresh Dayma
8
7/14/22
Getting Binary data
In the configuration, set isHttpRequestAutomaticDecompressionEnabled="true", this will fix
unread,
Getting Binary data
In the configuration, set isHttpRequestAutomaticDecompressionEnabled="true", this will fix
7/14/22
Ethan
4/28/22
AbotX Licensing
Hello, I have a question about the licensing. We agree that we need only 1 license if we use it for 2
unread,
AbotX Licensing
Hello, I have a question about the licensing. We agree that we need only 1 license if we use it for 2
4/28/22
Ethan
,
Steven Jones
3
4/19/22
Use of docker
Hi, Thank you for your answer. I have an other question then. Is it possible to get details about the
unread,
Use of docker
Hi, Thank you for your answer. I have an other question then. Is it possible to get details about the
4/19/22
S Wilkinson
3/21/22
Cannot scan a particular site - every url throws timeout / no content
Hi, I'm using abot to crawl various sites and the vast majority work, however for one site, every
unread,
Cannot scan a particular site - every url throws timeout / no content
Hi, I'm using abot to crawl various sites and the vast majority work, however for one site, every
3/21/22
Tom
11/10/21
Saving state on shutdown
I'm using the AbotX parallel crawler engine, within the confines of a dotnet worker app (ie uses
unread,
Saving state on shutdown
I'm using the AbotX parallel crawler engine, within the confines of a dotnet worker app (ie uses
11/10/21
Tom
2
11/10/21
Seeding SiteToCrawl with multiple pages?
Also, if I take the cheat route and just add a new SiteToCrawl for every URL, is the parallel engine
unread,
Seeding SiteToCrawl with multiple pages?
Also, if I take the cheat route and just add a new SiteToCrawl for every URL, is the parallel engine
11/10/21
Lloyd
11/8/21
NullReferenceException at CrawlDecisionMaker.ShouldDownloadPageContent
Hello, I'm writing a basic WinForm which starts the web crawler based on a URL (checked and
unread,
NullReferenceException at CrawlDecisionMaker.ShouldDownloadPageContent
Hello, I'm writing a basic WinForm which starts the web crawler based on a URL (checked and
11/8/21
Simon Bonello
9/7/21
Source code
Hii, I bought abotx for a sister company and we didn't receive access to the source code can you
unread,
Source code
Hii, I bought abotx for a sister company and we didn't receive access to the source code can you
9/7/21
ghislain borremans
,
sjdi...@gmail.com
3
7/1/21
Operation of Cookies
Hi, IsSendingCookiesEnabled = true tells Abot/AbotX to resend any cookies that are returned in the
unread,
Operation of Cookies
Hi, IsSendingCookiesEnabled = true tells Abot/AbotX to resend any cookies that are returned in the
7/1/21
rjone...@gmail.com
,
sjdi...@gmail.com
2
6/23/21
Getting Page Data when Parallel Crawling
Yes, There is a clear example in the docs here under ParallelCrawlerEngine. See the code snippet with
unread,
Getting Page Data when Parallel Crawling
Yes, There is a clear example in the docs here under ParallelCrawlerEngine. See the code snippet with
6/23/21
ghislain borremans
,
sjdi...@gmail.com
3
6/23/21
AbotX: screenshot from crawled page
Replied to the cookie issue thread instead of here. On Sat, Jun 19, 2021 at 8:42 AM ghislain
unread,
AbotX: screenshot from crawled page
Replied to the cookie issue thread instead of here. On Sat, Jun 19, 2021 at 8:42 AM ghislain
6/23/21
Saeid Babaei
,
sjdi...@gmail.com
2
5/24/21
Crawl only
Yes, Abot has no limitations that would prevent you from doing this. If you need custom consulting on
unread,
Crawl only
Yes, Abot has no limitations that would prevent you from doing this. If you need custom consulting on
5/24/21
ghislain borremans
, …
VLADIMIR KOZLOV
4
4/30/21
Using scheduler to preload with Urls, but then, the links on those preloaded Urls are not scheduled
suffer from similar problem. this leads to nothing: crawler.IsInternalUriDecisionMaker = (
unread,
Using scheduler to preload with Urls, but then, the links on those preloaded Urls are not scheduled
suffer from similar problem. this leads to nothing: crawler.IsInternalUriDecisionMaker = (
4/30/21
VLADIMIR KOZLOV
4/17/21
getting empy e.CrawledPage.Content.Text - don't undestand why
Hello, i'm getting always null or empty e.CrawledPage.Content.Text in PageCrawlCompleted(object
unread,
getting empy e.CrawledPage.Content.Text - don't undestand why
Hello, i'm getting always null or empty e.CrawledPage.Content.Text in PageCrawlCompleted(object
4/17/21
rjone...@gmail.com
,
sjdirect
4
4/9/21
Net 5.x Support
You can follow the status of this in the github issue. On Friday, April 2, 2021 at 3:30:12 PM UTC-7
unread,
Net 5.x Support
You can follow the status of this in the github issue. On Friday, April 2, 2021 at 3:30:12 PM UTC-7
4/9/21
Ridhima Shukla
,
sjdi...@gmail.com
2
3/18/21
Single Page Application
Hi, As your last request the same forum post answers this question as well. You can see this thread
unread,
Single Page Application
Hi, As your last request the same forum post answers this question as well. You can see this thread
3/18/21
Ridhima Shukla
,
sjdi...@gmail.com
2
3/18/21
Infinite Scrolling Pages
Hi, You can see this thread for an answer that also applies to your use case. Abot/AbotX cannot
unread,
Infinite Scrolling Pages
Hi, You can see this thread for an answer that also applies to your use case. Abot/AbotX cannot
3/18/21
Sajjad Mortazavi
,
sjdi...@gmail.com
2
3/16/21
Phantomjs
No plans at this time. There is a very clean abstraction in place that allows another provider to
unread,
Phantomjs
No plans at this time. There is a very clean abstraction in place that allows another provider to
3/16/21
Ridhima Shukla
,
sjdirect
2
3/15/21
Add Keywords to URL
Would the crawl bag be what you are looking for? On Monday, March 15, 2021 at 4:50:33 AM UTC-7 rsh...
unread,
Add Keywords to URL
Would the crawl bag be what you are looking for? On Monday, March 15, 2021 at 4:50:33 AM UTC-7 rsh...
3/15/21
Anton Kheistver
, …
CS
7
3/7/21
Best way to do page-by-page crawling
Thank-you! On Monday, January 25, 2021 at 10:45:56 PM UTC+2 sjdirect wrote: Hi, You can just create
unread,
Best way to do page-by-page crawling
Thank-you! On Monday, January 25, 2021 at 10:45:56 PM UTC+2 sjdirect wrote: Hi, You can just create
3/7/21
John Ligtenberg
,
sjdi...@gmail.com
5
2/23/21
IIS crashes on cancellation
Anyone that cancels a task using a cancellationtoken (nothing to do with Abot/Abotx) will have to
unread,
IIS crashes on cancellation
Anyone that cancels a task using a cancellationtoken (nothing to do with Abot/Abotx) will have to
2/23/21
ghislain borremans
,
sjdi...@gmail.com
3
2/22/21
Using Abot2Demo:crawledPage.HttpRequestException. on https://aanhangwagenspattyn.be/ while ok in browser
Discussed in github issue On Sat, Feb 20, 2021 at 4:16 AM ghislain borremans <ghislainborremans@
unread,
Using Abot2Demo:crawledPage.HttpRequestException. on https://aanhangwagenspattyn.be/ while ok in browser
Discussed in github issue On Sat, Feb 20, 2021 at 4:16 AM ghislain borremans <ghislainborremans@
2/22/21
sjdi...@gmail.com
,
John Ligtenberg
2
2/16/21
Re: CrawlResult not returned if MaxPagesToCrawl is set ?
This problem has been solved, it was caused by a locked session. I needed to add [SessionState(
unread,
Re: CrawlResult not returned if MaxPagesToCrawl is set ?
This problem has been solved, it was caused by a locked session. I needed to add [SessionState(
2/16/21
ghislain borremans
3
2/15/21
Overriding method is not called SchedulePageLinks(CrawledPage crawledPage)
Solved: i was mistaken by the class and method name. I renamed the class to "MyPoliteWebCrawler
unread,
Overriding method is not called SchedulePageLinks(CrawledPage crawledPage)
Solved: i was mistaken by the class and method name. I renamed the class to "MyPoliteWebCrawler
2/15/21
Ridhima Shukla
,
sjdi...@gmail.com
2
8/30/20
Crawl Anchor Tag Links with Href attribute as "#"
There is a config value that can switch that on and off (off by default)... /// <summary> ///
unread,
Crawl Anchor Tag Links with Href attribute as "#"
There is a config value that can switch that on and off (off by default)... /// <summary> ///
8/30/20
sir.a...@gmail.com
7/18/20
Q&A
Setting Proxy
Hi is there any way of setting proxies other than app settings ? the previous solution you mentioned
unread,
Q&A
Setting Proxy
Hi is there any way of setting proxies other than app settings ? the previous solution you mentioned
7/18/20
agarwal....@gmail.com
,
sjdi...@gmail.com
2
6/15/20
Q&A
Retry crawling for unsuccessfull http response by increasing MinCrawlDelayPerDomainMilliSeconds.
You can do the following which should allow configurable number or retries. CrawlConfiguration
unread,
Q&A
Retry crawling for unsuccessfull http response by increasing MinCrawlDelayPerDomainMilliSeconds.
You can do the following which should allow configurable number or retries. CrawlConfiguration
6/15/20
agarwal....@gmail.com
6/12/20
Q&A
How to implement the decision maker function in abot web crawler.
I am working on project where before crawling a page i need to check whether should a crawl this page
unread,
Q&A
How to implement the decision maker function in abot web crawler.
I am working on project where before crawling a page i need to check whether should a crawl this page
6/12/20