Strict Standards: date(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected 'America/New_York' for 'EST/-5.0/no DST' instead in /homepages/20/d268022878/htdocs/forum/viewtopic.php on line 988

Strict Standards: getdate(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected 'America/New_York' for 'EST/-5.0/no DST' instead in /homepages/20/d268022878/htdocs/forum/viewtopic.php on line 988
[phpBB Debug] PHP Notice: in file /includes/functions.php on line 4284: Cannot modify header information - headers already sent by (output started at /viewtopic.php:988)
[phpBB Debug] PHP Notice: in file /includes/functions.php on line 4286: Cannot modify header information - headers already sent by (output started at /viewtopic.php:988)
[phpBB Debug] PHP Notice: in file /includes/functions.php on line 4287: Cannot modify header information - headers already sent by (output started at /viewtopic.php:988)
[phpBB Debug] PHP Notice: in file /includes/functions.php on line 4288: Cannot modify header information - headers already sent by (output started at /viewtopic.php:988)
FAROO Forum • View topic - Active vs. passive crawling
Page 1 of 1

Active vs. passive crawling

PostPosted: Sat Jan 22, 2011 6:41 am
by TomHH
What is the difference between active and passive crawling?
Can I prevent faroo to accept urls for crawling from other peers?
If both options are switched off: Is the web history still indexed?

Re: Active vs. passive crawling

PostPosted: Sat Jan 22, 2011 11:01 am
by Wolf
TomHH wrote:What is the difference between active and passive crawling?

In active crawling every peer acts like a normal crawler, downloading and indexing web pages autonomously, traversing the web by following links. All indpendent from the user behaviour.
The used algorithm makes sure, that different peers have almost no overlap between crawled pages, despite the fact that each peer is crawling autonomously and independently.

In passive crawling only those web pages are indexed, which have been previously visited by the user within the browser. Within the crawler queue passive crawling has priority.
This ensures that those pages are indexed first, where the current attention of the users is right now.

TomHH wrote:Can I prevent faroo to accept urls for crawling from other peers?

FAROO does not accept urls for crawling from other peers at all. Every peer crawls autonomously.

TomHH wrote:If both options are switched off: Is the web history still indexed?

Yes.
There is another thing to keep in mind. If the crawling is disabled, the crawler queue will not be filled anymore with new entries, but entries left in the queue will still be processed.