Strict Standards: date(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected 'America/New_York' for 'EDT/-4.0/DST' instead in /homepages/20/d268022878/htdocs/forum/viewtopic.php on line 988

Strict Standards: getdate(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected 'America/New_York' for 'EDT/-4.0/DST' instead in /homepages/20/d268022878/htdocs/forum/viewtopic.php on line 988
[phpBB Debug] PHP Notice: in file /includes/functions.php on line 4284: Cannot modify header information - headers already sent by (output started at /viewtopic.php:988)
[phpBB Debug] PHP Notice: in file /includes/functions.php on line 4286: Cannot modify header information - headers already sent by (output started at /viewtopic.php:988)
[phpBB Debug] PHP Notice: in file /includes/functions.php on line 4287: Cannot modify header information - headers already sent by (output started at /viewtopic.php:988)
[phpBB Debug] PHP Notice: in file /includes/functions.php on line 4288: Cannot modify header information - headers already sent by (output started at /viewtopic.php:988)
FAROO Forum • View topic - Active vs. passive crawling

Active vs. passive crawling

Discussing Peer-to-peer, Distributed, Grid and Cloud technology for search

Active vs. passive crawling

Postby TomHH » Sat Jan 22, 2011 6:41 am

What is the difference between active and passive crawling?
Can I prevent faroo to accept urls for crawling from other peers?
If both options are switched off: Is the web history still indexed?
TomHH
 
Posts: 40
Joined: Sat Dec 18, 2010 5:37 am

Re: Active vs. passive crawling

Postby Wolf » Sat Jan 22, 2011 11:01 am

TomHH wrote:What is the difference between active and passive crawling?

In active crawling every peer acts like a normal crawler, downloading and indexing web pages autonomously, traversing the web by following links. All indpendent from the user behaviour.
The used algorithm makes sure, that different peers have almost no overlap between crawled pages, despite the fact that each peer is crawling autonomously and independently.

In passive crawling only those web pages are indexed, which have been previously visited by the user within the browser. Within the crawler queue passive crawling has priority.
This ensures that those pages are indexed first, where the current attention of the users is right now.

TomHH wrote:Can I prevent faroo to accept urls for crawling from other peers?

FAROO does not accept urls for crawling from other peers at all. Every peer crawls autonomously.

TomHH wrote:If both options are switched off: Is the web history still indexed?

Yes.
There is another thing to keep in mind. If the crawling is disabled, the crawler queue will not be filled anymore with new entries, but entries left in the queue will still be processed.
Wolf
Site Admin
 
Posts: 130
Joined: Wed Dec 17, 2008 12:28 pm


Return to Peer-to-Peer Search

Who is online

Users browsing this forum: No registered users and 2 guests

cron