public interface KnownUriFilter extends UriFilter
UriFilter that works like a blacklist filter and contains only those
URIs on its blacklist that the crawler already has seen before.| Modifier and Type | Method and Description |
|---|---|
void |
add(CrawleableUri uri,
long nextCrawlTimestamp)
Adds the given URI to the list of already known URIs.
|
void |
add(CrawleableUri uri,
long lastCrawlTimestamp,
long nextCrawlTimestamp)
Adds the given URI to the list of already known URIs together with the the time at which it has been crawled.
|
long |
count()
count the numbers of known URIs
|
List<CrawleableUri> |
getOutdatedUris()
Returns all
CrawleableUris which have to be recrawled. |
void add(CrawleableUri uri, long nextCrawlTimestamp)
add(CrawleableUri, long) with the current system time.uri - the URI that should be added to the list.nextCrawlTimestamp - The time at which the given URI should be crawled next.void add(CrawleableUri uri, long lastCrawlTimestamp, long nextCrawlTimestamp)
uri - the URI that should be added to the list.lastCrawlTimestamp - the time at which the given URI has eben crawled.nextCrawlTimestamp - The time at which the given URI should be crawled next.List<CrawleableUri> getOutdatedUris()
CrawleableUris which have to be recrawled. This means their time to next crawl has passed.CrawleableUris.long count()
Copyright © 2017–2019. All rights reserved.