public class InMemoryKnownUriFilter extends Object implements KnownUriFilter
KnownUriFilter interface.| Modifier and Type | Class and Description |
|---|---|
private class |
InMemoryKnownUriFilter.UriInfo |
| Modifier and Type | Field and Description |
|---|---|
protected long |
defaultRecrawlTime |
private boolean |
frontierDoesRecrawling
Indicates whether the
Frontier using this filter does recrawling. |
protected Hashtable<CrawleableUri,InMemoryKnownUriFilter.UriInfo> |
uris
- key: the crawled (known) uri
- value: the info about the URI (see
InMemoryKnownUriFilter.UriInfo), including the reference list |
| Constructor and Description |
|---|
InMemoryKnownUriFilter()
Constructor.
|
InMemoryKnownUriFilter(boolean frontierDoesRecrawling,
long defaultRecrawlTime)
Constructor.
|
InMemoryKnownUriFilter(Hashtable<CrawleableUri,InMemoryKnownUriFilter.UriInfo> uris,
boolean frontierDoesRecrawling)
Constructor.
|
| Modifier and Type | Method and Description |
|---|---|
void |
add(CrawleableUri uri,
long nextCrawlTimestamp)
Adds the given URI to the list of already known URIs.
|
void |
add(CrawleableUri uri,
long lastCrawlTimestamp,
long nextCrawlTimestamp)
Adds the given URI to the list of already known URIs together with the the time at which it has been crawled.
|
long |
count()
count the numbers of known URIs
|
List<CrawleableUri> |
getOutdatedUris()
Returns all
CrawleableUris which have to be recrawled. |
boolean |
isUriGood(CrawleableUri uri)
Returns true if the given
CrawleableUri object fulfills the
requirements imposed by this filter. |
protected Hashtable<CrawleableUri,InMemoryKnownUriFilter.UriInfo> uris
InMemoryKnownUriFilter.UriInfo), including the reference listprivate boolean frontierDoesRecrawling
Frontier using this filter does recrawling.protected long defaultRecrawlTime
public InMemoryKnownUriFilter(boolean frontierDoesRecrawling,
long defaultRecrawlTime)
frontierDoesRecrawling - Value for frontierDoesRecrawling.public InMemoryKnownUriFilter()
public InMemoryKnownUriFilter(Hashtable<CrawleableUri,InMemoryKnownUriFilter.UriInfo> uris, boolean frontierDoesRecrawling)
uris - Value for uris.frontierDoesRecrawling - Value for frontierDoesRecrawling.public void add(CrawleableUri uri, long nextCrawlTimestamp)
KnownUriFilterKnownUriFilter.add(CrawleableUri, long) with the current system time.add in interface KnownUriFilteruri - the URI that should be added to the list.nextCrawlTimestamp - The time at which the given URI should be crawled next.public void add(CrawleableUri uri, long lastCrawlTimestamp, long nextCrawlTimestamp)
KnownUriFilteradd in interface KnownUriFilteruri - the URI that should be added to the list.lastCrawlTimestamp - the time at which the given URI has eben crawled.nextCrawlTimestamp - The time at which the given URI should be crawled next.public boolean isUriGood(CrawleableUri uri)
UriFilterCrawleableUri object fulfills the
requirements imposed by this filter.isUriGood in interface UriFilteruri - the CrawleableUri object that is checkedCrawleableUri object fulfills the
requirements imposed by this filter. Otherwise false is returned.public List<CrawleableUri> getOutdatedUris()
KnownUriFilterCrawleableUris which have to be recrawled. This means their time to next crawl has passed.getOutdatedUris in interface KnownUriFilterCrawleableUris.public long count()
KnownUriFiltercount in interface KnownUriFilterCopyright © 2017–2019. All rights reserved.