DataProvider
public class NetworkCrawler extends Object implements DataProvider
This class handles a list of URLs pointing to data files or zip/jar on
the net. Since the net is not a tree structure the list elements
cannot be top elements recursively browsed as in DirectoryCrawler
, they must be data files or zip/jar archives.
The files fetched from network can be locally cached on disk. This prevents too frequent network access if the URLs are remote ones (for example original internet URLs).
If the URL points to a remote server (typically on the web) on the other side of a proxy server, you need to configure the networking layer of your application to use the proxy. For a typical authenticating proxy as used in many corporate environments, this can be done as follows using for example the AuthenticatorDialog graphical authenticator class that can be found in the tests directories:
System.setProperty("http.proxyHost", "proxy.your.domain.com"); System.setProperty("http.proxyPort", "8080"); System.setProperty("http.nonProxyHosts", "localhost|*.your.domain.com"); Authenticator.setDefault(new AuthenticatorDialog());
Gzip-compressed files are supported.
Zip archives entries are supported recursively.
This is a simple application of the visitor
design pattern for
list browsing.
DataProvidersManager
GZIP_FILE_PATTERN, ZIP_ARCHIVE_PATTERN
Constructor | Description |
---|---|
NetworkCrawler(URL... urls) |
Build a data classpath crawler.
|
Modifier and Type | Method | Description |
---|---|---|
boolean |
feed(Pattern supported,
DataLoader visitor) |
Feed a data file loader by browsing the data collection.
|
void |
setTimeout(int timeout) |
Set the timeout for connection.
|
public NetworkCrawler(URL... urls)
The default timeout is set to 10 seconds.
urls
- list of data file URLspublic void setTimeout(int timeout)
timeout
- connection timeout in millisecondspublic boolean feed(Pattern supported, DataLoader visitor)
The method crawls all files referenced in the instance (for example all files in a directories tree) and for each file supported by the file loader it asks the file loader to load it.
If the method completes without exception, then the data loader
is considered to have been fed successfully and the top level
data providers manager
will return
immediately without attempting to use the next configured providers.
If the method completes abruptly with an exception, then the top level
data providers manager
will try to use
the next configured providers, in case another one can feed the
data loader
.
feed
in interface DataProvider
supported
- pattern for file names supported by the visitorvisitor
- data file visitor to useCopyright © 2002-2019 CS Systèmes d'information. All rights reserved.