Class DirectoryCrawler

  • All Implemented Interfaces:
    DataProvider

    public class DirectoryCrawler
    extends Object
    implements DataProvider
    Provider for data files stored in a directories tree on filesystem.

    This class handles data files recursively starting from a root directories tree. The organization of files in the directories is free. There may be sub-directories to any level. All sub-directories are browsed and all terminal files are checked for loading.

    All registered filters are applied.

    Zip archives entries are supported recursively.

    This is a simple application of the visitor design pattern for directory hierarchy crawling.

    Author:
    Luc Maisonobe
    See Also:
    DataProvidersManager
    • Constructor Detail

      • DirectoryCrawler

        public DirectoryCrawler​(File root)
        Build a data files crawler.
        Parameters:
        root - root of the directories tree (must be a directory)
    • Method Detail

      • feed

        public boolean feed​(Pattern supported,
                            DataLoader visitor,
                            DataProvidersManager manager)
        Feed a data file loader by browsing the data collection.

        The method crawls all files referenced in the instance (for example all files in a directories tree) and for each file supported by the file loader it asks the file loader to load it.

        If the method completes without exception, then the data loader is considered to have been fed successfully and the top level data providers manager will return immediately without attempting to use the next configured providers.

        If the method completes abruptly with an exception, then the top level data providers manager will try to use the next configured providers, in case another one can feed the data loader.

        Specified by:
        feed in interface DataProvider
        Parameters:
        supported - pattern for file names supported by the visitor
        visitor - data file visitor to use
        manager - with the filters to apply to the resources.
        Returns:
        true if some data has been loaded