As explained on the filtering page, the DataProvidersManager.applyAllFilters
method works by looping over all registered DataFilter
instances and calling their filter
method with the current NamedData
. If the filter
method returns the exact same instance that was passed to it, it means the filter does not do anything. In this case, the DataProvidersManager.applyAllFilters
method just continues its loop and check the next filter. If the filter
method returns a different NamedData
instance that was passed to it, it means the filter does indeed act on the bytes stream. In this case, the DataProvidersManager.applyAllFilters
method sets the current NamedData
to the returned value and restart its loop from the beginning.
This algorithm allows the same filter to be applied several time if needed, and it also allows the filters to be applied in any order, regardless of the order in which they have been registered to the DataProvidersManager
.
Users may benefit from this general feature to add their own filters. One example could be a deciphering algorithm if sensitive data should be stored enciphered and should be deciphered on the fly when data is loaded.
As per the way the applyAllFilters
method works, the filter
method must be implemented in such a way that it should check the NamedData
passed to it and return its parameter if it considers it should not filter it, or return a new NamedData
if it considers is should filter it.
Checking is typically done using only the name and looking for files extensions, but it could as well be made by opening temporarily the stream to read just the first few bytes to look for some magic number and closing it afterwards, as the NamedData
passed as a parameter has an openStream
method that can be called as many times as one wants.
As applyAllFilters
restarts its loop from the beginning each time a filter is added to the stack, some care must be taken to avoid stacking an infinite number of instances of the same filter on top of each other. This means that the filtered NamedData
returned after filtering should be recognized as already filtered and not matched again by the same filter. If the check is based on file names extensions (like .gz
for gzip-compressed files), then if the original NamedData
has a name of the form base.ext.gz
than the filtered file should have a name of the form base.ext
. Another point is that if a filters does not act on a NamedData
, then it must return the same instance that was passed to it, it must not simply create a transparent filter that just passes names and bytes unchanged, otherwise it would be considered as a valid filter and added again and again until either a stack overflow or memory exhaustion exception occurs.
The filtering part itself is implemented by opening the bytes stream from the underlying original NamedData
, reading raw bytes from it, performing the processing on these bytes (uncompressing, deciphering, …) and returning them as another stream.
The following example shows how to do that for a dummy deciphering algorithm based on a simple XOR (this is a toy example only, not intended to be secure at all).
public class XorFilter implements DataFilter { /** Suffix for XOR ciphered files. */ private static final String SUFFIX = ".xor"; /** Highly secret key. */ private static final int key = 0x3b; /** {@inheritDoc} */ @Override public NamedData filter(final NamedData original) { final String oName = original.getName(); final NamedData.StreamOpener oOpener = original.getStreamOpener(); if (oName.endsWith(SUFFIX)) { final String fName = oName.substring(0, oName.length() - SUFFIX.length()); final NamedData.StreamOpener fOpener = () -> new XORInputStream(oName, oOpener.openStream()); return new NamedData(fName, fOpener); } else { return original; } } /** Filtering of XOR ciphered stream. */ private static class XORInputStream extends InputStream { /** File name. */ private final String name; /** Underlying compressed stream. */ private final InputStream input; /** Indicator for end of input. */ private boolean endOfInput; /** Simple constructor. * @param name file name * @param input underlying compressed stream * @exception IOException if first bytes cannot be read */ XORInputStream(final String name, final InputStream input) throws IOException { this.name = name; this.input = input; this.endOfInput = false; } /** {@inheritDoc} */ @Override public int read() throws IOException { if (endOfInput) { // we have reached end of data return -1; } final int raw = input.read(); if (raw < 0) { endOfInput = true; return -1; } else { return raw ^ key; } } } }