As explained on the filtering page, the FiltersManager.applyRelevantFilters
method works by looping over all registered DataFilter
instances and calling their filter
method
with the current DataSource
. If the filter
method returns the exact same instance that was passed
to it, it means the filter does not do anything. In this case, the FiltersManager.applyRelevantFilters
method just continues its loop and check the next filter. If the filter
method returns a different
DataSource
instance that was passed to it, it means the filter does indeed act on the data stream.
In this case, the FiltersManager.applyRelevantFilters
method sets the current DataSource
to the
returned value and restart its loop from the beginning.
This algorithm allows the same filter to be applied several time if needed, and it also allows
the filters to be applied in any order, regardless of the order in which they have been registered
to the FiltersManager
.
Users may benefit from this general feature to add their own filters. One example could be a deciphering algorithm if sensitive data should be stored enciphered and should be deciphered on the fly when data is loaded.
As per the way the applyRelevantFilters
method works, the filter
method must be implemented in such
a way that it should check the DataSource
passed to it and return its parameter if it considers
it should not filter it, or return a new DataSource
if it considers is should filter it.
There is one important caveat to understand when implementing custom filter: filters must never open
the DataSource
by themselves, regardless of the fact they will return the original instance or a filtered
instance when their filter
method is called. The rationale is that it is the upper layer that will decide
to open (or not) the returned value and that a DataSource
can be opened only once; this is the core
principle of lazy-opening provided by DataSource
.
A consequence of this caveat is that a filter cannot peek on the few bytes of the data stream that is
referenced by a DataSource
, for example in an attempt to look for a magic number in a header. This is
the reason why for example the GzipFilter
looks for a .gz
suffix in the name and does not
look for the 0x1f8B
magic number at file start.
As applyRelevantFilters
restarts its loop from the beginning each time a filter is added to the stack,
some care must be taken to avoid stacking an infinite number of instances of the same filter on top
of each other. This means that the filtered DataSource
returned after filtering should be recognized
as already filtered and not matched again by the same filter. If the check is based on file names
extensions (like .gz
for gzip-compressed files), then if the original DataSource
has a name of
the form base.ext.gz
than the filtered file should have a name of the form base.ext
. Another point
is that if a filters does not act on a DataSource
, then it must return the same instance that
was passed to it, it must not simply create a transparent filter that just passes names and data
stream unchanged, otherwise it would be considered as a valid filter and added again and again
until either a stack overflow or memory exhaustion exception occurs.
The filtering part itself is implemented by opening the data stream from the underlying original
DataSource
, reading raw data from it, performing the processing on these data (uncompressing,
deciphering, …) and returning them as another stream.
The following example shows how to do that for a dummy deciphering algorithm based on a simple XOR (this is a toy example only, not intended to be secure at all).
public class XorFilter implements DataFilter {
/** Suffix for XOR ciphered files. */
private static final String SUFFIX = ".xor";
/** Highly secret key. */
private static final int key = 0x3b;
/** {@inheritDoc} */
@Override
public DataSource filter(final DataSource original) {
final String oName = original.getName();
final DataSource.Opener oOpener = original.getOpener();
if (oName.endsWith(SUFFIX)) {
final String fName = oName.substring(0, oName.length() - SUFFIX.length());
final DataSource.StreamOpener fOpener = () -> new XORInputStream(oName, oOpener.openStreamOnce());
return new DataSource(fName, fOpener);
} else {
return original;
}
}
/** Filtering of XOR ciphered stream. */
private static class XORInputStream extends InputStream {
/** File name. */
private final String name;
/** Underlying compressed stream. */
private final InputStream input;
/** Indicator for end of input. */
private boolean endOfInput;
/** Simple constructor.
* @param name file name
* @param input underlying compressed stream
*/
XORInputStream(final String name, final InputStream input) {
this.name = name;
this.input = input;
this.endOfInput = false;
}
/** {@inheritDoc} */
@Override
public int read() throws IOException {
if (endOfInput) {
// we have reached end of data
return -1;
}
final int raw = input.read();
if (raw < 0) {
endOfInput = true;
return -1;
} else {
return raw ^ key;
}
}
}
}