[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Orekit Developers] Work In Progress for thread safety



Le 12/04/2012 23:37, Thomas Neidhart a écrit :
> On 04/10/2012 07:22 PM, MAISONOBE Luc wrote:
>> Hi all,
>>
>> Some progress has been made on thread-safety for Orekit (see issue
>> <https://www.orekit.org/forge/issues/3>).
>>
>> As discussed on this list and on the issue, we have set up a thread-safe
>> TimeStampedCache that do not use synchronization at all but rely on more
>> efficient mechanisms (read-write locks). The rationale is to have thread
>> safety that works even when threads are not tightly bound to some dates
>> interval. This corresponds to two major use cases:
>>  - one where threads are used in a pool (see ExecutorService and the whole
>>    java standard concurrent package)
>>  - one where threads are created and shut down at high frequency, a new
>> thread
>>    being used for each request
>>
>> Of course, it also supports applications where threads are completely
>> under control and each thread is tightly bound to a date range.
>>
>> This work started on a dedicated branch, and has now been merged back
>> into master. As of writing, the general TimeStampedCache framework is in
>> place for the simplest case of leap seconds handling (the UTCScale
>> class). It will be extended to all other Orekit caches and can also be
>> used at application level (people can for example need cached orbits).
>>
>> I would be happy to get some feedback from users. If you have
>> multi-threaded applications, could you check the behaviour of the
>> current implementation on dates ? Be aware that for now only UTC-TAI
>> relies on the new thread-safe mechanism and for example frames are not
>> thread-safe yet (but will probably be soon).
> 
> Hi Luc,

Hi Thomas,

> 
> I have looked into the new UTCScale version using a TimeStampedCache and
> I have found some issues:
> 
> In the testMultithreading unit test, you first create 10000 random dates
> with their corresponding UTC-TAI offset and compare this later inside an
> ExecutorService.
> 
> My expectation was that the internal cache in the UTCScale will contain
> 1 slot with 40 UTCTAIOffset entries.
> 
> In fact there were 100 slots (max slot size) with 40 entries each.
> 
> After debugging into it, I found two problems:
> 
>  - in the TimeStampedCache, the quantum is stored as an int
>    but it can lead to over/under-flows when doing the following
>    comparison in selectSlot (and maybe somewhere else):
> 
>         if (slots.isEmpty() ||
>             slots.get(index).getEarliestQuantum() > dateQuantum +
> NEW_RANGE_FACTOR * neighborsSize ||
>             slots.get(index).getLatestQuantum()   < dateQuantum -
> NEW_RANGE_FACTOR * neighborsSize) {
> 
>    the result is, that more slots as necessary are created.
>    For consistency, all uses of a quantum should be long instead of int

Good catch!
We should be cautious here. Simply changing the types to long may simply
move the problem without solving it. The quantum step in the cache
constructor is computed as halfSpan / Integer.MAX_VALUE. Blindly
changing int to long would lead to change this statement to halfSpan /
Long.MAX_VALUE. After this change, the earliest and latest dates would
still be of the order of magnitude of min and max values for a long, so
the overflow would still occur!

So I would suggest to both use long and at the same time use a fixed
quantum set. Using 1 microsecond as the quantum step seems reasonable
for separating entries even for very fast pace dynamics (when dealing
with attitude and transient modes), and allows a signed long to support
a range from t0 - 292271 years to t0 + 292271 years.

If we use a fixed quantum step, we don't need anymore to limit the
earliest/latest dates to finite values as explained in the generator
interface javadoc. This would also be a nice improvement as this
limitation is quite awkward.

> 
>  - UTCScale$Generator: the generator defines its earliest and latest
>    date somewhere in the range of 1960 - 2012. When providing dates
>    that are outside this range, a new slot is created, which is not
>    necessary, as there is no more offset data available.
> 
>    I would propose to add a method like this to the generator, and call
>    it instead of cache.getNeighbors directly:
> 
>     private UTCTAIOffset[] getNeighbors(final AbsoluteDate date) {
>         AbsoluteDate corrected = date;
>         final AbsoluteDate latest = cache.getLatest().getDate();
>         final AbsoluteDate earliest = cache.getEarliest().getDate();
>         if (date.compareTo(latest) > 0) {
>             corrected = latest;
>         }
>         if (date.compareTo(earliest) < 0) {
>             corrected = earliest;
>         }
>         return cache.getNeighbors(corrected);
>     }
> 
>    The idea is to limit the date to the available range of the cache.

I understand your rationale. Shouldn't this however be handled at cache
level as a general feature ? There is such a protection in the Slot
constructor for example, it could perhaps be moved a cache level.

> 
>  - There should be a possibility to limit the calls to
>    Generator#generate. In the case of the UTCScale, there is no use in
>    calling it multiple times, as it will not provide additional data,
>    but there is an unnecessary overhead involved (i.e. checking the
>    result).

I'm not sure about that. The case for UTC where we know one call
generates everything is quite specific. In ephemerides cases the number
of calls is also limited by the files contents, but we avoid reading all
files at first, so we don't know beforehand when we will have exhausted
the files (and we often stop computation before exhausting them as they
cover very large ranges). In analytical model cases (like
precession-nutation), there is no limit to the number of calls.

One way could be to say that a limit count equal to zero or less than
zero means no limits.

> 
>    I have done a first test to reduce the overhead by returning an empty
>    list after the first call to the generator, which works, but there
>    are surely better solutions:
> 
>    private static final List<UTCTAIOffset> EMPTY =
>         new ArrayList<UTCTAIOffset>();
>    private boolean generated = false;
> 
>    public List<UTCTAIOffset> generate(final UTCTAIOffset existing,
>                                       final AbsoluteDate date) {
>        // everything has been generated at construction
>        List<UTCTAIOffset> reply = generated ? EMPTY : offsets;
>        generated = true;
>        return reply;
>    }

If we decide to set a limit on the number of calls, it would be trivial
to implement at cache level, as there is already a counter for calls.
Also in the case of UTC, we limit generation because we know nothing
more can be added, but in other cases this could be used as a safety
net. So in the first case, no error should be triggered if the limit is
reached, but in the other case an exception should be thrown.

Another possibility would be to improve the checks the cache does. It
already knows the earliest/latest date and should not generate more than
that. I tried to do that, but obviously failed. There is a dirty trick
in the UTC case because there are regular entries between 1960 and 2012
(for now), and also two entries at infinity, in the past and in the
future. Perhaps these two entries mess up the cache.

> 
> 
> For the first two issues, I can provide a patch, the third one needs to
> be discussed more in detail I guess.
> 
> btw. I know its early work in progress :-), but having more fine-grained
> control over the cache settings would be good, i.e. having specific
> default values for each cache, e.g. UTCScale cache instead of having
> default values for all caches. If you agree, I can also provide patches
> here.

Yes, please do it. Also if you think we can add more control methods in
the same spirit as the various getters for generation calls, evictions,
number of slots and so on, do not hesitate to add them.

Thanks for your thorough analysis
Luc

> 
> Thomas
>