[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Orekit Users] Test failure
Hi Luc, Walter,
On Mon, 2018-06-04 at 18:10 +0200, MAISONOBE Luc wrote:
Walter Grossman <w.grossman@ieee.org> a écrit :
Thanks for prompt response. I will do my best. Let me also add that there
was a warning that 2 tests were skipped.
The skipped tests are expected, they correspond to one of the class
considered experimental as of 9.2.
I cloned the repository using git. the jar is orekit-9.2.jar
UBUNTU 16.04LTS
Intel® Core™ i5-3320M CPU @ 2.60GHz × 4
Intel® Ivybridge Mobile
64-bit
openjdk version "1.8.0_171"
OpenJDK Runtime Environment (build 1.8.0_171-8u171-b11-0ubuntu0.16.04.1-b11)
OpenJDK 64-Bit Server VM (build 25.171-b11, mixed mode)
I'm also seeing this test error, as well as one with NetworkCrawlerTest, when building the 9.2 tag from git. The NetworkCrawlerTest issue may be unrelated. Here is my system information:
$ mvn clean test
...
[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR] NetworkCrawlerTest.compressed:82 expected:<2> but was:<0>
[ERROR] OrbitDeterminationTest.testW3B:384 expected:<0.687998> but was:<0.6880143632396981>
[INFO]
[ERROR] Tests run: 2790, Failures: 2, Errors: 0, Skipped: 2
commit: 5da7febcc2769477c4522d7ec4ed42e8169c6e39
javac 1.8.0_171
openjdk version "1.8.0_171"
OpenJDK Runtime Environment (build 1.8.0_171-8u171-b11-0ubuntu0.16.04.1-b11)
OpenJDK 64-Bit Server VM (build 25.171-b11, mixed mode)
Linux B259-LINUX4 4.4.0-127-generic #153-Ubuntu SMP Sat May 19 10:58:46 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Description: Ubuntu 16.04.4 LTS
model name : Intel(R) Xeon(R) CPU X5650 @ 2.67GHz
The NetworkCrawlerTest issue seems to be related to the order the tests are run, because when I run `mvn clean test -Dtest=NetworkCrawlerTest#compressed` the test passes. So it is probably a data loading order/caching issue. I think the order JUnit tests
are run is non-deterministic.
Luc, If you run `mvn clean test -Dtest=OrbitDeterminationTest#testW3B` does it still pass on your machine? Perhaps it is a test ordering issue for that test as well.
Let me know if you would like me to try something to debug the issue.
Best Regards,
Evan
Maybe I should switch to Oracle Java?
No, most of the Orekit developers use Linux and openJDK.
I'll have a quick look at this, but this may be a numerical glitch. Increasing
the tolerance seems fine to me.
best regards,
Luc
On Mon, Jun 4, 2018 at 9:54 AM, MAISONOBE Luc <luc.maisonobe@c-s.fr> wrote:
Hi Walter,
Walter Grossman <w.grossman@ieee.org> a écrit :
I am a newbie to OREkit. I ran tests and go a "near-miss" failure. I
resolved by relaxing precision. How do I know if I am OK?
OrbitDeterminationTest.testW3B:384 expected:<0.687998> but
was:<0.6880143632396981>
found this line: Assert.assertEquals(0.687998, covariances.getEntry(6,
6),
1.0e-5);
Is the problem that acceptance criterion too tight? Why?
The test tolerance is intentionally extremely small, see below
for the rationale for this stringent choice. The test should however
succeed with the current settings. Could you tell us which version
of Orekit you use (development version from the git repository, released
version?) and with which Java environment (OS, JVM version, processor)?
Some tests in Orekit are built in several stages. First the test is
created without any thresholds and only output its results, which are
compared by the developer with whatever is available to get confidence
on the results. This may be run of other reference programs if available,
this may be another independent implementation using different algorithms,
or this may be sensitivity analysis with the program under test itself.
This
validation phase may be quite long. Once developers are convinced the
implementation
is good, they run the test one last time and register its output as the
reference values with a stringent threshold in order to transform the
nature
of the test into a non-regression test. The threshold is therefore not an
indication that the results are very good, it is only a way for us to
ensure
that any change in the code that affects this part will break the test and
will enforce developers to look again at this code and to decide what to
do.
They can decide that the changes that broke the test are valid and that
they
only changed the results in an acceptable way (sometimes to improve the
results),
so they change either the reference value or the threshold. They can
decide that
the changes in fact triggered something unexpected and that they should
improve
their new code so the test pass again without changing it. So as a summary
thresholds for non-regression tests are small to act as a fuse and people
notice
when it blows up and can take decisions.
best regards,
Luc