Hi,
I have some unit tests that chain together some BetweenFactor using an odometry mock, inject some slightly contradictory supplemental evidence in the form of other factors, such as a unary factor akin to the "GPS-like" factor from LocalizationExample.cpp, and finally use LevenbergMarquardtOptimizer to `optimize()` the factor graph and assert the result.
For example, a test might start with some rather noisy mocked odometry moving forward at a consistent speed with a heading of pi/4, qualify the noisy estimates with some very not-noisy GPS-like factors that suggest movement is actually happening in the pi/2 direction, rather than the pi/4 direction, then optimize the graph and make sure that the optimized poses were along the trajectory suggested by the more reliable GPS-like factors.
Such a test and several others work fine on my desktop (x86_64, Ubuntu 18.04) but fail on my Jetson Xavier NX (aarch64, Ubuntu 18.04).
Checking out borglab gtsam source from github and running `make check` to invoke unit tests on the aarch64 device results in many failures, which I expect are related to my own, the following example of which I think is similar to one of the more basic included examples in gtsam :
```
gtsam/tests/testNonlinearOptimizer.cpp:484: Failure: "assert_equal(expected, dl_result, tol)"
Not equal:
expected:
Values with 3 values:
Value x1: (gtsam::Pose2)
(0, 0, 0)
Value x2: (gtsam::Pose2)
(1.5, 0, 0)
Value x3: (gtsam::Pose2)
(3, 0, 0)
actual:
Values with 3 values:
Value x1: (gtsam::Pose2)
(-3.74995e-06, 0, 0)
Value x2: (gtsam::Pose2)
(1.49999, 0, 0)
Value x3: (gtsam::Pose2)
(3, 0, 0)
```
I am in the process of debugging gtsam source through verbose optimizer output and will provide updates as I make progress, but I am suspicious of Eigen anti-aliasing or use of `auto` keyword and type resolution on aarch64.
One thing I have observed so far only in aarch64 is:
```
Partial Cholesky on HessianFactor failed.
Frontal keys Position 0: 0
HessianFactor:
keys: 0(3) 1(3)
Augmented information matrix: [
1.00453889587e+81, -8.863578493e+80, 1.35333966459e+82, -1.22992576855e+42, -7.68703605343e+40, 0, -1.78961791309e+41;
-8.863578493e+80, 1.00453889587e+81, -1.19233688982e+82, 1.22992576855e+42, -7.68703605343e+40, 0, 3.06711287414e+41;
1.35333966459e+82, -1.19233688982e+82, 1.82326703797e+83, -1.65582451852e+43, -1.04723572027e+42, -400, -2.39906858576e+42;
-1.22992576855e+42, 1.22992576855e+42, -1.65582451852e+43, 1600, 0, 0, 315.903994301;
-7.68703605343e+40, -7.68703605343e+40, -1.04723572027e+42, 0, 100, 0, -83.0941179516;
0, 0, -400, 0, 0, 400, 628.318530718;
-1.78961791309e+41, 3.06711287414e+41, -2.39906858576e+42, 315.903994301, -83.0941179516, 628.318530718, 304968.261602
]
```
Another thing I have noticed in aarch64 is that the initial total error of the same system seems much lower than that of amd64, possibly resulting in less/no optimization iterations.
Any help is greatly appreciated, and thanks in advance!
Jay