In the user preferences, there is a SIMD on/off switch. Keep in mind that these vector-capable workloads account for a smaller percentage of the overall workload of running the sdrtrunk application.
It would be a challenge to make a comparison between SIMD enabled and disabled because you might not see the same workloads across each run of the application. The calibration tests each SIMD-capable software component in isolation to determine which SIMD or non-SIMD variant to use for a given machine. You can see the scores for each component when you run the calibrations. Each score represents an average of the number of operations that it completed in a fixed timeframe. Each calibration goes through a warm-up test and then runs the actual test.
During development I ran these calibration tests on several different machines and the results were surprising. Many of the newest machines support 8 lanes of 32-bit floating point operations meaning you can do 8x floating point calculations in the same CPU cycle count as a single floating point calculation. A CPU that supports 8-lane operations will also support 4-lane and 2-lane operations. Some Intel CPUs support 16-lane (AVX-512) operations. For some tasks the non-SIMD variant performs best. For other machines that have 8-lane capable CPUs, the 4-lane SIMD variant performs better.
The SIMD optimizations are competing against the Java HotSpot compiler. At runtime, Java is able to 'auto-vectorize' chunks of code when it detects that the code is structured in a way that is compatible with vectorization. So, the cases I mentioned where the non-SIMD variants were faster, these can be cases where Java is already vectorizing the code and doing it more efficiently than I can express in code. Moreover, sdrtrunk has been using some SIMD vector operations all along, because of this Java auto-vectorization capability.
The best speed-ups that I saw during my tests were 7.5x for 8 lane SIMD operations.
cheers,
Denny