Benchmarks

The OpenGJK library is continuously tested for accuracy and performance. This page shows the result of benchmarks that test the runtime performance of the library. This page is automatically updated each time a new version of the library is released. It serves as a dashboard that quickly summarizes where recent development has led to - in terms of speedup.

Latest Run

The latest benchmarks ran on a shared virtual machine (GitLab CI runner).


Latest run on:

Loading data ...

Number of v-CPU:

Loading data ...

CPU(s) frequency in MHz:

Loading data ...


Live Results

The OpenGJK is called for three simple scenarios:

  • A pair of points
  • A pair of random polytopes with 100 vertices each
  • A pair of random polytopes with 1000 vertices each

For each scenario the mean, median, standard deviation and coefficient of variation (cv) is reported for the runtime (CPU time). The results are reported below in logarithmic scale.

Performance History

The heatmap below shows how performance has evolved across all tagged releases of OpenGJK. Green indicates faster execution times, red indicates slower times.

Performance Heatmap

CPU time across versions and scenarios. Green = faster, Red = slower.

Version Pair of Points 100 Vertex Polytopes 1000 Vertex Polytopes Release Date
v3.0.0 1b84855 1.52 μs 5.85 μs 108.5 μs Jul 2022
v3.0.1 7a018a3 1.48 μs 5.68 μs 104.2 μs Nov 2022
v3.0.2 866c01e 1.42 μs 5.45 μs 99.8 μs Nov 2022
v3.0.3 9d9e8e4 1.38 μs 5.32 μs 97.5 μs Nov 2022
v4.0.0 ad2d602 1.21 μs 4.68 μs 85.8 μs Apr 2023
v4.0.1 855c5c0 1.19 μs 4.62 μs 84.2 μs Dec 2023
v4.0.2 ba1d826 1.18 μs 4.55 μs 83.1 μs Nov 2024
v4.0.3 bc0b389 1.15 μs 4.48 μs 81.5 μs Feb 2025
v4.0.4 5eca7fa 1.12 μs 4.35 μs 79.2 μs Dec 2025
Faster
Slower

Performance Improvement Summary

↓ 26.3%
Pair of Points
1.52 → 1.12 μs
↓ 25.6%
100 Vertex Polytopes
5.85 → 4.35 μs
↓ 27.0%
1000 Vertex Polytopes
108.5 → 79.2 μs

Comparing v3.0.0 (Jul 2022) to v4.0.4 (Dec 2025)

Performance Trend

An alternative view showing the performance trend as a line chart.

Performance Trend Across Versions

CPU time measurements for three test scenarios. Lower values indicate better performance.

Detailed Results

Version Pair Points (μs) 100 Vertices (μs) 1000 Vertices (μs) Date
v3.0.0 1.52 5.85 108.50 2022-07-09
v3.0.1 1.48 5.68 104.20 2022-11-01
v3.0.2 1.42 5.45 99.80 2022-11-23
v3.0.3 1.38 5.32 97.50 2022-11-28
v4.0.0 1.21 4.68 85.80 2023-04-22
v4.0.1 1.19 4.62 84.20 2023-12-28
v4.0.2 1.18 4.55 83.10 2024-11-26
v4.0.3 1.15 4.48 81.50 2025-02-16
v4.0.4 1.12 4.35 79.20 2025-12-29

Application Profiling

Beyond micro-benchmarks, we also profile the library in a realistic physics simulation scenario. The superb mini-application runs hundreds of collision shapes with multi-threaded physics, giving insights into real-world performance characteristics.

Application Profiling Results

Performance profile of the OpenGJK-powered physics simulation (superb app) running 800 collision shapes. Profiled on macos.

⏱️
3.29s
Wall Clock Time
🔥
29.58s
Total CPU Time
🧵
9
Threads
📈
899%
CPU Utilization

Thread Balance

EXCELLENT
11.0% — 11.1% per thread

Workload is nearly perfectly distributed across all worker threads.

CPU Time Hotspots

Function CPU % Time (ms)
compute_minimum_distance
36.0%
9520
PhysicsWorld::workerThreadFunc
24.0%
6360
CollisionShapeArray::fillGJKBuffer
12.3%
3480
W1D
1.5%
445
S2D
1.2%
355
__sincosf_stret
1.3%
385

Analysis Summary

The workload is nearly perfectly distributed across all 8 worker threads. Each thread consumes 11.0-11.1% of total CPU time with uniform self-weight distribution. The main hotspot is compute_minimum_distance at 36% total time. Optimization suggestions include caching GJK buffer data (potential 10-12% speedup) and enhanced SIMD vectorization (potential 15-20% speedup).

Version: v4.0.4 · Commit: 5eca7fa · Profiled: 2025-12-30