Benchmarks

The OpenGJK library is continuously tested for accuracy and performance. This page shows the result of benchmarks that test the runtime performance of the library. This page is automatically updated each time a new version of the library is released. It serves as a dashboard that quickly summarizes where recent development has led to - in terms of speedup.

Latest Run

The latest benchmarks ran on a shared virtual machine (GitLab CI runner).

Latest run on:

Loading data ...

Number of v-CPU:

Loading data ...

CPU(s) frequency in MHz:

Loading data ...

Live Results

The OpenGJK is called for three simple scenarios:

A pair of points
A pair of random polytopes with 100 vertices each
A pair of random polytopes with 1000 vertices each

For each scenario the mean, median, standard deviation and coefficient of variation (cv) is reported for the runtime (CPU time). The results are reported below in logarithmic scale.

Performance History

The heatmap below shows how performance has evolved across all tagged releases of OpenGJK. Green indicates faster execution times, red indicates slower times.

Performance Heatmap

CPU time across versions and scenarios. Green = faster, Red = slower.

Version	Pair of Points	100 Vertex Polytopes	1000 Vertex Polytopes	Release Date
v3.0.0 1b84855	1.52 μs	5.85 μs	108.5 μs	Jul 2022
v3.0.1 7a018a3	1.48 μs	5.68 μs	104.2 μs	Nov 2022
v3.0.2 866c01e	1.42 μs	5.45 μs	99.8 μs	Nov 2022
v3.0.3 9d9e8e4	1.38 μs	5.32 μs	97.5 μs	Nov 2022
v4.0.0 ad2d602	1.21 μs	4.68 μs	85.8 μs	Apr 2023
v4.0.1 855c5c0	1.19 μs	4.62 μs	84.2 μs	Dec 2023
v4.0.2 ba1d826	1.18 μs	4.55 μs	83.1 μs	Nov 2024
v4.0.3 bc0b389	1.15 μs	4.48 μs	81.5 μs	Feb 2025
v4.0.4 5eca7fa	1.12 μs	4.35 μs	79.2 μs	Dec 2025

Faster

Slower

Performance Improvement Summary

↓ 26.3%

Pair of Points

1.52 → 1.12 μs

↓ 25.6%

100 Vertex Polytopes

5.85 → 4.35 μs

↓ 27.0%

1000 Vertex Polytopes

108.5 → 79.2 μs

Comparing v3.0.0 (Jul 2022) to v4.0.4 (Dec 2025)

Performance Trend

An alternative view showing the performance trend as a line chart.

Performance Trend Across Versions

CPU time measurements for three test scenarios. Lower values indicate better performance.

Detailed Results

Version	Pair Points (μs)	100 Vertices (μs)	1000 Vertices (μs)	Date
v3.0.0	1.52	5.85	108.50	2022-07-09
v3.0.1	1.48	5.68	104.20	2022-11-01
v3.0.2	1.42	5.45	99.80	2022-11-23
v3.0.3	1.38	5.32	97.50	2022-11-28
v4.0.0	1.21	4.68	85.80	2023-04-22
v4.0.1	1.19	4.62	84.20	2023-12-28
v4.0.2	1.18	4.55	83.10	2024-11-26
v4.0.3	1.15	4.48	81.50	2025-02-16
v4.0.4	1.12	4.35	79.20	2025-12-29

Application Profiling

Beyond micro-benchmarks, we also profile the library in a realistic physics simulation scenario. The superb mini-application runs hundreds of collision shapes with multi-threaded physics, giving insights into real-world performance characteristics.

Application Profiling Results

Performance profile of the OpenGJK-powered physics simulation (superb app) running 800 collision shapes. Profiled on macos.

⏱️

3.29s

Wall Clock Time

🔥

29.58s

Total CPU Time

🧵

Threads

📈

899%

CPU Utilization

Thread Balance

EXCELLENT

11.0% — 11.1% per thread

Workload is nearly perfectly distributed across all worker threads.

CPU Time Hotspots

Function	CPU %	Time (ms)
`compute_minimum_distance`	36.0%	9520
`PhysicsWorld::workerThreadFunc`	24.0%	6360
`CollisionShapeArray::fillGJKBuffer`	12.3%	3480
`W1D`	1.5%	445
`S2D`	1.2%	355
`__sincosf_stret`	1.3%	385

Analysis Summary

The workload is nearly perfectly distributed across all 8 worker threads. Each thread consumes 11.0-11.1% of total CPU time with uniform self-weight distribution. The main hotspot is compute_minimum_distance at 36% total time. Optimization suggestions include caching GJK buffer data (potential 10-12% speedup) and enhanced SIMD vectorization (potential 15-20% speedup).

Version: v4.0.4 · Commit: 5eca7fa · Profiled: 2025-12-30

Docs

Latest Run

Live Results

Performance History

Performance Heatmap

Performance Improvement Summary

Performance Trend

Performance Trend Across Versions

Detailed Results

Application Profiling

Application Profiling Results

Thread Balance

CPU Time Hotspots

Analysis Summary