Consider setting -O3 in Makefile/CMakeLists.txt for BASOP
Basic info
In theory, since BASOP does not use floating point operations, enabling an optimization level should not harm anything but only serve to speedup the code.
Sample with one of the worst case renderer tests (outputs were BE):
Default Makefile setting -O0:
❯ hyperfine "./IVAS_rend_O0 -i scripts/testv/stv4ISM48n.wav -im scripts/testv/stvISM1.csv scripts/testv/stvISM2.csv scripts/testv/stvISM3.csv scripts/testv/stvISM4.csv -o tmp_O0.wav -if ISM4 -of BINAURAL_ROOM_IR -fr 5 -T scripts/trajectories/full_circle_in_15s.csv"
Benchmark 1: ./IVAS_rend_O0 -i scripts/testv/stv4ISM48n.wav -im scripts/testv/stvISM1.csv scripts/testv/stvISM2.csv scripts/testv/stvISM3.csv scripts/testv/stvISM4.csv -o tmp_O0.wav -if ISM4 -of BINAURAL_ROOM_IR -fr 5 -T scripts/trajectories/full_circle_in_15s.csv
Time (mean ± σ): 160.463 s ± 0.282 s [User: 160.440 s, System: 0.015 s]
Range (min … max): 160.189 s … 161.101 s 10 runs
Modified with -O3:
❯ hyperfine "./IVAS_rend_O3 -i scripts/testv/stv4ISM48n.wav -im scripts/testv/stvISM1.csv scripts/testv/stvISM2.csv scripts/testv/stvISM3.csv scripts/testv/stvISM4.csv -o tmp_O3.wav -if ISM4 -of BINAURAL_ROOM_IR -fr 5 -T scripts/trajectories/full_circle_in_15s.csv"
Benchmark 1: ./IVAS_rend_O3 -i scripts/testv/stv4ISM48n.wav -im scripts/testv/stvISM1.csv scripts/testv/stvISM2.csv scripts/testv/stvISM3.csv scripts/testv/stvISM4.csv -o tmp_O3.wav -if ISM4 -of BINAURAL_ROOM_IR -fr 5 -T scripts/trajectories/full_circle_in_15s.csv
Time (mean ± σ): 33.634 s ± 0.182 s [User: 33.578 s, System: 0.010 s]
Range (min … max): 33.350 s … 34.087 s 10 runs
Modified with -O2:
❯ hyperfine "./IVAS_rend_O2 -i scripts/testv/stv4ISM48n.wav -im scripts/testv/stvISM1.csv scripts/testv/stvISM2.csv scripts/testv/stvISM3.csv scripts/testv/stvISM4.csv -o tmp_O2.wav -if ISM4 -of BINAURAL_ROOM_IR -fr 5 -T scripts/trajectories/full_circle_in_15s.csv"
Benchmark 1: ./IVAS_rend_O2 -i scripts/testv/stv4ISM48n.wav -im scripts/testv/stvISM1.csv scripts/testv/stvISM2.csv scripts/testv/stvISM3.csv scripts/testv/stvISM4.csv -o tmp_O2.wav -if ISM4 -of BINAURAL_ROOM_IR -fr 5 -T scripts/trajectories/full_circle_in_15s.csv
Time (mean ± σ): 40.028 s ± 0.299 s [User: 39.969 s, System: 0.013 s]
Range (min … max): 39.579 s … 40.529 s 10 runs
Modified with -O1:
❯ hyperfine "./IVAS_rend_O1 -i scripts/testv/stv4ISM48n.wav -im scripts/testv/stvISM1.csv scripts/testv/stvISM2.csv scripts/testv/stvISM3.csv scripts/testv/stvISM4.csv -o tmp_O1.wav -if ISM4 -of BINAURAL_ROOM_IR -fr 5 -T scripts/trajectories/full_circle_in_15s.csv"
Benchmark 1: ./IVAS_rend_O1 -i scripts/testv/stv4ISM48n.wav -im scripts/testv/stvISM1.csv scripts/testv/stvISM2.csv scripts/testv/stvISM3.csv scripts/testv/stvISM4.csv -o tmp_O1.wav -if ISM4 -of BINAURAL_ROOM_IR -fr 5 -T scripts/trajectories/full_circle_in_15s.csv
Time (mean ± σ): 53.935 s ± 0.310 s [User: 54.079 s, System: 0.014 s]
Range (min … max): 53.154 s … 54.414 s 10 runs
System specs: Debian 12, gcc 12.2.0, Intel(R) Core(TM) i7-10700 CPU @ 2.90GHz
Edited by Archit Tamarapu