You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Recently I checked Profile-Guided Optimization (PGO) improvements on multiple projects. The results are here. Since PGO helps with achieving better performance in many projects I think trying to optimize difftastic with PGO can be a good idea.
I already did some benchmarks and want to share my results.
Test environment
Fedora 38
Linux kernel 6.5.6
AMD Ryzen 9 5900x
48 Gib RAM
SSD Samsung 980 Pro 2 Tib
Compiler - Rustc 1.59
Difftastic version: the latest for now from the master branch on commit 21ed3ec48b383511b08ffe20cc91697af8f64d78
Disabled Turbo boost
Benchmark
For benchmark purposes, I use difft difftastic/sample_files/dir_before/ difftastic/sample_files/dir_after/ as a usual way for using difftastic in practice. For the training PGO phase, I use completely the same command. The release version is built with cargo pgo --release, and PGO (instrumentation and optimization phases) are done with cargo-pgo.
Results
I got the following results:
hyperfine --warmup 10 --min-runs 50 './difft_release ../difftastic/sample_files/dir_before/ ../difftastic/sample_files/dir_after/ > /dev/null' './difft_optimized ../difftastic/sample_files/dir_before/ ../difftastic/sample_files/dir_after/ > /dev/null'
Benchmark 1: ./difft_release ../difftastic/sample_files/dir_before/ ../difftastic/sample_files/dir_after/ > /dev/null
Time (mean ± σ): 384.2 ms ± 5.2 ms [User: 288.5 ms, System: 126.8 ms]
Range (min … max): 373.6 ms … 396.9 ms 50 runs
Benchmark 2: ./difft_optimized ../difftastic/sample_files/dir_before/ ../difftastic/sample_files/dir_after/ > /dev/null
Time (mean ± σ): 354.7 ms ± 4.1 ms [User: 257.7 ms, System: 127.3 ms]
Range (min … max): 347.0 ms … 362.7 ms 50 runs
Summary
./difft_optimized ../difftastic/sample_files/dir_before/ ../difftastic/sample_files/dir_after/ > /dev/null ran
1.08 ± 0.02 times faster than ./difft_release ../difftastic/sample_files/dir_before/ ../difftastic/sample_files/dir_after/ > /dev/null
At least in the scenario above, PGO helps with optimizing performance.
Further steps
I can suggest the following action points:
Perform more PGO benchmarks on difftastic. If it shows improvements - add a note about possible improvements in difftastic's performance with PGO.
Providing an easier way (e.g. a build option) to build scripts with PGO can be helpful for the end-users and maintainers since they will be able to optimize difftastic according to their own workloads.
Optimize pre-built binaries
Testing Post-Link Optimization techniques (like LLVM BOLT) would be interesting too (Clang and Rustc already use BOLT as an addition to PGO) but I recommend starting from the usual PGO.
Here are some examples of how PGO optimization is integrated in other projects:
Hi!
Recently I checked Profile-Guided Optimization (PGO) improvements on multiple projects. The results are here. Since PGO helps with achieving better performance in many projects I think trying to optimize difftastic with PGO can be a good idea.
I already did some benchmarks and want to share my results.
Test environment
master
branch on commit21ed3ec48b383511b08ffe20cc91697af8f64d78
Benchmark
For benchmark purposes, I use
difft difftastic/sample_files/dir_before/ difftastic/sample_files/dir_after/
as a usual way for using difftastic in practice. For the training PGO phase, I use completely the same command. The release version is built withcargo pgo --release
, and PGO (instrumentation and optimization phases) are done with cargo-pgo.Results
I got the following results:
where
difft_release
- Release binary,difft_optimized
- Release + PGO binary.Regarding binary sizes:
At least in the scenario above, PGO helps with optimizing performance.
Further steps
I can suggest the following action points:
Testing Post-Link Optimization techniques (like LLVM BOLT) would be interesting too (Clang and Rustc already use BOLT as an addition to PGO) but I recommend starting from the usual PGO.
Here are some examples of how PGO optimization is integrated in other projects:
configure
scriptThe text was updated successfully, but these errors were encountered: