-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance regression between 1.78 and 1.79 #133082
Comments
One side effect of the PR in question is that it changes how aggressive the MIR inliner is. Since you've also commented that you have long callchains with lots of small functions, I suspect that the performance of your code is highly sensitive to inlining. If I'm right, you can emulate this effect on 1.78 by increasing the MIR inlining thresholds. The defaults are Similarly, you could try turning the MIR inliner off in 1.79 with But if the relevant optimization difference is in the precompiled standard library, you'd need to do all this with There is really not a lot we can do without a reproducer, and even with one if this boils down to "this code requires a specific MIR inliner behavior", we are likely to take action only if that can be extracted to an inliner heuristic that can stand on its own merits. |
Huh, when I heard "long callchains with lots of small functions" I was expecting this to be caused by #127113, not by #123949. I agree with Ben that it's really hard to do anything without a reproducer, though. All the complicated interactions between the MIR inliner and LLVM's inliner here make any change really hard to predict what it will actually do, and all the thresholds mean that tiny changes sometimes have no effect, and sometimes have knife-edge effects. |
This comment has been minimized.
This comment has been minimized.
(Sorry didn't notice this was open 12 hours ago) |
We were upgrading from rust 1.78 to 1.79 and later 1.81 and we discovered a performance regression in our applications in certain cases up to 10%.
In our benchmarks we were using
iai-callgrind
to measure instructions and we have seen a significant increase there.Unfortunately, I can't provide exact code as they are proprietary but in the nature they have long callchains with lots of small functions.
The same issue was reported here.
I have bisected the 1.78 and 1.79 version and found that the following change causes performance regression: 3412f01#diff-deee82aaf9baf43ab05d939355f6249fdacf8959bc0e06c9574283453f838ee9R702
Release cargo profile
Workaround
Adding
debug = 1
to the profile in release mode essentially disables this change and the performance degradation is not there anymore as expected.The text was updated successfully, but these errors were encountered: