-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modifies converse schedeuler to prioritize NodeGroup messages #3676
base: main
Are you sure you want to change the base?
Conversation
…an run with higher priority over local As it is, nodeGroup messages are not checked until all local and regular Charm queue (prio Q) messages are checked, which cause issues when the applicaiton is using nodeGroup messages in the hope that *some* PE will attend to it quickly. The change makes getNextMessage check nodeGroup queue every 2^nodeGrpFreq iterations with high priority in addition to its usual check after exhasuting local queues (except task Q). This commit has not been tested at all. But pusing it to allow others to help me test/fix it.
One thing to worry about is whether this change causes performance degradation by making the scheduler check the nodequeue too often (depend on whether the check is expensive, even for an empty queue, because of locking). It'd be nice if someone were to run a performance regression test. |
@ericjbohm @ericmikida @ZwFink review please. |
I thought this was already merged. @ZwFink @ericjbohm .. will you please take a look? |
@lvkale this is the code we discussed during the meeting today |
Sorry for the spam, but where is the meeting usually announced? |
Should I consider all the older unresolved comments here as acceptably resolved? |
In order for this to be mergeable it should be modified to no longer be a draft and the reviewer comments should be addressed and resolved. @lvkale |
Was that performance analysis done? If so, what were the results? |
Another possibly interesting point is that we run with 6 virtual nodes per physical node on our 192-cores-per-node machine because of better communication. This seems counterintuitive with the model I think Sanjay described that intranode communication shouldn't be affected by the comm thread. Maybe there's a lot of locking going on? |
src/conv-core/convcore.C
Outdated
s->iter++; | ||
|
||
#if CMK_NODE_QUEUE_AVAILABLE | ||
// we use nodeGrpFreq == 0 to mean |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indentation
@@ -1720,7 +1722,19 @@ void CsdSchedulerState_new(CsdSchedulerState_t *s) | |||
*/ | |||
void *CsdNextMessage(CsdSchedulerState_t *s) { | |||
void *msg; | |||
if((*(s->localCounter))-- >0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new code should be below the localCounter branch, right above the CmiGetNonLocal() call. The default CsdLocalMax is 0 so normally it won't matter, but if the user says to prioritize local messages then they should come first.
@@ -1720,7 +1722,19 @@ void CsdSchedulerState_new(CsdSchedulerState_t *s) | |||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The long expository comment preceeding the CsdNextMessage definition should be updated to match the new behavior, and any other behavior that it doesn't describe, such as csdLocalMax querying the PE onnode FIFO (localQ) and scheduler queues first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even better, break up the exposition so that the explanations are adjacent to the code they are intended to describe.
Modifies converse schedeuler's getNextMessage so nodeGroup messages can run with higher priority over local
closes #3674
As it is, nodeGroup messages are not checked until all local and regular Charm queue (prio Q) messages are checked, which cause issues when the applicaiton is using nodeGroup messages in the hope that some PE will attend to it quickly. The change makes getNextMessage check nodeGroup queue every 2^nodeGrpFreq iterations with high priority in addition to its usual check after exhasuting local queues (except task Q). This commit has not been tested at all. But pusing it to allow others to help me test/fix it.