Modifies converse schedeuler to prioritize NodeGroup messages #3676

lvkale · 2022-12-03T17:03:51Z

Modifies converse schedeuler's getNextMessage so nodeGroup messages can run with higher priority over local
closes #3674
As it is, nodeGroup messages are not checked until all local and regular Charm queue (prio Q) messages are checked, which cause issues when the applicaiton is using nodeGroup messages in the hope that some PE will attend to it quickly. The change makes getNextMessage check nodeGroup queue every 2^nodeGrpFreq iterations with high priority in addition to its usual check after exhasuting local queues (except task Q). This commit has not been tested at all. But pusing it to allow others to help me test/fix it.

…an run with higher priority over local As it is, nodeGroup messages are not checked until all local and regular Charm queue (prio Q) messages are checked, which cause issues when the applicaiton is using nodeGroup messages in the hope that *some* PE will attend to it quickly. The change makes getNextMessage check nodeGroup queue every 2^nodeGrpFreq iterations with high priority in addition to its usual check after exhasuting local queues (except task Q). This commit has not been tested at all. But pusing it to allow others to help me test/fix it.

lvkale · 2022-12-04T16:59:30Z

One thing to worry about is whether this change causes performance degradation by making the scheduler check the nodequeue too often (depend on whether the check is expensive, even for an empty queue, because of locking). It'd be nice if someone were to run a performance regression test.

lvkale · 2022-12-13T16:17:26Z

@ericjbohm @ericmikida @ZwFink review please.

src/conv-core/convcore.C

lvkale · 2023-11-01T22:27:11Z

I thought this was already merged. @ZwFink @ericjbohm .. will you please take a look?

nilsdeppe · 2023-12-11T17:14:04Z

@lvkale this is the code we discussed during the meeting today

jbakosi · 2023-12-11T17:36:27Z

@lvkale this is the code we discussed during the meeting today

Sorry for the spam, but where is the meeting usually announced?

ericjbohm · 2023-12-11T17:40:41Z

Should I consider all the older unresolved comments here as acceptably resolved?

ericjbohm · 2024-01-17T19:39:50Z

In order for this to be mergeable it should be modified to no longer be a draft and the reviewer comments should be addressed and resolved. @lvkale

ericjbohm · 2024-04-17T20:13:47Z

One thing to worry about is whether this change causes performance degradation by making the scheduler check the nodequeue too often (depend on whether the check is expensive, even for an empty queue, because of locking). It'd be nice if someone were to run a performance regression test.

Was that performance analysis done? If so, what were the results?

src/conv-core/convcore.C

src/conv-core/converse.h

ZwFink · 2024-05-03T20:20:32Z

One thing to worry about is whether this change causes performance degradation by making the scheduler check the nodequeue too often (depend on whether the check is expensive, even for an empty queue, because of locking). It'd be nice if someone were to run a performance regression test.

Was that performance analysis done? If so, what were the results?

Results of a benchmark that has one chare sending 10 messages to itself 500,000 times (SMP mode, higher is better). The entry method just increments a counter to track whether it should send another 10 messages to itself, so it should stress the queueing system. This result shows an overhead of $0.44$% averaged over 50 runs. This translates to the overhead of a few $ns$ per message. Any realistic, small tasks are on the order of a few $us$ per message. I think we should pay more attention to the fact that the scheduler can only churn through ~1.2million local entry methods per second than the addition of a few nanoseconds of overhead.

nilsdeppe · 2024-05-03T20:49:48Z

Another possibly interesting point is that we run with 6 virtual nodes per physical node on our 192-cores-per-node machine because of better communication. This seems counterintuitive with the model I think Sanjay described that intranode communication shouldn't be affected by the comm thread. Maybe there's a lot of locking going on?

src/conv-core/convcore.C

stwhite91 · 2024-05-06T12:41:23Z

src/conv-core/convcore.C

+	s->iter++;
+
+#if CMK_NODE_QUEUE_AVAILABLE
+ // we use nodeGrpFreq == 0 to mean


Indentation

src/conv-core/converse.h

jcphill · 2024-05-06T15:20:42Z

src/conv-core/convcore.C

@@ -1720,7 +1722,19 @@ void CsdSchedulerState_new(CsdSchedulerState_t *s)
 */
 void *CsdNextMessage(CsdSchedulerState_t *s) {
 	void *msg;
-	if((*(s->localCounter))-- >0)


The new code should be below the localCounter branch, right above the CmiGetNonLocal() call. The default CsdLocalMax is 0 so normally it won't matter, but if the user says to prioritize local messages then they should come first.

jcphill · 2024-05-06T16:45:23Z

src/conv-core/convcore.C

@@ -1720,7 +1722,19 @@ void CsdSchedulerState_new(CsdSchedulerState_t *s)
 */


The long expository comment preceeding the CsdNextMessage definition should be updated to match the new behavior, and any other behavior that it doesn't describe, such as csdLocalMax querying the PE onnode FIFO (localQ) and scheduler queues first.

Even better, break up the exposition so that the explanations are adjacent to the code they are intended to describe.

Laxmikant Kale and others added 2 commits December 3, 2022 09:46

fixed syntax error (missing right parent).

800aa6b

lvkale requested review from epmikida, ericjbohm and ZwFink December 4, 2022 16:53

ZwFink reviewed Dec 13, 2022

View reviewed changes

src/conv-core/convcore.C Outdated Show resolved Hide resolved

src/conv-core/convcore.C Show resolved Hide resolved

src/conv-core/convcore.C Outdated Show resolved Hide resolved

ZwFink added 3 commits April 9, 2024 12:50

Merge branch 'main' into nodegrpPrio

ed823ff

Addressing changes

62b0569

Merge remote-tracking branch 'origin/main' into nodegrpPrio

028cb95

ZwFink changed the title ~~Modifies converse schedeuler to prioritize so nodeGroup messages~~ Modifies converse schedeuler to prioritize NodeGroup messages Apr 17, 2024

ZwFink marked this pull request as ready for review April 17, 2024 19:35

ZwFink self-requested a review April 17, 2024 20:03

ZwFink approved these changes Apr 17, 2024

View reviewed changes

stwhite91 reviewed Apr 19, 2024

View reviewed changes

src/conv-core/convcore.C Outdated Show resolved Hide resolved

stwhite91 reviewed Apr 19, 2024

View reviewed changes

src/conv-core/converse.h Outdated Show resolved Hide resolved

ZwFink added 5 commits April 29, 2024 17:07

Merge branch 'main' into nodegrpPrio

7b3b414

Merge branch 'main' into nodegrpPrio

df3e677

initialize scheduler iter

0f31dd0

remove extra whitespace

5f3e5b2

Remove text after the parenthetical

7342221

stwhite91 reviewed May 6, 2024

View reviewed changes

jcphill requested changes May 6, 2024

View reviewed changes

jcphill reviewed May 6, 2024

View reviewed changes

ZwFink added 6 commits May 28, 2024 17:13

Merge branch 'main' into nodegrpPrio

f39d307

Remove long comment

22fdc83

Re-organize, fix whitespace

83e904f

fix merge conflict

f3eb123

Normalize indentation

bc5f37c

More whitespace

3b0ebbd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Modifies converse schedeuler to prioritize NodeGroup messages #3676

Modifies converse schedeuler to prioritize NodeGroup messages #3676

		@@ -1720,7 +1722,19 @@ void CsdSchedulerState_new(CsdSchedulerState_t *s)
		*/

Modifies converse schedeuler to prioritize NodeGroup messages #3676

Are you sure you want to change the base?

Modifies converse schedeuler to prioritize NodeGroup messages #3676

Conversation

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment