Nothing Special   »   [go: up one dir, main page]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix data race on stats due to an access w/o acquiring the lock #573

Closed
wants to merge 1 commit into from
Closed

Conversation

danielschemmel
Copy link
Contributor

In drive_machine, the stats_state is accessed without holding the STATS_LOCK, which causes data races against other threads that mutate the stats even while they are holding the lock.

This PR proposes to keep the time that this lock is held to a minimum by acquiring the lock before the if and releasing it after potentially updating stats. It would also be possible to acquire the lock twice: once to compute the condition, and once to increment the count of rejected connections.

This behavior was detected using Symbolic Execution techniques developed in collaboration by the SYMBIOSYS research project at COMSYS, RWTH Aachen University and Cesar Rodriguez (Diffblue, Ltd.). This research is supported by the European Research Council (ERC) under the EU's Horizon 2020 Research and Innovation Programme grant agreement n. 647295 (SYMBIOSYS).

@danielschemmel
Copy link
Contributor Author

Note: This PR tackles a different issue than #575.

@dormando
Copy link
Member

On 64-bit x86 systems the dirty read there should be fine, but that is wrong for 32bit.

Throwing that global lock around every connect is a non-starter. At very least the maxconns_fast check should happen unlocked, then either atomics or a separate lock for curr_conns should be used.

@danielschemmel
Copy link
Contributor Author

I have changed the PR to only lock if maxconns_fast is false.

If you give me a few pointers how you would like the additional lock for stats_state.curr_conns and stats_state.reserved_fds to look like, I am willing to try that variant that as well. To attack this problem with atomics, we would need to combine those two variables, as I am not aware of a way to perform the addition with both variables being loaded in a single atomic step.

memcached.c Outdated
res = write(sfd, str, strlen(str));
close(sfd);
bool reject;
if(settings.maxconns_fast) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spacing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is now a space between the if and the condition.

@dormando
Copy link
Member

reserved_fds is static before any listener sockets are opened, which should make that easier. I'm not sure off the top of my head if there's a better way to fix it. The way you've patched it is probably fine since the STATS_LOCK() was there before, and removing the branch from within the lock should help keep it short.

@dormando
Copy link
Member

out in 1.5.21, thanks!

@dormando dormando closed this Jan 21, 2020
@dormando
Copy link
Member
dormando commented Mar 5, 2020

For fuck's sake, how did I miss this..

This logic completely breaks the original feature; now every connection is counted as a rejection. The maxconns test doesn't cover rejections being counted properly and I was too distracted to check the tests here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants