A system call for random numbers: getrandom()
A system call for random numbers: getrandom()
Posted Jul 25, 2014 22:13 UTC (Fri) by giraffedata (guest, #1954)In reply to: A system call for random numbers: getrandom() by jimparis
Parent article: A system call for random numbers: getrandom()
An attacker might exhaust file descriptors maliciously, just to get some software to pick a bad random number,
How would exhausting file descriptors get some software to pick a bad random number? The natural result of that would be for software that uses random numbers to refuse to continue.
But regardless of whether it's a valid expectation of the attacker, it doesn't explain why LibreSSL needs to have a fallback other than "return -1" for exhausted file descriptors. No other software does.
Posted Jul 25, 2014 23:41 UTC (Fri)
by dlang (guest, #313)
[Link] (9 responses)
if the program zeros a buffer, then tries to read random data into that buffer and doesn't check the error codes properly, the result is that it continues on with zeros instead of it's random seed.
This is an advantage for the bad guy.
Yes, in theory this is handled by properly checking all error conditions
But in practice, we all know that such checks are not always done.
Also, note that shutting down the service is a DoS that is also to the advantage of the bad guy
Posted Jul 26, 2014 1:42 UTC (Sat)
by giraffedata (guest, #1954)
[Link] (8 responses)
So that still doesn't shed any light on how the fact that file descriptors could be exhausted means LibreSSL needs a fallback method of generating random numbers. LibreSSL does check the error condition -- that's how it knows to fall back.
And yet, no other program under the sun avoids DoS attacks by working around inability to open files. In fact, the program using LibreSSL most probably uses files other than /dev/urandom, so the bad guy can kill it by exhausting file descriptors regardless of what LibreSSL does.
It looks to me like the article is simply mistaken about the relevance of file descriptor exhaustion attacks. I think the reason LibreSSL has alternatives to /dev/urandom is that /dev/urandom might just be broken or not implemented on that system.
Posted Jul 26, 2014 4:03 UTC (Sat)
by jake (editor, #205)
[Link] (7 responses)
so, this comment that was quoted in the article:
> or consider providing a new failsafe API which
(which comes from the LibreSSL source) was not enough to convince you that the LibreSSL folks (at least) are worried about file descriptor exhaustion?
> I think the reason LibreSSL has alternatives to /dev/urandom is
interesting, but it certainly isn't what they *say* ...
jake
Posted Jul 26, 2014 15:55 UTC (Sat)
by giraffedata (guest, #1954)
[Link] (6 responses)
OK, I missed that. So the article is not mistaken. It's more like the developers were really confused, thinking it's worth adding a whole new system call to the kernel just to make a program progress a little further before succumbing to file descriptor exhaustion. Or there's some totally nonobvious attack vector I'm missing.
(I do understand that there are other, sensible, reasons to have getrandom()).
Posted Jul 26, 2014 21:18 UTC (Sat)
by dlang (guest, #313)
[Link]
well, that sort of thinking is par for the course for people who get tightly absorbed into security thinking. They start to see the small things that can fail and forget that the overall system is probably going to be down first.
Posted Jul 27, 2014 11:57 UTC (Sun)
by gioele (subscriber, #61675)
[Link] (4 responses)
Is it that hard to create a side program that uses some technique to force the exhaustion of fds during the entropy gathering (to create some weakness in a cryptographical step) and then stops, leaving the attacked programs with plenty of fds, as if nothing ever happened?
Posted Jul 27, 2014 16:11 UTC (Sun)
by giraffedata (guest, #1954)
[Link] (3 responses)
It doesn't matter because even if it's possible to create such a program, it's impossible for it to achieve its goal of creating weakness in a cryptographic step if LibreSSL refuses to proceed when the open of /dev/urandom fails.
That's what we've been talking about: the design choice of LibreSSL refusing to proceed in that case (the easy, natural, conventional thing to do) versus getting random numbers in some way that doesn't require file descriptors (which involves wishing for a new kind of system call) and proceeding.
Posted Jul 27, 2014 17:49 UTC (Sun)
by jimparis (guest, #38647)
[Link] (2 responses)
But what does "refuse to proceed" mean? Return an easily-ignored error code? Terminate the process? Sit in a busy loop? You'll get different answers based on who you ask. I generally agree with your point, but it's not as simple as you make it out to be. Making it so that the problem can never occur is just another way of fixing it.
Posted Jul 28, 2014 22:50 UTC (Mon)
by giraffedata (guest, #1954)
[Link] (1 responses)
It really doesn't matter that there are options, because at least one of them is an entirely reasonable response to a catastrophic failure such as file descriptor exhaustion - a more reasonable response than designing a new kernel interface or computing entropy some other way. As a practical matter, I think it's obvious in this case that "refuse to proceed" should just mean "return -1" when the open fails, which would ultimately cause the LibreSSL to return failure to the user instead of creating a connection. The user can ignore that failure, but there's no way he can leak private information to an eavesdropper over a connection that does not exist.
I'm really just asking why would a developer single out this one particular catastrophic failure for heroic action to avoid it? I'll bet the same code allocates memory various places and just "refuses to proceed" if the allocation fails. And at some point it creates a socket and likely just "refuses to proceed" if it fails because of file descriptor exhaustion. Maybe it even uses a temporary file somewhere, and just "refuses to proceed" if the filesystem is full.
Posted Jul 28, 2014 23:13 UTC (Mon)
by jimparis (guest, #38647)
[Link]
A system call for random numbers: getrandom()
A system call for random numbers: getrandom()
if the program zeros a buffer, then tries to read random data into that buffer and doesn't check the error codes properly, the result is that it continues on with zeros instead of it's random seed. ...
in practice, we all know that such checks are not always done.
Also, note that shutting down the service is a DoS that is also to the advantage of the bad guy
A system call for random numbers: getrandom()
> relevance of file descriptor exhaustion attacks.
> works in a chroot or when file descriptors are exhausted.
> that /dev/urandom might just be broken or not implemented on that
> system.
A system call for random numbers: getrandom()
so, this comment that was quoted in the article:
or consider providing a new failsafe API which
works in a chroot or when file descriptors are exhausted.
(which comes from the LibreSSL source) was not enough to convince you that the LibreSSL folks (at least) are worried about file descriptor exhaustion?
A system call for random numbers: getrandom()
A system call for random numbers: getrandom()
A system call for random numbers: getrandom()
Is it that hard to create a side program that uses some technique to
force the exhaustion of fds during the entropy gathering (to create some
weakness in a cryptographical step) and then stops, leaving the attacked
programs with plenty of fds, as if nothing ever happened?
A system call for random numbers: getrandom()
A system call for random numbers: getrandom()
But what does "refuse to proceed" mean? Return an easily-ignored error code? Terminate the process? Sit in a busy loop? You'll get different answers based on who you ask.
Making it so that the problem can never occur is just another way of fixing it.
A system call for random numbers: getrandom()
As a practical matter, I think it's obvious in this case that "refuse to proceed" should just mean "return -1" when the open fails, which would ultimately cause the LibreSSL to return failure to the user instead of creating a connection.
This has nothing to do with "creating a connection"; existing code calls RAND_bytes() all the time for all sorts of things and doesn't always check the return code.
I'm really just asking why would a developer single out this one particular catastrophic failure for heroic action to avoid it?
Because this is only a problem on Linux. Because the discussion was triggered by an article entitled LibreSSL's PRNG is Unsafe on Linux. Because, as a developer points out in the comments there, "we really want to see linux provide the getentropy() syscall, which fixes all the mentioned issues."