-
Notifications
You must be signed in to change notification settings - Fork 527
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kubernetes and talos services crashing under memory pressure #8123
Comments
Node conditions on crash. |
Talos services has some cgroup reservation, so it'd be nice to see the logs around the crash, as it might be something else. btw the conditions look good |
Andrey recently did additional testing and protections for this in talos 1.8.0 (or maybe it's coming in 1.9.0). If you are still running into this problem with newer versions of Talos please let us know. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Bug Report
Description
One of our worker nodes crashes rarely. Both
kubelet
andapid
. Sinceapid
also crashes, we have not yet been able to collect any logs.The problem is solved by restarting the node.
Logs
Not able to receive any yet, but the node get's under
DiskPressure
andMemoryPressure
.We are in the process of implementing some form of log collection and will provide logs asap.
Environment
The text was updated successfully, but these errors were encountered: