Nothing Special   »   [go: up one dir, main page]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[prometheus-kube-stack] Grafana is not persistent #436

Closed
ofiryy opened this issue Dec 1, 2020 · 22 comments
Closed

[prometheus-kube-stack] Grafana is not persistent #436

ofiryy opened this issue Dec 1, 2020 · 22 comments
Labels
bug Something isn't working lifecycle/stale

Comments

@ofiryy
Copy link
ofiryy commented Dec 1, 2020

Describe the bug
I installed the prometheus-community/kube-prometheus-stack chart.
and then I defined panels and alerts on grafana.
when I delete the grafana pod - all the data is deleted from grafana - there is no persistency.
I wanted to use this solution: prometheus-operator/prometheus-operator#2558 (comment)
but to my surprise - no pv or pvc was created by the prometheus-kube-stack chart.

how can I make my Grafana persistent ?

Version of Helm and Kubernetes:

Helm Version:

$ helm version
version.BuildInfo{Version:"v3.0.3", GitCommit:"ac925eb7279f4a6955df663a0128044a8a6b7593", GitTreeState:"clean", GoVersion:"go1.13.6"}

Kubernetes Version:

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.0", GitCommit:"641856db18352033a0d96dbc99153fa3b27298e5", GitTreeState:"clean", BuildDate:"2019-03-25T15:53:57Z", GoVersion:"go1.12.1", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"16+", GitVersion:"v1.16.13-eks-2ba888", GitCommit:"2ba888155c7f8093a1bc06e3336333fbdb27b3da", GitTreeState:"clean", BuildDate:"2020-07-17T18:48:53Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}

Which chart: prometheus-kube-stack

Which version of the chart: 12.3.0

How to reproduce it (as minimally and precisely as possible): Install prometheus-kube-stack and define a panel in grafana, then delete the grafana pod

@ofiryy ofiryy added the bug Something isn't working label Dec 1, 2020
@survivant
Copy link
Contributor

There is a setting for prometheus and alertmanager

storage:
        volumeClaimTemplate:
          spec:
            accessModes: [ "ReadWriteOnce" ]
            storageClassName: sc-mirror
            resources:
              requests:
                storage: 300Mi

I think we should have the same for grafana ?

@totoroot
Copy link
totoroot commented Dec 2, 2020

This is not a bug as data persistence is not enabled by default. You can either claim a PersistentVolume in your custom values.yaml file like @survivant suggested or export your dashboards as JSON definition files and create a ConfigMap with the JSON-formatted data for each custom dashboard. This way with each new release of the stack via helm, the modifications within Grafana do not persist but your exported dashboards get redeployed with everything else.

@blademainer
Copy link

@ofiryy I update the yaml: values.yaml add:

grafana:
  persistence:
    enabled: true

to fix the grafana persistent problem.

@survivant
Copy link
Contributor

@blademainer but we still can't choose our storage class

@BertelBB
Copy link
BertelBB commented Dec 8, 2020

@survivant prometheus-community/kube-prometheus-stack chart uses the grafana/grafana chart as a dependency. So any values you can pass to grafana/grafana you can pass to the grafana object in this chart. Or am I misunderstanding the issue being raised?

This works for me

grafana:
  enabled: true

  persistence:
    enabled: true
    type: pvc
    storageClassName: default
    accessModes:
    - ReadWriteOnce
    size: 4Gi
    finalizers:
    - kubernetes.io/pvc-protection

@survivant
Copy link
Contributor

@BertelBB thank you. I don't know what I did wrong at that time.. but it works fine. Now, I need to find a workaround for #437

@mkoziel2000
Copy link
mkoziel2000 commented Dec 9, 2020

I'm not sure the workflow of expecting all the grafana settings to get zapped on the next stopping of a pod has got the best interests of the enterprise in mind. I get the argument of exporting the charts as JSON and storing into configMaps to make them deployment agnostic, but there are other settings not related to charts that we don't want to have disappear when a pod crashes either (such as user login information, settings around alerting, and so forth). So, unless there is a best practice for storing all of that into configmaps as well (and a good user UI for how to do that, which doesn't require kubectl and a Kubernetes admin), it seems shortsighted to think that Grafana can live in an enterprise environment as an application that doesn't require persistence. It seems the opposite would be true.

I too am wringing out the kinks of my Prometheus install and ran into this exact same problem of grafana not supporting persistence out of the box. It was rather alarming to learn that after I began building out dashboards, I lost that work when I tested out the failover scenarios of the pod going down. I did not see a persistence piece in the grafana part of the values.yaml and didn't know that this would turn grafana into an app with a temporary persistence layer.

In hind sight, I should have done my pod failover test first before beginning to "persist" data in grafana to learn about this annoying default. I do wish that the helm chart can be upgraded to have a section under grafana that allows the ability to define the persistence layer...even if its commented out.

@stale
Copy link
stale bot commented Jan 8, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.

@stale stale bot added the lifecycle/stale label Jan 8, 2021
@stale
Copy link
stale bot commented Jan 22, 2021

This issue is being automatically closed due to inactivity.

@stale stale bot closed this as completed Jan 22, 2021
@AndrewGrachov
Copy link

For anyone who's looking - kube-prometheus stack uses values from

grafana chart

Probably this should be included in docs..

@darox
Copy link
darox commented Jun 24, 2021

@survivant prometheus-community/kube-prometheus-stack chart uses the grafana/grafana chart as a dependency. So any values you can pass to grafana/grafana you can pass to the grafana object in this chart. Or am I misunderstanding the issue being raised?

This works for me

grafana:
  enabled: true

  persistence:
    enabled: true
    type: pvc
    storageClassName: default
    accessModes:
    - ReadWriteOnce
    size: 4Gi
    finalizers:
    - kubernetes.io/pvc-protection

I have used your code snippet, but I'm facing issue:
Warning FailedScheduling 6s (x5 over 83s) default-scheduler 0/1 nodes are available: 1 pod has unbound immediate PersistentVolumeClaims.

k get pods -n prometheus                                        
NAME                                                     READY   STATUS    RESTARTS   AGE
alertmanager-prometheus-kube-prometheus-alertmanager-0   2/2     Running   0          19h
prometheus-grafana-5d9946dff9-4ffgc                      0/2     Pending   0          2m10s
prometheus-grafana-669fbc79f9-dmmhk                      2/2     Running   0          3m58s
prometheus-kube-prometheus-operator-85ccf48856-q8n68     1/1     Running   0          12m
prometheus-kube-state-metrics-6dc7f98565-twkxk           1/1     Running   0          12m
prometheus-prometheus-kube-prometheus-prometheus-0       2/2     Running   1          19h
prometheus-prometheus-node-exporter-bgqtb                1/1     Running   0          19h

I wonder how I can fix it.

@BertelBB
Copy link
BertelBB commented Jun 24, 2021

@darox The issues is that your PVC is already bound to the pod prometheus-grafana-669fbc79f9-dmmhk, so the new grafana pod cannot claim the PV and therefore fails to start.

A quick fix would be to delete the ReplicaSet for the older grafana pod, i.e. kubectl delete rs prometheus-grafana-669fbc79f9 -n prometheus.

A permanent fix would be to make sure that two grafana pods cannot be running at the same time. So your rolling update strategy should ensure that when a grafana upgrade is in progress, the scheduler first kills the old pod before starting the new one. I'm no expert in update strategies, but I think this should work

EDIT: Previous strategy was wrong, this one works.

grafana:
  deploymentStrategy:
    type: Recreate

This strategy will ensure the old grafana pod is terminated before starting a new one, which will result in a short downtime for Grafana during upgrades.

@darox
Copy link
darox commented Jun 24, 2021

I have applied your recommendations:

k get pods -n prometheus                                                                       
NAME                                                     READY   STATUS    RESTARTS   AGE
alertmanager-prometheus-kube-prometheus-alertmanager-0   2/2     Running   0          18s
prometheus-grafana-6fb7f46b9c-5ph99                      0/2     Pending   0          22s
prometheus-kube-prometheus-operator-548f79bb9-hskjx      1/1     Running   0          22s
prometheus-kube-state-metrics-5b8f9bdbbd-tr8vq           1/1     Running   0          22s
prometheus-prometheus-kube-prometheus-prometheus-0       2/2     Running   1          18s
prometheus-prometheus-node-exporter-k9nzm                1/1     Running   0          22s
Events:
  Type     Reason            Age                 From               Message
  ----     ------            ----                ----               -------
  Warning  FailedScheduling  24s (x6 over 111s)  default-scheduler  0/1 nodes are available: 1 pod has unbound immediate PersistentVolumeClaims.

@BertelBB
Copy link

@darox is the prometheus-grafana (default name) PVC marked as Bound and if so what pod is it being used by?

kubectl get pvc -n prometheus
kubectl describe pvc -n prometheus prometheus-grafana (replace name if needed)

Do you in fact have a default StorageClass?

kubectl get sc

@darox
Copy link
darox commented Jun 28, 2021

It worked with:

grafana:
  deploymentStrategy:
    type: Recreate
  persistence:
    enabled: true
    type: pvc
    storageClassName: hostpath
    accessModes:
    - ReadWriteOnce
    size: 4Gi
    finalizers:
    - kubernetes.io/pvc-protection  
k get pvc -n prometheus                                                 
NAME                                                                                                     STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
prometheus-grafana                                                                                       Bound    pvc-d7ec8849-db92-4ec7-a465-f7ff67e414cb   4Gi        RWO            hostpath       41s
prometheus-prometheus-kube-prometheus-prometheus-db-prometheus-prometheus-kube-prometheus-prometheus-0   Bound    pvc-4009c793-9d44-4a7b-ab4b-13af00c513ad   5Gi        RWO            hostpath       4d2h

Thanks a lot for your support :)

@kamilgregorczyk
Copy link

For some reason it doesn't work for me, got such values:

## helm upgrade --install prometheus prometheus-community/kube-prometheus-stack --values values.yml

kube-state-metrics:
  image:
    repository: k8s.gcr.io/kube-state-metrics-arm64
    tag: v1.9.5
prometheus:
  prometheusSpec:
    podMonitorSelectorNilUsesHelmValues: false
    serviceMonitorSelectorNilUsesHelmValues: false

grafana:
  adminPassword: xxx
  deploymentStrategy:
    type: Recreate
  enabled: true
  persistance:
    enabled: true
    type: pvc
    storageClassName: default
    accessModes:
      - ReadWriteOnce
    size: 4Gi
    finalizers:
      - kubernetes.io/pvc-protection
  grafana.ini:
    server:
      domain: xxx
      root_url: xxx
    auth.google:
      enabled: true
      client_id: xxx
      client_secret: xxx
      scopes: https://www.googleapis.com/auth/userinfo.profile https://www.googleapis.com/auth/userinfo.email
      auth_url: https://accounts.google.com/o/oauth2/auth
      token_url: https://accounts.google.com/o/oauth2/token
      allowed_domains: gmail.com
      allow_sign_up: false
    paths:
      data: /var/lib/grafana/data
      logs: /var/log/grafana
      plugins: /var/lib/grafana/plugins
      provisioning: /etc/grafana/provisioning
    analytics:
      check_for_updates: true
    log:
      mode: console
    grafana_net:
      url: https://grafana.net

and after doing an upgrade no PVCs are created, I also tried just this for Grafana and still no luck

grafana:
  adminPassword: xxx
  enabled: true
  persistance:
    enabled: true
➜  prometheus git:(master) ✗ kubectl get pvc
NAME                         STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
data-postgres-postgresql-0   Bound    pvc-e099d418-73d4-49ed-8232-e829e418c6b4   8Gi        RWO            nfs-client     453d
docker-registry              Bound    pvc-0e726761-fdfd-454d-86a6-36002c37ac3b   30Gi       RWO            nfs-client     149d
streaming-pvc-streaming-0    Bound    pvc-cb97df2f-7e99-47c8-80e0-c215381ee672   20Gi       RWO            nfs-client     135d
streaming-pvc-streaming-1    Bound    pvc-2240cdb5-7ab4-4aee-99c5-45696e4100bb   20Gi       RWO            nfs-client     135d
streaming-pvc-streaming-2    Bound    pvc-7f65bb1f-97ec-4890-9b29-6ba36f470cfe   20Gi       RWO            nfs-client     135d

@AwateAkshay
Copy link

can anyone help me with dashboard location ? I have added above values.yaml for persistence, the volume is bound. But when i restart pod, dashboards wont come up ? kamilgregorczyk

@UrosCvijan
Copy link
UrosCvijan commented Dec 30, 2021

Hi @AwateAkshay ,

did you solve your problem? I am having the same issue. I can see my dashboards when I get into grafana container, but they are not present in the grafana itself.

@AwateAkshay
Copy link

@UrosCvijan exec into grafana pod, you will see grafana.db file which is SQLite DB. Inside you can see your dashboard.

@sakulh
Copy link
sakulh commented Mar 31, 2022

@kamilgregorczyk not "persistance" but "persistence":

grafana:
  adminPassword: xxx
  enabled: true
  persistence:
    enabled: true

@asher-lab
Copy link

For some reason it doesn't work for me, got such values:

## helm upgrade --install prometheus prometheus-community/kube-prometheus-stack --values values.yml

kube-state-metrics:
  image:
    repository: k8s.gcr.io/kube-state-metrics-arm64
    tag: v1.9.5
prometheus:
  prometheusSpec:
    podMonitorSelectorNilUsesHelmValues: false
    serviceMonitorSelectorNilUsesHelmValues: false

grafana:
  adminPassword: xxx
  deploymentStrategy:
    type: Recreate
  enabled: true
  persistance:
    enabled: true
    type: pvc
    storageClassName: default
    accessModes:
      - ReadWriteOnce
    size: 4Gi
    finalizers:
      - kubernetes.io/pvc-protection
  grafana.ini:
    server:
      domain: xxx
      root_url: xxx
    auth.google:
      enabled: true
      client_id: xxx
      client_secret: xxx
      scopes: https://www.googleapis.com/auth/userinfo.profile https://www.googleapis.com/auth/userinfo.email
      auth_url: https://accounts.google.com/o/oauth2/auth
      token_url: https://accounts.google.com/o/oauth2/token
      allowed_domains: gmail.com
      allow_sign_up: false
    paths:
      data: /var/lib/grafana/data
      logs: /var/log/grafana
      plugins: /var/lib/grafana/plugins
      provisioning: /etc/grafana/provisioning
    analytics:
      check_for_updates: true
    log:
      mode: console
    grafana_net:
      url: https://grafana.net

and after doing an upgrade no PVCs are created, I also tried just this for Grafana and still no luck

grafana:
  adminPassword: xxx
  enabled: true
  persistance:
    enabled: true
➜  prometheus git:(master) ✗ kubectl get pvc
NAME                         STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
data-postgres-postgresql-0   Bound    pvc-e099d418-73d4-49ed-8232-e829e418c6b4   8Gi        RWO            nfs-client     453d
docker-registry              Bound    pvc-0e726761-fdfd-454d-86a6-36002c37ac3b   30Gi       RWO            nfs-client     149d
streaming-pvc-streaming-0    Bound    pvc-cb97df2f-7e99-47c8-80e0-c215381ee672   20Gi       RWO            nfs-client     135d
streaming-pvc-streaming-1    Bound    pvc-2240cdb5-7ab4-4aee-99c5-45696e4100bb   20Gi       RWO            nfs-client     135d
streaming-pvc-streaming-2    Bound    pvc-7f65bb1f-97ec-4890-9b29-6ba36f470cfe   20Gi       RWO            nfs-client     135d

Thanks for posting your code, it helped me debug how to add environmental variables in kube prometheus stack. Now I know that syntax.

@aksh-sood
Copy link
aksh-sood commented Sep 1, 2023

I tried the above methods and it got the pv created . But the pod failed to start as an initContainer chownData fails to start even after multiple tries. I followed the following issue 752 and set the initChownData to false.
Now the grafana pod starting running and i am able to access the dashboard but the logs of grafana pod show error="database is locked"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working lifecycle/stale
Projects
None yet
Development

No branches or pull requests