Nothing Special   »   [go: up one dir, main page]

Page MenuHomePhabricator

Cache invalidation based on usage tracking of Data: pages
Closed, ResolvedPublic3 Estimated Story Points

Description

This work will live mostly in JsonConfig with a portion in Chart, and is being done as part of the Charts taskforce.

Implementation for the cache invalidation based on global Data: page usage tracking as envisioned by T370378:

On all wikis:

  • provide an API action to fire off this job for a given Data: page

On the shared wiki (Commons):

  • job queue job to get a list of all wikis with usages for a given Data: page
    • when a local JsonConfig page changes and we are the shared repository, fire off API requests to each wiki using the page requesting them to do their local cache invalidation

This allows for safely propagating data from the Commons context to the individual wikis for performing their own invalidations in their local databases, with only a limited number of HTTP hits made even in the case of thousands of pages using a resource.

Concerns:

  • don't expect to need to put rate limiting or permission controls in this system, but if they were applied we'd want to use some secret key or something to validate the requests from the automated system.

Event Timeline

CCiufo-WMF set the point value for this task to 3.
CCiufo-WMF moved this task from Backlog to Sprint 7 on the Charts board.
CCiufo-WMF edited projects, added Charts (Sprint 7); removed Charts.
CCiufo-WMF edited projects, added Charts (Sprint 8); removed Charts (Sprint 7).
CCiufo-WMF moved this task from Incoming to Blocked on the Charts (Sprint 8) board.

Rough plan for adding this on JsonConfig:

  • base on the in-review patch for T374747
  • add a hook on page update or deletion
    • if it's a locally-stored data page and global usage tracking is active, then:
      • put the page name in a job queue item below
  • add the job queue handler for purging global json links
    • we have the data page's title
    • JCSingleton::getGlobalJsonLinks()...
    • ...ask it for remote pages linking to the current page
    • for each wiki/namespace/title tuple, look up the API URL for the remote wiki and hits its API with a purge request
  • if existing API isn't suitable for remote purging, add one
    • optional: coalesce multiple hits for each wiki or (wiki,namespace) tuple
      • this may have moderate performance benefits in some scenarios, but is probably not critical path being in job queues doing cache invalidations :D

Unit test cases may not be doable for this scenario, which is why I'm doubling down fixing tests in the first patch for the internals...

Note that current JsonConfig patch should store links for tabular data uses via the Lua module, but needs another fix to Chart to store links for our usages. Feel free to poke that if I don't get to it first!

Change #1081443 had a related patch set uploaded (by Aude; author: Aude):

[mediawiki/extensions/Chart@master] Store json page used by chart parser function - WIP

https://gerrit.wikimedia.org/r/1081443

CCiufo-WMF edited projects, added Charts (Sprint 9); removed Charts (Sprint 8).
CCiufo-WMF moved this task from Incoming to Doing on the Charts (Sprint 9) board.
CCiufo-WMF edited projects, added Charts; removed Charts (Sprint 9).
CCiufo-WMF edited projects, added Charts (Sprint 10); removed Charts.

Change #1081443 merged by jenkins-bot:

[mediawiki/extensions/Chart@master] Store json page used by chart parser function

https://gerrit.wikimedia.org/r/1081443

Some fixes are still underway in response to testing.

Change #1087958 had a related patch set uploaded (by Bvibber; author: Bvibber):

[mediawiki/extensions/JsonConfig@master] GlobalJsonLinksCachePurgeJob to actually invalidate caches

https://gerrit.wikimedia.org/r/1087958

Change #1087958 merged by jenkins-bot:

[mediawiki/extensions/JsonConfig@master] GlobalJsonLinksCachePurgeJob to actually invalidate caches

https://gerrit.wikimedia.org/r/1087958

Change #1090928 had a related patch set uploaded (by Bvibber; author: Bvibber):

[mediawiki/extensions/JsonConfig@wmf/1.44.0-wmf.3] GlobalJsonLinksCachePurgeJob to actually invalidate caches

https://gerrit.wikimedia.org/r/1090928

Change #1090928 merged by jenkins-bot:

[mediawiki/extensions/JsonConfig@wmf/1.44.0-wmf.3] GlobalJsonLinksCachePurgeJob to actually invalidate caches

https://gerrit.wikimedia.org/r/1090928

Mentioned in SAL (#wikimedia-operations) [2024-11-13T21:20:55Z] <cjming@deploy2002> Started scap sync-world: Backport for [[gerrit:1090928|GlobalJsonLinksCachePurgeJob to actually invalidate caches (T374746)]]

Mentioned in SAL (#wikimedia-operations) [2024-11-13T21:27:23Z] <cjming@deploy2002> cjming, bvibber: Backport for [[gerrit:1090928|GlobalJsonLinksCachePurgeJob to actually invalidate caches (T374746)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)

Mentioned in SAL (#wikimedia-operations) [2024-11-13T21:34:22Z] <cjming@deploy2002> Finished scap sync-world: Backport for [[gerrit:1090928|GlobalJsonLinksCachePurgeJob to actually invalidate caches (T374746)]] (duration: 13m 27s)

Change #1090988 had a related patch set uploaded (by Bvibber; author: Bvibber):

[operations/mediawiki-config@master] Correction to virtual-globaljsonlinks mapping

https://gerrit.wikimedia.org/r/1090988

Change #1090993 had a related patch set uploaded (by Bvibber; author: Bvibber):

[operations/mediawiki-config@master] Avoid use of globaljsonlinks* tables on beta cluster

https://gerrit.wikimedia.org/r/1090993

Change #1090988 merged by jenkins-bot:

[operations/mediawiki-config@master] Correction to virtual-globaljsonlinks mapping

https://gerrit.wikimedia.org/r/1090988

Mentioned in SAL (#wikimedia-operations) [2024-11-14T09:27:44Z] <kartik@deploy2002> Started scap sync-world: Backport for [[gerrit:1090988|Correction to virtual-globaljsonlinks mapping (T374746)]]

Mentioned in SAL (#wikimedia-operations) [2024-11-14T09:31:27Z] <kartik@deploy2002> bvibber, kartik: Backport for [[gerrit:1090988|Correction to virtual-globaljsonlinks mapping (T374746)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)

Mentioned in SAL (#wikimedia-operations) [2024-11-14T09:37:48Z] <kartik@deploy2002> Finished scap sync-world: Backport for [[gerrit:1090988|Correction to virtual-globaljsonlinks mapping (T374746)]] (duration: 10m 03s)

Change #1090993 merged by jenkins-bot:

[operations/mediawiki-config@master] Avoid use of globaljsonlinks* tables on beta cluster

https://gerrit.wikimedia.org/r/1090993

Config change is deployed and we have working cache invalidation on test+test-commons. Ready for sign-off and closing?

Config change is deployed and we have working cache invalidation on test+test-commons. Ready for sign-off and closing?

To test this, do I just make a change to a Data ns page and wait for it to propagate?

Config change is deployed and we have working cache invalidation on test+test-commons. Ready for sign-off and closing?

To test this, do I just make a change to a Data ns page and wait for it to propagate?

  1. ensure the page containinig the chart has been edited since this change went live last night
  2. edit one of the data: pages backing it
  3. reload the page and it should show updates

In my testing so far this has happened well within the time it took me to switch tabs, though backed-up job queues can delay both the link updates and the purge propagation.

Hmm doesn't seem to be working consistently for me right now, but maybe something's off with testwiki.

If I update a tabular data page, the corresponding charts sometimes show the updated version but also sometimes show the outdated version?

I can't seem to get the article page to update automatically either, but it shows the correct (new) chart on edit preview.

This appears to be working. Great work all! @CCiufo-WMF these are the test steps I followed

QA steps:

Scenario 1 - edit the chart definition updates test wiki

Scenario 2 - edit the data updates test wiki

Scenario 3- edit the data updates local wiki

*Refresh https://test-commons.wikimedia.org/wiki/Data:Example_Line_Chart.chart

Change #1091824 had a related patch set uploaded (by Bvibber; author: Bvibber):

[mediawiki/extensions/JsonConfig@master] Use WAN cache for JsonConfig remote fetch cache

https://gerrit.wikimedia.org/r/1091824

Change #1091824 merged by jenkins-bot:

[mediawiki/extensions/JsonConfig@master] Use WAN cache for JsonConfig remote fetch cache

https://gerrit.wikimedia.org/r/1091824

Change #1092304 had a related patch set uploaded (by Bvibber; author: Bvibber):

[mediawiki/extensions/JsonConfig@wmf/1.44.0-wmf.3] Use WAN cache for JsonConfig remote fetch cache

https://gerrit.wikimedia.org/r/1092304

Change #1092304 merged by jenkins-bot:

[mediawiki/extensions/JsonConfig@wmf/1.44.0-wmf.3] Use WAN cache for JsonConfig remote fetch cache

https://gerrit.wikimedia.org/r/1092304

Mentioned in SAL (#wikimedia-operations) [2024-11-18T21:46:41Z] <urbanecm@deploy2002> Started scap sync-world: Backport for [[gerrit:1092304|Use WAN cache for JsonConfig remote fetch cache (T374746)]], [[gerrit:1092300|Create no-link-recommendation variant (T377787 T380204)]], [[gerrit:1092295|[GrowthExperiments] testwiki: Enable no-link-recommendation experiment (T380204)]]

Mentioned in SAL (#wikimedia-operations) [2024-11-18T21:52:47Z] <urbanecm@deploy2002> urbanecm, bvibber: Backport for [[gerrit:1092304|Use WAN cache for JsonConfig remote fetch cache (T374746)]], [[gerrit:1092300|Create no-link-recommendation variant (T377787 T380204)]], [[gerrit:1092295|[GrowthExperiments] testwiki: Enable no-link-recommendation experiment (T380204)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)

Mentioned in SAL (#wikimedia-operations) [2024-11-18T21:58:51Z] <urbanecm@deploy2002> Finished scap sync-world: Backport for [[gerrit:1092304|Use WAN cache for JsonConfig remote fetch cache (T374746)]], [[gerrit:1092300|Create no-link-recommendation variant (T377787 T380204)]], [[gerrit:1092295|[GrowthExperiments] testwiki: Enable no-link-recommendation experiment (T380204)]] (duration: 12m 10s)