Related incident: https://wikitech.wikimedia.org/wiki/Incident_documentation/2017-03-20_Special_AllPages
Background
T160914: Databases overflown with connections due to slow query on Special:AllPages
The query in question looks something like SELECT page_title FROM page WHERE page_namespace = '0' AND page_is_redirect = '0' AND (page_title < 'Entomacrodus_stellifer') ORDER BY page_title DESC LIMIT 344, 1
There are at least two faults with the plan the database is currently using for this query:
- It's fetching all (or almost all) pages in the namespace, then filtering by title and is_redirect, rather than fetching pages in order using the index on namespace+title and filtering by is_redirect.
- It's not even using ICP to do the filtering by title, which a similar query using ascending order does do.
But even with the best plan, this query would still be problematic when run on a namespace with millions of redirects and few non-redirects.
To discuss
Adding a filter on is_redirect may be possible on the largest wikis (it is currently disabled for them, but maybe it can be enabled).
Options:
- Add an index including the flag (is is ok to add an index just for a single, infrequent query?)
- Rewrite the query (how?)
- Force the optimizer to do a better plan (probably not enough)
- Anything else?
It may require more than one option.