Nothing Special   »   [go: up one dir, main page]

Page MenuHomePhabricator

Transaction timeout for LinksUpdate::updateLinksTimestamp (SET page_links_updated)
Closed, ResolvedPublicPRODUCTION ERROR

Description

When saving an edit on test2.wikipedia.org via mwdebug1002 (using X-Wikimedia-Debug) the save procedure almost always takes upto a minute before going through.

If I understand the below Logstash entries correctly, either this single SET operation is really slow, or (more likely) there is some unrelated code happening in PHP (or in separate query) between the start and end of the transaction, thus causing the lock to timeout before it can commit the transaction.

Tentatively marking as regression since this is the first time I've experienced this when saving edits through any means.

  • [INFO] [TitleBlacklist-cache] Updated test2wiki:title_blacklist_entries with 646 entries
  • [WARNING] [DBPerformance] Sub-optimal transaction
http_method: POST
reqId: W2XVZApAAC4AAEKMbQAAAAAV
url: /w/index.php?... action=submit

Sub-optimal transaction on DB(s) [10.64.0.205 (test2wiki) (TRX#b5f896)]: 
0	0.003593	query-m: INSERT INTO `revision` (rev_page,rev_parent_id,rev_minor_edit,rev_timestamp,rev_deleted,rev_len,rev_sha1,rev_content_model,rev_content_format,rev_comment,rev_user,rev_user_text) VALUES ('X') [TRX#b5f896]
1	0.000327	query-m: INSERT INTO `revision_comment_temp` (revcomment_rev,revcomment_comment_id) VALUES ('X') [TRX#b5f896]
2	0.001551	query-m: INSERT INTO `blobs_cluster2N` (blob_text) VALUES ('X')
3	0.000377	query-m: INSERT INTO `text` (old_id,old_text,old_flags) VALUES (NULL,'X') [TRX#b5f896]
4	0.000357	query-m: UPDATE `revision` SET rev_text_id = 'X' [TRX#b5f896]
5	0.000392	query-m: INSERT INTO `content` (content_size,content_sha1,content_model,content_address) VALUES ('X') [TRX#b5f896]
6	0.000329	query-m: INSERT INTO `slots` (slot_revision_id,slot_role_id,slot_content_id,slot_origin) VALUES ('X') [TRX#b5f896]
7	29.652620	...delay...
8	0.005878	query-m: UPDATE `page` SET page_latest = 'X' [TRX#b5f896]
9	3.380293	...delay...
10	0.000718	query-m: UPDATE `user` SET user_editcount=user_editcount+N WHERE user_id = 'X' AND (user_editcount IS NOT NULL) [TRX#b5f896]
11	3.866257	...delay...
  • [WARNING] [DBPerformance] Expectation (readQueryTime <= 5) by MediaWiki::main not met (actual: 14.156344890594):
Expectation (readQueryTime <= 5) by MediaWiki::main not met (actual: 14.156344890594):
query-m: SELECT page_latest FROM `page` WHERE page_id = 'X' LIMIT N FOR UPDATE [TRX#e72714]
#0 /srv/mediawiki/php-1.32.0-wmf.15/includes/libs/rdbms/TransactionProfiler.php(223): Wikimedia\Rdbms\TransactionProfiler->reportExpectationViolated()
#1 /srv/mediawiki/php-1.32.0-wmf.15/includes/libs/rdbms/database/Database.php(1243): Wikimedia\Rdbms\TransactionProfiler->recordQueryCompletion()
#2 /srv/mediawiki/php-1.32.0-wmf.15/includes/libs/rdbms/database/Database.php(1143): Wikimedia\Rdbms\Database->doProfiledQuery()
#3 /srv/mediawiki/php-1.32.0-wmf.15/includes/libs/rdbms/database/Database.php(1640): Wikimedia\Rdbms\Database->query()
#4 /srv/mediawiki/php-1.32.0-wmf.15/includes/libs/rdbms/database/Database.php(1465): Wikimedia\Rdbms\Database->select()
#5 /srv/mediawiki/php-1.32.0-wmf.15/includes/page/WikiPage.php(2708): Wikimedia\Rdbms\Database->selectField()
#6 /srv/mediawiki/php-1.32.0-wmf.15/includes/Storage/PageUpdater.php(938): WikiPage->lockAndGetLatest()
#7 /srv/mediawiki/php-1.32.0-wmf.15/includes/Storage/PageUpdater.php(728): MediaWiki\Storage\PageUpdater->doModify()
#8 /srv/mediawiki/php-1.32.0-wmf.15/includes/page/WikiPage.php(1855): MediaWiki\Storage\PageUpdater->saveRevision()
#9 /srv/mediawiki/php-1.32.0-wmf.15/includes/EditPage.php(2236): WikiPage->doEditContent()
#10 /srv/mediawiki/php-1.32.0-wmf.15/includes/EditPage.php(1524): EditPage->internalAttemptSave()
#11 /srv/mediawiki/php-1.32.0-wmf.15/includes/EditPage.php(662): EditPage->attemptSave()
#12 /srv/mediawiki/php-1.32.0-wmf.15/includes/actions/EditAction.php(60): EditPage->edit()
#13 /srv/mediawiki/php-1.32.0-wmf.15/includes/actions/SubmitAction.php(38): EditAction->show()
#14 /srv/mediawiki/php-1.32.0-wmf.15/includes/MediaWiki.php(500): SubmitAction->show()
#15 /srv/mediawiki/php-1.32.0-wmf.15/includes/MediaWiki.php(294): MediaWiki->performAction()
#16 /srv/mediawiki/php-1.32.0-wmf.15/includes/MediaWiki.php(867): MediaWiki->performRequest()
#17 /srv/mediawiki/php-1.32.0-wmf.15/includes/MediaWiki.php(524): MediaWiki->main()
#18 /srv/mediawiki/php-1.32.0-wmf.15/index.php(42): MediaWiki->run()
#19 /srv/mediawiki/w/index.php(3): include()
#20 {main}
  • [ERROR] [DBQuery] LinksUpdate::updateLinksTimestamp .. Lock wait timeout exceeded
LinksUpdate::updateLinksTimestamp	10.64.0.205	1205	Lock wait timeout exceeded; try restarting transaction (10.64.0.205)	UPDATE  `page` SET page_links_updated = '20180804163431' WHERE page_id = '2369'
  • [WARNING] [DBPerformance] Expectation (writeQueryTime <= 1) by MediaWiki::restInPeace not met (actual: 15.480643987656)
Expectation (writeQueryTime <= 1) by MediaWiki::restInPeace not met (actual: 15.480643987656):
query-m: UPDATE `page` SET page_links_updated = 'X' [TRX#c022f1]
#0 /srv/mediawiki/php-1.32.0-wmf.15/includes/libs/rdbms/TransactionProfiler.php(226): Wikimedia\Rdbms\TransactionProfiler->reportExpectationViolated()
#1 /srv/mediawiki/php-1.32.0-wmf.15/includes/libs/rdbms/database/Database.php(1243): Wikimedia\Rdbms\TransactionProfiler->recordQueryCompletion()
#2 /srv/mediawiki/php-1.32.0-wmf.15/includes/libs/rdbms/database/Database.php(1143): Wikimedia\Rdbms\Database->doProfiledQuery()
#3 /srv/mediawiki/php-1.32.0-wmf.15/includes/libs/rdbms/database/Database.php(2059): Wikimedia\Rdbms\Database->query()
#4 /srv/mediawiki/php-1.32.0-wmf.15/includes/deferred/LinksUpdate.php(1124): Wikimedia\Rdbms\Database->update()
#5 /srv/mediawiki/php-1.32.0-wmf.15/includes/deferred/LinksUpdate.php(295): LinksUpdate->updateLinksTimestamp()
#6 /srv/mediawiki/php-1.32.0-wmf.15/includes/deferred/LinksUpdate.php(171): LinksUpdate->doIncrementalUpdate()
#7 /srv/mediawiki/php-1.32.0-wmf.15/includes/deferred/DeferredUpdates.php(268): LinksUpdate->doUpdate()
#8 /srv/mediawiki/php-1.32.0-wmf.15/includes/deferred/DeferredUpdates.php(214): DeferredUpdates::runUpdate()
#9 /srv/mediawiki/php-1.32.0-wmf.15/includes/deferred/DeferredUpdates.php(134): DeferredUpdates::execute()
#10 /srv/mediawiki/php-1.32.0-wmf.15/includes/MediaWiki.php(913): DeferredUpdates::doUpdates()
#11 /srv/mediawiki/php-1.32.0-wmf.15/includes/MediaWiki.php(733): MediaWiki->restInPeace()
#12 (): Closure$MediaWiki::doPostOutputShutdown()
#13 {main}
  • [WARNING] [DBPerformance] Sub-optimal transaction
http_method: POST

Sub-optimal transaction on DB(s) [10.64.0.205 (test2wiki) (TRX#c022f1)]: 
0	15.480644	query-m: UPDATE `page` SET page_links_updated = 'X' [TRX#c022f1]
1	1.523502	...delay...
  • [INFO] [deprecated] Caller from Wikimedia\Rdbms\Database::ping ignored an error originally raised from LinksUpdate::updateLinksTimestamp: [1205] Lock wait timeout exceeded; try restarting transaction (10.64.0.205)
  • [ERROR] [exception] Wikimedia\Rdbms\DBQueryError
Query: UPDATE  `page` SET page_links_updated = '20180804163431' WHERE page_id = '2369'
Function: LinksUpdate::updateLinksTimestamp
Error: 1205 Lock wait timeout exceeded; try restarting transaction (10.64.0.205)
t  exception.trace	       	#0 /srv/mediawiki/php-1.32.0-wmf.15/includes/libs/rdbms/database/Database.php(1413): Wikimedia\Rdbms\Database->makeQueryException(string, integer, string, string)
#1 /srv/mediawiki/php-1.32.0-wmf.15/includes/libs/rdbms/database/Database.php(1186): Wikimedia\Rdbms\Database->reportQueryError(string, integer, string, string, boolean)
#2 /srv/mediawiki/php-1.32.0-wmf.15/includes/libs/rdbms/database/Database.php(2059): Wikimedia\Rdbms\Database->query(string, string)
#3 /srv/mediawiki/php-1.32.0-wmf.15/includes/deferred/LinksUpdate.php(1124): Wikimedia\Rdbms\Database->update(string, array, array, string)
#4 /srv/mediawiki/php-1.32.0-wmf.15/includes/deferred/LinksUpdate.php(295): LinksUpdate->updateLinksTimestamp()
#5 /srv/mediawiki/php-1.32.0-wmf.15/includes/deferred/LinksUpdate.php(171): LinksUpdate->doIncrementalUpdate()
#6 /srv/mediawiki/php-1.32.0-wmf.15/includes/deferred/DeferredUpdates.php(268): LinksUpdate->doUpdate()
#7 /srv/mediawiki/php-1.32.0-wmf.15/includes/deferred/DeferredUpdates.php(214): DeferredUpdates::runUpdate(LinksUpdate, Wikimedia\Rdbms\LBFactoryMulti, string, integer)
#8 /srv/mediawiki/php-1.32.0-wmf.15/includes/deferred/DeferredUpdates.php(134): DeferredUpdates::execute(array, string, integer)
#9 /srv/mediawiki/php-1.32.0-wmf.15/includes/MediaWiki.php(913): DeferredUpdates::doUpdates(string)
#10 /srv/mediawiki/php-1.32.0-wmf.15/includes/MediaWiki.php(733): MediaWiki->restInPeace(string, boolean)
#11 [internal function]: Closure$MediaWiki::doPostOutputShutdown()
#12 {main}

Event Timeline

We have debug logs for this request. On mwlog1001 do zgrep W2XVZApAAC4AAEKMbQAAAAAV /srv/mw-log/archive/test2wiki.log-20180805.gz

The problems with this request started before your paste. There's a gap of 12 seconds here:

2018-08-04 16:33:59 [W2XVZApAAC4AAEKMbQAAAAAV] mwdebug1002 test2wiki 1.32.0-wmf.15 SpamBlacklist INFO: Spam blacklist validator: [[MediaWiki:Gadget-modrollback.js]] not a local blacklist  
2018-08-04 16:34:11 [W2XVZApAAC4AAEKMbQAAAAAV] mwdebug1002 test2wiki 1.32.0-wmf.15 DBQuery DEBUG: SELECT  page_id,page_namespace,page_title,page_restrictions,page_is_redirect,page_is_new,page_random,page_touched,page_links_updated,page_latest,page_len,page_content_model  FROM `page`    WHERE page_namespace = '8' AND page_title = 'Gadget-modrollback.js'  LIMIT 1   {"method":"WikiPage::pageData","master":true,"runtime":0.010992050170898}

The SpamBlacklist message is the one it gives when it's not doing anything, it's not at fault. The DB connection is already open and has already served queries. The wait apparently happens between line 1886 and 1987 of EditPage.php.

Shortly afterwards, there's two more delays, one of 3 seconds and one of 11 seconds:

2018-08-04 16:34:11 [W2XVZApAAC4AAEKMbQAAAAAV] mwdebug1002 test2wiki 1.32.0-wmf.15 wfDebug DEBUG: EditPage::internalAttemptSave: getting section ''  
2018-08-04 16:34:14 [W2XVZApAAC4AAEKMbQAAAAAV] mwdebug1002 test2wiki 1.32.0-wmf.15 caches INFO: parser: MultiWriteBagOStuff  
2018-08-04 16:34:25 [W2XVZApAAC4AAEKMbQAAAAAV] mwdebug1002 test2wiki 1.32.0-wmf.15 DBQuery DEBUG: SELECT  page_id,page_namespace,page_title,page_restrictions,page_is_redirect,page_is_new,page_random,page_touched,page_links_updated,page_latest,page_len,page_content_model  FROM `page`    WHERE page_namespace = '8' AND page_title = 'Gadget-modrollback.js'  LIMIT 1   {"method":"WikiPage::pageData","master":true,"runtime":0.00074911117553711}

Maybe there was an issue with hhvm on mwdebug1002, like recompilation. There was a CPU spike at this time, not enough to cause a big delay by itself, but too much to have been caused by a single PHP thread.

mwdebug1002 load.png (288×906 px, 43 KB)

I can't seem to reproduce this slowness (using mwdebug1002).

Me neither. Let's close this for now :)

mmodell changed the subtype of this task from "Task" to "Production Error".Aug 28 2019, 11:09 PM