Talk:Quarry

About this board

Previous discussion was archived at Talk:Quarry/Archive 1 on 2015-04-17. Discussion area for discussion about Quarry itself and help about individual queries.

Start a new topic

OperationalError('table resultsets already exists')

2 comments • 11:51, 16 September 2024 7 days ago

2

Enhancing999 (talkcontribs)

Any idea why that happens? I get it after long running queries. Sample: quarry:query/86096. If I happen to have the window open, I sometimes see the actual results before, meaning the query was successful. Usually I forget to export it before it disappears.

Reply 09:02, 10 September 2024 13 days ago

Framawiki (talkcontribs)

Hello! it seams to be an internal bug in Quarry. Could you open a bugreport on Phabricator for it? Thanks!

Reply 11:51, 16 September 2024 7 days ago

Reply to "OperationalError('table resultsets already exists')"

What's more efficient: AND NOT, not in, <>

2 comments • 15:30, 6 September 2024 17 days ago

2

Enhancing999 (talkcontribs)

Which is more efficient?

AND NOT ( lt_title = "ABC" ) AND NOT ( lt_title = "XYZ")
AND NOT in ( "ABC", "XYZ")
AND lt_title <> "ABC" AND lt_title <> "XYZ"

Agree that none is ideal.

Reply 14:34, 6 September 2024 17 days ago

Matěj Suchánek (talkcontribs)

As the first thing, the query engine builds a query plan, then executes the query according to the plan. You can check if the query plan is always the same (e.g., using Toolforge SQL Optimizer). If it is, there is no difference.

I prefer NOT IN.

Reply 15:30, 6 September 2024 17 days ago

Reply to "What's more efficient: AND NOT, not in, <>"

Using 2 databases

7 comments • 06:08, 5 September 2024 18 days ago

7

Enhancing999 (talkcontribs)

How to specify tables from two different databases (wikidatawiki_p and commonswiki_p)?

Reply Edited 12:21, 2 September 2024 21 days ago

Enhancing999 (talkcontribs)

I tried

SELECT * FROM `commonswiki_p`.`pages` LIMIT 1
USE DATABASE commonswiki_p
USE commonswiki_p;

to override what's specified in the GUI.

Reply 12:33, 2 September 2024 21 days ago

Matěj Suchánek (talkcontribs)

It's not possible since around 2021. See Topic:W6tzj276xib56phf.

Reply 13:26, 2 September 2024 21 days ago

TheDJ (talkcontribs)

They are completely separate DB servers, you cannot make queries across multiple servers in the same query.

Reply 20:13, 3 September 2024 20 days ago

Enhancing999 (talkcontribs)

Apparently it was possible (see sample in the topic referenced by Matej) but then un-featured.

I found an easier solution, as the gap between Wikidata and Commons is only partial: one table at Commons is updated ( wbc_entity_usage), but not the other (page_props): quarry:query/86040

Reply 04:06, 4 September 2024 19 days ago

TheDJ (talkcontribs)

"Apparently it was possible " Yes, until the infrastructure ran into scaling problems.

Reply 09:58, 4 September 2024 19 days ago

Enhancing999 (talkcontribs)

Are there any measure in place to keep the databases in sync? The gap mentioned above is minor in percentages (maybe 0.1%), but in absolute numbers 4600 is a lot.

Reply 06:08, 5 September 2024 18 days ago

Reply to "Using 2 databases"

combining SQL and SPARQL query

2 comments • 12:22, 2 September 2024 21 days ago

2

Jarekt (talkcontribs)

Is there a way to find Commons files that

use P625 SDC property
do not transclude c:Module:Coordinates.

Reply 16:59, 19 August 2024 1 month ago

Enhancing999 (talkcontribs)

Seems we might need the "Pages with maps" category again.

Reply Edited 12:22, 2 September 2024 21 days ago

Reply to "combining SQL and SPARQL query"

Replag for enwiki

One comment • 23:36, 15 August 2024 1 month ago

1

Summary by GTrang

No more replag

GTrang (talkcontribs)

Tracked in Phabricator
Task T367856

The enwiki database has been on replag for an entire week now. It should hopefully be fixed in the next week or so.

05:31, 11 August 2024 1 month ago

Quarry / SQL query - how to fix OperationalError('table resultsets already exists')?

4 comments • 17:16, 24 July 2024 1 month ago

4

Gluo88 (talkcontribs)

I was able to complete the above SQL query as seen in https://quarry.wmcloud.org/history/84807/911494/884555.

However, when I run the same query again at https://quarry.wmcloud.org/query/84807 . I got the error "OperationalError('table resultsets already exists')". How should I fix the error?

I guess that resultsets may be a temporary table that was produced from running my previous SQL query. I tried to execute the following:

DROP TABLE IF EXISTS resultsets;

However, I got the following information: “Access denied; you need (at least one of) the SUPER, READ_ONLY ADMIN privilege(s) for this operation”

May anyone help me on this issue?

Thanks.

Reply 21:27, 13 July 2024 2 months ago

Gluo88 (talkcontribs)

I just found that the above query is complete now. The issue looks to be automatically solved, at least for now, although I still don't know the reason of the "OperationalError('table resultsets already exists').

However, my query https://quarry.wmcloud.org/query/84817 has just failed still with the same reason of the "OperationalError('table resultsets already exists').

May anyone know how to handle "OperationalError('table resultsets already exists')?

Thanks.

Reply Edited 01:25, 15 July 2024 2 months ago

Gluo88 (talkcontribs)

The issue looks to be automatically solved again. Did someone help in background?

Thanks a lot.

Reply 02:49, 15 July 2024 2 months ago

This post was hidden by Gluo88 (history)

Reply to "Quarry / SQL query - how to fix OperationalError('table resultsets already exists')?"

Tool databases

5 comments • 23:56, 23 July 2024 2 months ago

5

Samwilson (talkcontribs)

Is there a delay between a tool database being created and it being available in Quarry? It looks like s55926__wishlist_p can not be queried (it was only created today): https://quarry.wmcloud.org/query/11263

Reply 05:19, 23 July 2024 2 months ago

TheDJ (talkcontribs)

As far as I know, tools databases are not public, and so not in quarry.

Reply 09:16, 23 July 2024 2 months ago

Samwilson (talkcontribs)

@TheDJ: That used to be the case, but I'd thought that recently (phab:T151158) it had become possible. It's only databases with names ending in _p.

Reply 13:18, 23 July 2024 2 months ago

BDavis (WMF) (talkcontribs)

Quarry reads from the mirror of the primary ToolsDB rather than the primary. The mirror is lagged by 15 hours at the moment: https://grafana.wmcloud.org/d/PTtEnEyVk/toolsdb-mariadb?orgId=1&var-server=tools-db-3

Reply 16:08, 23 July 2024 2 months ago

Samwilson (talkcontribs)

Ah that's good to know, thanks! I'll be patient. :-)

Reply 23:56, 23 July 2024 2 months ago

Reply to "Tool databases"

Quarry / SQL optimization - using DB indexes?

3 comments • 18:33, 10 July 2024 2 months ago

3

Fl.schmitt (talkcontribs)

Hi, I'm trying to optimize a simple SQL query for pages on commonswiki (table page) which start with a certain string (e.g. "SELECT * FROM page WHERE page_title LIKE "Building%" ORDER by page_title;"- see also https://quarry.wmcloud.org/query/83277 for an example with a smaller result set). The SQL Optimizer on Toolforge tells me that this query would use filesort instead of indexes which causes performance issues ("Query plan 1.1 is using filesort or a temporary table. This is usually an indication of an inefficient query. If you find your query is slow, try taking advantage of available indexes to avoid filesort."). The DB schema documentation tells me that there should be an index defined (key: page_name_title) on page_title and page_namespace columns. The MySQL docs tell me that i could use USE INDEX (page_name_title) to enforce using that index. If I add that clause to my query (SELECT * FROM page USE INDEX (page_name_title) WHERE page_title LIKE "Building%" ORDER by page_title;), the SQL optimizer complains about a "Query error: Key 'page_name_title' doesn't exist in table 'page'". At the Commons village pump, I've learned that the replicas may lack the indices. So, I'm not sure if there's a way to optimize such a query on Quarry. I would prefer using Quarry instead of the toolforge CLI because Quarry allows for linking queries and results.

Reply 17:57, 10 July 2024 2 months ago

Matěj Suchánek (talkcontribs)

This query cannot use the index since you don't have a condition on page_namespace.

Reply 18:18, 10 July 2024 2 months ago

Fl.schmitt (talkcontribs)

Aww - ok :-) - yes, with such a condition, it works like a charm - thanks a lot!

Reply 18:33, 10 July 2024 2 months ago

Reply to "Quarry / SQL optimization - using DB indexes?"

Long Query Time

2 comments • 19:47, 12 June 2024 3 months ago

2

SoySauceOnRice (talkcontribs)

Hi, I am doing some research on sockpuppets. I'm trying to use the SQL database to assist in building my dataset, but queries seem to take a long time.

For instance, this previous query ran in 196 seconds:

https://quarry.wmcloud.org/query/61732

Whereas my identical query has been going for over 9 hours:

https://quarry.wmcloud.org/query/83588

A previous instance was running over a week before I restarted it.

Am I doing something wrong, or are these running times typical?

On another note, I have been looking for complete SQL dumps to run my own instance so that I don't cause issues with excessive queries, however I can only find complete xml dumps (https://dumps.wikimedia.org/enwiki/20240601/).

Is there any way to get a full SQL dump (complete with article revision history)?

Thank you for your help.

Reply 12:23, 11 June 2024 3 months ago

Matěj Suchánek (talkcontribs)

Using Toolforge CLI, the query returns 187814 rows in 31.187 sec. Maybe that's too much for Quarry.

Reply 19:47, 12 June 2024 3 months ago

Reply to "Long Query Time"

Queued queries

One comment • 14:06, 8 June 2024 3 months ago

1

Plastikspork (talkcontribs)

Recently, some of my queries get immediately queued and won't run. If I fork the query, then it runs immediately, but now I have a bunch that are marked as queued but not running. Is there a way to kill these old ones, and any ideas why this happens?

Here is an example of a few that are still queued:

https://quarry.wmcloud.org/query/71639

https://quarry.wmcloud.org/query/83591

https://quarry.wmcloud.org/query/83590

Thanks!

Reply 14:06, 8 June 2024 3 months ago

Reply to "Queued queries"