Jump to content

Wikipedia:Request a query

From Wikipedia, the free encyclopedia

This is a page for requesting one-off database queries for certain criteria. Users who are interested and able to perform SQL queries on the projects can provide results from the Quarry website.

You may also be interested in the following:

  • If you are interested in writing SQL queries or helping out here, visit our tips page.
  • If you need to obtain a list of article titles that meet certain criteria, consider using PetScan (user manual) or the default search. Petscan can generate list of articles in subcategories, articles which transclude some template, etc.
  • If you need to make changes to a number of articles based on a particular query, you can post to the bot requests page, depending on how many changes are needed.
  • For long-term review and checking, database reports are available.

Quarry does not have access to page content, so queries which require checking wikitext cannot be answered with Quarry. However, someone may be able to assist by using Quarry in another way (e.g. checking the table of category links rather than the "Category:" text) or suggest an alternative tool.

section headers starting with a lowercase "w" followed by a uppercase letter or a number

[edit]

I discovered something weird, see Wikipedia:Help_desk#Many_W's. Is it possible to run a query or write a regex search or whatever that meets these criteria:

Finds section headers (no matter how many levels deep) that start with a lowercase "w" immediately followed by a uppercase letter or a number. Polygnotus (talk) 15:05, 28 August 2024 (UTC)[reply]

The database replicas don't have article text, so the only place these would show up is if there are redirects to those sections.
Regex searches aren't indexed, so don't work well on their own; you have to pair them with something that is, like a category or template or outgoing link or normal search. But if you had that, I'd think insource:/== *w[A-Z0-9]/ would find just about all of these; it'd miss weird space characters, which should be uncommon anyway, and level-1 headers, where you'd get too many false positives with template parameters. Example search, paired with "meters". —Cryptic 15:23, 28 August 2024 (UTC)[reply]
Thank you! I think I have them all. I am not worried about level 1 headers because I think that every header that follows this pattern is a subheader of something like "Women's events" or "Women's results" or similar. Polygnotus (talk) 15:34, 28 August 2024 (UTC)[reply]

All categories with "LGBT" in their title

[edit]

Would it be possible to get a list of all categories with LGBT (without a Q) in their title? They all need to be WP:C2D-renamed to change from LGBT to LGBTQ following Talk:LGBTQ#Requested move 14 August 2024, and a query which can be redone at will to get the currently-outstanding list would be helpful. Thanks, HouseBlaster (talk • he/they) 03:08, 31 August 2024 (UTC)[reply]

Try this. Let me know if it needs adjusting :) –Novem Linguae (talk) 03:22, 31 August 2024 (UTC)[reply]
As always, you are amazing, Novem! Exactly what I had in mind. Thank you so much :) HouseBlaster (talk • he/they) 03:30, 31 August 2024 (UTC)[reply]
@Novem Linguae: would it be possible to add a check to make sure the category is not a member of Category:All categories for discussion (and thus is not already nominated)? Thanks, HouseBlaster (talk • he/they) 03:14, 1 September 2024 (UTC)[reply]
Here you go. I've also made the check against 'LGBT' (but not 'LGBTQ') case-insensitive, which finds a handful more around Category:British lGBT entertainers and Category:Irish lGBT entertainers. —Cryptic 03:58, 1 September 2024 (UTC)[reply]
Thank you, Cryptic!! HouseBlaster (talk • he/they) 04:14, 1 September 2024 (UTC)[reply]

Typical page views

[edit]

I would like a sentence in Wikipedia:Statistics#Page views that says something like "As of 2024, the median page views for an article is 6 per day, with n% of articles falling into the range of 3 to 50 page views per day" (where 50% ≤ n ≤ 90%, to show the middle range). I think a sample from 10,000 articles (e.g., User:BilledMammal/Average articles) or even just 1,000 would be sufficient.

I have found Wikipedia:Does Wikipedia traffic obey Zipf's law? but it's based on very old data. WhatamIdoing (talk) 18:41, 10 September 2024 (UTC)[reply]

Wikipedia:Request a query/Archive 3#Shortdesc query looks similar. WhatamIdoing (talk) 19:04, 10 September 2024 (UTC)[reply]
View counts aren't exposed in the public database replicas AFAIK, so this isn't possible other than by hitting the API once for each page. —Cryptic 19:24, 10 September 2024 (UTC)[reply]
You can try plugging quarry query 80241 or pagepile 60602 - they're the same sample of 10000 pages - into https://pageviews.wmcloud.org/massviews, but it only gave me data on about 3/4 of them and a ton of errors on the rest. —Cryptic 19:45, 10 September 2024 (UTC)[reply]
Thanks. I was able to get the pageviews by running it in two batches. I'm looking at the numbers now. WhatamIdoing (talk) 20:23, 10 September 2024 (UTC)[reply]
After excluding ~200 pages with 0 reported page views from 2023 (because they might have been created during 2024), I find: mean of 5,725 page views per year, median of 55 page views per year, mode of 1. 25% of articles get 3 page views per year(!) or less.  75% of articles get 570 page views per year — or less. Almost exactly 10% of articles average 10+ page views per day. WhatamIdoing (talk) 21:05, 10 September 2024 (UTC)[reply]

Cross Reference Request

[edit]

Would someone be able to cross-reference the links in this list Green Bay Packers All-Time Roster with to verify that every target page in that list is also categorized in that category. The goal for me here is to make sure the links in the list are targeted to the right article. As an example, the list had Rex Smith, when it should have had Rex Smith (American football) as he correct target. Thus, this cross reference will identify any disambiguation that needs to occur on that page for links that aren't disambiguation pages (which I have already fixed). Thank you! « Gonzo fan2007 (talk) @ 22:14, 18 September 2024 (UTC)[reply]

quarry:query/86383. —Cryptic 01:11, 19 September 2024 (UTC)[reply]
Cryptic, thank you! The query appears to be pulling the links from {{Green Bay Packers}}, which is on the page. Any way to exclude the template in the query? « Gonzo fan2007 (talk) @ 17:53, 19 September 2024 (UTC)[reply]