TIL: Wikidata SPARQL trick - getting item and subclasses

If you are using the Wikidata Query Service to see how data is structured in Wikidata, one frequent query you might want to do is as follows.

Count the number of items which are an instance of a subclass of X, or an instance of X itself.

This is useful as you can see roughly the structure of how objects are classified.

The following query answers half of the query above (replace THING with the item you’re interested in): count the number of items which are an instance of subclass of X.

SELECT DISTINCT ?category ?categoryLabel (COUNT (DISTINCT ?item) AS ?count) WHERE {
  ?category wdt:P279 wd:THING .

  ?item wdt:P31 ?category .

  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". }
}
GROUP BY ?category ?categoryLabel
ORDER BY DESC(?count)
LIMIT 100

But we want to bind ?category to include the thing itself as well as the subclasses. Barber paradox? Who gives a damn?

An easy but hacky way of binding ?category to the thing itself? UNION it together with a sitelink.

SELECT DISTINCT ?category ?categoryLabel (COUNT (DISTINCT ?item) AS ?count) WHERE {
  { ?category wdt:P279 wd:THING . }
  UNION
  { <https://en.wikipedia.org/wiki/ARTICLE_ABOUT_THING> schema:about ?category . }

  ?item wdt:P31 ?category .

  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". }
}
GROUP BY ?category ?categoryLabel
ORDER BY DESC(?count)
LIMIT 100

Some helpful person might go and change the name of the Wikipedia article about the thing, so these kinds of queries might break. (But they might go edit Wikidata too. C’est la vie.) You could always find another statement where the object of the statement uniquely picks the thing out.

Alternatively, you could use BIND and VALUES and subqueries and CONSTRUCT but I would suggest this method has significantly lower cognitive overhead.