DS Catalog:SPARQL Query Service/example queries: Difference between revisions
Line 102: | Line 102: | ||
=Technical Queries= | =Technical Queries= | ||
==Authority Record Generator== | ==Authority Record Generator== | ||
This query generates a list of authority records in the Wikibase by authority value type (i.e., all items which are an''instance of'' a particular Authority Type). | This query generates a list of authority records in the Wikibase by authority value type (i.e., all items which are an ''instance of'' a particular Authority Type). | ||
==Dated Classification Generator== | ==Dated Classification Generator== |
Revision as of 15:55, 29 January 2024
Using the SPARQL Query Service for the DS Catalog
This page provides basic example queries for exploring the DS Wikibase using SPARQL, a query language designed for RDF-encoded linked datasets. Familiarity with the properties used in the DS Data Model is helpful for understanding how the queries operate, but the queries also contain comments (noted by the use of hash character "#") to direct users to the individual steps taken to better understand how the query is constructed to derive a solution.
Manuscripts and DS Records
In redeveloping the DS data model, the project team made an explicit choice to differentiate between the metadata description (the DS Record) and the manuscript object (Manuscript). Although separate, the data model links the DS Record to the Manuscript, such that a DS Record contains data about the manuscript object from institutional records that provide metadata about the object itself.
The decision to separate but link metadata descriptions from their manuscript objects was purposeful so as not to make any direct claims or assertions about the manuscript object other than its existence (which happens through assignment of a unique persistent identifier: the DS ID). In this way, the DS Record is conceptualized as a document which makes statements about a manuscript object which are not inherent to the manuscript object itself and can be revised at any time. Although the DS data model is designed to have only one DS Record linked to a Manuscript, this conceptualization of descriptive documents as separate from described objects potentially allows many different (and potentially competing) descriptions to be linked to the same object simultaneously.
Because of this data structure, unlike traditional library catalogs or search interfaces (like the one for the DS Catalog), users may find that SPARQL queries seem at first circuitous in comparison to other search and retrieval systems. This is because graphs databases like the DS Wikibase are queried on the basis of pattern matching for particular entities (items) and relationships between entities. A machine rapidly traverses the graph finding patterns that match the path indicated by the query. For purposes of querying DS data, that means that seemingly disparate elements of DS Records, Manuscripts, and even Holding Information (i.e., information about and assigned by the institution that owns and/or contributes data about a manuscript object) may all need to be invoked as part of a constructed queried in order to get solutions to seemingly simple questions (such as which institutions own items with texts authored by Avicenna). Taking some time to understand the items, properties, and linking structures in the DS data model and its substantiation in the DS Wikibase will help to elucidate how queries of this nature can be constructed.
To help users better understand how queries are constructed, the example queries found below provide comments (which are proceeded by # tags) to explain how clauses and asserted triple patterns function in the context of a query. We hope that working through some of these examples will allow users to construct their own more complex queries as they learn more about how the DS data model operates in concert with their research questions.
Prefix Declarations
Why are they used?
Prefix declarations made at the beginning of a SPARQL query tell you which namespaces (ontologies, data models, or other specifications) will be used by the query to construct its triples. Rather than having to write out a long URI every time an entity is referenced, by declaring prefixes, you can shorthand the URIs used later in the query.
For instance, by declaring the following prefixes at the beginning of the query,
PREFIX wd: <https://catalog.digital-scriptorium.org/entity/>
PREFIX wdt: <https://catalog.digital-scriptorium.org/prop/direct/>
instead of having to type out
<https://catalog.digital-scriptorium.org/entity/Q88> <https://catalog.digital-scriptorium.org/prop/direct/P16> <https://catalog.digital-scriptorium.org/entity/Q13> .
after declaring prefixes, you can type out
wd:Q88 wdt:P16 wd:Q13 .
As you can see, the Q and P values are appended to the end of the base URIs, so that you only need to know the prefix (e.g., wd, wdt) and the appropriate Q or P number to construct the triple pattern you want to use. This makes SPARQL queries much more readable and editable by human beings.
Which prefix declarations will I need to use to query the DS Wikibase?
The following prefix declarations should be at the beginning of any SPARQL query made at the DS Wikibase Query Service endpoint.
PREFIX wd: <https://catalog.digital-scriptorium.org/entity/>
PREFIX wds: <https://catalog.digital-scriptorium.org/entity/statement/>
PREFIX wdv: <https://catalog.digital-scriptorium.org/value/>
PREFIX wdt: <https://catalog.digital-scriptorium.org/prop/direct/>
PREFIX p: <https://catalog.digital-scriptorium.org/prop/>
PREFIX ps: <https://catalog.digital-scriptorium.org/prop/statement/>
PREFIX pq: <https://catalog.digital-scriptorium.org/prop/qualifier/>
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
Basic Example Queries
Below is a taxonomy of two types of basic queries based on whether the records in the DS Catalog are described by a particular data element in general (e.g., have any author, any assigned genre, or any place of production) or whether the records meet certain criteria (e.g., were produced by a specific author, were assigned a specific genre, or identified as produced in a particular place). The following queries were originally developed by L.P. Coladangelo (DS Catalog and Data Manager) for prototype testing, and adapted by LEADING Fellows Mace Jones and Jade Snelling as part of their fellowship research.
All manuscripts and their DS records
These queries will return lists of manuscript records and the associated data values, including both the string value as recorded in the original catalog record (the as_recorded value) and the authority value from a Linked Open Vocabulary to which the as_recorded value has been linked (the authority value). You should expect to see a list of all records and manuscripts in the DS Catalog which have values for the below data types.
Find all DS records describing manuscripts by their...
Artists
Authors
Centuries of Production
Dates of Production
Dated status
Former Owners
Genres
Holding Institutions
Languages
Materials
Other associated names/agents
Places of Production
Scribes
Subjects
Titles
Specific manuscripts and their DS records
These queries will return lists of manuscript records based on or limited by a specific value from an associated DS authority record, including both the string value as recorded in the original catalog record (the as_recorded value) and the authority value from a Linked Open Vocabulary to which the as_recorded value has been linked (the authority value). You should expect to see a list of all records and manuscripts in the DS Catalog which meet the conditions of having a specific value for the below data types.
Find all DS records describing manuscripts by a specific...
Artist
Author
Century of Production
Date of Production
- Start date
- End date
- Date range
- Inside date range
- Outside date range
Dated status
- Dated
- Non-dated
Former Owner
Genre
Holding Institution
Language
Material
Other associated name/agent
Place of Production
Scribe
Subject
Title
User generated examples
TBD
Technical Queries
Authority Record Generator
This query generates a list of authority records in the Wikibase by authority value type (i.e., all items which are an instance of a particular Authority Type).
Dated Classification Generator
This query generates a list of manuscript items which have and have not be classified as dated.