Querying the CSDMS model repository
Semantic MediaWiki (SMW)
is the knowledge management system used on the CSDMS website.
SMW has an API with several actions,
allowing users to add, edit, and query information.
Here, we'll focus on the ask
action,
and the Ask API,
to query metadata from the CSDMS model metadata repository.
The base URL for any call to the SMW API on the CSDMS website is httpw://csdms.colorado.edu/csdms_wiki/api.php.
Query syntax
The ask
action supports one parameter, query
,
which takes an urlencoded string.
The query is written in the SMW query language.
A query consists of a series of conditions,
which describe the search.
Conditions are built from properties and values.
For example, the condition
[[Programming language::C]]
would query for all models with the Programming language
property
that have a value of C
.
Note that the colons ::
in the condition
are literal in the query language,
and cannot be urlencoded.
Spaces, however, should be encoded with %20
or +
,
while brackets []
may optionally be encoded.
Try this condition in a query:
The results of a query are returned as JSON
with a specified
format.
A query result can also be viewed in pretty print form
by changing the value of the format
parameter to jsonfm
.
Properties
Properties are the basic data type of SMW. They consist of a name and a value, both of which are case-sensitive.
A defined set of properties are added to each model
by the CSDMS WikiSysop.
For example,
Programming language
is a property of models
in the CSDMS model metadata repository.
Note: I desire a query that returns all the properties of a model, but I haven't figured out how to make it. It's on my list of unanswered questions below. In lieu of a programmatic query, I've been looking at the model's wiki source; for example, the Wikitext for HydroTrend.
Categories
Categories are tags added to a page by the CSDMS WikiSysop to aid in classification. Like properties, categories can be queried. For example, the condition
[[Category:Terrestrial]]
will list all terrestrial models the CSDMS model metadata repository.
Unlike properties,
only one colon :
separates the category name and value.
Model
is itself a category in the CSDMS wiki.
Search for a particular model by name:
[[Model:HydroTrend]]
The category value is case-sensitive;
e.g., hydrotrend
wouldn't match a model.
Here's this condition in a query:
https://csdms.colorado.edu/csdms_wiki/api.php?action=ask&query=[[Model:HydroTrend]]&format=json
Model keywords
Model keywords are defined not by SMW or the CSDMS WikiSysop, but by the developer of a model, so they may inconsistently vary from model to model. For example, the condition
[[Model keywords::basin]]
can be used to find all models that contain the keyword basin
.
Use this condition in a query:
Advanced queries
The Ask API supports a number of advanced query options.
Limiting the displayed results
By default,
only the first 10 matches to a query are returned.
To raise this limit,
set the limit
display property to a larger number.
For example, in applying this to the example above
[[Programming language::C]]|limit=10000
we see that there are (at the time of writing this article)
actually 100 models written in C.
Note the use of the pipe character |
to set off
the display property from the condition.
Here's the query:
Combining conditions
Conditions listed in serial are combined with a logical AND
.
For example,
the two conditions
[[Programming language::C++]] [[Last name::Tucker]]
can be combined into a single query with:
Note that spaces in the properties need to be urlencoded
(here, with +
),
as well as the plus signs in C++
(here, with %2B
)!
Conditions can support multiple values
combined with a logical OR
operation
using the double pipe ||
operator.
For example,
to list models written in either Fortran 77 or Fortran 90,
use the condition
[[Programming language::Fortran77||Fortran90]]
in a query this is:
See Help:Selecting_pages for more examples of disjunctions and comparisons of conditionals.
Specifying additional data
Additional data can be returned with a query result
by specifying additional properties in query string.
Separate additional properties with the pipe and question mark characters |?
.
For example,
to find all models written by the user with the last name "Hutton",
and also include, if available,
the DOI and the source code repository for each model found,
use the query string:
[[Last+name::Hutton]]|?DOI+model|?Source+web+address
The API call is:
See Help:Inline_queries for more information on building query strings with several properties.
Testing queries
Test queries with the Special:Ask
page on the CSDMS portal:
In addition to interactively running queries,
the Special:Ask
page shows the raw query string,
which can be helpful for building new queries programmatically.
Examples of queries
Here are some examples of queries into the CSDMS model repository.
Python examples of these queries (and others) can be found in the GitHub repository https://github.com/csdms/ask-api-examples.
Unanswered questions
- How does one get a list of all model properties used in the CSDMS wiki?
- How can one show the data for all the properties of a particular model?
Additional references
- Properties: https://www.semantic-mediawiki.org/wiki/Help:Properties_and_types
- Results formats: https://www.semantic-mediawiki.org/wiki/Help:Result_formats
- Categories: https://meta.wikimedia.org/wiki/Help:Category