Custom search
All of the search examples so far in this tutorial have used MarkLogic's default query options (interchangeably called "search options"). This may suffice for some basic applications, but most of the time you will end up wanting to provide custom options. Custom options let you do things like:
- define named constraints, which can be used in string queries, such as "tag" in "flower tag:shakespeare"
- enable analytics and faceting by identifying lexicons and range indexes from which to retrieve values
- extend or alter the default search grammar
- customize the structure of the search results, including snippeting and default pagination
- control search options such as case sensitivity and ordering
Options are grouped into named option sets on your REST API server. You can customize these either by updating the default option set, or by creating a new named option set.
Get a list of the server's option sets
To see a list of all your server's option sets, make a GET request to the /config/query endpoint:
- http://localhost:8011/v1/config/query?format=xml (XML)
- http://localhost:8011/v1/config/query?format=json (JSON)
If you haven't added any custom options yet, then you'll see just one option set—the "default" option set. Here's what the response looks like:
Whenever you run a search without explicitly specifying an option set (using the options request parameter), this is the option set that will be in effect.
Upload custom search options
Only users with the "rest-admin" role can update option sets. Until now, all the examples in this tutorial have used the "rest-writer" user to connect to MarkLogic. Now, whenever you need to update options, you'll connect with the "rest-admin" user instead.
Let's start by building a constraint option. Constraint means something very specific in MarkLogic. Whenever a user types a phrase of the form name:
text in their search string, they're using a constraint (assuming one has been defined for them). For example, they might type "author:melville" to constrain their search to documents with an author element or property with the value "melville", as opposed to a search for the term anywhere.. But for this to have the intended behavior, a constraint named "author" must first be defined in the server's query options. For this tutorial, you're going to define a constraint that enables users to type things like "tag:shakespeare" and "tag:mlw12".
To create or replace an entire option set, use the PUT method against /config/query/yourOptionsName:
The above command connects as the rest-admin user and uploads a JSON-based options configuration named "tutorial". The option set defines one constraint named "tag":
There are a number of different kinds of constraints. In this case, you're using a "collection constraint". The "prefix" field is an optional collection tag prefix, which would be handy if you wanted to power multiple constraints via collection tags such as "author/shakespeare" and "state/california" using the prefixes "author/" and "state/", respectively. In this case, you're not doing that; you just want to constrain by the whole collection tag, so you pass an empty prefix ("").
You can see what the stored options look like by retrieving the newly-created option set: http://localhost:8011/v1/config/query/tutorial. Add the "format=json" parameter to see the options in that format.
For complete details on what structures are allowed in both the XML and JSON representations of query options, see:
Confirm that two option sets are now available by getting the list again:
- http://localhost:8011/v1/config/query?format=xml (XML)
- http://localhost:8011/v1/config/query?format=json (JSON)
You should now see two option sets: default
and tutorial
.
Search using a collection constraint
Now you'll make use of the new configuration and run a search using the "tag" constraint. To do that, call /search with the options parameter:
- http://localhost:8011/v1/search?q=flower+tag:shakespeare&options=tutorial&format=xml (XML)
- http://localhost:8011/v1/search?q=flower+tag:shakespeare&options=tutorial&format=json (JSON)
The above query searches for occurrences of "flower" in the "shakespeare" collection.
Search using a JSON key value constraint
The rest of the examples in this section include two steps:
- Update the server configuration
- Run a query making use of the updated configuration
You're going to keep using the "tutorial" options set, but rather than replacing it anew each time using PUT, you're going to incrementally add to it, using POST, which will append to the server's options. Run the following command:
View the options to confirm that you've appended to (rather than replaced) them:
- http://localhost:8011/v1/config/query/tutorial (XML)
- http://localhost:8011/v1/config/query/tutorial?format=json (JSON)
The command you ran above defined a JSON key value constraint called "company", backed by the "affiliation" JSON key. You can define these options using either JSON or XML:
Since this is a value constraint, the searched-for value must match the affiliation exactly.
Run the following search to find all conference talks given by MarkLogic employees and mentioning the word "engineer", making use of our newly-defined "company" constraint:
- http://localhost:8011/v1/search?q=engineer+company:marklogic&options=tutorial (XML)
- http://localhost:8011/v1/search?q=engineer+company:marklogic&options=tutorial&format=json (JSON)
Search using an element value constraint
Run the following command:
Here we're defining another value constraint but against an element this time (<PERSONA>) instead of a JSON key. Here's the representation of the constraint:
Now you can search for the King of France directly in your query text, using the new "person" constraint:
- http://localhost:8011/v1/search?q=person:%22king+of+france%22&options=tutorial (XML)
- http://localhost:8011/v1/search?q=person:%22king+of+france%22&options=tutorial&format=json (JSON)
Search using a JSON key word constraint
Run the following command:
Here, instead of a value constraint, we're using a word constraint scoped within all JSON "bio" keys:
Unlike a value constraint (which tests for the value of the key or element), a word constraint uses normal search-engine semantics. The search will succeed if the word is found anywhere in the given context. Also, it uses stemming, which means that matching words will include equivalent forms: "strategies" and "strategy", "run" and "ran", etc.
Now let's use the "bio" constraint to find all bios mentioning "strategy":
- http://localhost:8011/v1/search?q=bio:strategy&options=tutorial&format=xml (XML)
- http://localhost:8011/v1/search?q=bio:strategy&options=tutorial&format=json (JSON)
Search using an element word constraint
Run the following command:
This time you're using a word constraint against the <STAGEDIR> element:
Now we can find all the Shakespeare plays where, for example, swords are involved on stage:
- http://localhost:8011/v1/search?q=stagedir:sword&options=tutorial&format=xml (XML)
- http://localhost:8011/v1/search?q=stagedir:sword&options=tutorial&format=json (JSON)
Search using an element constraint
Run the following command:
Here we're defining an element constraint:
An element constraint is similar to a word constraint, except that it will match words in the element and any of its descendants. For example, it will match text in <LINE> element children of <SPEECH>. This is useful for searching documents that contain "mixed content" (i.e. text mixed with markup, such as <em> and <strong>).
Using this constraint will restrict the search to the spoken lines of text (excluding, for example, stage directions). This will search for mentions of "sword" in the script itself:
- http://localhost:8011/v1/search?q=spoken:sword&options=tutorial&format=xml (XML)
- http://localhost:8011/v1/search?q=spoken:sword&options=tutorial&format=json (JSON)
Search using a properties constraint
We can also create a constraint for searching properties. Run the following command:
The properties constraint enables us to search an image's metadata:
Now it's easy for a user to search for photos of fish (or anything else):
- http://localhost:8011/v1/search?q=image:fish&options=tutorial&format=xml (XML)
- http://localhost:8011/v1/search?q=image:fish&options=tutorial&format=json (JSON)
Search using a structured query
We've seen how the REST API supports three kinds of queries:
- key/value queries (using the /keyvalue endpoint)
- string queries (using the /search endpoint)
- structured queries (also using the /search endpoint)
We briefly touched on a structured query for searching properties. Now we'll take a look at a richer use of it, utilizing the constraints we've defined in the "tutorial" options set.
Here we're going to build up a complex set of criteria. It will find documents:
- Matching any of (OR query):
- Matching all of (AND query):
- bio:product
- company:MarkLogic
- Matching all of (AND query):
- spoken:fie
- stagedir:fall
- person:GRUMIO
- Matching all of (AND query):
- documents whose properties contain "fish"
- documents in the "/images/2012/02/27/" directory
- Matching all of (AND query):
- documents in the "mlw2012" collection
- documents containing the term "fun"
- Matching all of (AND query):
Complex queries like this can be expressed as search strings, but to construct and manipulate them programmatically, it can more convenient to use a structured query. Here's the above structured query expressed in JSON format:
An or-query will find documents matching any of its child queries (union). In contrast, an and-query restricts its results to those documents matching all of its child queries (intersection). To run the query, you can include it as POST data:
Or as the value of the (URL-encoded) structuredQuery parameter in a GET request: http://localhost:8011/v1/search?options=tutorial&structuredQuery=...
Note that the search will only give you the expected results if you've previously defined the "bio", "company", "spoken", "stagedir", and "person" constraints (see previous examples in this section).
For more details on what structures are allowed in both the XML and JSON representations of structured queries, see Structured Search XML Node and JSON Keys in the Search Developer's Guide.
For more details on the kinds of constraints you can define, see "Constraint Options" in the Search Developer's Guide.
Understanding search results
Analytics