Custom search
All of the search examples thus far in this tutorial have used MarkLogic's default query options (interchangeably called "search options"). This may suffice for some basic applications, but most of the time you will end up wanting to provide custom options. Custom options let you do things like:
- define named constraints, which can be used in string queries, such as "tag" in "flower tag:shakespeare"
- enable analytics and faceting by identifying lexicons and range indexes from which to retrieve values
- extend or alter the default search grammar
- customize the structure of the search results, including snippeting and default pagination
- control search options such as case sensitivity and ordering
Options are grouped into named option sets on your REST API server. You can customize these either by updating the default option set, or by creating a new named option set.
Get a list of the server's option sets
Before we start manipulating option sets, let's query the list of current option sets. Open Example_23_ListOptionSets.java. We can read the list as a POJO by using a QueryOptionsListHandle:
We then call our query manager's optionsList() method to retrieve the list, storing it in our handle:
And then iterate over the Map returned by the now-populated handle's getValuesMap() method:
What this does is give you, the developer, a list of the available option set names you can pass to the search() method. If you don't pass a name explicitly (as in our examples so far), then the option set named default
is used.
Since we haven't added any custom options yet, when you run this program, you should just see the "default" option set and its URI, /v1/config/query/default
, which reveals the fact that you can view the raw options in your browser if you want:
- http://localhost:8011/v1/config/query/default (XML)
- http://localhost:8011/v1/config/query/default?format=json (JSON)
Now let's create a new set of options.
Upload custom search options
Only users with the "rest-admin" role can update option sets. Until now, all the examples in this tutorial have used the "rest-writer" user to connect to MarkLogic. Now, whenever we need to update options, we'll connect with our "rest-admin" user instead. See in Example_24_LoadOptions.java:
To manipulate query options, we need a manager object (QueryOptionsManager), just as we need managers for other kinds of server interactions (DocumentManager for CRUD, and QueryManager for search). However, getting this manager (and other admin-related managers) takes an extra, intermediate call to newServerConfigManager():
Next, we get a QueryOptionsBuilder, which we'll use to construct individual query options:
As with other kinds of payloads we transmit, we need a handle to contain the query options (QueryOptionsHandle). We then use our query builder to initialize the options, passing them in using our handle's withConstraints() method, which is part of a fluent interface for immediately populating the handle with query options:
In this case, we're building a constraint option. Constraint means something very specific in MarkLogic. Whenever a user types a phrase of the form name:
text in their search string, they're using a constraint (assuming one has been defined for them). For example, they might type "author:melville" to constrain their search to documents authored by Herman Melville. But for this to have the intended behavior, a constraint named "author" must first be defined in the server's query options. In our case, we want to enable users to type things like "tag:shakespeare" and "tag:mlw12".
So we must name our constraint "tag", which is the first argument passed to the constraint() option constructor:
The second argument is the constraint source. In this case, we want the constraint to be backed by our collection tags, so we call our builder's collection() method. Its argument is an optional collection tag prefix, which would be handy if we wanted to power multiple constraints via collection tags such as "author/shakespeare" and "state/california" using the prefixes "author/" and "state/", respectively. We're not doing this, so we pass an empty prefix ("").
Now that the options are configured, all we need to do is write them to the server, using a name of our choosing (tutorial
):
Run the program. Then go back and re-run the previous example (Example_23_ListOptionSets.java). You should now see that two option sets are available: default
and tutorial
.
Search using a collection constraint
Let's make use of our new configuration and run a search using our "tag" constraint. Open Example_25_ConstraintOnCollection.java. To make the new option available, we need to associate our string query with the tutorial options on the server:
This time (and from now on), we'll slim down our code by creating the handle and populating it on the same line, taking advantage of the fact that read() returns the handle:
Run the program. It should yield the same results as Example_20_SearchCollection.java. The only difference is that now, the "shakespeare" collection criterion is user-supplied as part of their search string in the form of the "tag" constraint.
Search using a JSON key value constraint
Normally, the code that you use to define and upload query options should reside in a different place than the code you use to run searches. For one thing, the two tasks require different levels of access. For another, your application code is designed to be run over and over again, whereas server configuration is more of a one-time thing. However, for purposes of this tutorial, we're going to mix the two in the remaining examples. The rest of the examples will thus include two steps:
- Update the server configuration
- Run a query making use of the updated configuration
We're going to keep using the "tutorial" options set, but rather than replacing it anew each time, we're going to add to it. That means we'll need to fetch it, modify it, and send the updated configuration back to the server. That's exactly what Example_26_ConstraintOnJSONValue.java starts off by doing. First, we fetch the existing "tutorial" options by calling the readOptions() method:
Next, we use one of the QueryOptionsHandle's add methods to augment the options. In this case, we'll create another constraint using addConstraint():
If the "company" constraint isn't already configured, we add it, backed by the JSON key named "affiliation". In this case, we're using a value constraint, which means a searched-for value must match the affiliation exactly.
Next, we write the updated options back to the server:
Now we're ready to test it out.
Instead of calling the query definition's setOptionsName() method, we can also set the options when constructing the query (which we'll do from now on):
Now let's find all the MarkLogic engineers who spoke at the conference:
Run the program to see the search results. You can also see the updated query options at http://localhost:8011/v1/config/query/tutorial.
Search using an element value constraint
Open Example_27_ConstraintOnElementValue.java. Here we're defining another value constraint but against an element this time instead of a JSON key:
Now we can search for the King of France directly in our query text:
Run the program to see the results.
Search using a JSON key word constraint
Open Example_28_ConstraintOnJSONWords.java. Here, instead of a value constraint, we're using a word constraint scoped within all JSON "bio" keys:
Unlike a value constraint (which tests for the value of the key or element), a word constraint uses normal search-engine semantics. The search will succeed if the word is found anywhere in the given context. Also, it uses stemming, which means that matching words will include equivalent forms: "strategies" and "strategy", "run" and "ran", etc.
Now let's use the "bio" constraint in some search text:
Run the program to see the results:
Search using an element word constraint
Open Example_29_ConstraintOnElementWords.java. This time our word constraint is against the <STAGEDIR> element:
Now we can find all the Shakespeare plays where, for example, swords are involved on stage:
Run the program to see the results.
Search using an element constraint
Open Example_30_ConstraintOnElement.java. Here we're defining an element constraint:
An element constraint is similar to a word constraint, except that it will match words in the element and any of its descendants. In the above case, it will match text in <LINE> element children of <SPEECH>. This is useful for searching documents that contain "mixed content" (i.e. text mixed with markup, such as <em> and <strong>).
Using this constraint will restrict the search to the spoken lines of text (excluding, for example, stage directions):
Run the program to see the result.
Search using a properties constraint
We can also create a constraint for searching properties. See in Example_31_ConstraintOnProperties.java, how we do this to enable searching an image's metadata:
Now it's easy for a user to search for photos of fish (or anything else):
Run the program to see the list of matching image docs.
Search using a structured query
Recall that the Java API supports three kinds of queries that can be passed to search():
- key/value queries
- string queries
- structured queries
We briefly touched on a structured query in Example_18_SearchProperties.java. Now we'll take a look at a richer use of it, utilizing the constraints we've defined so far. Open up Example_32_StructuredQuery.java. We'll start by creating a StructuredQueryBuilder, associating it with our "tutorial" options:
The query builder is analogous to the options builder in that it gives us a way of building up complex object structures using nested method calls. Only this time, rather than building up options to store on the server, we're building up an actual query:
The builder's or() method constructs a query that will find documents matching any of its argument queries (union). In contrast, an and() query restricts its results to those documents matching all of its child queries (intersection). Take a look at the StructuredQueryBuilder javadocs to see what methods you can use to construct queries. Many of these (particularly the ones with "Constraint" in their names) require you to have defined options for them to be of any use.
To run the query, we pass it to our query manager's search() method, just as we do with string and key/value queries:
Run the program to see the results. Note that the search will only give you the expected results if you've previously defined the "bio", "company", "spoken", "stagedir", and "person" constraints (see previous examples in this section).
For more details on the kinds of constraints you can define, see "Constraint Options" in the Search Developer's Guide.
Processing search results
Analytics