Processing search results
In all of the search examples so far, we haven't looked too closely at how the search results are extracted (and printed to the console). In each case, we've been using the tailor-made SearchHandle, which encapsulates search results as a POJO. Before we look more closely at that object structure, let's take a peek at the raw data it encapsulates. We already saw how use of DocumentMetadataHandle is optional; so too the case with SearchHandle.
Get search results as raw XML
Open Example_21_SearchResultsAsXML.java. This example performs the same search as the previous example, except that instead of using a SearchHandle, here we're using a StringHandle to receive the raw XML search results (from the server) as a string:
Run the program and examine the console to see how MarkLogic represents its search results in XML. This should give you an idea of the complexity of information we're dealing with here. Also, depending on your search options, the structure of these results can vary widely.
Get search results as raw JSON
Open Example_22_SearchResultsAsJSON.java. This example is identical to the previous one, except now we configure our StringHandle to receive JSON (instead of XML, the default):
Run the program to see the raw JSON search results that were fetched from the server.
Get search results as a POJO
While you are certainly free to process search results as raw JSON or XML, the preferred way in Java is to use a SearchHandle instance, which models the results using a containment hierarchy that mirrors that of the raw data we saw:
- SearchHandle
- MatchDocumentSummary[]
- MatchLocation[]
- MatchSnippet[]
- MatchLocation[]
- MatchDocumentSummary[]
Open TutorialUtil.java in the tutorial project. This module contains a few different approaches to printing search results that have been used by the previous search examples. Let's focus on the last one -— displayResults(). The first step to extracting search results from a SearchHandle is to call its getMatchResults() method:
This yields an array of MatchDocumentSummary objects. We can illustrate what this object represents by looking at a typical search results page, such as the one on this website:
Each matching document in the list would be represented by a MatchDocumentSummary instance. This suggests that SearchHandle could then be used, for example, as the model (or to drive the model) in an MVC-based web application. Our utility code is only concerned with printing text to the console, but the basic task is the same: iterate through each level of this hierarchy and do something useful at each level.
Next, we drill down into each search result and call getMatchLocations():
A MatchLocation object represents a range of text in the document that includes a search "hit":
For each MatchLocation, we call getSnippets():
A MatchSnippet object represents a range of text within a location that either is or isn't highlighted:
In addition to getMatchResults(), the SearchHandle class provides other useful methods for building a search application, such as getFacets(), getMetrics(), and getTotalResults().
Basic Search
Custom search