MarkLogic Java Client API
The MarkLogic Java Client API is an open-source library that allows developers to quickly, easily, and reliably access MarkLogic from their Java applications.
- Faster development and less custom code with out-of-the-box data management, search, and alerting
- Pure Java query builder and conveniences for POJOs, JSON, XML, and binary I/O
- Built-in extensibility for moving performance-critical code to the database
- Always open-source and developed on GitHub
Deeper dive
System requirements and installation
The generally available 3.0 release of the Java Client API will be hosted from the Maven Central repository. This will allow Maven and Gradle to easily access the built JAR and all of its dependencies.
Add the project’s dependencies.
<dependency> <groupId>com.marklogic</groupId> <artifactId>java-client-api</artifactId> <version>3.0.1</version> </dependency>
You can also download the source code yourself and compile a JAR or reference it directly from your IDE, for example as a “Project” in Eclipse’s Build Path settings for Java projects.
0. Configuration
The following examples use a shared DatabaseClient
configuration. The Java Client API communicates with MarkLogic over HTTP. DatabaseClient
instances are designed to be shared across application threads. In a typical production implementation, you’d likely inject a configured client singleton into data service helper classes using something like Spring. Take a look at the Samplestack reference architecture for an example.
package com.marklogic.examples; import com.marklogic.client.DatabaseClient; import com.marklogic.client.DatabaseClientFactory; public class Configuration { private static DatabaseClient client = DatabaseClientFactory.newClient( "localhost", 8000, // Every instance comes with a REST Client API instance // pre-installed on port 8000 "Documents", // Each connection can specify its database at runtime "admin", "********", DatabaseClientFactory.Authentication.DIGEST); public static DatabaseClient exampleClient() { return client; } }
1. POJO Façade
The POJO façade allows an application to work with plain Java objects while the Java Client API manages marshaling and unmarshaling documents from the database and mapping queries to fields. The POJO façade is not a full-featured ORM (ODM?) system. For example, it won’t automatically fetch related objects or manage circular dependencies, and it provides no client-side caching. It’s designed to be a low-overhead way to persist and query simple domain objects. For more complicated scenarios, you’re better off working directly with JSON or XML documents and manually assembling object graphs.
package com.marklogic.examples; import com.acme.Tag; import com.acme.User; import com.marklogic.client.DatabaseClient; import com.marklogic.client.pojo.PojoRepository; public class Ex01_POJO { public static void main(String[] args) { // A little phony here. A real app probably wouldn't // generate these inline. // Create com.acme.User instances. User shauna = new User(); shauna.setName("Shauna Weber"); shauna.setAddress("760 Forest Place, Glenshaw, Michigan, 1175"); shauna .setAbout("Kitsch fingerstache XOXO, Carles chambray 90's meh cray disrupt Tumblr. Biodiesel craft beer sartorial meh put a bird on it, literally keytar blog vegan paleo. Chambray messenger bag +1 hoodie, try-hard actually banjo bespoke distillery pour-over Godard Thundercats organic. Kitsch wayfarers Pinterest American Apparel. Hella Shoreditch blog, shabby chic iPhone tousled paleo before they sold out keffiyeh Portland Marfa twee dreamcatcher. 8-bit Vice post-ironic plaid. Cornhole Schlitz blog direct trade lomo Pinterest."); shauna.setActive(true); shauna.setBalance(2774.31); shauna.setGender("female"); shauna.setAge(29); shauna.setGUID("6e1c7304-09a1-4436-ba77-ae1e3b8856f7"); User peters = new User(); peters.setName("Peters Barnett"); peters.setAddress("749 Green Street, Tyro, Illinois, 2856"); peters .setAbout("Letterpress Echo Park fashion axe occupy whatever before they sold out, Pinterest pickled cliché. Ethnic stumptown food truck wolf, ethical Helvetica Marfa hashtag. Echo Park photo booth banh mi ennui, organic VHS 8-bit fixie. Skateboard irony dreamcatcher mlkshk iPhone cliche. Flannel ennui YOLO artisan tofu. Hashtag irony Shoreditch letterpress, selvage scenester YOLO. Locavore fap bicycle rights, drinking vinegar Tonx bespoke paleo 3 wolf moon readymade direct trade ugh wolf asymmetrical beard plaid."); peters.setActive(false); peters.setBalance(1787.45); peters.setGender("male"); peters.setAge(38); peters.getTags().add(new Tag("ex")); peters.getTags().add(new Tag("ex")); peters.getTags().add(new Tag("ut")); peters.getTags().add(new Tag("exercitation")); peters.getTags().add(new Tag("Lorem")); peters.getTags().add(new Tag("magna")); peters.getTags().add(new Tag("non")); peters.getTags().add(new Tag("aute")); peters.getTags().add(new Tag("nisi")); peters.setGUID("34a23649-ec61-478f-90ab-5f01a55120ce"); DatabaseClient client = Configuration.exampleClient(); // Create a repository specific to User classes. // The repository takes care of all of the serialization/deserialization // between POJOs and documents in the database. PojoRepository<User, String> userRepo = client.newPojoRepository( User.class, String.class); userRepo.write(shauna, "fake data"); userRepo.write(peters, "fake data"); } }
User
package com.acme; import java.util.HashSet; import java.util.Set; import com.marklogic.client.pojo.annotation.Id; import com.marklogic.client.pojo.annotation.PathIndexProperty; import com.marklogic.client.pojo.annotation.PathIndexProperty.ScalarType; public final class User { public String guid; private String name; private String about; private String address; private Boolean active; private Double balance; private Integer age; private String gender; private Set<Tag> tags; public User() { super(); this.tags = new HashSet<Tag>(); } @Id public String getGUID() { return guid; } public void setGUID(String guid) { this.guid = guid; } @PathIndexProperty(scalarType = ScalarType.STRING) public String getGender() { return gender; } public void setGender(String gender) { this.gender = gender; } @PathIndexProperty(scalarType = ScalarType.DOUBLE) public Double getBalance() { return balance; } public void setBalance(Double balance) { this.balance = balance; } public String getAddress() { return address; } public void setAddress(String address) { this.address = address; } public Boolean isActive() { return active; } public void setActive(Boolean active) { this.active = active; } public String getName() { return name; } public void setName(String name) { this.name = name; } public String getAbout() { return about; } public void setAbout(String about) { this.about = about; } @PathIndexProperty(scalarType = ScalarType.INT) public Integer getAge() { return age; } public void setAge(Integer age) { this.age = age; } public Set<Tag> getTags() { return tags; } public void setTags(Set<Tag> tags) { this.tags = tags; } @Override public String toString() { return "User [guid=" + guid + ", name=" + name + ", about=" + about + ", address=" + address + ", active=" + active + ", balance=" + balance + ", age=" + age + ", gender=" + gender + ", tags=" + tags + "]"; } }
Tag
package com.acme; public class Tag { private String label; private String description; public Tag() { super(); } public Tag(String label) { super(); this.label = label; } public String getLabel() { return label; } public void setLabel(String label) { this.label = label; } public String getDescription() { return description; } public void setDescription(String description) { this.description = description; } @Override public String toString() { return "Tag [label=" + label + ", description=" + description + "]"; } }
2. POJO Query
The PojoRepository
provides a means for querying typed POJOs. It does not re-hydrate an entire object graph or cache results, but is an easy way to work with simple POJOs, complementary to the other JSON or XML document I/O capabilities.
package com.marklogic.examples; import com.acme.User; import com.marklogic.client.DatabaseClient; import com.marklogic.client.pojo.PojoPage; import com.marklogic.client.pojo.PojoQueryBuilder; import com.marklogic.client.pojo.PojoRepository; import com.marklogic.client.query.PojoQueryDefinition; public class Ex02_POJOQuery { public static void main(String[] args) { DatabaseClient client = Configuration.exampleClient(); // Create a repository specific to User classes. // The repository takes care of all of the serialization/deserialization // between POJOs and documents in the database. PojoRepository<User, String> userRepo = client.newPojoRepository( User.class, String.class); PojoQueryBuilder<User> qb = userRepo.getQueryBuilder(); PojoQueryDefinition query = qb.or(qb.word("about", "pickled cliche"), qb.value("gender", "female")); PojoPage<User> page = userRepo.search(query, 1); for (User user : page) { System.out.println(user.toString()); } } }
3. POJO Index Annotations
The User
class above is annotated with @PathIndexProperty
annotations. For example, the gender
getter method specifies a string path index using a standard Java annotation in the context of the domain class.
@PathIndexProperty(scalarType = ScalarType.STRING) public String getGender() { return gender; }
The com.marklogic.client.pojo.util.GenerateIndexConfig
utility, generates index configurations directly from your domain classes. For example, the above annotation will create a string path range index on the gender property. GenerateIndexConfig
handles the details of mapping your fields to their serialized structures in the database—JSON documents, in particular. Without these indexes in place some scalar queries will fail.
To parse the annotations and generate an index configuration, run the GenerateIndexConfig
utility. The -classes
parameter is a space-separated list of fully qualified POJO classes, and -file
specifies where to save the JSON output. (You’ll also have to specify a classpath, depending on your environment.)
java com.marklogic.client.pojo.util.GenerateIndexConfig -classes "com.acme.User" -file User.json
That output will look something like the following. The details aren’t actually important. Your POJO queries are written in terms of your domain classes, not their JSON serialization. The configuration generator handles the intricacies of the mapping.
{ "range-path-index" : [ { "path-expression" : "com.acme.User/balance", "scalar-type" : "double", "collation" : "", "range-value-positions" : "false", "invalid-values" : "ignore" }, { "path-expression" : "com.acme.User/age", "scalar-type" : "int", "collation" : "", "range-value-positions" : "false", "invalid-values" : "ignore" }, { "path-expression" : "com.acme.User/gender", "scalar-type" : "string", "collation" : "http://marklogic.com/collation/", "range-value-positions" : "false", "invalid-values" : "ignore" } ], "geospatial-path-index" : [ ], "geospatial-element-pair-index" : [ ] }
Finally, to apply the generated index configuration to the database you invoke the REST management API.
curl -i --digest --user admin:'********' \ -H 'Content-Type: application/json' \ -d '@User.json' \ -X PUT 'http://localhost:8002/manage/LATEST/databases/Documents/properties'
To summarize the POJO configuration workflow:
- Annotate your POJO getters with
@PathIndexProperty
, specifying the appropriate data type. - As part of your deployment automation, generate database configuration using the
GenerateIndexConfig
utility in Java. - Apply the configuration the target MarkLogic instance by POSTing to the
databases/[database]/properties
endpoint.
3. Bulk Writes
Bulk writes allow an application to gather multiple inserts or updates into a single request, amortizing the fixed network overhead across many documents. You can accumulate documents in a WriteSet
and periodically write them to the database as a single transaction.
package com.marklogic.examples; import java.io.IOException; import com.fasterxml.jackson.core.JsonFactory; import com.fasterxml.jackson.core.JsonParser; import com.fasterxml.jackson.databind.JsonNode; import com.fasterxml.jackson.databind.ObjectMapper; import com.marklogic.client.DatabaseClient; import com.marklogic.client.document.DocumentWriteSet; import com.marklogic.client.document.GenericDocumentManager; import com.marklogic.client.io.DocumentMetadataHandle; import com.marklogic.client.io.JacksonHandle; public class Ex03_BulkWrite { private static final int BATCH_SIZE = 500; public static void main(String[] args) throws IOException { DatabaseClient client = Configuration.exampleClient(); GenericDocumentManager docMgr = client.newDocumentManager(); // Create an I/O adapter for inputting JSON data. JacksonHandle handle = new JacksonHandle(); ObjectMapper mapper = new ObjectMapper(); JsonFactory factory = mapper.getFactory(); // Create a new collection to manage bulk writes. DocumentWriteSet writeSet = docMgr.newWriteSet(); // Set the default metadata (e.g. collections, permissions, quality) // that will apply to the entire set. DocumentMetadataHandle meta = new DocumentMetadataHandle(); meta.withCollections("dummy batched data"); writeSet.addDefault(meta); for (int i = 0; i < 1031; i++) { // Construct a JSON document inline. This isn't very realistic. // Typically you'd read from the file system or another service. The // I/O handle adapters make it easy to switch input sources, though. JsonParser jp = factory.createParser("{\"k\":\"v1-" + i + "\"}"); JsonNode content = mapper.readTree(jp); handle.set(content); writeSet.add("/" + i + ".json", meta, handle); // Periodically write a batch. Each batch is written in its own // transaction. The ideal batch size will depend on // the characteristics of your data and your infrastructure. if (i % BATCH_SIZE == 0) { docMgr.write(writeSet); System.out.println("Wrote batch"); writeSet.clear(); } } // Cleanup if there's anything that hasn't been flushed from the // WriteSet. if (!writeSet.isEmpty()) { docMgr.write(writeSet); System.out.println("Wrote remainder"); } } }
4. Bulk Read
Bulk read allows an application to efficiently read sets of raw documents and/or metadata. Reads can be filtered by a QueryDefinition
to provide additional precision.
package com.marklogic.examples; import com.marklogic.client.DatabaseClient; import com.marklogic.client.document.DocumentPage; import com.marklogic.client.document.DocumentRecord; import com.marklogic.client.io.JacksonParserHandle; import com.marklogic.client.query.QueryDefinition; import com.marklogic.client.query.StructuredQueryBuilder; public class Ex04_BulkRead { public static void main(String[] args) { DatabaseClient client = Configuration.exampleClient(); // Build a structured query. Here we're doing a pretty broad collection // query on all of the data loaded in Example 1. StructuredQueryBuilder builder = client.newQueryManager() .newStructuredQueryBuilder(); QueryDefinition query = builder.and(builder.collection("fake data")); DocumentPage page = client.newDocumentManager().search(query, 1); // Iterate through the results, which include the raw documents, // available with a ReadHandle. for (DocumentRecord doc : page) { System.out.println(doc.getContent(new JacksonParserHandle())); } } }
Documentation for bulk write and read is available as part of MarkLogic 8.
5. JavaScript Extensions
The Java Client API supports extensibility with code run on the server. Developers can deploy modules written in JavaScript or XQuery and invoke them in the context of the Java Client API. The two primary extension points are resource extensions and transformations. Resource Extensions provide fluent wrappers for server-side code so that Java developers can access the custom code from pure Java. Transformations integrate into the document management and search workflow to provide hooks for modifying data as it goes into or comes out of the server.
Samplestack, a reference application that will ship along with MarkLogic 8, includes an implementation of a server-side transformation written in JavaScript.
/* transforms search response */ /* given an iterator of ids, get the reputations of these * and return a mapping of ids to reputations */ function joinReputations(ids) { var results = cts.search( cts.andQuery( [ cts.collectionQuery("com.marklogic.samplestack.domain.Contributor"), cts.jsonPropertyValueQuery("id", ids) ])); var returnObject = {}; for (var result of results) { var nextObject = result.toObject(); var ownerObject = nextObject["com.marklogic.samplestack.domain.Contributor"] if (ownerObject.id == null) { returnObject.id = "N/A"; } else { returnObject.id = ownerObject.id; } if (ownerObject.reputation == null) { returnObject.reputation = 0; } else { returnObject.reputation = ownerObject.reputation; } returnObject.originalId = ownerObject.originalId; returnObject.userName = ownerObject.userName; returnObject.displayName = ownerObject.displayName; } return returnObject; } /* this function requires bulk input */ function searchTransform(context, params, input) { var outputObject = input.toObject(); var ownerIds = input.xpath("./owner/id"); xdmp.log("OWNERIDS" + ownerIds); if (ownerIds.count > 0) { var joinedOwner = joinReputations(ownerIds); if (joinedOwner.id == undefined) { return outputObject; } else { outputObject.owner = joinedOwner; return outputObject; } } else { /* search response here */ var results = outputObject.results; for (var i = 0; i < results.length; i++) { var result = results[i]; var matches = result.matches; for (var j = 0; j < matches.length; j++) { var match = matches[j]; var source = "question"; if (match.path.indexOf("answers") > -1) { source = "answer"; } else if (match.path.indexOf("tags") > -1) { source = "tags"; } match.source = source; if (source == "answer") { match.id = "answerid"; } else { match.id = fn.doc(match.uri).root.id } match.path = null; } } return outputObject; } }; exports.transform = searchTransform;
7. Eval (and Invoke)
While the Java Client API provides rich functionality out-of-the-box and several built-in ways to extend its functionality with server-side code, it’s sometimes convenient to be able to evaluate a string of JavaScript or XQuery or invoke a remote XQuery or JavaScript module. These are lower-level mechanisms than resource extensions and not integrated into the document management and query workflow like custom transformations. In general, you should look to resource extensions and transformations for long-term maintainability.
The following example demonstrates how to evaluate Server-Side JavaScript from the Java Client API. The remote code uses an external variable, percent
, that’s passed in from Java. It also shows how to write to the database from an eval’ed module. You can even run the eval as part of an existing multistatement transaction using the transaction(Transaction transaction)
method.
package com.marklogic.examples; import com.marklogic.client.DatabaseClient; import com.marklogic.client.eval.EvalResult; import com.marklogic.client.eval.EvalResultIterator; public class Ex07_Eval { public static void main(String[] args) { DatabaseClient client = Configuration.exampleClient(); final String javascript = "declareUpdate(); " // Tell the server we're writing to the database + "var total = 0; " + "for(var d of fn.collection('fake data')) {" // Loop through the JSON documents in the collection + " total += d.toObject().balance * percent;" // Update the total based on a percentage of the balance property + "}" + "xdmp.documentInsert('/total.json', {total: total}); " // Write a document with the updated total + "total;"; // Return the total EvalResultIterator eval = client.newServerEval().javascript(javascript) .addVariable("percent", 0.08).eval(); for (EvalResult result : eval) { System.out.println(result.getString()); } } }
Use the modulePath()
method on the ServerEvaluationCall
instance to specify a JavaScript or XQuery main module to be invoked remotely.
eval
and invoke
should be familiar. One significant advantage of using the Java Client API over XCC, is the integration with the other capabilities that the Client APIs provide. For example, a Client API invocation can process its return values using the I/O Handler
classes, just like the rest of the library. If you’re beginning a new Java application with MarkLogic 8, you should start with the Java Client API, rather than XCC.