Documents
The basic unit of organization in MarkLogic is a Document and the Node.js Client API enables you to store and search documents. The Node.js API supports documents encoded in JSON like:
and XML like
The set of JSON keys, objects, and arrays, or XML elements and attributes you use in your documents is up to you. MarkLogic does not require adherence to any schemas.
MarkLogic also supports documents encoded in binary form or plain text as well. We refer to this encoding (JSON, XML, text, or binary) as the document's Format.
URIs
A document's URI is a key that you choose when you insert a document into the database. Each document has a unique URI. You use this URI to retrieve or refer to the document later. Typically document URIs begin with a slash like /beer.
Beyond the URI, MarkLogic maintains some additional metadata associated with each document.
Organization
How does MarkLogic organize documents in the database? Logically, MarkLogic provides two concepts: Collections and Directories. You can think of collections as unordered sets. If you have a notion of tag as well, that may help. Collections can hold multiple documents and documents can belong to multiple collections.
Directories are similar in concept to the notion of directories or folders in file systems. They are hierarchical and membership is implicit based on the path syntax of URIs.
API Basics
Ok, so that's what is stored in MarkLogic. What does the API look like? The first step whenever you want to interact with MarkLogic is to get a DatabaseClient instance. Note that this tutorial uses admin/admin for the username and password; for a real deployment, you'd want a more secure password.
You can connect as a different user, to a different port, and a different database. (See the Getting Started with the Node Client API for project setup).
CRUD
Yep, that's Create, Read, Update, and Delete. We use the term "Insert" instead of "Create" but that doesn't keep us from saying CRUD for fun. MarkLogic provides a simple Node.js API for CRUD.
The first step is to use our DatabaseClient instance (db) to write a document by calling the write() method:
The result will be:
To get the document back, call the read() method:
Updating (replacing) the document works exactly the same as creating a document: use the write() method.
To delete the document, call the remove() method:
Search
Beyond basic retrieval by URI, MarkLogic provides extremely robust support for search-style queries. Results can be sorted by relevance or by a scalar. Result items can be paged (viewing a small number at a time), returned documents can be snippeted (showing a short content blurb containing the matching terms) and highlighted (to perhaps bold the matching word occurrences).
When executing a search you can choose to retrieve a simple description of the matching documents, or fetch the documents as well, for the sake of efficiency to avoid repeated calls. If you fetch the documents as part of the search, you can request the same subsetting and transformation occur as for singular document retrievals.
Ok, that sounds cool. Let's move on to some sample code.
To execute any kind of query, call a document instance's query() method and construct the query using queryBuilder instance qb():
String Query
To run a search that retrieves documents with the word, "delicious":
If you only want documents with this word that are in your drinks collection, you can do:
For more examples, see Getting Started with the Node Client API.