MLJAM: Wire Protocol Documentation

Jason Hunter

Last updated May 10, 2006

This article describes the wire protocol between the client and server. It's documented here for completeness and to aid the project's ongoing development. Casual, and even expert, users do not need to understand the wire protocol, and the protocol is always subject to change.

The XQuery client drives the interaction using xdmp:http-get() and xdmp:http-post() calls to the remote Java server. Each request includes:

The context id
An action verb
An optional name parameter in the query string
An optional POST body holding code or data

Requested URLs always follow this pattern: http://localhost:8080/mljam/contextid/verb. The /mljam portion of tha path matches the servlet, the part after containing the context id and verb is extra path info.

Start

The jam:start() function doesn't actually connect on the wire. Instead, it creates an internal hashmap (actually represented in XML) to remember the mapping between context ids and web locations. If the caller doesn't provide a context id, the function generates a random one for the session.

Set

The jam:set() function makes a POST request to the server, sending the variable name in the query string and the value in the request body. It may choose from a handful of verbs. It uses set-string when the body contains a string; set-binary when the body contains binary content. For other data types (such as numerics, dates, durations, etc) it uses the eval verb and sends BeanShell script code to the server. The server executes this code to assign the variable.

This little trick makes it easy to pass complicated data structures to the server without a lot of custom parsing or protocol work. It's a technique that's too expensive for large strings or binaries, which is why there are the set-string and set-binary optimizations.

The request returns 204 (No Content) on success, indicating to the client no action is necessary and to return empty().

Here, for example, is a simplified version of what goes over the wire to the server during a call to jam:set-in("str", "Literal string", "12345"):

POST http://localhost:8080/12345/set-string?name=str

Literal string

If instead the call had been jam:set-in("str", (xs:double(1), xs:integer(2)), "12345") we'd see:

POST http://localhost:8080/12345/eval

unset("str"); str = new Object[] { 1D, 2L };

Eval

The jam:eval() function makes a POST request to the server sending in its request body the Java code for evaluation. The server doesn't care if the eval verb was used to set a variable or just execute code. The server returns a 204 (No Content) status code on success. On error? We'll discuss that at the end.

Get

The jam:get() function is one of the few actions using the HTTP GET method. It issues a GET request and passes the variable name as the "name" parameter on the query string.

The response from the server to the client depends on the type of variable held by Java. If it's a string, the server sends a response with the string in the response body and a content type of text/plain; charset-UTF-8. That way MarkLogic knows to process it as a string. If it's a binary, the server sends the bytes in the body and a content type of application/binary-encoded. For any other type, it sends executable XQuery in the response, using a content type of x-marklogic/xquery. That custom type tells the jam.xqy client to evaluate the response body in order to get the value. It's the same trick used in jam:set() but in the opposite direction. The status code 200 (OK) indicates success.

For example:

GET http://localhost:8080/mljam/12345/get?name=x

200 OK
Content-type: x-marklogic/xquery

(xs:double(1), "somestring")

Security of Code Passing

Is it safe to pass a data value as a code string to evaluate? Normally not. But this situation is a little different than normal.

We're deliberately creating a server designed to execute arbitrary code passed in from a client. The server in this scenario must completely trust the client (and restrict access for all other untrustworthy clients). We're not opening up any additional risk or requiring any more trust by doing XQuery -> Java variable sets using code.

Doing the opposite, sending XQuery code to the client for evaluation, does raise the trust level required by the client. Sending code to the client requires the client trust the server as much as the server trusts the client. Can we make that assumption? For the environments where we envision MLJAM, we think so. The same trust happens in Ajax when a JavaScript client talks to its server using JSON.

Eval-get

The jam:eval-get() function is a simple hybrid: it sends the code to the server using POST just like jam:eval() but returns a response body just like jam:get().

Unset

The jam:unset() function sends a POST request including a name parameter in the query string specifying the variable to unset. It uses POST, rather than GET, because it changes the server state in a possibly destructive way. It returns a 204 (No Content) status code.

Get-stdout and Get-stderr

The jam:get-stdout() and jam:get-stderr() functions make simple GET requests using the verbs get-stdout or get-stderr. The server returns the string value in the response body with a content type text/plain; charset=UTF-8.

Source

The jam:source() function sends a POST request using the verb source and provides the name of the file to source as the name parameter in the query string. The server returns 204 (No Content) on success.

End

The jam:end() function makes a simple POST request to the server using the end verb. The request doesn't include a request body. After removing the associated Interpreter from the internal memory, the server returns 204 (No Content).

Errors

Should the server ever need to send an error to the client (in response to any of the above calls) it does so by setting the status code to 200 (OK), setting the content type to x-marklogic/xquery, and sending a response body holding an error() call for the client to evaluate. For example:

200 OK
Content-type: x-marklogic/xquery

error('Token Parsing Error: Lexical error at line 2, column 29')

This naturally causes the client to generate the specified error.

Contents