Using the built-in Java HTTP client to query Elasticsearch

Nov 25, 2020

22 minutes read

TLDR; This post explains how to use the built-in Java HTTP client (that one available since Java 11) to query Elasticsearch within a Javalin based web application. One of the goals of this post is to reduce dependencies. Less dependencies means also less code to take care of. You can find the complete sample code of this blogpost in the javalin-elasticsearch-sample-app repository.

Why not the HLRC?

As you probably know, Elasticsearch features a RestClient as well as a high level rest client (from now on referred to as HLRC). If you want to know more about this, I blogged an introduction earlier this year.

The HLRC basically features a java class for every available request & response in Elasticsearch - it is a full abstraction of the Elasticsearch APIs and uses the Apache HTTP client under the hood.

One of the main reasons for the Apache HTTP client is its compatibility. Any client can still use java 8 and make use of that client.

However for a recent experiment I wanted to figure out, if I can use the features of newer Java versions like text blocks, records and the built-in HTTP client instead of the existing implementation. Also, I basically only needed the ability to index along with the ability to search, but not need to execute any other Elasticsearch endpoint, like admin functionality or index management.

The new Java HTTP client

In Java, there was always the famous HttpUrlConnection class. Which in fact was quite the thing, but calling this a HTTP client would not be too fair to all the other fancy implementations over the years. It also had a couple of notorious failure behaviors, like timeouts not kicking in after the headers from a client were received but before the body was sent. Also HTTP GET requests with a body were not supported.

Over the years there were a couple of alternatives. The most common ones are likely the Apache HTTP Client and OkHttp from Square, with a couple of notable ones over the years like AsyncHttpClient - which is based on netty.

However, since Java 9 the JDK included a new HTTP client, which went from incubator to GA status in Java 11. Most notably it also features support for HTTP/2, but more important it finally is fully asynchronous using CompletableFuture.

With this building block right inside the JDK, all we need to do is to create JSON based requests and use that client to send them over to Elasticsearch.

Effects of reducing the number of dependencies

Even though I mentioned at the beginning there is less code to take care of, there’s more to this aspect. Less dependencies might mean less security holes, less moving parts of your application, less work to upgrade. And there’s one more aspect. When you use common dependencies, you might be in need of different versions of a single dependency. One of the common workarounds for this problem is shading, basically moving the dependencies into a different package on build time of your own project. While this works, this will increase the size of your project quite a bit.

The Elasticsearch HLRC basically has all the dependencies that Elasticsearch has, like Lucene, Netty, Jackson, log4j2, jodatime - all of these are common in other projects as well and have potential for a Jar conflict. And yes, this could also be solved via JPMS or custom classloaders, but I rarely see this anywhere.

The application

Today’s sample application is a web application that features two endpoints. The first is a POST operation to the /person endpoint, that retrieves a JSON body resembling a person, consisting of firstname, lastname and an employer field.

The second endpoint allows searching for person by specifying a q parameter to the /search endpoint and returns a JSON array containing a the found hits.

Writing a small Elasticsearch client

Just a side note: I will use java 15 features (and even experimental ones) in these examples, so if you build the sample repository with your IDE, make sure the IDE adheres to experimental features - everything should be configured properly in the gradle project.

Writing an Elasticsearch client is basically comprised of three tasks.

Creating proper JSON before it is sent to Elasticsearch
Sending the data via HTTP
Receiving the JSON response and deserializing it

There is one more thing I try to do: Not using reflection. I am not a big fan of reflection, I learned not to use it in the Elasticsearch code base (everything JSON parsing related is done via pull parsing, which means you pay a huge complexity price for this though - even though there are some helper classes to deal with this).

So, let’s try to go step by step.

JSON creation via templates

Instead of programmatically creating JSON for queries, I have taken a different route by using a templating language. Elasticsearch also has a search template functionality, that uses mustache under the hood. However, a couple of months ago I found a new template engine named jte, which is compile time checked and has been added to Javalin as well.

So my basic idea is to use a template to render JSON. JTE does have two different content types already, one for for rendering plain text and one for rendering HTML with proper escaping. However one cannot add own types, so that we need to use the plain text one, but properly JSON escape the input parameters. So, what would we end up with? A renderer like this one

public class Renderer {

    private final TemplateEngine templateEngine;
    private final JsonFactory factory;

    public Renderer(ObjectMapper mapper) {
        this.factory = mapper.getFactory();
        final ResourceCodeResolver resolver =
            new ResourceCodeResolver("templates");
        this.templateEngine = 
            TemplateEngine.create(resolver, ContentType.Plain);
    }

    // JSON escape each user content, but make sure no escaping
    // happens when reading input from the template
    private static final class JsonStringOutput extends StringOutput {
        @Override
        public void writeUserContent(String value) {
            super.writeUserContent(escape(value));
        }
    }

    String render(final String templateName, final Map<String, Object> params) {
        try (StringOutput output = new JsonStringOutput()) {
            templateEngine.render(templateName + ".jte", params, output);
            return output.toString();
        }
    }

    // escape method down here...
}

There is a static escape() method in the class to JSON escape user input in the template. I pretty much used the minimal-json implementation to make sure that any input is properly escaped.

How does a template look like? The search.jte file looks like this:

@param String query
{
  "query" : {
    "query_string" : {
      "query" : "${query}",
      "default_field":"name.first"
    }
  }
}

You need to specify the parameters that you want to use via @param definitions at the top. Also, if you use classes of specific types, those need to be imported. You can also write custom tags, but I do not think that is needed for this use-case.

So, why did I pick a template language, instead of hand writing my JSON? I find the generator pattern (which I will use later, when serializing a class just for comparison) much harder to read. Imagine writing the above query using the Jackson generator. This would be a quite a few lines of JSON already and it would be much harder to build up the structure of that JSON in your head, while reading that java.

Using a the template language - and the assumption that your queries are rather static - the JSON looks much more readable to me than anything that I create within Java code - at the cost of a templating overhead, but I feel that this is OK, especially when the queries are becoming longer and more complex and because JTE is rather fast.

With this foundation we can now get started using the Java HTTP client.

The real Elasticsearch HTTP client

As every request contains all the information required to be sent to a HTTP server, the HTTP client creation looks like this

this.client = HttpClient.newBuilder()
        .connectTimeout(Duration.ofSeconds(5))
        .build();

I was mildly shocked to read the following in the java docs regarding the connect timeout for each request (the above is for the initial connection to an endpoint, not for each request)

The effect of not setting a timeout is the same as setting an infinite Duration, i.e. block forever.

I was surprised to find such an infinite timeout as default value, to be honest.

The constructor for the ElasticsearchClient now looks like this:

private final HttpClient client;
private final String endpoint;
private final Map<String, String> headers;
private final Renderer renderer;
private final Parser parser;

private ElasticsearchClient(Renderer renderer, Parser parser,
                            String endpoint, Map<String, String> headers) {
    this.renderer = renderer;
    this.parser = parser;
    this.client = HttpClient.newBuilder()
            .connectTimeout(Duration.ofSeconds(5))
            .build();
    this.endpoint = endpoint.endsWith("/") ? 
        endpoint.substring(0, endpoint.length()-1) : endpoint;
    // map might be immutable, so create a new one
    this.headers = new HashMap<>(headers);
    this.headers.putIfAbsent("Content-Type", "application/json");
}

The headers parameter can contain custom headers, for example for Basic authentication, as seen later. The endpoint should be something like http://localhost:9200.

With this as the foundation, let’s implement a basic search, using the Renderer class from above.

private static final String INDEX = "persons";

public SearchResponse search(String templateName, String query) throws IOException, InterruptedException {
    final String body = renderer.render(templateName, Map.of("query", query));
    final HttpRequest.Builder requestBuilder = HttpRequest.newBuilder()
            .POST(HttpRequest.BodyPublishers.ofString(body))
            .uri(URI.create(endpoint + "/" + INDEX + "/_search"))
            .timeout(Duration.ofSeconds(10));
    headers.forEach((key, value) -> requestBuilder.setHeader(key, value));
    final HttpResponse<byte[]> response = client.send(requestBuilder.build(),
               HttpResponse.BodyHandlers.ofByteArray());
    return parser.toSearchResponse(response.body());
}

After rendering the body, the HTTP request gets created, enriched with the headers and then sent off with the client. The toSearchResponse() method converts a response byte array to a POJO again

JSON to POJO parsing

So, what is doing the magic parsing to POJOs. First, because this project uses java 15 features, we can make use of records for our helper classes like SearchResponse, SearchHit and Person, making them super short.

public record SearchResponse(List<SearchHit> hits) {}

public record SearchHit(String index, String id, float score, Person person) {}

public record Person(String firstName, String lastName, String employer) {}

We also need a JSON parser, that converts a HTTP response byte array read from Elasticsearch into a java record

public class Parser {

    private static final JsonPointer hitsArray =
        JsonPointer.compile("/hits/hits");
    private static final JsonPointer hitsTotalValue =
        JsonPointer.compile("/hits/total/value");
    private static final JsonPointer hitSource =
        JsonPointer.compile("/_source");

    private final ObjectMapper mapper;

    public Parser(ObjectMapper mapper) {
        this.mapper = mapper;
    }

    SearchResponse toSearchResponse(byte[] data) throws IOException {
        final JsonNode node = mapper.readTree(data);
        boolean hasHits = node.at(hitsTotalValue).longValue() > 0;
        if (hasHits) {
            final JsonNode hits = node.at(hitsArray);
            List<SearchHit> searchHits = new ArrayList<>(hits.size());
            hits.forEach(hit -> {
                Person person = Person.parse(hit.at(hitSource));
                searchHits.add(new SearchHit(hit.get("_index").asText(), 
                                             hit.get("_id").asText(), 
                                             hit.get("_score").floatValue(), 
                                             person));
            });
            return new SearchResponse(searchHits);
        }

        return new SearchResponse(Collections.emptyList());
    }
}

So, instead of introspecting the class or the object via reflection, this piece of code uses the Jackson ObjectMapper in combination with JSON pointers. I had no idea that this was an RFC until a week ago. Regardless, this makes poking around in complex JSON structures a breeze and allows for easy extraction of entities from an Elasticsearch query response. The person parsing code assumes the following JSON structure

{
  "name": {
    "first": "Alexander",
    "last": "Reelsen"
  },
  "employer": "Elastic"
}

This parsing logic is the foundation within the self written client. Serializing via a template, parsing using records and JSON pointers and sending the data using the built-in Java HTTP client.

Just to show the alternative, the Renderer class features the classic Jackson serialization using a JSON generator

public byte[] searchResponse(SearchResponse searchResponse) throws IOException {
    try (ByteArrayOutputStream bos = new ByteArrayOutputStream();
         JsonGenerator generator = factory.createGenerator(bos)) {
        generator.writeStartArray();
        for (SearchHit hit : searchResponse.hits()) {
            generator.writeStartObject();
            generator.writeObjectFieldStart("name");
            generator.writeStringField("first", hit.person().firstName());
            generator.writeStringField("last", hit.person().lastName());
            generator.writeEndObject();
            generator.writeStringField("employer", hit.person().employer());
            generator.writeEndObject();
        }
        generator.flush();
        return bos.toByteArray();
    }
}

As already mentioned earlier, it is much harder to figure out the structure of the JSON by looking at the code, but this is the fastest way of serialization. Also ensuring that all resources are properly closed/flushed, maybe using templates for serialization is not such a bad idea either! 😀

This kind of serialization is also needed to index a POJO back into Elasticsearch. This would be a viable index() method in the client class:

public void index(Person person) throws IOException, InterruptedException {
    final HttpRequest.Builder requestBuilder = HttpRequest.newBuilder()
            .POST(HttpRequest.BodyPublishers.ofString(toJson(person)))
            .uri(URI.create(endpoint + "/" + INDEX + "/_doc/"))
            .timeout(Duration.ofSeconds(10));
    headers.forEach((key, value) -> requestBuilder.setHeader(key, value));

    final HttpResponse<byte[]> response = client.send(requestBuilder.build(), HttpResponse.BodyHandlers.ofByteArray());
    // This is really bad error handling, you need to bubble the Elasticsearch client side exception up as well!
    if (response.statusCode() != 201) {
        throw new RuntimeException("Error indexing new person: " + response.statusCode());
    }
}

private String toJson(Person person) throws IOException {
    // unfortunately we cannot use records in JTE yet
    // so we have serialize each getter into its own field
    return renderer.render("person", Map.of("firstName", person.firstName(), "lastName", person.lastName(), "employer", person.employer()));
    // if the above is fixed, or records are not a preview feature anymore, we can go with this instead and fix the template
    //return renderer.render("person", Map.of("person", person));
}

The corresponding JTE person template looks like this

@param String firstName
@param String lastName
@param String employer
{ "name" : { "first" : "${firstName}", "last" : "${lastName}"}, "employer" : "${employer}" }

Once previews can be enabled for template compilation this can be changed to

@import model.Person;
@param Person person
{ 
  "name" : {
    "first" : "${person.firstName()}", 
    "last" : "${person.lastName()}"
  }
  "employer" : "${person.employer()}" 
}

Let’s move on to integrate this into a small Javalin app.

Integrating with Javalin

Javalin is more than a small web server for testing. It’s built on top of Jetty and you can write full web applications with it. One of the features I really enjoy is the fast startup time compared to something like a Spring Boot application.

Let’s setup a proper build.gradle file first, including all the dependencies. This also includes a few features, we will need later for testing and packaging.

plugins {
  id 'java'
  id 'application'
  id 'com.github.johnrengelman.shadow' version '6.0.0'
}

repositories {
  jcenter()
}

group = 'de.spinscale.javalin'
version = '0.1.0-SNAPSHOT'

sourceCompatibility = 15
targetCompatibility = 15

// enable previews for records
tasks.withType(JavaCompile) {
  options.compilerArgs += '--enable-preview'
}

tasks.withType(Test) {
  jvmArgs += "--enable-preview"
}

dependencies {
  compile 'io.javalin:javalin:3.12.0'
  compile 'gg.jte:jte:1.4.0'
  compile 'org.slf4j:slf4j-simple:1.8.0-beta4'
  compile "com.fasterxml.jackson.core:jackson-databind:2.10.3"

  testCompile "org.assertj:assertj-core:3.18.1"
  testCompile 'org.mockito:mockito-core:3.6.0'

  // use @TestContainers annotation to manage lifecycle in tests
  testImplementation "org.testcontainers:junit-jupiter:1.15.0"
  testCompile "org.testcontainers:elasticsearch:1.15.0"

  // jupiter support
  testImplementation 'org.junit.jupiter:junit-jupiter-api:5.7.0'
  testRuntimeOnly 'org.junit.jupiter:junit-jupiter-engine:5.7.0'
}

test {
    useJUnitPlatform()
}

application {
  mainClassName = 'app.App'
}

jar {
  manifest {
    attributes 'Main-Class': 'app.App'
  }
}

The most important part here is enabling of java 15 language features plus enabling preview features when compiling - otherwise the project will not compile. You can also see the required run-time dependencies and junit5 + assertj + testcontainers for testing.

Let’s build our Javalin App class:

public class App {

    public static void main(String[] args) {
        final ObjectMapper mapper = new ObjectMapper();
        final Renderer renderer = new Renderer(mapper);
        final Parser parser = new Parser(mapper);
        final ElasticsearchClient client = 
          ElasticsearchClient.newBuilder(renderer, parser)
            .fromEnvironment().build();

        Javalin app = Javalin.create().start(7000);

        final String result = "{\"healthy\":\"ok\"}";
        app.get("/", 
            ctx -> ctx.contentType("application/json").result(result));

        app.get("/search", ctx -> {
            final SearchResponse searchResponse = 
                client.search("search", ctx.queryParam("q"));
            ctx.contentType("application/json").status(200)
                .result(renderer.searchResponse(searchResponse));
        });

        app.post("/person", ctx -> {
            final Person person = parser.toPerson(ctx.bodyAsBytes());
            client.index(person);
            ctx.status(200);
        });
    }
}

Testing

There are different components that need to be tested.

Escaping of the JSON input
Rendering of templates
Sending data to and endpoint
Parsing JSON
Serializing JSON
Integration tests against a real Elasticsearch

Writing unit tests

Let’s start with testing our renderer as well as JSON escpaing

public class RendererTests {

    private static final Renderer renderer = new Renderer(new ObjectMapper());

    @Test
    public void testEscape() {
        assertThat(Renderer.escape("a")).isEqualTo("a");
        assertThat(Renderer.escape("ab")).isEqualTo("ab");
        assertThat(Renderer.escape("test")).isEqualTo("test");
        assertThat(Renderer.escape(" test")).isEqualTo(" test");
        assertThat(Renderer.escape(" test ")).isEqualTo(" test ");
        assertThat(Renderer.escape("\"")).isEqualTo("\\\"");
        assertThat(Renderer.escape("\"a")).isEqualTo("\\\"a");
        assertThat(Renderer.escape("\"a\"")).isEqualTo("\\\"a\\\"");
        assertThat(Renderer.escape("a\"b")).isEqualTo("a\\\"b");
        assertThat(Renderer.escape("{\"spam\":\"eggs\"}"))
            .isEqualTo("{\\\"spam\\\":\\\"eggs\\\"}");
    }

    @Test
    public void testRendering() {
        String data = renderer.render("test", Collections.emptyMap());
        assertThat(data).isEqualTo("{\"hello\":\"world\"}");
    }

    @Test
    public void testRenderingWithDifferentArguments() {
        String data = renderer.render("test", Map.of("name", "test"));
        assertThat(data).isEqualTo("{\"hello\":\"test\"}");
    }

    @Test
    public void testRenderingEscaping() {
        String data = renderer.render("test", Map.of("name", "world\",\"foo\":\"bar"));
        assertThat(data).isNotEqualTo("{\"hello\":\"world\",\"foo\":\"bar\"}");
    }
}

The corresponding test.gte template looks like this

@param String name
{"hello":@if(name != null)"${name}"@else"world"@endif}

Next up is testing the parser

public class ParserTests {

    private static final Parser parser = new Parser(new ObjectMapper());

    @Test
    public void testSearchResponseParsing() throws Exception {
        final byte[] data = sampleSearchResponse();
        final SearchResponse response = parser.toSearchResponse(data);
        assertThat(response.hits()).hasSize(2);

        assertThat(response.hits().get(0).id()).isEqualTo("first");
        assertThat(response.hits().get(0).index()).isEqualTo("foo");
        Person firstPerson = new Person("first", "last", "Elastic");
        assertThat(response.hits().get(0).person()).isEqualTo(firstPerson);

        assertThat(response.hits().get(1).id()).isEqualTo("second");
        assertThat(response.hits().get(1).index()).isEqualTo("bar");
        Person secondPerson = new Person("2nd", "2nd last", "2nd Elastic");
        assertThat(response.hits().get(1).person()).isEqualTo(secondPerson);
    }

    private static byte[] sampleSearchResponse() {
        return """
               ... <original search response here>
               """.getBytes(Charsets.UTF_8);
    }
}

Properly testing the sending of HTTP requests requires either another mock web server that checks if the request was correct or a heavy integration test, that uses Elasticsearch. This is what testcontainers can be used for.

TestContainers for integration tests

If you’ve been reading through my other blog posts this year, you may have already reached the conclusion that I like testcontainers very much. I do think it is a really nice integration of running complex services easily within your tests, always at the possible expense of having less unit tests, even though those are much faster than trying to start up Elasticsearch within your tests. On my three year old notebook starting Elasticsearch in a test easily takes 15 seconds.

The final ElasticsearchIngegrationTests class now sends data to Elasticsearch. We can use the index() and search() methods of our client to check if everything works as expected. Let’s take a look at the following test:

@Testcontainers
@Tag("slow")
public class ElasticsearchIntegrationTests {

    private static final ObjectMapper mapper = new ObjectMapper();
    private static final Renderer renderer = new Renderer(mapper);
    private static final Parser parser = new Parser(mapper);

    @Container
    private ElasticsearchContainer container = 
        new ElasticsearchContainer("docker.elastic.co/elasticsearch/elasticsearch:7.10.0");

    @Test
    public void testIndexAndSearch() throws Exception {
        final String endpoint = "http://" + container.getHttpHostAddress();
        ElasticsearchClient client = ElasticsearchClient
            .newBuilder(renderer, parser).withUri(endpoint).build();

        Person person = new Person("first", "last", "employer");
        client.index(person);

        refresh();

        SearchResponse response = client.search("search", "first");
        assertThat(response.hits()).hasSize(1);
        assertThat(response.hits().get(0).person()).isEqualTo(person);

        response = client.search("search", "non-existing");
        assertThat(response.hits()).isEmpty();
    }

    private void refresh() throws IOException, InterruptedException {
        HttpClient client = HttpClient.newBuilder()
            .connectTimeout(Duration.ofSeconds(5)).build();
        final String uri = "http://" + 
            container.getHttpHostAddress() + "/persons/_refresh";

        final HttpRequest request = HttpRequest.newBuilder()
                .POST(HttpRequest.BodyPublishers.noBody())
                .uri(URI.create(uri))
                .header("Content-Type", "application/json")
                .timeout(Duration.ofSeconds(10))
                .build();

        client.send(request, HttpResponse.BodyHandlers.ofByteArray());
    }
}

The test spins up a docker container, and configures the ElasticsearchClient instance to point to that docker container.

The second noticeable thing of this test is the implementation of the refresh() method in order to prevent waiting for the documents to be available for search.

Running ./gradlew clean check now takes a daunting 25s for less than ten tests to run. In order to fix this, you can use the @Tag annotation to mark a test using test containers and then exclude them from regular test runs. Gradle requires you to add another task, only running those kind of tests.

test {
    useJUnitPlatform {
      excludeTags 'slow'
    }
}

task integTest(type: Test) {
  useJUnitPlatform {
    includeTags 'slow'
  }
}

So, now we got our tests up and running, let’s get the final piece into place and run the application against an Elasticsearch instance.

Creating the Elasticsearch client builder

In order to ease the creation of creating an Elasticsearch client all the code so far used

ElasticsearchClient.newBuilder(parser, renderer).fromEnvironment().build();

Let’s take a look what this is doing, as we have not covered that builder yet. The most important functionality that builder has to cover is the ability to either connect to an Elastic Cloud cluster by specifying a cloud id or properly set the URI as well as the authentication information. Authentication can either be a username and a password or an API key.

public static Builder newBuilder(Renderer renderer, Parser parser) {
    return new Builder(renderer, parser);
}

public static class Builder {

    private String authorizationHeader;
    private String uri;
    private final Renderer renderer;
    private final Parser parser;

    public Builder(Renderer renderer, Parser parser) {
        this.renderer = renderer;
        this.parser = parser;
    }

    // pretty much copied from
    // https://github.com/elastic/elasticsearch/blob/master/client/rest/src/main/java/org/elasticsearch/client/RestClient.java#L143-L177
    public Builder withCloudId(String cloudId) {
        // parse cloud id
        // ...
        return this;
    }

    public Builder withUri(String uri) {
        this.uri = uri;
        return this;
    }

    public Builder withAuth(String username, String password) {
        String input = username + ":" + password;
        String value = Base64.getEncoder().encodeToString(input.getBytes(Charsets.UTF_8));
        this.authorizationHeader = "Basic " + value;
        return this;
    }

    public Builder withApiKey(String apiKey) {
        String value = Base64.getEncoder()
            .encodeToString(apiKey.getBytes(Charsets.UTF_8));
        this.authorizationHeader = "ApiKey " + value;
        return this;
    }

    // build the client from the existing env vars with different priorities
    public Builder fromEnvironment() {
        final String cloudId = System.getenv("ELASTICSEARCH_CLOUD_ID");
        final String endpoint = System.getenv("ELASTICSEARCH_URL");
        if (cloudId != null) {
            withCloudId(cloudId);
        } else if (endpoint != null) {
            withUri(endpoint);
        } else {
            withUri("http://localhost:9200");
        }
        final String apiKey = System.getenv("ELASTICSEARCH_API_KEY");
        if (apiKey != null) {
            withApiKey(apiKey);
        } else {
            final String username = System.getenv("ELASTICSEARCH_USERNAME");
            final String password = System.getenv("ELASTICSEARCH_PASSWORD");
            if (username != null && password != null) {
                withAuth(username, password);
            }
        }

        return this;
    }

    public ElasticsearchClient build() {
        Map<String, String> headers = authorizationHeader != null ? Map.of("Authorization", authorizationHeader) : Collections.emptyMap();
        return new ElasticsearchClient(renderer, parser, uri, headers);
    }
}

If you need, you can built any combination using the builder yourself, no matter if you use an API key or basic or connect to a cloud instance or another one. The fromEnvironment() method reads from several environment variables - this can be useful when you application is deployed in a container and vault is used to inject credentials into environment variables.

Let’s built a single jar and start this up with the proper credentials to connect to a cloud cluster.

Packaging into a single jar

As you could already see in the above build.gradle we use the shadowJar plugin to package the whole application into a single jar, using shading as well. This makes it easier to ship and start. Just run

./gradlew clean check shadowJar

And you will end up with a file in build/libs. This file is named javalin-elasticsearch-client-0.1.0-SNAPSHOT-all.jar. However, the best part of this file is its size: on my machine it is 6.5MB. For a java application including all dependencies this is refreshingly small. If you did the same, but included the HLRC dependency of Elasticsearch, the jar has a file size of about 33.4 MB to me - which frankly is still not big compared to many other existing web applications. A small spring boot application I wrote recently has about 80 MB in size.

Starting the container

Now with the uber jar being built, you can simply start like this

java --enable-preview -jar \
  build/libs/javalin-elasticsearch-client-0.1.0-SNAPSHOT-all.jar

You need to ensure to enable preview feature here as well.

The above call will fail, because no Elasticsearch endpoint has been configured. You can either set ELASTICSEARCH_URL="http://localhost:9200" manually or set it in your .envrc and have a tool like direnv loading this properly - which is my preferred way.

After doing so, you should see something like this

java --enable-preview -jar build/libs/javalin-elasticsearch-client-0.1.0-SNAPSHOT-all.jar
[main] INFO io.javalin.Javalin -
           __                      __ _
          / /____ _ _   __ ____ _ / /(_)____
     __  / // __ `/| | / // __ `// // // __ \
    / /_/ // /_/ / | |/ // /_/ // // // / / /
    \____/ \__,_/  |___/ \__,_//_//_//_/ /_/

        https://javalin.io/documentation

[main] INFO org.eclipse.jetty.util.log - Logging initialized @459ms to org.eclipse.jetty.util.log.Slf4jLog
[main] INFO io.javalin.Javalin - Starting Javalin ...
[main] INFO io.javalin.Javalin - Listening on http://localhost:7000/
[main] INFO io.javalin.Javalin - Javalin started in 114ms \o/

That’s a neat fast startup. Make sure you have an Elasticsearch cluster running locally and send the following curl request:

curl -v 'localhost:7000/person' -X POST \ 
  -d '{"name":{"first":"Alexander", "last":"Reelsen"},"employer":"Elastic"}'

Next up, you can search for that entry via

curl 'localhost:7000/search?q=alexander'
[{"name":{"first":"Alexander","last":"Reelsen"},"employer":"Elastic"}]

In order to make sure this works with Elastic Cloud, let’s create a cloud cluster and an API key

Returns something like

POST /_security/api_key
{
  "name": "persons-api-key",
  "role_descriptors": {
    "persons_role": {
      "index": [
        {
          "names": [ "persons" ],
          "privileges": [ "read", "write", "create_index" ]
        }
      ]
    }
  }
}

{
  "id" : "v3bv-XUBsPFf-JVRLYsW",
  "name" : "persons-api-key",
  "api_key" : "VKgZ9DR5TuGGfZVMOk2r-Q"
}

export ELASTICSEARCH_CLOUD_ID="my-deployment:dXMtZWFzdC0xLmF3cy5mb3VuZC5pbyRmYmY3ZThlOWMyNWE0OTNlYjhiYWQ0ZTcyZjAzZjc3OSQ4ZDA5ZGI3YjllY2Y0OTUxYTk1MTNiNjQxNmExODNhMA=="
export ELASTICSEARCH_API_KEY="v3bv-XUBsPFf-JVRLYsW:VKgZ9DR5TuGGfZVMOk2r-Q"
java --enable-preview -jar \
    build/libs/javalin-elasticsearch-client-0.1.0-SNAPSHOT-all.jar

Running the Javalin app with the above way, you can now index your data again on the cloud cluster and query it as well, by sending the same requests than above.

Summary

Keeping a small dependency tree while querying elasticsearch is possible. When checking out the dependencies of this project, it is pretty much jetty, a bit of Kotlin jars (Javalin is written in Kotlin) plus JTE and Jackson. Also, I learned about JSON pointers, which are great helpers when deserializing JSON to a POJO without doing annotation & reflection based object mapping.

That said, like any project that can be implemented within a blog post, this one is full of gross oversimplifications, and maybe you need more dependencies bringing back those that you tried to save from your project. If that is the case, it probably does not make a lot of sense to try and stay small and just go with the HLRC. Pro-tip: Always stay pragmatic!

I enjoyed diving into JTE as a template language and do have a use-case for this kind of application in the moment, as my application only needs to index a few documents but merely queries Elasticsearch using the exact same query with slightly changing inputs.

Next steps

There’s a lot of things to improve in the demo code base. First, to properly make use of system resources, you should rather go async with the HTTP client and pass a future to the Javalin result, as this is fully supported.

If you wait a couple of JDK releases, maybe Project Loom will be a salvation for I/O blocking use-cases. There is a small benchmark with Javalin called Loomylin, that compares different workloads using Loom, Futures and blocking I/O.

A single strategy for JSON serialization might be a good idea as well. I have mixed the Jackson JSON generator style here with templates on purpose, but at some point you should probably decide to go with one of those. If you keep the JSON generator, than having an interface for a common serialization sounds like a good idea as well. Also, JTE will hopefully soon support preview features in its template compiler to support records properly, see this GitHub issue - also thanks to the JTE author Andreas who is always super responsive.

Feel free to improve and leave PRs at the GitHub repository, but especially go ahead and fork it. I hope you learned something while reading this.

Resources

Code sample repository
JTE templating language
Elasticsearch HLRC documentation
JSON template: I found this while researching for this post and found the idea pretty nice, but JSON pointers work as well for me.
Argo: I asked for reflection free JSON parsers on twitter and someone replied this library to me, which I never heard about. The fact that it’s a sourceforge link, already drove me a bit away from it I have to admit. Also there are several other libraries named Argo that parse JSON in various languages, so watch out.
Which Java HTTP client should I use in 2020?

Final remarks

If you made it down here, wooow! Thanks for sticking with me. You can follow or ping me on twitter, GitHub or reach me via Email (just to tell me, you read this whole thing :-).

If there is anything to correct, drop me a note, and I am happy to do so and append to this post!

Same applies for questions. If you have question, go ahead and ask!

If you want me to speak about this, drop me an email!

Back to posts

Backend developer, productivity fan, likes the JVM, full text search, distributed databases & systems