Table of Contents
TLDR; In this article we will go from nothing to a fully running Spring Boot application querying Elastic App Search, which has crawled the contents of a website. We will start up the cluster and configure the application step by step.
Clone the sample repo
The full example to test out via terraform
is available in a public GitHub
repository.
git clone https://github.com/spinscale/spring-boot-app-search
Spinning up a cluster
In order to have an example up and running, we have to create an API key in Elastic Cloud as described in the terraform provider setup.
Once that is done, run
terraform init
terraform validate
terraform apply
and grab a coffee before the serious work begins. After a few minutes you should see your instance in the Elastic Cloud UI up and running
Configuring the Spring Boot application
Before we continue, let’s make sure we can build and run the java application. All you need is Java 17 installed, and you can go ahead and run
./gradlew clean check
This will download all the dependencies, run the tests and fail. This is expected, as we have not indexed any data into our app search instance.
Before we can do that we have to change the configuration and index some
data. Let’s start with changing the configuration by editing the
src/main/resources/application.properties
file (the snippet below only
shows the parameters that require changing!):
appsearch.url=https://dc3ff02216a54511ae43293497c31b20.ent-search.westeurope.azure.elastic-cloud.com
appsearch.engine=web-crawler-search-engine
appsearch.key=search-untdq4d62zdka9zq4mkre4vv
feign.client.config.appsearch.defaultRequestHeaders.Authorization=Bearer search-untdq4d62zdka9zq4mkre4vv
If you do not want to enter any password, you log into the Kibana instance
via the Elastic Cloud UI and then go to Enterprise Search > App Search
You can extract the appsearch.key
and feign...
search parameter from the
Credentials
page within App Search. The same applies for the Endpoint
which is shown right at the top.
Now, when running ./gradlew clean check
the correct App Search endpoint is
hit, but we still have not indexed any data. Let’s do that now!
Configuring the Crawler
Before setting up a crawler, we need to create a container to for our
documents. This is called an engine
right now, so let’s create one. Name
your engine web-crawler-search-engine
, so it matches with the
application.conf
file.
After that let’s configure a crawler by clicking Use The Crawler
Now add a domain. You can add your own domain here, I went with my personal blog spinscale.de.
When clicking on Validate Domain
a few checks are done, and then the
domain gets added to the engine.
The final step is to trigger a crawl manually so data gets indexed right
now. Click on Start a crawl
.
Now wait a minute and check on the engine overview if documents have been added.
Now with data indexed into our engine, let’s rerun the test and see if it
passes via ./gradlew check
. It should pass now, you can also see a recent
API call in the engine overview coming from the test (see above at the
bottom).
Before starting our app, let’s take a quick look at the test code
@SpringBootTest(classes = SpringBootAppSearchApplication.class, webEnvironment = SpringBootTest.WebEnvironment.NONE)
class AppSearchClientTests {
@Autowired
private AppSearchClient appSearchClient;
@Test
public void testFeignAppSearchClient() {
final QueryResponse queryResponse = appSearchClient.search(Query.of("seccomp"));
assertThat(queryResponse.getResults()).hasSize(4);
assertThat(queryResponse.getResults().stream().map(QueryResponse.Result::getTitle))
.contains("Using seccomp - Making your applications more secure",
"Presentations",
"Elasticsearch - Securing a search engine while maintaining usability",
"Posts"
);
assertThat(queryResponse.getResults().stream().map(QueryResponse.Result::getUrl))
.contains("https://spinscale.de/posts/2020-10-27-seccomp-making-applications-more-secure.html",
"https://spinscale.de/presentations.html",
"https://spinscale.de/posts/2020-04-07-elasticsearch-securing-a-search-engine-while-maintaining-usability.html",
"https://spinscale.de/posts/"
);
}
}
This tests fires up the spring application without binding to a port, auto
injects the AppSearchClient
class and runs a test that searches for
seccomp
.
Starting up the application
All right, time to get up and running and check if our applications starts up
./gradlew bootRun
You should see a few logging messages, most importantly that your application has started like this
2022-03-16 15:43:01.573 INFO 21247 --- [ restartedMain] d.s.s.SpringBootAppSearchApplication : Started SpringBootAppSearchApplication in 1.114 seconds (JVM running for 1.291)
You can now open the app in a browser and take a look, but I would like to take a look at the Java code first.
Defining an interface only for our search client
In order to be able to query the App Search endpoint within Spring Boot, all that is needed to be implemented is an interface due to using Feign. We do not have to care for JSON serialization or creating HTTP connections, we can work with POJOs only. This is our definition for our app search client
@FeignClient(name = "appsearch", url="${appsearch.url}")
public interface AppSearchClient {
@GetMapping("/api/as/v1/engines/${appsearch.engine}/search")
QueryResponse search(@RequestBody Query query);
}
The client uses the application.properties
definitions for the url
and
engine
so that nothing of this needs to be specified as part of the API
call. Also, this client uses the headers defined in the
application.properties
file. This way, no application code contains URLs,
engine names or custom auth headers.
The only classes requiring more implementation are Query
to model the
request body and QueryResponse
modelling the request response. I have
opted to only model the absolutely needed fields in the response, even
though it usually contains much more JSON. Whenever I need more data, I
could add it to the QueryResponse
class.
The query class only consists of the query
field for now
public class Query {
private final String query;
public Query(String query) {
this.query = query;
}
public String getQuery() {
return query;
}
public static Query of(String query) {
return new Query(query);
}
}
So, finally, let’s run some searches from within the application.
Server side queries and rendering
The sample application implements three models of querying the App Search instance and integrating it within the Spring Boot Application. The first one sends a search term to the Spring Boot App, which sends the query to App Search and then renders the results via thymeleaf, the standard rendering dependency in Spring Boot.
This is the controller
@Controller
@RequestMapping(path = "/")
public class MainController {
private final AppSearchClient appSearchClient;
public MainController(AppSearchClient appSearchClient) {
this.appSearchClient = appSearchClient;
}
@GetMapping("/")
public String main(@RequestParam(value = "q", required = false) String q,
Model model) {
if (q != null && q.trim().isBlank() == false) {
model.addAttribute("q", q);
final QueryResponse response = appSearchClient.search(Query.of(q));
model.addAttribute("results", response.getResults());
}
return "main";
}
}
Looking at the main()
method there is a check for the q
parameter. If it
exists, the query is sent to App Search and the model
is enriched with the
results. Then the main.html
thymeleaf template is rendered. It looks like
this:
<!DOCTYPE html>
<html xmlns:th="http://www.thymeleaf.org"
xmlns:layout="http://www.ultraq.net.nz/thymeleaf/layout"
layout:decorate="~{layouts/base}">
<body>
<div layout:fragment="content">
<div>
<form action="/" method="get">
<input autocomplete="off" placeholder="Enter search terms..."
type="text" name="q" th:value="${q}" style="width:20em" >
<input type="submit" value="Search" />
</form>
</div>
<div th:if="${results != null && !results.empty}">
<div th:each="result : ${results}">
<h4><a th:href="${result.url}" th:text="${result.title}"></a></h4>
<blockquote style="font-size: 0.7em" th:text="${result.description}"></blockquote>
<hr>
</div>
</div>
</div>
</body>
</html>
The template checks for the results
variable, and if it is set, iterates
throuh that list. For every result the same template is rendered, which
looks like this:
Using htmx for dynamic page updates
As you can see in the top navigation, we can change the way of searching
between three alternatives. When clicking on the second one named htmx based search
, the execution model is slightly changed.
Instead of reloading the whole page, only the part with the results is replaced with what the server returns. The nice part about this is, that it can be done without writing any javascript. This is possible due to the awesome htmx library. Citing the description
htmx gives you access to AJAX, CSS Transitions, WebSockets and Server Sent Events directly in HTML, using attributes, so you can build modern user interfaces with the simplicity and power of hypertext
In this example only a tiny subset of htmx is used. Let’s take a look at the two endpoint definitions first. One for rending the HTML and one for only returning the required HTML snippet to update the part of the page.
// htmx based search using server side rendering for responses only without page reload
@GetMapping("/htmx")
public String htmx() {
return "htmx-main";
}
@PostMapping("/htmx-search")
public String htmxSearch(@RequestParam("q") String q, Model model) {
if (q != null && q.trim().isBlank() == false) {
final QueryResponse response = appSearchClient.search(Query.of(q));
model.addAttribute("results", response.getResults());
}
return "htmx-search-results";
}
The first renders the htmx-main
template, where as the second endpoint
renders the results. The htmx-main
template looks like this:
<!DOCTYPE html>
<html xmlns:th="http://www.thymeleaf.org"
xmlns:layout="http://www.ultraq.net.nz/thymeleaf/layout"
layout:decorate="~{layouts/base}">
<body>
<div layout:fragment="content">
<div>
<form action="/search" method="get">
<input type="search"
autocomplete="off"
id="searchbox"
name="q" placeholder="Begin Typing To Search Articles..."
hx-post="/htmx-search"
hx-trigger="keyup changed delay:500ms, search"
hx-target="#search-results"
hx-indicator=".htmx-indicator"
style="width:20em">
<span class="htmx-indicator" style="padding-left: 1em;color:red">Searching... </span>
</form>
</div>
<div id="search-results">
</div>
</div>
</body>
</html>
The magic happens in the hx-
attributes of the <input>
HTML element.
Reading this out loud, this translates to
- Trigger a HTTP request only, if there was no typing activity for 500ms
- Then send a HTTP POST request to
/htmx-search
- While you are waiting, show the
.htmx-indicator
element. - The response should be rendered into the element with the ID
#search-results
Just think about the sheer amount of javascript you would need for all the logic regarding key listeners, displaying elements to wait for a response or sending the AJAX request.
The other big advantage is the fact of being able to use your favourite
server side rendering solution to create the HTML being returned. This
means, we can stay in the thymeleaf cosmos instead of having to implement
some client side templating language. This makes the htmx-search-results
template very simple by just iterating through the results:
<div th:each="result : ${results}">
<h4><a th:href="${result.url}" th:text="${result.title}"></a></h4>
<blockquote style="font-size: 0.7em" th:text="${result.description}"></blockquote>
<hr>
</div>
One difference to the first example is, that the URL of this search never gets changed, so you cannot bookmark it. Even though there is history support in htmx, I left it out for the sake of this example, as it requires some more careful implementation to do it right.
Searching via the browser against App Search
Now on to the last example. This is vastly different, because it does not involve the Spring Boot server side at all. Everything is done in the browser. This is done using Alpine.js. The server side endpoint looks as simple as possible:
@GetMapping("/alpine")
public String alpine() {
return "alpine-js";
}
The alpine-js.html
template requires some more explanation, but let’s take
a look first:
<!DOCTYPE html>
<html
xmlns:layout="http://www.ultraq.net.nz/thymeleaf/layout"
layout:decorate="~{layouts/base}">
<body>
<div layout:fragment="content" x-data="{ q: '', response: null }">
<div>
<form @submit.prevent="">
<input type="search" autocomplete="off" placeholder="Begin Typing To Search Articles..." style="width:20em"
x-model="q"
@keyup="client.search(q).then(resultList => response = resultList)">
</form>
</div>
<template x-if="response != null && response.info.meta != null && response.info.meta.request_id != null">
<template x-for="result in response.results">
<template x-if="result.data != null && result.data.title != null && result.data.url != null && result.data.meta_description != null ">
<div>
<h4><a class="track-click" :data-request-id="response.info.meta.request_id" :data-document-id="result.data.id.raw" :data-query="q" :href="result.data.url.raw" x-text="result.data.title.raw"></a></h4>
<blockquote style="font-size: 0.7em" x-text="result.data.meta_description.raw"></blockquote>
<hr>
</div>
</template>
</template>
</template>
<script th:inline="javascript">
var client = window.ElasticAppSearch.createClient({
searchKey: [[${@environment.getProperty('appsearch.key')}]],
endpointBase: [[${@environment.getProperty('appsearch.url')}]],
engineName: [[${@environment.getProperty('appsearch.engine')}]]
});
document.addEventListener("click", function(e) {
const el = e.target;
if (!el.classList.contains("track-click")) return;
client.click({
query: el.getAttribute("data-query"),
documentId: el.getAttribute("data-document-id"),
requestId: el.getAttribute("data-request-id")
});
});
</script>
</div>
</body>
</html>
The first major difference is the actual use of JavaScript in order to
initialize the ElasticAppSearch
client - using the configured properties
from the application.properties
file. Once that client is initialized we
can use it in the HTML
attributes.
The code initializes two variables to be used
<div layout:fragment="content" x-data="{ q: '', response: null }">
The q
variable will contain the query of the input form and the response
will contain the response from a search. The next interesting part is the
form definition
<form @submit.prevent="">
<input type="search" autocomplete="off" placeholder="Search Articles..."
x-model="q"
@keyup="client.search(q).then(resultList => response = resultList)">
</form>
Using <input x-model="q"...>
binds the q
variable to the input and is
updated whenever the user types. There is also an event for keyup
to
execute a search using client.search()
and assign the outcome to the
response
variable. So once the client search returns, the response
variable won’t be empty anymore. Lastly using @submit.prevent=""
ensures
that the form is not submitted.
Next up all the <template>
tags, that are rendered once the response is
set and iterates through the response.results
list. For each element, the
following HTML snippet is rendered
<div>
<h4><a class="track-click"
:data-request-id="response.info.meta.request_id"
:data-document-id="result.data.id.raw"
:data-query="q"
:href="result.data.url.raw"
x-text="result.data.title.raw">
</a></h4>
<blockquote style="font-size: 0.7em"
x-text="result.data.meta_description.raw"></blockquote>
<hr>
</div>
This rendering is slightly different than the two server side rendering
implementations, because it contains added functionality to track clicked
links. The important part to render the templates are the :href
and the
x-text
properties to set the link and the text of the link. The other
:data
parameters are for tracking links.
Tracking clicks
So, why would you want to track link clicks? Well simple, it’s one of the possibilities to figure out if your search results are good by measuring if your users clicked them. This is also why there is some more javascript included in this HTML snippet. Let’s take a look first, how this looks in Kibana.
You can see the Click analytics
at the bottom, that have tracked a click
after I searched for crystal
in clicked the first link. By clicking on
that term you can see which document has been clicked and basically follow
the click trail of your users.
So, how is this implemented in our little app? By using a click
javascript
listener for certain links. This is the javascript snippet
document.addEventListener("click", function(e) {
const el = e.target;
if (!el.classList.contains("track-click")) return;
client.click({
query: el.getAttribute("data-query"),
documentId: el.getAttribute("data-document-id"),
requestId: el.getAttribute("data-request-id")
});
});
If a clicked link has the track-click
class, then sent a click event using
the ElasticAppSearch
client. This event contains the original query term,
as well as the documentId
and requestId
, that were part of the search
response and were rendering into the link/<a>
element in the template
above.
We could also add this functionality into the server side rendering, by providing those information when a user clicks a link, so this is not exclusive for the browser at all. For the sake of simplicity I skipped the implementation here.
Summary
I hope you enjoyed getting an introduction into Elastic App Search from a developer perspective and the different possibilities of integrating this in your applications. Make sure to check out the GitHub Repository and follow the example along.
You can use terraform with the Elastic Cloud Provider to get up and running in no time in Elastic Cloud.
Final remarks
You can follow or ping me on twitter, GitHub or reach me via Email (just to tell me, you read this whole thing :-).
If there is anything to correct, drop me a note, and I am happy to do so and append to this post!
Same applies for questions. If you have a question, go ahead and ask!
If you want me to speak about this, drop me an email!