Table of Contents
TLDR; Today we will dive into the two available Terraform providers from Elastic. Terraform allows you to define your infrastructure as a code and keep that in repositories to easily make changes.
Getting started
Infrastructure-as-code (IaC) means managing your IT infrastructure via files, so that changes can be applied automatically without intervention and are properly documented all the times. Terraform has become kind of a standard with some competitors like CloudFormation or Pulumi. Today we will focus on terraform in combination with Elastic Cloud, so you do not need to use the Cloud UI to spin up, change or remove clusters.
Creating a Cloud API Key
In order to use terraform, you need a cloud API key. Make sure this is not an API key for a Elastic Cloud instance but for your Elastic Cloud Account. The Elastic Cloud Provider show the configuration in more detail in the Authentication part of its documentation.
Initial Terraform configuration
Let’s start with an almost minimal main.tf
file:
terraform {
required_version = ">= 1.0.0"
required_providers {
ec = {
source = "elastic/ec"
version = "0.4.0"
}
}
}
provider "ec" {
}
resource "ec_deployment" "custom-deployment-id" {
name = "My deployment identifier"
region = "gcp-europe-west3"
version = "8.1.3"
deployment_template_id = "gcp-memory-optimized-v2"
elasticsearch {}
kibana {}
}
After storing this you can run
terraform init
terraform validate
terraform apply auto-approve
The last call will take 1-3 minutes in order to bring up Kibana and
Elasticsearch nodes. With that, we have our instance up and running.
However, if we want to make use of that, defining outputs in main.tf
is very
useful:
output "elasticsearch_endpoint" {
value = ec_deployment.custom-deployment-id.elasticsearch[0].https_endpoint
}
output "elasticsearch_username" {
value = ec_deployment.custom-deployment-id.elasticsearch_username
}
output "elasticsearch_password" {
value = ec_deployment.custom-deployment-id.elasticsearch_password
sensitive = true
}
output "kibana_endpoint" {
value = ec_deployment.custom-deployment-id.kibana[0].https_endpoint
}
Now, after running terraform apply
again, you can see the endpoints in
terraform output
. With this output, you can run scripts, that parse that
output and then do actions like index or template creation.
I used this Crystal snippet in sample application:
command = "terraform output -json"
io = IO::Memory.new
Process.run(command, shell: true, output: io)
tf = JSON.parse io.to_s
elasticsearch_endpoint = tf["elasticsearch_endpoint"]["value"].as_s
elasticsearch_username = tf["elasticsearch_username"]["value"].as_s
elasticsearch_password = tf["elasticsearch_password"]["value"].as_s
kibana_endpoint = tf["kibana_endpoint"]["value"].as_s
uri = URI.parse elasticsearch_endpoint
client = HTTP::Client.new uri
client.basic_auth elasticsearch_username, elasticsearch_password
This way, the location of the terraform state is irrelevant is well.
Another feature for the elastic cloud provider is setting up remote clusters. Let’s take a look how this works:
Setting up remote clusters for CCS/CCR
CCS/CCR functionalities require so called remote
clusters
to be set up, where one cluster can access another cluster. The following
sample snippet will fully omit the setup or unneeded output
parts, but
only show what is required.
resource "ec_deployment" "cluster_1" {
name = "cluster_1"
region = "gcp-europe-west3"
version = "8.1.3"
deployment_template_id = "gcp-memory-optimized-v2"
elasticsearch {}
kibana {}
}
resource "ec_deployment" "cluster_2" {
name = "cluster_2"
region = "gcp-europe-west3"
version = "8.1.3"
deployment_template_id = "gcp-memory-optimized-v2"
elasticsearch {
remote_cluster {
deployment_id = ec_deployment.cluster_1.id
alias = ec_deployment.cluster_1.name
ref_id = ec_deployment.cluster_1.elasticsearch.0.ref_id
}
}
kibana {}
}
In this setup cluster_2
will have a remote connection to cluster_1
.
Time to set up cross cluster replication. Note: The comment line above the command indicates which cluster this needs to be executed on!
# Cluster 1
PUT my-leader-index
# Cluster 2
PUT my-follower-index/_ccr/follow?wait_for_active_shards=1
{"remote_cluster":"cluster_1","leader_index":"my-leader-index"}
# Cluster 1
PUT /my-leader-index/_doc/my-doc?refresh=true
{"key":"value"}
# Cluster 2, repeat until hit count > 0, should take less than a second usually
GET /my-follower-index/_search
Now we have a follower index in cluster 2 based on the remote cluster setup in terraform.
Cross cluster search is similar to use by leveraging the remove cluster connection:
# Cluster 1
PUT /cluster1-index/_doc/1?refresh=true
{"cluster":"cluster1","name":"my name"}
# Cluster 2
PUT /cluster2-index/_doc/1?refresh=true
{"cluster":"cluster2","name":"my name"}
# Cluster 2
GET /cluster2-index,cluster_1:cluster1-index/_search
On top of just managing the instances yourself, you also may want to change the configuration or setup of a particular cluster. This is where the second terraform provider, namely the elasticstack provider will help you, allowing to configure a cluster, regardless if in the cloud or on-premise.
Using the elasticstack
provider
The elasticstack
provider allows to manage parts of the Elastic Stack, for
example cluster settings, index and component templates, users and roles or
ingest pipelines and processors. In this example we will create an index
pipeline with two processors:
terraform {
required_version = ">= 1.0.0"
required_providers {
ec = {
source = "elastic/ec"
version = "0.4.0"
}
elasticstack = {
source = "elastic/elasticstack",
version = "0.3.3"
}
}
}
provider "ec" {
}
resource "ec_deployment" "custom-deployment-id" {
name = "custom-deployment-id"
region = "gcp-europe-west3"
version = "8.1.3"
deployment_template_id = "gcp-memory-optimized-v2"
elasticsearch {}
kibana {}
}
provider "elasticstack" {
elasticsearch {
username = ec_deployment.custom-deployment-id.elasticsearch_username
password = ec_deployment.custom-deployment-id.elasticsearch_password
endpoints = [ec_deployment.custom-deployment-id.elasticsearch[0].https_endpoint]
}
}
data "elasticstack_elasticsearch_ingest_processor_set" "set_field_terraform" {
field = "pipeline-source"
value = "terraform"
}
data "elasticstack_elasticsearch_ingest_processor_grok" "grok_the_log" {
field = "message"
patterns = ["%%{TIMESTAMP_ISO8601:@timestamp} %%{LOGLEVEL:level} %%{GREEDYDATA:message}"]
}
resource "elasticstack_elasticsearch_ingest_pipeline" "ingest" {
name = "my-ingest-pipeline"
processors = [
data.elasticstack_elasticsearch_ingest_processor_set.set_field_terraform.json,
data.elasticstack_elasticsearch_ingest_processor_grok.grok_the_log.json
]
}
This creates a pipeline named my-ingest-pipeline
after bringing up the
cluster. You could now go to the Management
part in Kibana and see that
the pipeline has been created, or just run the following simulate pipeline
call:
POST _ingest/pipeline/my-ingest-pipeline/_simulate
{
"docs": [
{
"_source": {
"message": "2022-03-03T12:34:56.789Z INFO ze message"
}
}
]
}
This will return the following document as part of the response:
"_source" : {
"@timestamp" : "2022-03-03T12:34:56.789Z",
"level" : "INFO",
"pipeline-source" : "terraform",
"message" : "ze message"
}
There is one final step, which is not yet as easy as it should be, but the following little trick does it at least for most of my PoCs.
Adding a dashboard
While the elasticstack
terraform provider right now only works for
changing parts of the Elasticsearch setup, there are other parts in the
stack like Kibana.
One of my favorite ways of demoing the Elastic Stack is to provide the whole example in a GitHub repository and make it as easy as possible to get up and running.
Part of this is installing a dashboard, that shows the data that was part of
the demo. So how can this be done without any Kibana helpers in the
terraform elasticstack
provider? By using a curl
command as part of a
null_resource
. This adds some platform dependency and the requirement of
curl wherever it is executed.
terraform {
required_version = ">= 1.0.0"
required_providers {
ec = {
source = "elastic/ec"
version = "0.4.0"
}
}
}
provider "ec" {
}
data "local_file" "dashboard" {
filename = "${path.module}/dashboard.ndjson"
}
resource "ec_deployment" "custom-deployment-id" {
name = "custom-deployment-id"
region = "gcp-europe-west3"
version = "8.1.2"
deployment_template_id = "gcp-memory-optimized-v2"
elasticsearch {}
kibana {}
}
resource "null_resource" "store_local_dashboard" {
provisioner "local-exec" {
command = "curl -X POST -u ${ec_deployment.custom-deployment-id.elasticsearch_username}:${ec_deployment.custom-deployment-id.elasticsearch_password} ${ec_deployment.custom-deployment-id.kibana[0].https_endpoint}/api/saved_objects/_import?overwrite=true -H \"kbn-xsrf: true\" --form file=@dashboard.ndjson"
}
depends_on = [ec_deployment.custom-deployment-id]
triggers = {
dashboard_sha1 = "${sha1(file("dashboard.ndjson"))}"
}
}
You may also notice the triggers
part in the null_resource
- this will
take changes of the dashboard file into account and call the curl
call
again if the sha1sum of the JSON file changes.
Summary
I hope you enjoyed the ride across the ec terraform provider and the elasticstack terraform provider. They are both in development and you can follow the corresponding GitHub repositories (ec, elasticstack).
Also, if you encounter any issues, please create an issue in the corresponding GitHub repository and provide feedback. Thank you and happy terraforming!
Final remarks
You can follow or ping me on twitter, GitHub or reach me via Email (just to tell me, you read this whole thing :-).
If there is anything to correct, drop me a note, and I am happy to do so and append to this post!
Same applies for questions. If you have a question, go ahead and ask!
If you want me to speak about this, drop me an email!