February 9, 2023

Deploying HarperDB on DigitalOcean & Linode with Terraform

Welcome to Community Posts
Click below to read the full article.
Arrow
Summary of What to Expect
Table of Contents
UPDATE HarperDB is now available on the DigitalOcean Marketplace.
Since the initial publication of this article, Linode has been acquired and rebranded as Akamai Connected Cloud.

Diving into HarperDB, a distributed database platform designed for the edge, on Digital Ocean and Linode.

In a previous article, I deployed HarperDB on AWS and GCP via HarperDB Studio and Terraform. Given HarperDB’s excellent support for Docker, deploying it on any Linux VM was very trivial. To test it out on other cloud providers, I decided to deploy HarperDB on Digital Ocean and Linode as well.

You can follow along or clone each GitHub repo:

Digital Ocean

To get started, we first need to create a Digital Ocean account and download a personal access token. After signing up for a free account, navigate to API section, and click on Generate New Token.

Copy the token and set the variable in your terminal:

$ export DIGITALOCEAN_ACCESS_TOKEN=

Now we need to set the Terraform provider for Digital Ocean:

terraform {
  required_version = ">= 1.0.0"

  required_providers {
    digitalocean = {
      source  = "digitalocean/digitalocean"
      version = "~> 2.0"
    }
  }
}

provider "digitalocean" {}

Terraform will automatically pick up DIGITALOCEAN_ACCESS_TOKEN and pass it to the provider.

Next, we need to define a VM and firewall rules to allow ssh and access to ports 9925 and 9926.

esource "digitalocean_droplet" "harperdb" {
  image     = "ubuntu-18-04-x64"
  name      = "harperdb"
  region    = "nyc1"
  size      = "s-1vcpu-1gb"
  user_data = file("harperdb.yaml")
}

resource "digitalocean_firewall" "web" {
  name = "harperdb"

  droplet_ids = [digitalocean_droplet.harperdb.id]

  inbound_rule {
    protocol         = "tcp"
    port_range       = "22"
    source_addresses = ["0.0.0.0/0", "::/0"]
  }

  inbound_rule {
    protocol         = "tcp"
    port_range       = "9925-9926"
    source_addresses = ["0.0.0.0/0", "::/0"]
  }

  inbound_rule {
    protocol         = "tcp"
    port_range       = "9926"
    source_addresses = ["0.0.0.0/0", "::/0"]
  }

  inbound_rule {
    protocol         = "icmp"
    source_addresses = ["0.0.0.0/0", "::/0"]
  }

  outbound_rule {
    protocol              = "tcp"
    port_range            = "1-65535"
    destination_addresses = ["0.0.0.0/0", "::/0"]
  }

  outbound_rule {
    protocol              = "icmp"
    destination_addresses = ["0.0.0.0/0", "::/0"]
  }
}

We are creating an Ubuntu 18.04 VM in NYC1 region with predefined user_data field from harperdb.yaml file. This file essentially bootstraps users, authorized keys, Docker, and starts the HarperDB container.

#cloud-config
groups:
  - ubuntu: [root,sys]
  - hashicorp

# Add users to the system. Users are added after groups are added.
users:
  - default
  - name: terraform
    gecos: terraform
    shell: /bin/bash
    primary_group: hashicorp
    sudo: ALL=(ALL) NOPASSWD:ALL
    groups: users, admin
    lock_passwd: false
    ssh_authorized_keys:
    # Add the ssh-rsa key
    #  - ssh-rsa ... email@example.com

runcmd:
  - sudo apt-get update
  - sudo apt-get install -yq \
    ca-certificates \
    curl \
    gnupg \
    lsb-release
  - sudo mkdir -p /etc/apt/keyrings
  - curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg --yes
  - echo \
    "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
    $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
  - sudo apt-get update
  - sudo apt-get install -yq docker-ce docker-ce-cli containerd.io docker-compose-plugin
  - sudo mkdir harperdb
  - sudo chmod 755 harperdb
  - sudo docker run -d \
    -v $(pwd)/harperdb:/home/harperdb/hdb \
    -e HDB_ADMIN_USERNAME=HDB_ADMIN \
    -e HDB_ADMIN_PASSWORD=password \
    -p 9925:9925 \
    -p 9926:9926 \
    harperdb/harperdb

Before running Terraform, we need to create an SSH key to access the VM. Otherwise adding new keys will trigger Terraform to recreate the instance.

Create a SSH key pair by running the following:

ssh-keygen -t rsa -C "your_email@example.com" -f ./tf-digitalocean

Then grab the public key (one starting with ssh-rsa ... ) and paste that under ssh_authorized_keys .

Now we’re ready to run the plan and apply!

$ terraform plan -out plan.out
$ terraform apply "plan.out"

The IP address for the Digital Ocean VM will be printed in the console via the outputs.tf file. Copy that IP address and we can interact with the DB as usual:

curl --location --request POST :9925 \                                                              
  --header 'Content-Type: application/json' \
  --header 'Authorization: Basic SERCX0FETUlOOnBhc3N3b3Jk' \
--data-raw '{
    "operation": "create_schema",
    "schema": "dev"
}'

Linode

Deploying HarperDB onto Linode is also simple. Create a free Linode account and click on API Tokens under your profile. Click on “Create a Personal Access Token” to generate a token for Terraform to use.

To pass this token to Linode Terraform provider, we will make use of Terraform variables file. Create terraform.tfvars and paste in the following:

token = ""
root_pass ="bogusPassword$"

The root_pass field will be used to bootstrap the VM on Linode.

Then in main.tf we will configure the Linode provider:

terraform {
  required_providers {
    linode = {
      source = "linode/linode"
      version = "1.27.1"
    }
  }
}

provider "linode" {
  token = var.token
}

Finally, we will create a Ubuntu 18.04 VM in us-east region and open up ssh and ports 9925–9926:

resource "linode_instance" "harperdb" {
  image = "linode/ubuntu18.04"
  region = "us-east"
  type = "g6-standard-1"
  root_pass = var.root_pass
}

resource "linode_firewall" "harperdb_firewall" {
  label = "harperdb"

  inbound {
    label    = "ssh"
    action   = "ACCEPT"
    protocol = "TCP"
    ports    = "22"
    ipv4     = ["0.0.0.0/0"]
    ipv6     = ["::/0"]
  }

  inbound {
    label    = "harperdb"
    action   = "ACCEPT"
    protocol = "TCP"
    ports    = "9925-9926"
    ipv4     = ["0.0.0.0/0"]
    ipv6     = ["::/0"]
  }

  inbound_policy = "DROP"

  outbound_policy = "ACCEPT"

  linodes = [linode_instance.harperdb.id]
}

Run the plan and create the infrastructure:

Once the VM is created, navigate to the Linode console and click Launch LISH Console . Then we can log in as root with the password defined in the variables file.

Now we just need to install Docker and create a directory for persistence.

First, follow the instructions to download docker: https://docs.docker.com/engine/install/debian/#prerequisites

Then create harperdb directory and run the container:

$ mkdir harperdb

$ chmod 777 harperdb

$ sudo docker run -d \
  -v $(pwd)/harperdb:/home/harperdb/hdb \
  -e HDB_ADMIN_USERNAME=HDB_ADMIN \
  -e HDB_ADMIN_PASSWORD=password \
  -p 9925:9925 \
  -p 9926:9926 \
  harperdb/harperdb

Now we can use CURL commands to create schema like before:

$ curl --location --request POST 34.150.228.207:9925 \
  --header 'Content-Type: application/json' \
  --header 'Authorization: Basic SERCX0FETUlOOnBhc3N3b3Jk' \
--data-raw '{
    "operation": "create_schema",
    "schema": "dev"
}'

Final Thoughts

Now that HarperDB is running on Digital Ocean and Linode, we can link them via HarperDB Studio as we did before with AWS and GCP. To allow HTTP connection, enable mix content on your browser.

Then click Register User-Installed Instance and fill out the information:

After registering, you should now see the schema we created via CURL. Finally, to enable clustering, navigate to the cluster button and create the same user in all the instances accordingly. Once the pub/sub topology is defined, then the changes made to one node will be synced to all subscribers in the cluster.

HarperDB Studio simplifies the management of multiple HarperDB instances running on one or more clouds. By connecting the databases we deployed to Digital Ocean and Linode via HarperDB Studio, we can set up data synchronization strategies to unlock interesting use cases. Imagine running some HarperDB instances on the edge and syncing up data on the cloud as needed.

Finally, once you are done testing, to avoid further charges, make sure to delete all created infrastructure via terraform destroy .

Note: You can find additional prebuilt templates for multi-cloud deployment, authentication, API caching, and more in the HarperDB Development Repo.