Notes on Kubernetes based MQTT workflow on Raspberry PI

California Seascape near Half-Moon Bay, (Photo by author)

As you may have guessed from my previous posts, I am currently busy putting together a Kubernetes based cloud-native software stack for working with Raspberry Pi’s, and other IoT devices for that matter. This post is a quick overview of a use case where we want to track sensor data from a device and visualize it on a dashboard, but do so in a cloud-native way using Kubernetes primitives.

In this case we have a device topology with a central hub running Kubernetes cluster (running on Raspberry Pi, but it can be any k8s cluster) and several IoT devices scattered over the network (shown below are RPi zero’s) making communication with the central hub via MQTT messaging. MQTT is a standard for IoT messaging and very ideal for this scenario.

Example Schematic of device topology

So the general idea is that the user communicates with the entire system through the Kubernetes interface and the devices do so using MQTT messaging.

Given this topology the objective is to achieve a cloud-native workflow where we “configure” the topology using Kubernetes primitives and allow the devices to publish gauge data on MQTT. There are several things working together but a high level breakdown of components is as follows:

  • MQTT server from EMQX as a message broker
  • DAPR sidecars for consuming MQTT messages in application pods
  • Prometheus for metrics gathering
  • Grafana for data visualization
  • And finally we write a Kubernetes CRD (a custom resource definition) to express this workflow. (CRD deserves a separate post, so I may cover that in future)

Once we put all this together and trigger MQTT publishing from the device, the data can be visualized on a Grafana dashboard as shown below. Let’s break this down and see how this works.

Motivation

The idea of using Kubernetes for control-plane activities is very appealing to me for a variety of reasons. A few are listed below:

  • Robustness and fault tolerant from core k8s behavior due to its reconciliation loops and state machine facilities.
  • API extension using CRD’s, which means authentication and authorization can be delegated to k8s RBAC.
  • Rich ecosystem of tools that can be easily managed via helm .

This allows us to install EMQX, DAPR, Prometheus, Grafana and our custom CRD operators (let’s call it device operator since it manages IoT devices) using helm.

Individual instances of devices and their gauges can be configured as follows:

apiVersion: edge.deoras-labs.io/v1beta1
kind: Device
metadata:
name: pi-zero-w
spec:
gauges:
- name: temperature
help: "temperature sensor"
- name: humidity
help: "humidity sensor"

Once we create an instance of this device in a namespace, it configures a corresponding Prometheus metric in the background and also configures the pipeline for capturing data from the device on a MQTT topic bound to this metric.

Kubernetes allows hiding all this complexity behind a simple CRD

Besides all these factors, we get out of the box integration with a lot of open source tools that speak the k8s language such as for management, alerting, logging and automation.

Publishing

The contract for the device to interact with the system should be simple. Device should only be required to know a few coordinates in order to publish data, such as:

  • MQTT host and port (and auth credentials if any required for connecting)
  • MQTT topic name
  • Device name and namespace
  • Gauge metric name
  • A JWT admission token (just so we can rotate/revoke such tokens from the central console for a publishing device… but I won’t cover this for now)
{
"specversion": "1.0",
"source": "pi-zero-w",
"topic": "sensor-gauges",
"traceid": "5298e1ab-d707-45b7-8ed1-3a39aa9aa3f3",
"data": {
"metric": {
"metricType": 1,
"gauge": {
"name": "temperature",
"value": 42,
"namespace": "iot-system",
"subsystem": "pi-zero-w"
}
}
},
"id": "bd9ace6d-40ef-4700-a7e5-15bd59d9b2f2",
"datacontenttype": "application/json",
"type": "com.dapr.event.sent",
"pubsubname": "mqtt-pubsub"
}

The device can populate sensor values (currently at 42 in example above) and publish this on MQTT. The payload structure is compliant with the format standardized by Cloud Events spec.

Behind the scenes

Let’s briefly overview what happens behind the scene. Once the IoT device publishes a message on the MQTT broker, it get’s intercepted by DAPR sidecar which is listening for messages on a topic. It is DAPR where compliance of message payload to Cloud Events standard is enforced and once that passes, the message gets forwarded to an app for which DAPR was the sidecar.

Without going much into details, the core idea here is that the application is made aware of a new MQTT message and it can then take appropriate action. In this case, the message payload is parsed and data is forwarded to Prometheus metric.

Prometheus is pre-configured to scrape metrics from such application pod in both static and dynamic way. Static configuration binds Prometheus to so-called target for scraping metric and dynamic configuration registers new metrics. Since things can get fairly complex with so many moving parts, most of this is automated using CRDs. For instance, DAPR is configured to bind to MQTT as follows:

apiVersion: dapr.io/v1alpha1
kind: Component
metadata:
name: mqtt-pubsub
namespace: device-operator-system
spec:
metadata:
- name: url
value: tcp://emqx-headless.mqtt-system.svc.cluster.local:1883
- name: qos
value: 2
- name: retain
value: "false"
- name: cleanSession
value: "false"
type: pubsub.mqtt
version: v1

Application gets bound to a MQTT topic using another CRD as follows:

apiVersion: dapr.io/v1alpha1
kind: Subscription
metadata:
name: inbox
namespace: device-operator-system
scopes:
- device-operator
spec:
pubsubname: mqtt-pubsub
route: /
topic: sensor-gauges

Furthermore, Prometheus can be configured to scrape metrics as follows:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
app.kubernetes.io/managed-by: Helm
control-plane: controller-manager
release: prometheus
name: device-operator-controller-manager-metrics-monitor
namespace: device-operator-system
spec:
endpoints:
- bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
path: /metrics
port: https
scheme: https
tlsConfig:
insecureSkipVerify: true
selector:
matchLabels:
control-plane: controller-manager

As you can see, most of these configurations is via custom k8s resources or CRD’s, which improves our ability to put these things together and provide a simple interface to the user.

Summary

Kubernetes is a rich ecosystem and the ability to extend its API opens door to wonderful possibilities for edge computing. This was a high level post skipping a lot of the details, but I hope this post gave a good overview of how such a system using MQTT messaging can be put together using Kubernetes based workflow… more details later, stay tuned!

Software engineer and entrepreneur currently building Kubernetes infrastructure and cloud native stack for edge/IoT and ML workflows.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store