Building Go binaries with Tensorflow: On packaging a Keras Model for serving

Photo by freestocks on Unsplash

Let’s walk through a quick setup to define a workflow for serving a keras model packaged in a Go binary with TensorFlow backend engine. Keras is a high level Python API that makes it very easy to construct a deep learning model. It can seamlessly operate with TensorFlow as one of the supported ML engines. Typically, all such high level work is done in Python, however, it is quite useful to be able to serve a ML model in the form of a compiled binary without requiring the end user to install any dependencies on their system. So we will break down our workflow into two steps:

  • Step 1: Training and exporting the model in Python using Keras and TensorFlow
  • Step 2: Serving the exported model in Go

Input Data

In order to demonstrate the workflow, we can construct random data and train a model against that. A simple approach to construct random data to do a linear combination over n variables of input data x and coefficients c:

yk = c_1*x_k1 + c_2*x_k2 + c_3*x_k3 + ...  

This would mean that x is a matrix with n column and c is a vector with n values and y is a vector with same number of rows as that of input x. For any arbitrary input x if we know the coefficients c we can predict the output y, however, the machine learning model will model this as a sequential deep learning model with a few layers.

Preparing a CSV with input training data is trivially easy. Doing so in julia would involve creating a random vector c for coefficients and a random input matrix x with 1024 rows and 10 columns:

julia> c = rand(Int8, 10);julia> x = randn(1024, 10);

We can then produce output vector y and save a CSV file:

julia> y = x*c;julia> df = DataFrame([x y], :auto);julia> CSV.write("training_data.csv", df);

At this point a model can be built as very nicely described here:

A model can then be saved as described here:

As it will become clear later, it is good to annotate layers with names. This can be done as follows:

model.add(Dense(1, name='out'))

Once we have the model exported as a protobuf file, we can inspect the nodes to identify input and output node labels. These labels will allow us to inject new data into the model via Go code and produce predictions that can then be fetched back into the Go code.

Code below enumerates node labels:

package mainimport (
_ "embed"
"fmt"
"log"
tf "github.com/galeone/tensorflow/tensorflow/go"
)
//go:embed saved_model.pb
var savedModel []byte
func main() {
// import the graph
g := tf.NewGraph()
if err := g.Import(savedModel, ""); err != nil {
log.Fatal(err)
}
// print available operations in the graph
for i, operation := range g.Operations() {
fmt.Println(operation.Name())
}
}

For the model we just built, the labels are:

x
sequential/dense/MatMul/ReadVariableOp/resource
sequential/dense/MatMul/ReadVariableOp
sequential/dense/MatMul
sequential/dense/BiasAdd/ReadVariableOp/resource
sequential/dense/BiasAdd/ReadVariableOp
sequential/dense/BiasAdd
sequential/dense/Relu
sequential/dense_1/MatMul/ReadVariableOp/resource
sequential/dense_1/MatMul/ReadVariableOp
sequential/dense_1/MatMul
sequential/dense_1/BiasAdd/ReadVariableOp/resource
sequential/dense_1/BiasAdd/ReadVariableOp
sequential/dense_1/BiasAdd
sequential/dense_1/Relu
sequential/dense_2/MatMul/ReadVariableOp/resource
sequential/dense_2/MatMul/ReadVariableOp
sequential/dense_2/MatMul
sequential/dense_2/BiasAdd/ReadVariableOp/resource
sequential/dense_2/BiasAdd/ReadVariableOp
sequential/dense_2/BiasAdd
sequential/dense_2/Relu
sequential/out/MatMul/ReadVariableOp/resource
sequential/out/MatMul/ReadVariableOp
sequential/out/MatMul
sequential/out/BiasAdd/ReadVariableOp/resource
sequential/out/BiasAdd/ReadVariableOp
sequential/out/BiasAdd
Identity

These labels are strings that need to be specified via TensorFlow Go API. x is the input for the model. Since there are several nodes, it now becomes easy to spot the output layer since we had annotated it with label out earlier. The output data can be fetched at node sequential/out/MatMul

Using the model via Go code is now possible by reading a new set of input data and “feeding” to the session. The variable data below has the same datatype and dimensions as defined in the python code earlier, i.e., it has 10 columns, arbitrary number of rows and is of data type float32

// prepare input data
x, err := tf.NewTensor(data)
if err != nil {
log.Fatal(err)
}
// prepare data feed specifying names of the operation
feeds := map[tf.Output]*tf.Tensor{
g.Operation("x").Output(0): x,
}

Similarly, “fetches” can be prepared to pull graph computation output from TensorFlow back into Go runtime. As you can see, the operation label defines where we pull the data from:

// prepare data outputs from tensorflow run
fetches := []tf.Output{
g.Operation("sequential/out/MatMul").Output(0),
}

Finally we run the computation:

// run session feeding feeds and fetching fetches
out, err := sess.Run(feeds, fetches, nil)
if err != nil {
log.Fatal(err)
}

The output of the model can then be compared against that computed using coefficients and verified on a scatter plot:

julia> out = CSV.read("rand_out.csv", DataFrame)
1024×2 DataFrame
Row │ expected out
│ Float64 Float64
──────┼─────────────────────────
1 │ -42.0058 -44.6357
2 │ -313.967 -317.834
3 │ 99.6379 97.7438
4 │ 44.3145 44.4152
5 │ 62.5908 57.3141
scatterplot of expected v/s computed values

Capturing the TensorFlow computation graph and serving in Go enables a very powerful software delivery paradigm and paves the path for smoother cloud-native deliveries with smaller container images.

--

--

--

Software engineer and entrepreneur currently building Kubernetes infrastructure and cloud native stack for edge/IoT and ML workflows.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Solution of the Exercise Chapter 7: Ensemble learning and Random Forest

Backorder prediction Problem

An In-depth Review of Andrew Ng's deeplearning.ai Speciliazation

Introducing GulpIO

How Do Neural Networks Learn?

A Detailed Look at the Capabilities of the Machine Learning Library Scikit-learn

Train A XLM Roberta model for Text Classification on Pytorch

Understanding ML In Production: Model Analysis, Explainable AI, Fairness Indicators, Privacy

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Saurabh Deoras

Saurabh Deoras

Software engineer and entrepreneur currently building Kubernetes infrastructure and cloud native stack for edge/IoT and ML workflows.

More from Medium

Multithreaded programs on GPU

3 Fundamentals of Enums in Rust

WebAssembly: The 4th Official Language of the Web

Understanding Golang “Concurrency” using Arduino UNO.