Building Go binaries with TensorFlow on arm64
: Summary of steps

TensorFlow is a very powerful computing platform, particularly for working with matrices, linear algebra and machine learning. Such computation can be very verbose and difficult to express in low level languages such as C and Go and we need an ability to design the computation in high level languages. TensorFlow, with its Python interface, allows us to express complex computations and at the same time provides the path for seamless integration with low level languages for production use cases.
The real beauty of TensorFlow is the ability to export a computation graph that is language agnostic. This makes it possible to express complex computation sequences in Python and then export those steps to be integrated in production via other coding languages such as Go, Java etc.
This post is a quick summary for steps I needed to take in order to build TensorFlow for arm64
architecture and then integrate a computation graph using Go programming language. The final goal is to achieve a binary that can be easily packaged in a container. Let’s first define a sample computation problem:
0.6047 0.9405 0.6646 0.4377 0.4246
0.6868 0.0656 0.1565 0.0970 0.3009
0.5152 0.8136 0.2143 0.3807 0.3181
0.4689 0.2830 0.2931 0.6791 0.2186
0.2032 0.3609 0.5707 0.8625 0.2931
Let’s say we have a matrix of data as shown above and we need to compute it’s inverse. It is trivial to perform this computation in Python using TensorFlow
import tensorflow as tfx = [
[0.6047, 0.9405, 0.6646, 0.4377, 0.4246],
[0.6868, 0.0656, 0.1565, 0.0970, 0.3009],
[0.5152, 0.8136, 0.2143, 0.3807, 0.3181],
[0.4689, 0.2830, 0.2931, 0.6791, 0.2186],
[0.2032, 0.3609, 0.5707, 0.8625, 0.2931],
]y = tf.linalg.inv(x)print(y)
which outputs the inverse of x
as follows:
tf.Tensor(
[[ 1.3268485 -0.13446523 -1.5245299 3.4717727 -2.7188532 ]
[ 0.56479335 -1.2117027 0.9216604 0.4177374 -0.88607126]
[ 2.958812 -0.5985456 -3.399865 1.1653783 -0.8511223 ]
[-1.0250404 -0.5553139 0.57051307 1.3094081 0.45926073]
[-4.360103 4.384766 4.863161 -9.043573 6.693542 ]], shape=(5, 5), dtype=float32)
Now that we have the hello-world in Python working, the problem is to be able to do the same in Go using TensorFlow dynamic library and it’s C API. Let’s fast forward to a point where are able to build such a binary. Inspecting the binary (saymain
) highlights its dependency on various system libraries and the TensorFlow dynamic library at /usr/local/lib/libtensorflow.so.2
.
└─ $ ▶ ldd main
linux-vdso.so.1 (0x00007ffe9d769000)
libtensorflow.so.2 => /usr/local/lib/libtensorflow.so.2 (0x00007f0e1e6ea000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f0e1e6b9000)
libc.so.6 => /lib64/libc.so.6 (0x00007f0e1e4ea000)
libtensorflow_framework.so.2 => /usr/local/lib/libtensorflow_framework.so.2 (0x00007f0e1c6a9000)
libm.so.6 => /lib64/libm.so.6 (0x00007f0e1c565000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007f0e1c55e000)
librt.so.1 => /lib64/librt.so.1 (0x00007f0e1c551000)
libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007f0e1c332000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f0e1c317000)
/lib64/ld-linux-x86-64.so.2 (0x00007f0e2cbae000)
Executing the binary prints the original data followed by it’s inverse:
└─ $ ▶ ./main
input data:
0.6047 0.9405 0.6646 0.4377 0.4246
0.6868 0.0656 0.1565 0.0970 0.3009
0.5152 0.8136 0.2143 0.3807 0.3181
0.4689 0.2830 0.2931 0.6791 0.2186
0.2032 0.3609 0.5707 0.8625 0.2931inverse:
1.3263 -0.1338 -1.5241 3.4710 -2.7183
0.5647 -1.2114 0.9217 0.4173 -0.8857
2.9600 -0.5993 -3.4010 1.1674 -0.8529
-1.0257 -0.5548 0.5711 1.3083 0.4602
-4.3596 4.3834 4.8627 -9.0423 6.6930
And that worked on arm64
allowing us to use TensorFlow on RaspberryPi supporting 64-bit OS. In order to build this binary, we need three things:
- TensorFlow dynamic library and it’s C API
- A Go library to interface with TensorFlow
- A Go wrapper code for data input and output (shown below)
Building TensorFlow library for arm64
TensorFlow C library is available as a tarball for various operating systems, however, arm64
is currently not supported. It fairly easy to build the library.
Start with a large enough compute instance on the cloud (or you can do this on your laptop if it is powerful enough). Building the library takes some time and compiles thousands of targets, so I found it to be easy to configure a cloud virtual machine for building. The configuration of the virtual machine looked as follows:
- 16 vCPU, 64GB memory
- Ubuntu 21.04 OS with 100GB disk
- Bazel v3.7.2
- TensorFlow v2.5.0
- Python 3.8.8
bazel
is the build system for TensorFlow and there is a strict dependency of its version to a particular version of TensorFlow.
Install build tools:
sudo apt-get update && \
sudo apt-get install -y zip build-essential
Download bazel
:
wget https://github.com/bazelbuild/bazel/releases/download/3.7.2/bazel-3.7.2-installer-linux-x86_64.sh
chmod 755 bazel-3.7.2-installer-linux-x86_64.sh
sudo ./bazel-3.7.2-installer-linux-x86_64.sh
Download Python:
wget https://repo.anaconda.com/archive/Anaconda3-2021.05-Linux-x86_64.sh
chmod 755 Anaconda3-2021.05-Linux-x86_64.sh
./Anaconda3-2021.05-Linux-x86_64.sh
exit # and login again
At this point python is setup, but might not be activated, so it’s good to exit and log back in and confirm the version:
$ which python
/home/<user>/anaconda3/bin/python$ python --version
Python 3.8.8
Download TensorFlow code:
git clone https://github.com/tensorflow/tensorflow.git
cd tensorflow
git checkout tags/v2.5.0
configure build params:
./configure #answering defaults for most
The build can now be started for the default arch as follows:
bazel build -c opt //tensorflow/tools/lib_package:libtensorflow
To build for arm64
bazel build -c opt --config=elinux_aarch64 //tensorflow/tools/lib_package:libtensorflow
This can take some time but should result in a tarball at bazel-bin/tensorflow/tools/lib_package/libtensorflow.tar.gz
At this point we are now ready to integrate with Go code, but first we need to install the library on the target machine, which is a Raspberry Pi in my case
# on a Raspberry Pi running 64-bit os
sudo tar -C /usr/local -xzf libtensorflow.tar.gz
sudo ldconfig
export LD_LIBRARY_PATH="/usr/local/lib"
Preparing Computation Graph
Now that the TensorFlow library is installed, we can start to prepare a computation graph in Python and export it for integration with Go. The computation graph is essentially a declarative manifest that defines inputs, outputs and operations on the data. Read more about the graphs here.
In our case, we need to express an idea of matrix inversion in the graph. To do this we start with following Python code:
import tensorflow as tf# define python function over input array
# reshape array into matrix
def inv(x, dim):
y = tf.reshape(x, shape=dim)
return tf.linalg.inv(y)# wrap python function
tfFuncInv = tf.function(inv)# get graph implying that input data would be of
# arbitrary length and will be reshaped into
# a matrix
g = tfFuncInv.get_concrete_function(
tf.ragged.constant(
[],
dtype=tf.float64,
),
tf.constant([2,2]),
).graph# print graph manifest to visually inspect it
print(g.as_graph_def())# finally export the graph as a protobuf file
tf.io.write_graph(g, "./", "graph.pb", as_text=False)
The file, graph.pb
now contains the workflow description of what needs to happen to the input data. We can now start to integrate it and feed it data dynamically.
package mainimport (
_ "embed"
"fmt"
"log"
"math/rand"tf "github.com/galeone/tensorflow/tensorflow/go"
)//go:embed graph.pb
var def []bytefunc main() {
// import the graph
g := tf.NewGraph()
if err := g.Import(def, ""); err != nil {
log.Fatal(err)
}// print available operations in the graph
for i, operation := range g.Operations() {
fmt.Println(i, operation.Name())
}data := make([]float64, 25)
for i := range data {
data[i] = rand.Float64()
}// prepare input data
x, err := tf.NewTensor(data)
if err != nil {
log.Fatal(err)
}// prepare shape of the matrix
shape, err := tf.NewTensor([]int32{5, 5})
if err != nil {
log.Fatal(err)
}// prepare data feed specifying names of the operation
feeds := map[tf.Output]*tf.Tensor{
g.Operation("x").Output(0): x,
g.Operation("dim").Output(0): shape,
}// prepare data outputs from tensorflow run
fetches := []tf.Output{
g.Operation("MatrixInverse").Output(0),
}// start new session
sess, err := tf.NewSession(
g,
&tf.SessionOptions{},
)
if err != nil {
log.Fatal(err)
}
defer sess.Close()// run session feeding feeds and fetching fetches
out, err := sess.Run(feeds, fetches, nil)
if err != nil {
log.Fatal(err)
}// reshape output data as vector
y := out[0]
if err := y.Reshape([]int64{25}); err != nil {
log.Fatal(err)
}yRaw, ok := y.Value().([]float64)
if !ok {
log.Fatal("type assertion error")
}var k int// print input
fmt.Println("input data:")
k = 0
for i := 0; i < 5; i++ {
for j := 0; j < 5; j++ {
fmt.Printf("%.4f ", data[k])
k++
}
fmt.Println()
}
fmt.Println()// print output
fmt.Println("inverse:")
k = 0
for i := 0; i < 5; i++ {
for j := 0; j < 5; j++ {
fmt.Printf("%.4f ", yRaw[k])
k++
}
fmt.Println()
}
}
The code can now be built using go build main.go
and it will run as shown earlier in this post.
Have fun!