Make KVM, Docker, and TensorFlow Play Nice

Notes on getting KVM, Docker, and TensorFlow to cooperate.

By default, a KVM VM does not have the necessary CPU flags set to run the TensorFlow Docker image. In particular, the TensorFlow Docker image is compiled with support AVX.

The solution:

  • Use virsh capabilities on the host to get a list of host CPU capabilities, then
  • Use virsh edit to manually add the necessary CPU flags as <feature> tags under the <cpu> tag.

I elected to add all of the SIMD capabilities, including FP16.

For an AMD Threadripper 1950X, the resulting <cpu> tag looks like this:

<cpu mode='host-model'>
  <model fallback='allow'/>
  <feature policy='require' name='sse4.1'/>
  <feature policy='require' name='sse4.2'/>
  <feature policy='require' name='avx'/>
  <feature policy='require' name='f16c'/>
  <feature policy='require' name='avx2'/>
  <feature policy='require' name='ssse3'/>
</cpu>

Test run:

pabs@hive:~> time docker run --rm -it tensorflow/tensorflow:latest-py3 \
  python3 -c "import tensorflow as tf; tf.enable_eager_execution();
              print(tf.reduce_sum(tf.random_normal([1000, 1000])))"
2019-04-06 12:25:16.576095: I tensorflow/core/platform/cpu_feature_guard.cc:141]
 Your CPU supports instructions that this TensorFlow binary was not compiled to
 use: AVX2 FMA
2019-04-06 12:25:16.627588: I tensorflow/core/platform/profile_utils/cpu_utils.c
c:94] CPU Frequency: 3393620000 Hz
2019-04-06 12:25:16.629909: I tensorflow/compiler/xla/service/service.cc:150] XL
A service 0x395bf00 executing computations on platform Host. Devices:
2019-04-06 12:25:16.629968: I tensorflow/compiler/xla/service/service.cc:158]
StreamExecutor device (0): <undefined>, <undefined>
tf.Tensor(-95.5094, shape=(), dtype=float32)

real	0m1.780s
user	0m0.024s
sys	0m0.012s