syntax.us Let the syntax do the talking
Blog Contact Posts Questions Tags Hire Me

Question:
How to install TensorFlow linked with Cuda 7.5 on Ubuntu 14.04?

I bought a laptop recently with an Nvidia GPU chip in it.

I knew that a Nvidia GPU could speed up machine learning calculations via the Cuda API:

https://developer.nvidia.com/cuda-gpus

I detected the chip with this shell command:
lspci | grep -i nvidia
I saw this:
01:00.0 3D controller: NVIDIA Corporation GM204M [GeForce GTX 980M] (rev a1)
I was curious, 'How to install Cuda on this laptop?'

I wanted to run Ubuntu but I assumed that Ubuntu 14.04 Desktop might interfere with Cuda.

So, I started by installation of Ubuntu 14.04 Server instead of Ubuntu-Desktop.

The steps I followed to install Cuda 7.5 on my laptop running Ubuntu 14.04 Server are listed below:

nvidia_cuda75_ubuntu

Next I studied TensorFlow instructions:

https://www.tensorflow.org/versions/r0.9/get_started/os_setup.html#requirements

I noticed this statement:

The GPU version (Linux only) works best with Cuda Toolkit 7.5 and cuDNN v4. other versions are supported .

I found the cuDNN software at this URL:

https://developer.nvidia.com/rdp/cudnn-download

To get the cuDNN download I registered for a developer program.

The site presented me a choice.

Do I want cuDNN 4.0 or cuDNN 5.1?

cuDNN 4.0 seemed the better choice due to TensorFlow Requirements page.
  • cudnn-7.0-linux-x64-v4.0-prod.tgz
  • cudnn-7.5-linux-x64-v5.1-rc.tgz
After I downloaded cudnn-7.0-linux-x64-v4.0-prod.tgz I did this:
cd ~/Downloads
mkdir tmp
cd tmp
tar zxf ../cudnn-7.0-linux-x64-v4.0-prod.tgz
su
rsync -av cuda/include/ /usr/local/cuda/include/
rsync -av cuda/lib64/   /usr/local/cuda/lib64
In my mind, that finished the installation of cuDNN 4.0 into my Cuda 7.5 environment.

Next I tried to install this TensorFlow ubuntu package into my Anaconda Python environment:

https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.9.0-cp35-cp35m-linux_x86_64.whl

The above package contains the token, 'gpu', it is designed to work with Cuda.

I failed to install the above TensorFlow 0.9 Ubuntu package into my Anaconda Python environment.

I like Anaconda but I am content to use a plain install of Python so I moved Anaconda out of the way:
cd ~/
mv anaconda3 anaconda3UNUSED


Next, I downloaded a copy of Python, installed it, downloaded pip, and installed that:

cd ${HOME}/Downloads

wget https://www.python.org/ftp/python/3.5.2/Python-3.5.2.tgz

tar zxf Python-3.5.2.tgz
cd      Python-3.5.2

./configure --prefix=${HOME}/py35

make

make install

cd ${HOME}/py35/bin

ln -s python3 python

export PATH=${HOME}/py35/bin:$PATH

echo 'export PATH=${HOME}/py35/bin:$PATH' >> ${HOME}/.bashrc

cd ~/Downloads

/usr/bin/curl https://bootstrap.pypa.io/get-pip.py > get-pip.py

python get-pip.py

Then, I installed TensorFlow 0.9:
export TF_BINARY_URL=https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.9.0-cp35-cp35m-linux_x86_64.whl
pip install --upgrade $TF_BINARY_URL
Next, I tried to use it:

dan@srvr1404:~/dl $ 
dan@srvr1404:~/dl $ python
Python 3.5.2 (default, Jul 13 2016, 22:45:49) 
[GCC 4.8.4] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> 
>>> import tensorflow as tf
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/dan/py35/lib/python3.5/site-packages/tensorflow/__init__.py", line 23, in <module>
    from tensorflow.python import *
  File "/home/dan/py35/lib/python3.5/site-packages/tensorflow/python/__init__.py", line 48, in <module>
    from tensorflow.python import pywrap_tensorflow
  File "/home/dan/py35/lib/python3.5/site-packages/tensorflow/python/pywrap_tensorflow.py", line 28, in <module>
    _pywrap_tensorflow = swig_import_helper()
  File "/home/dan/py35/lib/python3.5/site-packages/tensorflow/python/pywrap_tensorflow.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow', fp, pathname, description)
  File "/home/dan/py35/lib/python3.5/imp.py", line 242, in load_module
    return load_dynamic(name, filename, file)
  File "/home/dan/py35/lib/python3.5/imp.py", line 342, in load_dynamic
    return _load(spec)
ImportError: libcudart.so.7.5: cannot open shared object file: No such file or directory
>>> 
>>> 
I went on a quest for libcudart.so.7.5 and found it here:
/usr/local/cuda/lib64/libcudart.so.7.5
Long ago I learned that when Linux complains about a missing file which has a name like: libcudart.so.7.5, then I need to tell Linux where the file resides.

I do this using an env-variable named: LD_LIBRARY_PATH

Also I should point out that the TensorFlow documentation lists the variable:

https://www.tensorflow.org/versions/r0.9/get_started/os_setup.html#optional-linux-enable-gpu-support

I added some syntax to ${HOME}/.bashrc with these shell commands:
echo 'export CUDA_HOME=/usr/local/cuda' >> ${HOME}/.bashrc
echo 'export LD_LIBRARY_PATH=${CUDA_HOME}/lib64:${LD_LIBRARY_PATH}' >> ${HOME}/.bashrc
bash
Then, inside a Python prompt, I tried to import tensorflow again:

dan@srvr1404:~/dl $ export CUDA_HOME=/usr/local/cuda
dan@srvr1404:~/dl $ export LD_LIBRARY_PATH=${CUDA_HOME}/lib64:${LD_LIBRARY_PATH}
dan@srvr1404:~/dl $ python
Python 3.5.2 (default, Jul 13 2016, 22:45:49) 
[GCC 4.8.4] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> 
>>> import tensorflow as tf
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally
>>> 
It worked.

Yay!

I wanted further evidence that TensorFlow would use Cuda and the GPU on my laptop.

I started this effort by studying this page:

https://www.tensorflow.org/versions/r0.9/tutorials/mnist/beginners/index.html

I copied syntax from the above page into a simple python script:

# my_mnist.py

# This script should use TensorFlow to learn from MNIST data and then calculate predictions.

# Also my intent is to look for evidence that TensorFlow is using the GPU on my laptop.

from tensorflow.examples.tutorials.mnist import input_data

mnist = input_data.read_data_sets("/tmp/MNIST_data/", one_hot=True)

import tensorflow as tf

# I will use x to hold image pixels:
x = tf.placeholder(tf.float32, [None, 784])

W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))

y = tf.nn.softmax(tf.matmul(x, W) + b)

# y_ should hold observed y values.
y_ = tf.placeholder(tf.float32, [None, 10])

# Here the loss function is called cross_entropy
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))

# tf,  can automatically use the backpropagation algorithm
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
# 0.5 is the learning-rate

init = tf.initialize_all_variables()

# Launch the graph.
sess = tf.Session()
sess.run(init)

for i in range(1000):
  batch_xs, batch_ys = mnist.train.next_batch(100)
  sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})

correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))

accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

print(sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels}))

'bye'
After I ran the above I saw this output:

dan@ss79:~/ks/b/tf $ 
dan@ss79:~/ks/b/tf $ time python my_mnist.py
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally
Extracting /tmp/MNIST_data/train-images-idx3-ubyte.gz
Extracting /tmp/MNIST_data/train-labels-idx1-ubyte.gz
Extracting /tmp/MNIST_data/t10k-images-idx3-ubyte.gz
Extracting /tmp/MNIST_data/t10k-labels-idx1-ubyte.gz
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:924] 
successful NUMA node read from SysFS had negative value (-1), 
but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties: 
name: GeForce GTX 980M
major: 5 minor: 2 memoryClockRate (GHz) 1.1265
pciBusID 0000:01:00.0
Total memory: 4.00GiB
Free memory: 3.93GiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0 
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0:   Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:806] 
Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 980M, pci bus id: 0000:01:00.0)

0.9118

real	0m3.135s
user	0m3.452s
sys	0m1.580s
dan@ss79:~/ks/b/tf $ 
dan@ss79:~/ks/b/tf $ 
It was obvious to me that, on my laptop, TensorFlow was using Cuda and my GPU.


syntax.us Let the syntax do the talking
Blog Contact Posts Questions Tags Hire Me