syntax.us Let the syntax do the talking
Blog Contact Posts Questions Tags Hire Me

Question:
How to install TensorFlow 1.1 linked with Cuda 8.0.61, cuDNN 5.1 on Ubuntu 16.04?

I started by buying an HP Envy Laptop with an Nvidia GPU chip in it.

http://www.google.com/search?q=HP+ENVY+17-s143cl

Next, I enhanced the BIOS so the Laptop will run in Legacy mode instead of UEFI mode:

http://www.google.com/search?q=On+HP+Laptop+BIOS+how+to+switch+from+UEFI+to+Legacy

Then, I installed Ubuntu 16.04 Desktop from a USB stick:

http://www.google.com/search?q=How+to+install+Ubuntu+16.04+from+USB+stick

Earlier, I had made the USB stick from another Ubuntu latop using a utility called 'Startup Disk Creator':

http://www.google.com/search?q=Ubuntu+16.04+Startup+Disk+Creator

After I installed Ubuntu on the HP laptop, I enhanced it with some of my favorite packages:
sudo apt-get update
sudo apt-get upgrade
sudo apt-get install autoconf bison build-essential libssl-dev libyaml-dev    \
libreadline6-dev zlib1g-dev libncurses5-dev libffi-dev libgdbm3  sqlite3 curl \
libgdbm-dev libsqlite3-dev gitk postgresql postgresql-server-dev-all aptitude \
libpq-dev emacs wget openssh-server ruby ruby-dev libbz2-dev linux-headers-$(uname -r) \
r-base r-base-dev
After I enhanced Ubuntu on the laptop, I searched for TensorFlow installation instructions.

I found the following URL:

https://www.tensorflow.org/install/install_linux

The tensorflow site changes frequently; the page I saw is displayed below:


Next, I studied this page:

http://docs.nvidia.com/cuda/cuda-installation-guide-linux/

The page I saw is displayed below:


I used the lspci shell command to verify that Ubuntu could see the GPU inside the laptop:
dan@h78:~/tf0513 $ lspci | grep -i nvidia
01:00.0 3D controller: NVIDIA Corporation GM108M [GeForce 940MX] (rev a2)
dan@h78:~/tf0513 $
dan@h78:~/tf0513 $
Next, I studied this page:

http://developer.nvidia.com/cuda-downloads

The page I saw is displayed below:


I clicked the download link and inspected the downloaded file:
dan@h78:~/dl $ ll cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64.deb
-rw-rw-r-- 1 dan dan 1913589814 May 12 21:18 cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64.deb
dan@h78:~/dl $
dan@h78:~/dl $
I installed it with a shell command:
sudo dpkg -i cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64.deb
I ran two more shell commands:
sudo apt-get update
sudo apt-get install cuda
I updated my PATH:
export CUDA_HOME=/usr/local/cuda
export PATH=${CUDA_HOME}/bin:$PATH
I copied the samples folder to my home folder:
cd $CUDA_HOME
rsync -a samples ~/
I linked/compiled them:
cd ~/samples/
make
I tried the deviceQuery demo:

root@h78:~/samples $ find . -name deviceQuery -print
./bin/x86_64/linux/release/deviceQuery
./1_Utilities/deviceQuery
./1_Utilities/deviceQuery/deviceQuery
root@h78:~/samples $ 

root@h78:~/samples $ bin/x86_64/linux/release/deviceQuery bin/x86_64/linux/release/deviceQuery Starting... CUDA Device Query (Runtime API) version (CUDART static linking) Detected 1 CUDA Capable device(s) Device 0: "GeForce 940MX" CUDA Driver Version / Runtime Version 8.0 / 8.0 CUDA Capability Major/Minor version number: 5.0 Total amount of global memory: 4044 MBytes (4240965632 bytes) ( 3) Multiprocessors, (128) CUDA Cores/MP: 384 CUDA Cores GPU Max Clock rate: 1242 MHz (1.24 GHz) Memory Clock rate: 900 Mhz Memory Bus Width: 64-bit L2 Cache Size: 1048576 bytes Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096) Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers Total amount of constant memory: 65536 bytes Total amount of shared memory per block: 49152 bytes Total number of registers available per block: 65536 Warp size: 32 Maximum number of threads per multiprocessor: 2048 Maximum number of threads per block: 1024 Max dimension size of a thread block (x,y,z): (1024, 1024, 64) Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535) Maximum memory pitch: 2147483647 bytes Texture alignment: 512 bytes Concurrent copy and kernel execution: Yes with 1 copy engine(s) Run time limit on kernels: No Integrated GPU sharing Host Memory: No Support host page-locked memory mapping: Yes Alignment requirement for Surfaces: Yes Device has ECC support: Disabled Device supports Unified Addressing (UVA): Yes Device PCI Domain ID / Bus ID / location ID: 0 / 1 / 0 Compute Mode: < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) > deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 8.0, CUDA Runtime Version = 8.0, NumDevs = 1, Device0 = GeForce 940MX Result = PASS root@h78:~/samples $ root@h78:~/samples $
I considered the above output to be good evidence that CUDA 8.0 was installed on my laptop.

Next, I returned my attention to the TensorFlow page:

https://www.tensorflow.org/install/install_linux

It told me I need:

cuDNN v5.1

That software was easy to find at the URL below after I had created a free account:

https://developer.nvidia.com/cudnn

After I downloaded it I saw this file:

cudnn-8.0-linux-x64-v5.1-tgz
I untarred it:
tar xf cudnn-8.0-linux-x64-v5.1-tgz
I installed it with rsync shell commands:
sudo rsync -av cuda/include/ /usr/local/cuda/include/
sudo rsync -av cuda/lib64/   /usr/local/cuda/lib64/
Next I installed libcupti-dev:
sudo apt-get install libcupti-dev
Then I installed Python 3.6.1 under my home-folder:

wget https://www.python.org/ftp/python/3.6.1/Python-3.6.1.tar.xz
tar xf Python-3.6.1.tar.xz
cd     Python-3.6.1/
./configure --prefix=${HOME}/py36
make
make install
cd ~/py36/bin/
ln -s python3.6 python
ln -s pip3.6 pip
export PATH=${HOME}/py36/bin:$PATH
which pip
which python
Next, I installed TensorFlow 1.1.0 with GPU CUDA cuDNN support:
pip install tensorflow-gpu
I verified the TensorFlow version:
dan@h79:~/tf0513 $ pip list |grep tensorflow
tensorflow-gpu (1.1.0)
dan@h79:~/tf0513 $
dan@h79:~/tf0513 $
I created a simple Python script:
import tensorflow as tf
hello = tf.constant('Hello, TensorFlow!')
sess  = tf.Session()
print(sess.run(hello))
I ran it:

dan@h79:~/tf0513 $ python tftest.py 
2017-05-14 11:35:21.580082: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-05-14 11:35:21.580109: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-05-14 11:35:21.580113: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-05-14 11:35:21.580116: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-05-14 11:35:21.580120: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2017-05-14 11:35:21.898045: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:901] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2017-05-14 11:35:21.898501: I tensorflow/core/common_runtime/gpu/gpu_device.cc:887] Found device 0 with properties: 
name: GeForce 940MX
major: 5 minor: 0 memoryClockRate (GHz) 1.2415
pciBusID 0000:01:00.0
Total memory: 3.95GiB
Free memory: 3.60GiB
2017-05-14 11:35:21.898530: I tensorflow/core/common_runtime/gpu/gpu_device.cc:908] DMA: 0 
2017-05-14 11:35:21.898538: I tensorflow/core/common_runtime/gpu/gpu_device.cc:918] 0:   Y 
2017-05-14 11:35:21.898554: I tensorflow/core/common_runtime/gpu/gpu_device.cc:977] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce 940MX, pci bus id: 0000:01:00.0)
b'Hello, TensorFlow!'
dan@h79:~/tf0513 $ 
dan@h79:~/tf0513 $
I considered the above output to be solid evidence that my HP-ENVY laptop was running a copy of TenssorFlow 1.1 with CUDA 8.0.61.


syntax.us Let the syntax do the talking
Blog Contact Posts Questions Tags Hire Me