syntax.us Let the syntax do the talking
Blog Contact Posts Questions Tags Hire Me

Question:
How to install Nvidia Cuda 7.5 on CentOS 7?

I encountered a desktop recently with an Nvidia GPU chip in it:

http://www.google.com/search?q=HP+Pavilion+All-in-One+PC+27-inch+intel-i7

http://www.google.com/search?q=HP+Pavilion+27-a027c+Nvidia+GT930A

http://support.hp.com/us-en/product/HP-Pavilion-27-a000-All-in-One-Desktop-PC-series-(Touch)/11059032/document/c05145192/

So I booted the desktop up on a copy of CentOS 7:

http://isoredirect.centos.org/centos/7/isos/x86_64/CentOS-7-x86_64-DVD-1511.iso

I configured CentOS so it offered a character-only UI, not GUI:
sudo systemctl set-default multi-user.target
sudo shutdown -r now
I did not want to deal with a broken CentOS GUI because of Cuda GPU contention issues.

Eventually I may learn how to configure a host so it can support both Cuda and a GUI.

Anyway, I knew that a Nvidia GPU could speed up machine learning calculations via the Cuda API:

https://developer.nvidia.com/cuda-gpus

I detected the chip with this shell command:
lspci | grep -i nvidia
I saw this:
dan@localhost.localdomain:~ $
dan@localhost.localdomain:~ $ lspci | grep -i nvidia
01:00.0 3D controller: NVIDIA Corporation Device 134e (rev a2)
dan@localhost.localdomain:~ $
dan@localhost.localdomain:~ $
I was curious, 'How to install Cuda on this desktop?'

I started by enhancing CentOS:
sudo echo 'blacklist nouveau'         >  /etc/modprobe.d/blacklist-nouveau.conf
sudo echo 'options nouveau modeset=0' >> /etc/modprobe.d/blacklist-nouveau.conf
sudo yum groupinstall 'Development Tools'
sudo yum install kernel-devel-$(uname -r) kernel-headers-$(uname -r)
sudo yum install epel-release
Then I downloaded an RPM from Nvidia:
mkdir -p ~/Downloads
cd       ~/Downloads
wget http://developer.download.nvidia.com/compute/cuda/7.5/Prod/local_installers/cuda-repo-rhel7-7-5-local-7.5-18.x86_64.rpm
Which I found at the URL listed below:

https://developer.nvidia.com/cuda-downloads

I installed it with this syntax:
cd ~/Downloads
sudo rpm -i cuda-repo-rhel7-7-5-local-7.5-18.x86_64.rpm
Next, I installed cuda:
sudo yum clean all
sudo yum install cuda
That command installed software in this location:
/usr/local/cuda-7.5/
Also it installed a convenient soft-link:
/usr/local/cuda
It looked like this:

dan@localhost.localdomain:~ $ 
dan@localhost.localdomain:~ $ ll /usr/local/
total 12
drwxr-xr-x. 13 root root 4096 Jul 13 19:10 ./
drwxr-xr-x. 13 root root 4096 Jul 12 22:22 ../
drwxr-xr-x.  2 root root    6 Aug 12  2015 bin/
lrwxrwxrwx.  1 root root    8 Jul 13 19:10 cuda -> cuda-7.5/
drwxr-xr-x. 13 root root 4096 Jul 13 19:10 cuda-7.5/
drwxr-xr-x.  2 root root    6 Aug 12  2015 etc/
drwxr-xr-x.  2 root root    6 Aug 12  2015 games/
drwxr-xr-x.  2 root root    6 Aug 12  2015 include/
drwxr-xr-x.  2 root root    6 Aug 12  2015 lib/
drwxr-xr-x.  2 root root    6 Aug 12  2015 lib64/
drwxr-xr-x.  2 root root    6 Aug 12  2015 libexec/
drwxr-xr-x.  2 root root    6 Aug 12  2015 sbin/
drwxr-xr-x.  5 root root   46 Jul 12 22:22 share/
drwxr-xr-x.  2 root root    6 Aug 12  2015 src/
dan@localhost.localdomain:~ $ 
dan@localhost.localdomain:~ $ ll /usr/local/cuda-7.5/
total 40
drwxr-xr-x. 13 root root 4096 Jul 13 19:10 ./
drwxr-xr-x. 13 root root 4096 Jul 13 19:10 ../
drwxr-xr-x.  3 root root 4096 Jul 13 19:10 bin/
drwxr-xr-x.  5 root root   52 Jul 13 19:10 doc/
drwxr-xr-x.  4 root root   33 Jul 13 19:06 extras/
lrwxrwxrwx.  1 root root   28 Jul 13 19:06 include -> targets/x86_64-linux/include/
lrwxrwxrwx.  1 root root   24 Jul 13 19:06 lib64 -> targets/x86_64-linux/lib/
drwxr-xr-x.  8 root root 4096 Jul 13 19:09 libnsight/
drwxr-xr-x.  7 root root 4096 Jul 13 19:09 libnvvp/
-rw-r--r--.  1 root root  365 Aug 15  2015 LICENSE
drwxr-xr-x.  7 root root   80 Jul 13 19:06 nvvm/
-rw-r--r--.  1 root root  365 Aug 15  2015 README
drwxr-xr-x. 11 root root 4096 Jul 13 19:10 samples/
drwxr-xr-x.  3 root root   16 Jul 13 19:06 share/
drwxr-xr-x.  2 root root 4096 Jul 13 19:06 src/
drwxr-xr-x.  3 root root   25 Aug 15  2015 targets/
drwxr-xr-x.  2 root root   42 Jul 13 19:06 tools/
-rw-r--r--.  1 root root   20 Aug 15  2015 version.txt
dan@localhost.localdomain:~ $ 
dan@localhost.localdomain:~ $ 
dan@localhost.localdomain:~ $ 
I found useful html docs here:

dan@localhost.localdomain:~ 
dan@localhost.localdomain:~ 
dan@localhost.localdomain:~ $ ll /usr/local/cuda-7.5/doc/html/
total 236
drwxr-xr-x. 47 root root   4096 Jul 13 19:10 ./
drwxr-xr-x.  5 root root     52 Jul 13 19:10 ../
drwxr-xr-x.  5 root root     52 Jul 13 19:10 common/
drwxr-xr-x.  3 root root     38 Jul 13 19:10 cublas/
drwxr-xr-x.  3 root root     38 Jul 13 19:10 cuda-binary-utilities/
drwxr-xr-x.  3 root root     38 Jul 13 19:10 cuda-c-best-practices-guide/
drwxr-xr-x.  3 root root     38 Jul 13 19:10 cuda-compiler-driver-nvcc/
drwxr-xr-x.  3 root root     38 Jul 13 19:10 cuda-c-programming-guide/
drwxr-xr-x.  2 root root   4096 Jul 13 19:10 cuda-driver-api/
drwxr-xr-x.  2 root root     23 Jul 13 19:10 cuda-gdb/
drwxr-xr-x.  3 root root     38 Jul 13 19:10 cuda-installation-guide-linux/
drwxr-xr-x.  3 root root     38 Jul 13 19:10 cuda-installation-guide-mac-os-x/
drwxr-xr-x.  3 root root     38 Jul 13 19:10 cuda-installation-guide-microsoft-windows/
drwxr-xr-x.  2 root root   4096 Jul 13 19:10 cuda-math-api/
drwxr-xr-x.  2 root root     23 Jul 13 19:10 cuda-memcheck/
drwxr-xr-x.  3 root root     38 Jul 13 19:10 cuda-quick-start-guide/
drwxr-xr-x.  2 root root   4096 Jul 13 19:10 cuda-runtime-api/
drwxr-xr-x.  3 root root     38 Jul 13 19:10 cuda-samples/
drwxr-xr-x.  2 root root     23 Jul 13 19:10 cuda-toolkit-release-notes/
drwxr-xr-x.  2 root root     23 Jul 13 19:10 cufft/
drwxr-xr-x.  2 root root   8192 Jul 13 19:10 cupti/
drwxr-xr-x.  3 root root   4096 Jul 13 19:10 curand/
drwxr-xr-x.  2 root root     23 Jul 13 19:10 cusolver/
drwxr-xr-x.  2 root root     23 Jul 13 19:10 cusparse/
drwxr-xr-x.  2 root root   4096 Jul 13 19:10 debugger-api/
drwxr-xr-x.  2 root root     23 Jul 13 19:10 eula/
drwxr-xr-x.  3 root root     38 Jul 13 19:10 floating-point/
drwxr-xr-x.  3 root root     38 Jul 13 19:10 gpudirect-rdma/
drwxr-xr-x.  3 root root     38 Jul 13 19:10 incomplete-lu-cholesky/
-rw-r--r--.  1 root root  32526 Aug 15  2015 index.html
drwxr-xr-x.  2 root root     23 Jul 13 19:10 inline-ptx-assembly/
drwxr-xr-x.  2 root root     23 Jul 13 19:10 kepler-tuning-guide/
drwxr-xr-x.  2 root root  12288 Jul 13 19:10 libdevice-users-guide/
drwxr-xr-x.  2 root root   4096 Jul 13 19:10 libnvvm-api/
drwxr-xr-x.  2 root root     23 Jul 13 19:10 maxwell-compatibility-guide/
drwxr-xr-x.  2 root root     23 Jul 13 19:10 maxwell-tuning-guide/
drwxr-xr-x.  2 root root     23 Jul 13 19:10 npp/
drwxr-xr-x.  3 root root     38 Jul 13 19:10 nsight-eclipse-edition-getting-started-guide/
drwxr-xr-x.  2 root root     23 Jul 13 19:10 nvblas/
drwxr-xr-x.  2 root root     23 Jul 13 19:10 nvrtc/
drwxr-xr-x.  2 root root     23 Jul 13 19:10 nvvm-ir-spec/
drwxr-xr-x.  3 root root     38 Jul 13 19:10 optimus-developer-guide/
drwxr-xr-x.  3 root root     38 Jul 13 19:10 parallel-thread-execution/
drwxr-xr-x.  3 root root     38 Jul 13 19:10 profiler-users-guide/
drwxr-xr-x.  3 root root     38 Jul 13 19:10 ptx-writers-guide-to-interoperability/
drwxr-xr-x.  3 root root   4096 Jul 13 19:10 search/
-rw-r--r--.  1 root root 138760 Aug 15  2015 sitemap.xml
drwxr-xr-x.  2 root root     23 Jul 13 19:10 thrust/
drwxr-xr-x.  2 root root     23 Jul 13 19:10 video-decoder/
dan@localhost.localdomain:~ $ 
dan@localhost.localdomain:~ $ 
Then, I added /usr/local/cuda/bin to my PATH in ~/.bashrc
export PATH=/usr/local/cuda/bin:$PATH
I did:
. ~/.bashrc
Next, I copied the samples to my home folder:
cd /usr/local/cuda/
rsync -a samples ~/
I found a Makefile there so I ran make:
cd ~/samples/
make
I saw this:

dan@localhost.localdomain:/usr/local/cuda $ 
dan@localhost.localdomain:/usr/local/cuda $ 
dan@localhost.localdomain:/usr/local/cuda $ rsync -a samples ~
dan@localhost.localdomain:/usr/local/cuda $ cd ~/samples
dan@localhost.localdomain:~/samples $ ll
total 140
drwxr-xr-x. 11 dan dan  4096 Jul 13 19:10 ./
drwx------. 21 dan dan  4096 Jul 16 18:50 ../
drwxr-xr-x. 47 dan dan  4096 Jul 13 19:10 0_Simple/
drwxr-xr-x.  6 dan dan  4096 Jul 13 19:10 1_Utilities/
drwxr-xr-x. 12 dan dan  4096 Jul 13 19:10 2_Graphics/
drwxr-xr-x. 20 dan dan  4096 Jul 13 19:10 3_Imaging/
drwxr-xr-x. 10 dan dan  4096 Jul 13 19:10 4_Finance/
drwxr-xr-x.  9 dan dan  4096 Jul 13 19:10 5_Simulations/
drwxr-xr-x. 30 dan dan  4096 Jul 13 19:10 6_Advanced/
drwxr-xr-x. 28 dan dan  4096 Jul 13 19:10 7_CUDALibraries/
drwxr-xr-x.  6 dan dan    90 Jul 13 19:10 common/
-rw-r--r--.  1 dan dan 96407 Aug 14  2015 EULA.txt
-rw-r--r--.  1 dan dan  2652 Jul 13 19:10 Makefile
dan@localhost.localdomain:~/samples $ 
dan@localhost.localdomain:~/samples $ 


dan@localhost.localdomain:~/samples $ 
dan@localhost.localdomain:~/samples $ 
dan@localhost.localdomain:~/samples $ make
make[1]: Entering directory `/home/dan/samples/0_Simple/UnifiedMemoryStreams'
/usr/local/cuda-7.5/bin/nvcc -ccbin g++ -I../../common/inc -m64
-Xcompiler -fopenmp -gencode arch=compute_30,code=sm_30 -gencode
arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37
-gencode arch=compute_50,code=sm_50 -gencode
arch=compute_52,code=sm_52 -gencode arch=compute_52,code=compute_52 -o
UnifiedMemoryStreams.o -c UnifiedMemoryStreams.cu

snip ...

ptxas info    : 'device-function-maxrregcount' is a BETA feature (target: sm_52)
ptxas info    : 'device-function-maxrregcount' is a BETA feature (target: sm_52)
mkdir -p ../../bin/x86_64/linux/release
cp simpleDevLibCUBLAS ../../bin/x86_64/linux/release
make[1]: Leaving directory `/home/dan/samples/7_CUDALibraries/simpleDevLibCUBLAS'
Finished building CUDA samples
dan@localhost.localdomain:~/samples $ 
dan@localhost.localdomain:~/samples $ 
dan@localhost.localdomain:~/samples $ 
The executables landed here:
~/samples/bin/x86_64/linux/release/
They looked like this:

dan@localhost.localdomain:~/samples $ 
dan@localhost.localdomain:~/samples $ 
dan@localhost.localdomain:~/samples $ ll ~/samples/bin/x86_64/linux/release/
total 441900
drwxrwxr-x. 2 dan dan      8192 Jul 16 19:14 ./
drwxrwxr-x. 3 dan dan        28 Jul 16 18:51 ../
-rwxrwxr-x. 1 dan dan    674092 Jul 16 19:03 alignedTypes*
-rwxrwxr-x. 1 dan dan    557953 Jul 16 18:51 asyncAPI*
-rwxrwxr-x. 1 dan dan    563354 Jul 16 18:56 bandwidthTest*
-rwxrwxr-x. 1 dan dan    563146 Jul 16 19:13 batchCUBLAS*
-rwxrwxr-x. 1 dan dan    861062 Jul 16 18:58 bicubicTexture*
-rwxrwxr-x. 1 dan dan    814610 Jul 16 18:58 bilateralFilter*
-rwxrwxr-x. 1 dan dan    826691 Jul 16 18:56 bindlessTexture*
-rwxrwxr-x. 1 dan dan    747492 Jul 16 19:01 binomialOptions*
-rwxrwxr-x. 1 dan dan     54297 Jul 16 19:01 binomialOptions_nvrtc*
-rwxrwxr-x. 1 dan dan    595672 Jul 16 19:00 BlackScholes*
-rwxrwxr-x. 1 dan dan     45782 Jul 16 19:00 BlackScholes_nvrtc*
-rwxrwxr-x. 1 dan dan    991773 Jul 16 18:58 boxFilter*
-rwxrwxr-x. 1 dan dan   6323618 Jul 16 19:13 boxFilterNPP*
-rwxrwxr-x. 1 dan dan   3778441 Jul 16 19:04 cdpAdvancedQuicksort*
-rwxrwxr-x. 1 dan dan   3667694 Jul 16 19:04 cdpBezierTessellation*
-rwxrwxr-x. 1 dan dan  41500691 Jul 16 19:04 cdpLUDecomposition*
-rwxrwxr-x. 1 dan dan   4142249 Jul 16 19:04 cdpQuadtree*
-rwxrwxr-x. 1 dan dan   3626510 Jul 16 18:51 cdpSimplePrint*
-rwxrwxr-x. 1 dan dan   3643251 Jul 16 18:51 cdpSimpleQuicksort*
-rwxrwxr-x. 1 dan dan    560224 Jul 16 18:51 clock*
-rwxrwxr-x. 1 dan dan     35029 Jul 16 18:51 clock_nvrtc*
-rwxrwxr-x. 1 dan dan    568579 Jul 16 19:05 concurrentKernels*
-rwxrwxr-x. 1 dan dan   3557332 Jul 16 19:13 conjugateGradient*
-rwxrwxr-x. 1 dan dan    548091 Jul 16 19:13 conjugateGradientPrecond*
-rwxrwxr-x. 1 dan dan    543612 Jul 16 19:13 conjugateGradientUM*
-rwxrwxr-x. 1 dan dan    664956 Jul 16 18:58 convolutionFFT2D*
-rwxrwxr-x. 1 dan dan    620487 Jul 16 18:59 convolutionSeparable*
-rwxrwxr-x. 1 dan dan    596799 Jul 16 18:59 convolutionTexture*
-rwxrwxr-x. 1 dan dan    565243 Jul 16 18:51 cppIntegration*
-rwxrwxr-x. 1 dan dan    577577 Jul 16 18:52 cppOverload*
-rwxrwxr-x. 1 dan dan    305363 Jul 16 18:59 cudaDecodeGL*
-rwxrwxr-x. 1 dan dan    547568 Jul 16 18:52 cudaOpenMP*
-rwxrwxr-x. 1 dan dan    529718 Jul 16 19:13 cuHook*
-rwxrwxr-x. 1 dan dan    588674 Jul 16 19:13 cuSolverDn_LinearSolver*
-rwxrwxr-x. 1 dan dan    597646 Jul 16 19:13 cuSolverRf*
-rwxrwxr-x. 1 dan dan    592700 Jul 16 19:13 cuSolverSp_LinearSolver*
-rwxrwxr-x. 1 dan dan    761177 Jul 16 18:59 dct8x8*
-rwxrwxr-x. 1 dan dan    538598 Jul 16 18:56 deviceQuery*
-rwxrwxr-x. 1 dan dan    534759 Jul 16 18:56 deviceQueryDrv*
-rwxrwxr-x. 1 dan dan    605204 Jul 16 18:59 dwtHaar1D*
-rwxrwxr-x. 1 dan dan    692394 Jul 16 18:59 dxtc*
-rwxrwxr-x. 1 dan dan    865691 Jul 16 19:05 eigenvalues*
-rwxrwxr-x. 1 dan dan    595956 Jul 16 19:05 fastWalshTransform*
-rwxrwxr-x. 1 dan dan    632160 Jul 16 19:03 FDTD3d*
-rwxrwxr-x. 1 dan dan    876734 Jul 16 19:01 fluidsGL*
-rwxrwxr-x. 1 dan dan   3079713 Jul 16 19:13 freeImageInteropNPP*
-rwxrwxr-x. 1 dan dan    860099 Jul 16 19:03 FunctionPointers*
-rwxrwxr-x. 1 dan dan   3098985 Jul 16 19:13 histEqualizationNPP*
-rwxrwxr-x. 1 dan dan    650473 Jul 16 18:59 histogram*
-rwxrwxr-x. 1 dan dan    713011 Jul 16 18:58 HSOpticalFlow*
-rwxrwxr-x. 1 dan dan    937342 Jul 16 19:00 imageDenoising*
-rwxrwxr-x. 1 dan dan    552250 Jul 16 18:52 inlinePTX*
-rwxrwxr-x. 1 dan dan     35037 Jul 16 18:52 inlinePTX_nvrtc*
-rwxrwxr-x. 1 dan dan    799859 Jul 16 19:06 interval*
-rwxrwxr-x. 1 dan dan    574077 Jul 16 19:13 jpegNPP*
-rwxrwxr-x. 1 dan dan    542188 Jul 16 19:13 libcuhook.so.1*
-rwxrwxr-x. 1 dan dan   2419708 Jul 16 19:06 lineOfSight*
-rwxrwxr-x. 1 dan dan   1071664 Jul 16 18:56 Mandelbrot*
-rwxrwxr-x. 1 dan dan   2150836 Jul 16 18:57 marchingCubes*
-rwxrwxr-x. 1 dan dan    580973 Jul 16 18:52 matrixMul*
-rwxrwxr-x. 1 dan dan    543173 Jul 16 18:52 matrixMulCUBLAS*
-rwxrwxr-x. 1 dan dan    552454 Jul 16 18:52 matrixMulDrv*
-rwxrwxr-x. 1 dan dan    605491 Jul 16 19:06 matrixMulDynlinkJIT*
-rw-rw-r--. 1 dan dan     40837 Jul 16 18:52 matrixMul_kernel64.ptx
-rwxrwxr-x. 1 dan dan     39231 Jul 16 18:52 matrixMul_nvrtc*
-rwxrwxr-x. 1 dan dan   1348702 Jul 16 19:12 MC_EstimatePiInlineP*
-rwxrwxr-x. 1 dan dan   1329161 Jul 16 19:12 MC_EstimatePiInlineQ*
-rwxrwxr-x. 1 dan dan    606423 Jul 16 19:12 MC_EstimatePiP*
-rwxrwxr-x. 1 dan dan    606540 Jul 16 19:13 MC_EstimatePiQ*
-rwxrwxr-x. 1 dan dan   1460814 Jul 16 19:13 MC_SingleAsianOptionP*
-rwxrwxr-x. 1 dan dan    838702 Jul 16 19:07 mergeSort*
-rwxrwxr-x. 1 dan dan  54093908 Jul 16 19:13 MersenneTwisterGP11213*
-rwxrwxr-x. 1 dan dan   1708078 Jul 16 19:01 MonteCarloMultiGPU*
-rwxrwxr-x. 1 dan dan   1435964 Jul 16 19:01 nbody*
-rwxrwxr-x. 1 dan dan    803833 Jul 16 19:07 newdelete*
-rw-rw-r--. 1 dan dan     11785 Jul 16 18:59 NV12ToARGB_drvapi64.ptx
-rwxrwxr-x. 1 dan dan    854344 Jul 16 19:02 oceanFFT*
-rwxrwxr-x. 1 dan dan    604657 Jul 16 18:56 p2pBandwidthLatencyTest*
-rwxrwxr-x. 1 dan dan   2186717 Jul 16 19:02 particles*
-rwxrwxr-x. 1 dan dan    834128 Jul 16 19:00 postProcessGL*
-rwxrwxr-x. 1 dan dan    540591 Jul 16 19:07 ptxjit*
-rwxrwxr-x. 1 dan dan    612308 Jul 16 19:01 quasirandomGenerator*
-rwxrwxr-x. 1 dan dan     50030 Jul 16 19:01 quasirandomGenerator_nvrtc*
-rwxrwxr-x. 1 dan dan   4085890 Jul 16 19:08 radixSortThrust*
-rwxrwxr-x. 1 dan dan    787071 Jul 16 19:13 randomFog*
-rwxrwxr-x. 1 dan dan    857834 Jul 16 19:00 recursiveGaussian*
-rwxrwxr-x. 1 dan dan   2067653 Jul 16 19:08 reduction*
-rwxrwxr-x. 1 dan dan    574847 Jul 16 19:08 scalarProd*
-rwxrwxr-x. 1 dan dan    592173 Jul 16 19:08 scan*
-rwxrwxr-x. 1 dan dan  12067531 Jul 16 19:11 segmentationTreeThrust*
-rwxrwxr-x. 1 dan dan    626774 Jul 16 19:11 shfl_scan*
-rwxrwxr-x. 1 dan dan    556275 Jul 16 18:52 simpleAssert*
-rwxrwxr-x. 1 dan dan     34974 Jul 16 18:52 simpleAssert_nvrtc*
-rwxrwxr-x. 1 dan dan    566361 Jul 16 18:52 simpleAtomicIntrinsics*
-rwxrwxr-x. 1 dan dan     41169 Jul 16 18:52 simpleAtomicIntrinsics_nvrtc*
-rwxrwxr-x. 1 dan dan    556249 Jul 16 18:53 simpleCallback*
-rwxrwxr-x. 1 dan dan    588770 Jul 16 18:53 simpleCubemapTexture*
-rwxrwxr-x. 1 dan dan    538551 Jul 16 19:13 simpleCUBLAS*
-rwxrwxr-x. 1 dan dan    799786 Jul 16 19:00 simpleCUDA2GL*
-rwxrwxr-x. 1 dan dan    561176 Jul 16 19:13 simpleCUFFT*
-rwxrwxr-x. 1 dan dan    568631 Jul 16 19:13 simpleCUFFT_2d_MGPU*
-rwxrwxr-x. 1 dan dan 187064153 Jul 16 19:14 simpleCUFFT_callback*
-rwxrwxr-x. 1 dan dan    556872 Jul 16 19:13 simpleCUFFT_MGPU*
-rwxrwxr-x. 1 dan dan  41352797 Jul 16 19:14 simpleDevLibCUBLAS*
-rwxrwxr-x. 1 dan dan    803799 Jul 16 18:57 simpleGL*
-rwxrwxr-x. 1 dan dan    572896 Jul 16 19:11 simpleHyperQ*
-rwxrwxr-x. 1 dan dan    555908 Jul 16 18:53 simpleIPC*
-rwxrwxr-x. 1 dan dan    580591 Jul 16 18:53 simpleLayeredTexture*
-rwxrwxr-x. 1 dan dan    561135 Jul 16 18:53 simpleMultiCopy*
-rwxrwxr-x. 1 dan dan    555826 Jul 16 18:53 simpleMultiGPU*
-rwxrwxr-x. 1 dan dan    556803 Jul 16 18:53 simpleOccupancy*
-rwxrwxr-x. 1 dan dan    555784 Jul 16 18:54 simpleP2P*
-rwxrwxr-x. 1 dan dan    590218 Jul 16 18:54 simplePitchLinearTexture*
-rwxrwxr-x. 1 dan dan    560208 Jul 16 18:54 simplePrintf*
-rwxrwxr-x. 1 dan dan    582674 Jul 16 18:54 simpleSeparateCompilation*
-rwxrwxr-x. 1 dan dan    565027 Jul 16 18:54 simpleStreams*
-rwxrwxr-x. 1 dan dan    641315 Jul 16 18:54 simpleSurfaceWrite*
-rwxrwxr-x. 1 dan dan    579337 Jul 16 18:55 simpleTemplates*
-rwxrwxr-x. 1 dan dan     45469 Jul 16 18:55 simpleTemplates_nvrtc*
-rwxrwxr-x. 1 dan dan    632346 Jul 16 18:55 simpleTexture*
-rwxrwxr-x. 1 dan dan    786404 Jul 16 18:57 simpleTexture3D*
-rwxrwxr-x. 1 dan dan    563086 Jul 16 18:55 simpleTextureDrv*
-rw-rw-r--. 1 dan dan     18876 Jul 16 18:55 simpleTexture_kernel64.ptx
-rwxrwxr-x. 1 dan dan    573374 Jul 16 18:55 simpleVoteIntrinsics*
-rwxrwxr-x. 1 dan dan     39588 Jul 16 18:55 simpleVoteIntrinsics_nvrtc*
-rwxrwxr-x. 1 dan dan    560361 Jul 16 18:55 simpleZeroCopy*
-rwxrwxr-x. 1 dan dan   2905482 Jul 16 19:03 smokeParticles*
-rwxrwxr-x. 1 dan dan    822017 Jul 16 18:58 SobelFilter*
-rwxrwxr-x. 1 dan dan   1400482 Jul 16 19:01 SobolQRNG*
-rwxrwxr-x. 1 dan dan    679081 Jul 16 19:12 sortingNetworks*
-rwxrwxr-x. 1 dan dan    602796 Jul 16 19:00 stereoDisparity*
-rwxrwxr-x. 1 dan dan    556037 Jul 16 19:03 StreamPriorities*
-rwxrwxr-x. 1 dan dan    567585 Jul 16 18:55 template*
-rwxrwxr-x. 1 dan dan    547523 Jul 16 18:55 template_runtime*
-rwxrwxr-x. 1 dan dan   1219094 Jul 16 19:12 threadFenceReduction*
-rwxrwxr-x. 1 dan dan    546206 Jul 16 19:12 threadMigration*
-rw-rw-r--. 1 dan dan       581 Jul 16 19:12 threadMigration_kernel64.ptx
-rwxrwxr-x. 1 dan dan    632129 Jul 16 19:12 transpose*
-rwxrwxr-x. 1 dan dan    557139 Jul 16 18:51 UnifiedMemoryStreams*
-rwxrwxr-x. 1 dan dan    546985 Jul 16 18:55 vectorAdd*
-rwxrwxr-x. 1 dan dan    546060 Jul 16 18:55 vectorAddDrv*
-rw-rw-r--. 1 dan dan      1162 Jul 16 18:55 vectorAdd_kernel64.ptx
-rwxrwxr-x. 1 dan dan     39203 Jul 16 18:55 vectorAdd_nvrtc*
-rwxrwxr-x. 1 dan dan   1008983 Jul 16 18:57 volumeFiltering*
-rwxrwxr-x. 1 dan dan    822753 Jul 16 18:57 volumeRender*
dan@localhost.localdomain:~/samples $ 
dan@localhost.localdomain:~/samples $ 
I needed about 25 minutes to run make.

I read this page:

http://docs.nvidia.com/cuda/cuda-installation-guide-linux/#running-binaries

I tried this command line:

dan@localhost.localdomain:~/samples $ 
dan@localhost.localdomain:~/samples $ 
dan@localhost.localdomain:~/samples $ ll ~/samples/bin/x86_64/linux/release/deviceQuery
-rwxrwxr-x. 1 dan dan 538598 Jul 16 18:56 /home/dan/samples/bin/x86_64/linux/release/deviceQuery*
dan@localhost.localdomain:~/samples $ 
dan@localhost.localdomain:~/samples $ 


dan@localhost.localdomain:~/samples $ 
dan@localhost.localdomain:~/samples $ 
dan@localhost.localdomain:~/samples $ ~/samples/bin/x86_64/linux/release/deviceQuery
/home/dan/samples/bin/x86_64/linux/release/deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "Graphics Device"
  CUDA Driver Version / Runtime Version          7.5 / 7.5
  CUDA Capability Major/Minor version number:    5.0
  Total amount of global memory:                 4096 MBytes (4294836224 bytes)
  ( 3) Multiprocessors, (128) CUDA Cores/MP:     384 CUDA Cores
  GPU Max Clock rate:                            902 MHz (0.90 GHz)
  Memory Clock rate:                             2000 Mhz
  Memory Bus Width:                              64-bit
  L2 Cache Size:                                 1048576 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
  Maximum Layered 1D Texture Size, (num) layers  1D=(16384), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(16384, 16384), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 1 copy engine(s)
  Run time limit on kernels:                     No
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Device PCI Domain ID / Bus ID / location ID:   0 / 1 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 7.5, CUDA Runtime Version = 7.5, NumDevs = 1, Device0 = Graphics Device
Result = PASS
dan@localhost.localdomain:~/samples $ 
dan@localhost.localdomain:~/samples $ 
dan@localhost.localdomain:~/samples $ 
I liked the look of that.

Also I compared it to similar output from my Alienware laptop:

dan@srvr1404:~/samples/bin/x86_64/linux/release $ 
dan@srvr1404:~/samples/bin/x86_64/linux/release $ ll deviceQuery
-rwxrwxr-x 1 dan dan 542694 Jul 11 16:13 deviceQuery*
dan@srvr1404:~/samples/bin/x86_64/linux/release $ 
dan@srvr1404:~/samples/bin/x86_64/linux/release $ ./deviceQuery
./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GeForce GTX 980M"
  CUDA Driver Version / Runtime Version          7.5 / 7.5
  CUDA Capability Major/Minor version number:    5.2
  Total amount of global memory:                 4096 MBytes (4294770688 bytes)
  (12) Multiprocessors, (128) CUDA Cores/MP:     1536 CUDA Cores
  GPU Max Clock rate:                            1126 MHz (1.13 GHz)
  Memory Clock rate:                             2505 Mhz
  Memory Bus Width:                              256-bit
  L2 Cache Size:                                 2097152 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
  Maximum Layered 1D Texture Size, (num) layers  1D=(16384), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(16384, 16384), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 2 copy engine(s)
  Run time limit on kernels:                     No
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Device PCI Domain ID / Bus ID / location ID:   0 / 1 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 7.5, CUDA Runtime Version = 7.5, NumDevs = 1, Device0 = GeForce GTX 980M
Result = PASS
dan@srvr1404:~/samples/bin/x86_64/linux/release $ 
dan@srvr1404:~/samples/bin/x86_64/linux/release $ 
The GPU in the Alienware laptop looks more powerful.

Next, I worked on the task of connecting my Cuda software to Machine Learning software.

If you have questions, e-me: bikle101@gmail.com


syntax.us Let the syntax do the talking
Blog Contact Posts Questions Tags Hire Me