I installed CUDA toolkit on my computer and started BOINC project on GPU. In BOINC I can see that it is running on GPU, but is there a tool that can show me more details about that what is running on GPU – GPU usage and memory usage?
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
For Nvidia GPUs there is a tool nvidia-smi that can show memory usage, GPU utilization and temperature of GPU. There also is a list of compute processes and few more options but my graphic card (GeForce 9600 GT) is not fully supported.
Sun May 13 20:02:49 2012 +------------------------------------------------------+ | NVIDIA-SMI 3.295.40 Driver Version: 295.40 | |-------------------------------+----------------------+----------------------+ | Nb. Name | Bus Id Disp. | Volatile ECC SB / DB | | Fan Temp Power Usage /Cap | Memory Usage | GPU Util. Compute M. | |===============================+======================+======================| | 0. GeForce 9600 GT | 0000:01:00.0 N/A | N/A N/A | | 0% 51 C N/A N/A / N/A | 90% 459MB / 511MB | N/A Default | |-------------------------------+----------------------+----------------------| | Compute processes: GPU Memory | | GPU PID Process name Usage | |=============================================================================| | 0. Not Supported | +-----------------------------------------------------------------------------+
Method 2
For linux, use nvidia-smi -l 1 will continually give you the gpu usage info, with in refresh interval of 1 second.
Method 3
Recently I have written a simple command-line utility called gpustat (which is a wrapper of nvidia-smi) : please take a look at https://github.com/wookayin/gpustat.

Method 4
For Intel GPU’s there exists the intel-gpu-tools from http://intellinuxgraphics.org/ project, which brings the command intel_gpu_top (amongst other things). It is similar to top and htop, but specifically for the Intel GPU.
render busy: 18%: ███▋ render space: 39/131072
bitstream busy: 0%: bitstream space: 0/131072
blitter busy: 28%: █████▋ blitter space: 28/131072
task percent busy
GAM: 33%: ██████▋ vert fetch: 0 (0/sec)
GAFS: 3%: ▋ prim fetch: 0 (0/sec)
VS: 0%: VS invocations: 559188 (150/sec)
SF: 0%: GS invocations: 0 (0/sec)
VF: 0%: GS prims: 0 (0/sec)
DS: 0%: CL invocations: 186396 (50/sec)
CL: 0%: CL prims: 186396 (50/sec)
SOL: 0%: PS invocations: 8191776208 (38576436/sec)
GS: 0%: PS depth pass: 8158502721 (38487525/sec)
HS: 0%:
TE: 0%:
GAFM: 0%:
SVG: 0%:
Method 5
nvidia-smi does not work on some linux machines (returns N/A for many properties). You can use nvidia-settings instead (this is also what mat kelcey used in his python script).
nvidia-settings -q GPUUtilization -q useddedicatedgpumemory
You can also use:
watch -n0.1 "nvidia-settings -q GPUUtilization -q useddedicatedgpumemory"
for continuous monitoring.
Method 6
For Linux, I use this HTOP like tool that I wrote myself. It monitors and gives an overview of the GPU temperature as well as the core / VRAM / PCI-E & memory bus usage. It does not monitor what’s running on the GPU though.
Method 7
I have a GeForce 1060 GTX video card and I found that the following command give me info about card utilization, temperature, fan speed and power consumption:
$ nvidia-smi --format=csv --query-gpu=power.draw,utilization.gpu,fan.speed,temperature.gpu
You can see list of all query options with:
$ nvidia-smi --help-query-gpu
Method 8
You can use nvtop, it’s similar to htop but for NVIDIA GPUs. Link: https://github.com/Syllo/nvtop
Method 9
For completeness, AMD has two options:
-
fglrx (closed source drivers).
$ aticonfig --odgc --odgt
-
mesa (open source drivers), you can use RadeonTop.
View your GPU utilization, both for the total activity percent and individual blocks.
Method 10
Recently, I have written a monitoring tool called nvitop, the interactive NVIDIA-GPU process viewer.

It is written in pure Python and is easy to install.
Install from PyPI:
pip3 install --upgrade nvitop
Install the latest version from GitHub (recommended):
pip3 install git+https://github.com/XuehaiPan/nvitop.git#egg=nvitop
Run as a resource monitor:
nvitop -m
nvitop will show the GPU status like nvidia-smi but with additional fancy bars and history graphs.
For the processes, it will use psutil to collect process information and display the USER, %CPU, %MEM, TIME and COMMAND fields, which is much more detailed than nvidia-smi. Besides, it is responsive for user inputs in monitor mode. You can interrupt or kill your processes on the GPUs.
nvitop comes with a tree-view screen and an environment screen:


In addition, nvitop can be integrated into other applications. For example, integrate into PyTorch training code:
import os
from nvitop.core import host, CudaDevice, HostProcess, GpuProcess
from torch.utils.tensorboard import SummaryWriter
device = CudaDevice(0)
this_process = GpuProcess(os.getpid(), device)
writer = SummaryWriter()
for epoch in range(n_epochs):
# some training code here
# ...
this_process.update_gpu_status()
writer.add_scalars(
'monitoring',
{
'device/memory_used': float(device.memory_used()) / (1 << 20), # convert bytes to MiBs
'device/memory_percent': device.memory_percent(),
'device/memory_utilization': device.memory_utilization(),
'device/gpu_utilization': device.gpu_utilization(),
'host/cpu_percent': host.cpu_percent(),
'host/memory_percent': host.virtual_memory().percent,
'process/cpu_percent': this_process.cpu_percent(),
'process/memory_percent': this_process.memory_percent(),
'process/used_gpu_memory': float(this_process.gpu_memory()) / (1 << 20), # convert bytes to MiBs
'process/gpu_sm_utilization': this_process.gpu_sm_utilization(),
'process/gpu_memory_utilization': this_process.gpu_memory_utilization(),
},
global_step
)
See https://github.com/XuehaiPan/nvitop for more details.
Note: nvitop is released under the GPLv3 License. Please feel free to use it as a package or dependency for your own projects. However, if you want to add or modify some features of nvitop, or copy some source code of nvitop into your own code, the source code should also be released under the GPLv3 License.
Method 11
I have had processes terminate (probably killed or crashed) and continue to use resources, but were not listed in nvidia-smi. Usually these processes were just taking gpu memory.
If you think you have a process using resources on a GPU and it is not being shown in nvidia-smi, you can try running this command to double check. It will show you which processes are using your GPUs.
sudo fuser -v /dev/nvidia*
This works on EL7, Ubuntu or other distributions might have their nvidia devices listed under another name/location.
Method 12
Glances has a plugin which shows GPU utilization and memory usage.
http://glances.readthedocs.io/en/stable/aoa/gpu.html
Uses the nvidia-ml-py3 library: https://pypi.python.org/pypi/nvidia-ml-py3
Method 13
For OS X
Including Mountain Lion
Excluding Mountain Lion
The last version of atMonitor to support GPU related features is atMonitor 2.7.1.
– and the link to 2.7.1 delivers 2.7b.
For the more recent version of the app, atMonitor – FAQ explains:
To make atMonitor compatible with MacOS 10.8 we have removed all GPU related features.
I experimented with 2.7b a.k.a. 2.7.1 on Mountain Lion with a MacBookPro5,2 with NVIDIA GeForce 9600M GT. The app ran for a few seconds before quitting, it showed temperature but not usage:

Method 14
for nvidia on linux i use the following python script which uses an optional delay and repeat like iostat and vmstat
https://gist.github.com/matpalm/9c0c7c6a6f3681a0d39d
$ gpu_stat.py 1 2
{"util":{"PCIe":"0", "memory":"10", "video":"0", "graphics":"11"}, "used_mem":"161", "time": 1424839016}
{"util":{"PCIe":"0", "memory":"10", "video":"0", "graphics":"9"}, "used_mem":"161", "time":1424839018}
Method 15
The following function appends information such as PID, user name, CPU usage, memory usage, GPU memory usage, program arguments and run time of processes that are being run on the GPU, to the output of nvidia-smi:
function better-nvidia-smi () {
nvidia-smi
join -1 1 -2 3
<(nvidia-smi --query-compute-apps=pid,used_memory
--format=csv
| sed "s/ //g" | sed "s/,/ /g"
| awk 'NR<=1 {print toupper($0)} NR>1 {print $0}'
| sed "/[NotSupported]/d"
| awk 'NR<=1{print $0;next}{print $0| "sort -k1"}')
<(ps -a -o user,pgrp,pid,pcpu,pmem,time,command
| awk 'NR<=1{print $0;next}{print $0| "sort -k3"}')
| column -t
}
Example output:
$ better-nvidia-smi Fri Sep 29 16:52:58 2017 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 378.13 Driver Version: 378.13 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GeForce GT 730 Off | 0000:01:00.0 N/A | N/A | | 32% 49C P8 N/A / N/A | 872MiB / 976MiB | N/A Default | +-------------------------------+----------------------+----------------------+ | 1 Graphics Device Off | 0000:06:00.0 Off | N/A | | 23% 35C P8 17W / 250W | 199MiB / 11172MiB | 0% Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 Not Supported | | 1 5113 C python 187MiB | +-----------------------------------------------------------------------------+ PID USED_GPU_MEMORY[MIB] USER PGRP %CPU %MEM TIME COMMAND 9178 187MiB tmborn 9175 129 2.6 04:32:19 ../path/to/python script.py args 42
Method 16
You can use
nvidia-smi pmon -i 0
to monitor every process in GPU 0. including compute/graphic mode, sm usage, memory usage, encoder usage, decoder usage.
Method 17
This script is more readable and is designed for easy mods and extensions.
You can replace gnome-terminal with your favorite terminal window program.
#! /bin/bash
if [ "$1" = "--guts" ]; then
echo; echo " ctrl-c to gracefully close"
f "$a"
f "$b"
exit 0; fi
# easy to customize here using "nvidia-smi --help-query-gpu" as a guide
a='--query-gpu=pstate,memory.used,utilization.memory,utilization.gpu,encoder.stats.sessionCount'
b='--query-gpu=encoder.stats.averageFps,encoder.stats.averageLatency,temperature.gpu,power.draw'
p=0.5 # refresh period in seconds
s=110x9 # view port as width_in_chars x line_count
c="s/^/ /; s/, +/t/g"
t="`echo '' |tr 'n' 't'`"
function f() { echo; nvidia-smi --format=csv "$1" |sed -r "$c" |column -t "-s$t" "-o "; }
export c t a b; export -f f
gnome-terminal --hide-menubar --geometry=$s -- watch -t -n$p "`readlink -f "$0"`" --guts
#
License: GNU GPLv2, TranSeed Research
Method 18
I didn’t see it in the available answers (except maybe in a comment), so I thought I’d add that you can get a nicer refreshing nvidia-smi with watch. This refreshes the screen with each update rather than scrolling constantly.
watch -n 1 nvidia-smi
for one second interval updates. Replace the 1 with whatever you want, including fractional seconds:
watch -n 5 nvidia-smi watch -n 0.1 nvidia-smi
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0


