Thanks!
As I understand it, it bind-mounts the /dev/nvidia devices and the CUDA toolkit binaries inside the container, giving it direct access just as if it was running on the host. It’s not virtualized, just running under a different namespace so the VRAM is still being managed by the host driver. I would think the same restrictions exist in containers that would apply for running CUDA applications normally on the host. Personally I’ve had up to 4 containers run GPU processes at the same time on 1 card.
And yes, Nvidia hosts it’s own GPU accelerated container images for PyTorch, Tensorflow and a bunch of others on the NGC. They also have images with the full CUDA SDK on their dockerhub.
Canadian islands because I ran out of bears.