Rendering on a GPU Cluster with Cycles
Disclaimer: This tutorial is tailored towards the cluster system I have access to. The output from the commands is unique to each system, and the same goes for the cluster commands. Also, since OpenCL is still not fully supported in Blender this guide focus on CUDA rendering only. Anyways, use it wisely, and refer to your own cluster documentation for their equivalent of qsub, interactive testing, nvidia-smi, … feel free to drop a line if that helps you or if you have any complementing information.
Blender has no command-line way to set the render device Cycles should use. That means you may not be able to take full advantage of your GPU renderfarm out of the box.
This is not a problem if you use the same computer to save the file and to render it. However if you need to render your file in a remote computer (in this case a renderfarm) you may not have access to the same hardware available in the target computer. So you can’t rely on the ‘render device’ settings that are saved with your .blend.
Luckily we can use Python to set the proper rendering settings before producing your image.
Getting the list of GPUS
The ssh terminal I use for login doesn’t have access to any GPU, thus it doesn’t work for testing. Taking advantage of the qsub system available in the cluster, the first step is to ask for an interactive session to get a list of the available GPUs. In a ssh session in your cluster do:
$ qsub -q interactive -I -V -l walltime=01:00:00,nodes=1:gpus=3
qsub: waiting for job 125229.parallel-admin to start
qsub: job 125229.parallel-admin ready
At this point the shell terminal is back and we can ask Blender what devices it can see.
First copy the code below and paste in a new file called available_devices.py:
bpy.context.user_preferences.system.compute_device = 'BLABLABLA'
To use this script in Blender, run:
$blender -b -P available_device.py
Here this will return the following list: (‘Intel Xeon CPU’, ‘Tesla M2070′, ‘Tesla M2070′, ‘Tesla M2070′, ‘Tesla M2070 (3x)’)
And also the following intentional error:
TypeError: bpy_struct: item.attr = val: enum “BLABLABLA” not found in (‘CUDA_0′, ‘CUDA_1′, ‘CUDA_2′, ‘CUDA_MULTI_2′)
This “error” is only to let us know what is the internal name of the CUDA device I want to use. In my case I want to render with the ‘Tesla M2070 (3x)’ so I should set system.compute_device as ‘CUDA_MULTI_2′ (the error list and the output list have the same size and order, so it’s a one-to-one correlation).
Now to create the actual setup script. Paste the following code in a new file called cuda_setup.py (replace CUDA_MULTI_2 by the device you want to use in your system):
bpy.context.user_preferences.system.compute_device_type = 'CUDA'
bpy.context.user_preferences.system.compute_device = 'CUDA_MULTI_2'
And to use this script in Blender and test render your file:
$ blender -b Nave.blend -o //nave_###.png -P ~/cuda_setup.py -f 1
Remember, the order of the arguments matter. Any argument added after the -f will be parsed only after the render is over. For the complete list of available arguments visit the Blender Wiki.
For the final render you need to use the above command as part of a qsub job dispatching file (the -P cuda_setup.py part). Since Cycles doesn’t recognize all the cluster nodes as rendering devices, you need to split your job into batches, to have an instance of Blender running on every node. This is outside the scope of this guide though.
Just to be Sure
In case you think the GPUs may not be at use, you can do the following. First ssh connect in the interactive node you were using for the test. Next use the NVidia SMI program:
As you can see there is no GPU power been spared here. Sorry but you can’t bitcoin mine here
This tutorial is intended for my own records, but also to help someone stuck in the same situation – though you will likely have to adjust something to your own setup. I tested this with Blender 2.65a running on a CentOS in a cluster based on the HP Proliant SL390 server architecture.