Lots of things! Since 3D convolutional networks are very limited in their maximum resolution, most of the interesting things you can do involve learning on RGB+D images via a 2D CNN. A lot of tasks on images (segmentation, identification etc) are easier when you have even partial depth data as input.
I've done a lot of work in this area, and I can say that this is significantly faster and higher quality than the other Kinect-based 3D reconstruction techniques out there such as RGBDemo ( http://www.youtube.com/watch?v=Cldf7UdFq1k ).
It's also clear just how much of an advantage having a 3D sensor is for reconstruction when you compare this against 2D-camera-based 3D reconstruction software like Photofly.