I believe the 2.4GB/sec is the spec rate for this chip. Their demo board does three 1080p feeds mixed into a 1080p output which suggests it is pretty usable.
One of the key differences in this chip versus other architectures is that ARM system boots first and has some ability to do reconfiguration. (all of the key subsystems for the ARM core to boot are hard blocks.)
I don't hold out a lot of hope for open floorplanning tools from Xilinx but they do have a Linux toolchain so hopefully it will be possible to do native development.
Xilinx' block diagram shows the FPGA having direct access to both the peripheral bus and the memory bus, including the cache coherency port, so I'd guess you can transfer data to the ARM cores fairly fast (as fast as you can get something from dram into L2 cache).