It depends on what you want to do. FPGAs excel in periodic "always on" workloads that need deterministic timing and low latency. If you don't have that and just care about total throughput and don't care about energy efficiency, then Nvidia will sell you more tflops per chip.
The energy efficiency of FPGAs can't be understated. Reducing the clock and voltage to levels comparable to an FPGA will kill your GPU's tflops and the control overhead and the energy spent on data movement are unavoidable in a GPU.
The energy efficiency of FPGAs can't be understated. Reducing the clock and voltage to levels comparable to an FPGA will kill your GPU's tflops and the control overhead and the energy spent on data movement are unavoidable in a GPU.