> Word counting is primarily IO-bound, and it is much too expensive to ferry the file contents all the way to the GPU over the (relatively) slow PCI Express bus just to do a relatively meagre amount of computation.
After seeing that it's possible to play crysis using software rendering on an AMD Rome cpu with 128 hw threads [1] - might this lead to some vindication for AMD sticking with opencl (assuming exposing such a cpu via opencl) - or is it just simpler to ignore that (in general and for futark) and just use regular threads for parallelizing aacross many cpu cores?
CUDA is proprietary to NVidia, and is pretty much the standard for GPU computing. AMD's been chipping away with OpenCL, Vulkan/GLSL, https://github.com/RadeonOpenCompute/hcc/wiki, etc. but not much luck so far. I wouldn't say AMD's been "sticking with" OpenCL, if anything it seems like they will deprecate it in a few years, as the plan is to fold OpenCL into Vulkan.
After seeing that it's possible to play crysis using software rendering on an AMD Rome cpu with 128 hw threads [1] - might this lead to some vindication for AMD sticking with opencl (assuming exposing such a cpu via opencl) - or is it just simpler to ignore that (in general and for futark) and just use regular threads for parallelizing aacross many cpu cores?
[1] https://news.ycombinator.com/item?id=21339652