Yes, at least in Poland. You can either use Blik (mobile payments integrated with your bank account) or get redirected to your bank website for fast wire transfer.
From personal experience, it doesn't work that well. The available 3D models [1] are not detailed enough and the effort to make realistic and diverse renders is huge. It will kinda work but it won't generalize well into real images.
That is something I don't really understand exactly why. 3D Models provide in one dataset the whole geometry information of each part, making it easier to learn and recognize that part in the real world, wouldn't it?
Is it because these ML algos lack a way to internally interpret 3D models for learning?
On top of that you reduce effort in data labeling, as each model would come with the relevant part ID, shape and color.
Of course, finetuning can be done afterwards with real world photos to increase robustness.
I think it's because we classify single images which lose 3d information. We expect them to work with single image but not even humans do that (without stereoscopic vision) if we can help it.
Computer vision has become very good, but whatever it is exactly that it is doing, it is still not the same as human vision, and this is one of the places where that really sticks out. A human can learn from a bunch of super-perfectly-pristine 3D renders to recognize a real-world object. Heck, we can do it from a single pristine 3D render, and not even necessarily a high-quality/resolution one. Whatever it is that computer vision is doing, it is something less "powerful" than human, which we then make up for by throwing a lot more computation and resources at it, which covers over a lot of the problems.
If you can figure out exactly why that is, you will at the very least get a very well-cited paper out of it, if not win some sort of award. It's not a complete mystery what the problem is but the solution is unknown.
Because we don't know what the difference is, we can't fix up the 3D renders. "Just make it noisy" certainly isn't it. (We in fact have a lot of experience throwing noise at things; the whole "stable diffusion" AI image generation is based on that principle at its core.) It has to be the right sort of noise, or distortions, to represent the real world, and nobody can tell you exactly what that is.
Yes, you can train an ML model on rendered data, but the model tends to fixate on rendering artifacts, and performance doesn't transfer to real world images. Plus, it's very difficult to make generated scenes with the variety and complexity of the real world. Your trained model will fail to generalize to all the distractions in natural scenes.
Yes, there are techniques for all these problems, but none of them are good, reliable, or easy to get right.
Yes, I think it should be possible without any technological hurdles. It's just some work to set it up.
(DL models are trained with one type of camera and used with another type of camera all the time, and that's sort of similar; plus renderers as used in blender3d are pretty good and they should work well with LEGO bricks which are relatively simple objects; if despite all this it doesn't work you could degrade both the real and the generated images e.g. by quantization, with the aim of bringing them closer together.)
RebrickNet is an abandoned project that supports only 300 parts. I recommend Brickognize [1], which recognizes all 85k Lego parts, minifigures, and sets.
That's a weirdly minimalistic site. I mean it's nice that the front page is exactly what you want to see but I wish the About page would show some more information about the what the site can do, some examples, technology bits etc.