Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

>I now strongly encourage explicit loops for everything, and hope the compiler unrolls it properly.

I get why this is a thing. Sometimes an unrolled loop is faster. But if this is really an issue, why isn't there a [UnRoll] modifier or a preprocessor or something that handles that for you?

Something like this:

  for (int i = 0; i < x; i++;) {
    dothing(x[i]);
  }
versus:

  unroll for (int i = 0; i < x; i++;) {
    dothing(x[i]);
  }
Only the compiler / preprocessor would unroll the second one. You have the best of both worlds with a reduced chance of subtle errors.


This is essentially the thinking behind the 'register' keyword. The idea was to make it possible to mark which variables were supposed to go in registers and which could be stored in memory. That may have made sense back in the 70's, but these days, the compiler's heuristic is usually better. This also applies to the 'inline' keyword. Maybe you're right... but maybe you're wrong and inlining the function blows the cache, etc.

I think the same logic applies to a putative 'unroll' keyword. Even if it's a short-term win, the environmental properties that make it a win are likely to change before the code is retired. To me, that argues for relying on the heuristic.

One note to this is that MSVC has both the usual 'inline' keyword as well as a proprietary stronger '__forceinline' keyword. __forceinline overrides the heurstic and forces the inlining of the function even if the compiler doesn't agree it makes sense. I can see how that kind of compiler-specific annotation might be useful tactically. (ie: You've found the compiler to be making the wrong choice for a specific platform and you wish to overrule.) But not a full-fledged language keyword...


And letting the compile decide means you can choose between -Os and -O3 later depending on how your constraints change. Really for performance they only keywords you should be using are 'const' and 'static'. Both just tell the compiler it's free to make certain kinds of optimizations it might not otherwise figure out that it's allowed to do.


Many compilers let you supply pragmas that give hints as to how many iterations a loop will usually be hit, whether this will multiples of x etc. Compiler can then use this info to decide whether to apply various optimisations. I think it's not part of C because it's kinda an underlying detail - explicitly informing the compiler how it should optimise something shouldn't really be part of the language really, pragmas are good for this


This kind of pre-processor / compiler specific keyword is available in most embedded C/C++ compiler.

See http://www.keil.com/support/man/docs/armcc/armcc_chr13591249...

For example.


The compiler has heuristics for unrolling, and ought to do it automatically when appropriate. Sometimes unrolling hurts performance, as you make the code larger and therefore reduce i-cache performance.


You mean something like...

    __attribute__((optimize("unroll-loops")))
? :-)


btw, there shouldn't be a semicolon past i++ (i.e. the code won't compile)

any compiler worth its salt should be able to unroll w/o explicit demand from the developer.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: