Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
What's your C migration plan? (wingolog.org)
40 points by lucasr on Oct 13, 2011 | hide | past | favorite | 57 comments


You know what I've learned over the years? While many people are very bad about writing C, I'm pretty good at it. So C is not a bad language; there are just a lot of bad programmers around using it.

(And if you're wondering how to write good C? Make object ownership explicit. Never have malloc and free more than a page of code apart. Don't expose data structures as API. Make errors put your routine into a defined state that the caller can understand. Use bstrings instead of cstrings.

And although I don't write much C++, the more I learn about it, the more I believe it's possible to write good C++. Most people won't spend the time to write good software, and when you do that in C or C++ it's an absolute disaster. But if you go slowly, plan, think, exercise care, and review your work regularly, you can get good code that runs fast.

You can write a safe Haskell application in 10x less time than you can write a safe C++ application, though.)


If defects are at best constant over number of lines of code, then just the extra situps C makes you do are inevitably going to create reliability issues. But we all know that it's not that simple, and that some languages are more defect-prone than others; that leaves you to make the case that C is more resilient than most high level languages.

Amusingly (just in this context), when Joe Damato makes fun of Ruby reliability, what he's actually making fun of is the terrible C code MRI is built out of.

It's interesting to look at how Apple has dealt with this problem. iOS developers write a dialect of C --- in fact, a dialect that is nominally less safe than C++. But idiom in iOS keeps most iOS programs away from unsafe code patterns. You can try to tokenize a string in an iPhone program by taking it's char* (even that is a pain to get because of character encoding) and then strsepping it, but the whole rest of the programming environment works against you when you do.


Hail, god amongst men. Once upon a time I thought I was a very good C programmer. Today I know myself to be a terrible programmer. I surround myself with safety mechanisms such as -Wall -Wextra -Werror and the clang analyzer, but even these tools cannot stop me from shooting myself in the foot time and time again. Please enlighten me, such that I may learn to write safe code in an unsafe language infallibly.


The only effective safety mechanism for C code is size. You must treat everything you write in C as though it's a tiny standalone library, and then build your application from those libraries. "Frameworks" do not work here. (Unless, that is, you're going to go insane and implement a refcounting gc and double-indirection pointers, like C++ programmers do. That works, but at that point, you've lost all speed and simplicity benefits and you might as well just use your favorite Java substitute instead. Remember: you write C because you want other stuff to use it. When you implement your own memory model and semantics, you lose that. And then you have if statements, integers, goto, and segfaults.)


Each time you put a bug in, make sure to take it out afterwards. Nobody cares how many bugs you put in, only how many are left :)


Oh! It's so easy. I can't believe I didn't think to do that before.

... but every time I do find a bug, I am surprised to find it. Thus it stands to reason there are bugs I have not yet found. How do I find all of them?


Ah, well you seemed most concerned about the sheer quantity of foot-shot-off type bugs you put in, so I thought I'd put your mind at rest about that first. Sadly, that part is inevitable.

As for finding them before anybody else does, I suggest all of the following, which is what I do, because I've found it to work for me:

* Never run outside debugger unless you have previously verified inside the debugger that the run will work - this ensures that you are in a position to fully investigate any problems that might occur. (Some programs can't be verified ahead of time, because they're non-deterministic. (Hopefully due to non-deterministic sources of input, rather than anything in the code!) You just never run these outside the debugger.)

* Never run optimised build if you can help it (this is what the QA department and your users are for :) - again, this ensures you are in a good position to investigate any issues you discover. (Note - you should feel confident about using the debugger with an optimised build, most likely using the disassembly view, but I figure life is too short to be doing this all the time.)

* assert, lots, about everything, particularly pointers into buffers and indexes into arrays.

* -Wall, -W4, -Whatever

* Investigate memory manager's maximally-anally-retentive mode. If it doesn't have one, find a memory manager that does, and use that one instead. Switch it on and fix every complaint it might have.

* Find every debug option and #define for every library you use, and switch all of them on.

* Try to avoid having any multithreading.

* Make sure you know as many dark corners of the language as possible, and commit them to memory, so that you'll be able to spot them when people start making use of them accidentally.

* Add in some kind of crash dump/stack trace/etc. display to your program, so if it should crash during actual use then you stand some chance of getting some useful information back. (Avoid frame pointer omission for this reason.)

Unfortunately I can't guarantee that this will necessarily work, but I've found the above to at least have helped me minimise the sort of creepy, freakish bugs that you only get in C and its ilk.


Care. I recommend "Programming Pearls" as a good introduction to this concept.


How careful are you with your code? What would you be willing to stake on that answer?


I'm not saying I care about my code. I'm just saying that one can write bug-free code if one takes the time to do it.


Do you have a piece of code you feel confident about, such that you'd be able to say "here is an existence proof of carefully-written resilient C code of some significant length"? I'd like to take you up on an offer to validate that proof.


I do, but it's unfortunately internal at work. Everything I write for fun has so many library dependencies that you just have to cross your fingers and hope that it will all work as documented :)


Very true. I think most hate for C is really misplaced or for completely the wrong reasons.

I like to think I'm a good C programmer, and I use most of the techniques you describe. But lately I've been wondering whether I'm not actually coding object-oriented code in an inefficient manner. Do you try to somewhat avoid falling into an OOP pitfall, or do you embrace OOP and if so, what reasons do you have to use C over Java or C++?


C is a language for people that care about programming. Like, you treat it as a lifestyle, not a career or side activity. You read books on it cover to cover. You continually seek to simplify and make your code more clear.

If that describes you, it's a damn fine tool.


This is just nonsense. C is perfect for what it is supposed to do: systems programming, the kind of software you write when creating operating systems, debuggers, and device drivers. It will hardly ever stop being useful for these tasks.

People that make this kind of complaint were using C for the wrong type of software. If you want to write a web application in C you just need to reevaluate your tool set.


This objection is a hyperbolic response to an imprecisely phrased blog post.

The author points out that C is gradually being superseded by languages like Python. The author is right. Every year, less new C is being written, and for good reason.

You point out that there are obviously environments (like your life support valve driver, as Julio Capote says) where C is going to be the most appropriate choice for years to come.

We can be reasonable people and accept that there's truth to both of these sentiments, or we can synthesize a phony controversy in which we compete to defend extremes and get nowhere.

If we're reasonable, I think the author's point is well taken. Less C/C++ is written every year. In part that's due to the web and the increasing power of (what we used to think of as) "embedded" environments. But in part, it's because people who would have reached for C as their go-to language for some cases are (properly) going to stop doing that.

As someone who spends most of his working days looking at other people's projects for flaws like the stuff this guy is alluding to, particularly for Python and Ruby projects --- when you want to find something particularly fun and gruesome, you go straight for the C extensions.


Going purely on the number of processors that run exclusively on C code, I'd say more new C is being written, and for good reason. I'm referring to the firmware running on the dozens of devices you have in your home, the ones you don't see, don't even know exist.

These devices will only continue to grow, so more C will be written not less. More Python, Ruby etc will be written too, because more software will be written in general. The proliferation of software isn't a zero sum game and probably won't be for a very long time.


I think it's pretty clear that we're talking about the proportion of all software written that will happen to be written in C. If you'd like to talk about something else, that's fine, but it's not what I'm talking about and I'm not interested in the semantic debate.


You could make the same argument about most languages - programmers have more viable languages to choose from than they did just 10 years ago. But even on this count, C's share of the 'market' is increasing or stable, and not decreasing.

My point is you can't talk about decreasing C usage without acknowledging that a huge chunk, probably a majority, of C code is for embedded applications. That segment is increasing, not decreasing. And C is used on the lion share of embedded apps, possibly upwards of 90%.


While I agree with your comment, I would also point out that the fastest, most reactive and most scalable webapps have seen were written in C.

Some people out there actually know how to use C, and instead of battling with async JS do get a 5000hits/s hello world, they code a webapp that can easily handle hundreds of thousands of requests/sec without a blink.

Granted, it's less and less common.


Interesting, do you have an example?



You're suggesting that Varnish is what one would normally think of as a "web application"? Not Twitter?


Exactly. It's sad that in the modern world there are people (smart, productive people even) who are so far removed from system stuff that they just plain forget that this software still needs to be written.

Now, is C the best possible tool for these jobs? Certainly not. But it's the one we have and it's got a pretty fantastic track record.


As regards quality and reliability in consumer software, C has a godawful track record.

The heart valve code inherits constraints that make C a lot safer: it has an extraordinarily limited feature set, its functional interfaces are simple, it changes rarely, and the time - to - market pressures it faces are dwarfed by other factors like certification and manufacturing.

Consumer software is richly functional, has complex interfaces, changes constantly, and is written under ridiculous scheduling pressure. In that environment, C does indeed give you software that is as likely to enable someone to install a trojan on your system as it is to properly render an image.


Kind of ironic considering how much security industry related software is written in C/C++ (snort, dragon, bro, nessus, nmap, ...)


You want to have a conversation about the code quality of a typical C-code security tool? :)


To say that wingo has forgotten that systems software needs to be written is both insulting and flat-out wrong. He is a language implementor (Guile). Of course he knows that it is still needed and what it is needed for.


I doubt that the tech industry will ever move completely away from C. (I had a friend who was doing graduate work in physics and he was required to do all his work in Fortran, believe it or not, so old languages do stick around).

For "mainstream" (read: enterprise) software, yes, C may fall out of favor, but it's a great language and here to stay.

There are just too many cases where C is the right tool for the right job and it would be silly to use a different language.

I'm thinking particularly of embedded software. While languages like Lua are great for embedded scripting, they still need something to script against.

So, C isn't going anywhere and I expect that 100 years from now it will still be a worthwhile language to program in.

(Also, C is very much the Latin of modern programming. Learning it helps your understanding of so many other things. In case you're curious, Lisp is Greek and I mean that affectionately).


You think in 100 years we'll be writing in a language whose only real data type is an integer, overloaded to represent human language, memory addresses, and (by extension) the identities of functions?

What a depressing thought.

I love writing C code (the model of programming where virtually anything is possible by modifying and reinterpreting integers is often a fun one), and it's my best language, but I hope we grow out of it sooner than later.


As long as we're running our software on a Von Neumann machine, there will be a need for a language which is just a thin layer on top of it. Considering we already have one, it seems hard to believe that it will go away without a fundamental change in how computers work. (Which is not to say we won't have better languages - just that we will always have the bare-metal one as well)


C will go away alongside modern hardware design "standards." Until that happens someone will always find a use for talking as directly as possible to the hardware (which, of course, doesn't care about types at all). Whether that'll take a hundred years, or a thousand, or ten I don't know.

Of course it's possible to write a lot of things in, say, Python, then rewrite the slow bits in C. The number of programs that absolutely must be written close to the metal is pretty small in absolute terms.


> C isn't going anywhere and I expect that 100 years from now it will still be a worthwhile language to program in.

I often wonder what a more radical evolution of C might look like. What will "ISO C[20]99" look like? gcc has many clever compiler warnings, but a safer superset/subset language that is mostly source compatible might be valuable.


C is the best thing we have that's truly cross platform and has consistent performance everywhere without any company to claim ownership of it.

Until that changes, C is here to stay.


The nice thing about an article like this is that anyone who is receptive to it truly shouldn't be writing C code, and anyone who should be writing C code is (in almost all likelihood) not receptive to it.


As a game developer this post is rather amusing. I have far more questions about what happens behind the scenes on the python line than the C line.


It really depends on what you are writing. If you're trying to write the next great startup web app then, yeah, use python or ruby or perl--C is probably too finicky for the fast pace of a web startup. But if you're writing systemd (an "init" replacement) you'd be the object of ridicule if you wrote it in any of those languages.


I've been recently using Python more rather than Java, and I find myself writing C more than ever. I always seem to find some speed/memory issue in my programs that when rewritten in C goes away. Maybe I'm just not proficient enough in Python to avoid them yet. I would say its still a net gain over raw C.


Is it that easy to write in C and call from Python that you don't think twice about it?


Yes, it actually is. Writing Python extensions in C was always easy, but now there's ctypes, and you don't even need to do that anymore. You can just call a shared library function directly from Python.


You might want to use a profiler first, then try PyPy. Then Cython.


I didn't mean to imply I didn't profile (otherwise I wouldn't know what parts to write in C). I've been doing profiling (and trying various non-C ideas on what may help first). I'll have to look into PyPy and Cython, thanks.


What this can possibly mean? Because C is everywhere. It is in the code that runs the VM of your favorite language. It is in most compilers. It provides basic system services. Why should we have a plan to migrate from something that is everywhere?


But I don't want to use his language, I want to hack on it with him. To see how he solves problems. So, C.

Unless its something like Factor, where most of the implementation is written in Factor.


Changing languages will not save you from security holes, only change the kind of mistakes you have to worry about. Only knowing how to write secure code can save you from security holes. You cannot ignore how computers work, no matter how many layers of abstraction and libraries you pile on in an effort to shield yourself from reality.


Changing languages will absolutely save you from classes of security holes. Java programmers do not have to keep a working model of object memory lifecycles in their head to build safe programs. C programmers, on the other hand, still do have to care about the metacharacters of the HTML DOM and SQL. In fact, because it's a bitch to rewrite retained char* strings in place without pegging malloc() to the top of your gprof profile, C programs are actually slightly more prone to the kinds of security holes you're implying plague languages like Java.

Even if you stretch to find a class of flaw that only affects a language like Perl (for instance, the ease with which Perl allows you to write code where metacharacters will allow an attacker to stuff commands into a shell), you simply have to go back 10 years or so to see the 8lgm posts that did the same thing to C programs all over the Internet.


And yet, this still ignores the rather amusing fact that the more libraries and VMs and whatnot in between you and the computer hardware, the more chances your program will suffer a security hole because of a mistake in one of those libraries. Not too long ago someone found a bug in PHP that would cause it to go into an infinite loop if you put in a wrong number. These kinds of things happen, so if your going to write secure code, the only thing that will save you is being really good at writing secure code on whatever platform you are choosing to write code for. The only thing that changes is the types of security holes you have to worry about. Instead of buffer overflows, you have to know that function x does x and in some cases function y is more secure but only if there's a solar eclipse, etc. The higher level language you use, the more complexity the system has, which increases the possible security flaws even as the languages reduces other security flaws. The end result is simply that any given language will move the problem around, not actually solve it.


The bug you're talking about is a result of C code. I'm not sure what point you're hoping to make by observing that the C code underpinning high level languages is prone to vulnerabilities; I feel like that point is rather more supportive of my argument.


It's fitting that his site is timing out.


I am unable to load the article. Is there a mirror?



My suspicion is that this is not a C programmer at all. Claiming that you've "written loads of it" certainly lends the author a cloak of credibility, but I don't buy it.

"This one, you really can't tell."

A competent C programmer can absolutely discern what is happening... more importantly, though, is that the two examples don't do the same thing.

I stopped reading at that point.


Upon hitting his website, you were one single click away (the "software" link, next to "about") from learning that he's the co-maintainer of Guile, a C implementation of Scheme.


The author of that (badly timed, btw) post probably does not know that Python is implemented in C.


The author of that post is a C programmer. This isn't just snark; it's dumb snark.


A C programmer posting that kind of nonsense is even more stupid than a non-C programmer doing it.


No, CPython is implemented in C. Python has been implemented in many languages, including C# (IronPython), Java (Jython) and in Python itself (PyPy). The latter, I might add, is maturing incredibly rapidly.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: