I think this is an awesome patch. Hopefully more people will follow suit. The first time someone was trying to describe a configuration using 'master' and 'slave' to me, the terminology was really distracting, and I could barely pay attention without remembering all of the lessons about U.S. history from 8th grade. I hope this change in terminology will help make software development more welcoming and accessible to more people.
This is an awesome blog post! Could the fact that the bug was stochastic have to do something with multi-threading? Also, how do you use BIOS to zero out the section of memory?
No, given the early stage of boot at which it crashed, no threading was happening. The randomness was because the initial zeroing out of the kernel's global and static variables might or might not happen, as a result of a physically random process (electrical discharge), instead of being ensured by software.
Most bootloaders (well, a BIOS usually refers to one step before the bootloader, but still) have a pretty primitive command shell, through which you issue the commands telling it how to load the initial kernel (e.g. from storage, or over the network). My guess would be she had to add a line to the boot script that zeroed out the relevant RAM; that, or rewrite the bootloader and add a loop in machine code to zero out the memory.
There was already code written to zero out the BSS shared across all the bootloaders for PowerPC, the call to it had just gotten lost when our enthusiastic fellow kernel dev rewrote bootloaders for platforms they couldn't test. I assume I just added the call to the existing code back in.
No, as the post states, the non-determinism is due to the fact that DRAM cells lose their charge over time unless they are constantly refreshed. When the system is rebooted after having been powered off for a long time, the DRAM cells are all discharged, and thus uninitialized memory will be 0. The kernel was relying on the memory in the bss section to be 0, but was not actually zeroing it out. Therefore, the code would only work if the memory actually was 0 due to being discharged.
Based on the first two paragraphs of the "independent" investigator website, http://www.ryaa.com/ the "independent" investigator seems to be biased in favor of the company. It could be argued that github is using their CEO as a scapegoat in order to avoid having to confront a possibly sexist internal culture. I wish companies and individuals were not afraid to address their sexist cultures or thoughts. Living in post modern society, it's almost impossible not to have a sexist thought - it's what we do with these thoughts that matters. I look forward to seeing the new initiatives github will be launching, and hope the initiatives will bring about real change in company culture, and cause people to question their beliefs. Meanwhile, I'm trying to decide if I want to switch to a different company to host my code. Any ideas?
I've noticed a trick when watching long HD youtube videos that I need to pause and re-wind several times. If I'm signed in to my web browser the video quality is demoted to lower quality after a while of watching, pausing and re-winding. To avoid this degradation of quality, I use a browser in incognito mode, and the quality stays HD. Does anyone who why this occurs?