Month: July 2007

Kernel Debugging at scale

This paper will discuss the difficulties and methods involved in debugging the Linux kernel on huge clusters. Intermittent errors that occur once every few years are hard to debug and become a real problem when running across 1000s of machines simultaneously. The more we scale clusters, the more reliability becomes critical. Many of the normal debugging luxuries like a serial console or physical access are unavailable. Instead, we need a new strategy for addressing thorny intermittent race conditions. This paper presents the case for a new set of tools that are critical to solve these problems and also very useful in a broader context. It then presents the design for one such tool created from a hybrid of a Google internal tool and the open source LTTng project. Real world case studies are included.

how to deal with rare error conditions that are hard to reproduce

Where would Jesus queue?

AT&T’s rivals, Verizon and Sprint, issued “talking points” to their salespeople, with helpful hints for impugning the iPhone’s divinity. They lost customers anyway. Executives at Motorola and other phonemakers were spotted in various stages of shock and awe at the cultural impact that the launch of a handset—a handset!—could have. Honchos in all sorts of industries have long studied keynote speeches by Steve Jobs, Apple’s boss, for ways to cast spells on audiences; now they also need to work out how he outsourced his product marketing to an entire nation of volunteers.

awes. channeling boing boing, calling it the jesusphone that it is.

40 gb broadband

Sigbritt Löthberg’s home has been supplied with a blistering 40 Gigabits per second connection, many 1000x faster than the average residential link and the first time ever that a home user has experienced such a high speed. Fiber technology makes such high speed connections technically and commercially viable. “The most difficult part of the whole project was installing Windows on Sigbritt’s PC”