Archive for the 'Tech' Category

Now, say …

… just how many Space Shuttle Programs can you fund with the insane amounts of money Europe is about to spend to pull Greece out of the hole they’ve dug for themselves? Same for the countries that will undoubtedly follow the same path in the coming weeks.

Yeah, fuck that.

Point is: there are so many better ways to spend that money (that we don’t have anyway).

Farewell, Space Shuttle Program

Well, today was the day and I’ve just watched the last ever Space Shuttle launch.

Let’s just say there’s something very, very wrong with this world and leave it at that.

Hats off to all the men and women who dedicated their lives or part thereof to these magnificent birds over the past 30+ years, from the original design studies back in the days to the transition & retirement operations that will last for a year still.

We’ll most probably never get to see any other machine pushing boundaries like the Shuttle has done, and this is a gigantic loss for us all. One day people will realize…

If you aren’t familiar with the challenges faced by the Shuttle, with the advances in science and countless engineering disciplines brought by Shuttle and more generally with Shuttle and facts about the Shuttle, go read Wings in Orbit; a book by the people who built and operated the Space Shuttle, available freely in PDF format from NASA.

Did you know that the Space Shuttle Main Engines (SSME, RS-25) are the most powerful, most efficient engines ever built? They were designed 40 years ago. I won’t make a list, but this is true of a number of other components. What the hell have we been doing all these years?

QLogic QLE73xx InfiniBand adapters, QDR, ib_qib, OFED 1.5.2 and Debian Squeeze

A few weeks ago, I’ve had to look into getting a QLogic QLE7342 InfiniBand adapter working on a Debian Squeeze system, with OFED 1.5.2. This post will probably save quite some time to anyone trying to do the same; it applies to all the QLogic adapters supported by the ib_qib kernel module.

Note: on the ibverbs side of things, the adapters are supported by the ipath plugin, just like older QLogic adapters (that use the ib_ipath kernel module).

First of all: grab a recent version of the ofa-kernel package. The ib_qib module in ofa-kernel from OFED 1.5.2 did not work at all for me. I used the ofa-kernel snapshot from 20110203.

There are two things to know about the QLogic adapters:

  • they don’t autodetect the fabric speed by default, a dedicated utility has to be used to set the speed(s)
  • the driver exports a sysfs-like pseudo-filesystem, ipathfs, that needs to be mounted

QLogic offers a complete IB stack based on OFED and dubbed “QLogic OFED+”. The complete package weighs in at over 500 MB and contains the QLogic-blessed drivers, OFED stack, libraries and, of utmost importance to us, the QLogic-specific utilities.

Unfortunately, as simple as the QLogic utilities are, they don’t come with source. The package also only exists for RedHat 4 and 5.

So, go to the QLogic support website, select your hardware and download the QLogic OFED+ Host Software for RHEL 5 package. In the tarball, one directory contains the OFED stack and another contains all the utilities (QLogic-Tools.*).

For setting the adapter speed, you’ll want the iba_portconfig utility along with the initscript (shipped as iba_portconfig.sh). The desired adapter speed is set in the initscript by choosing the proper arguments to iba_portconfig (-s 1 for SDR, -s 2 for DDR, -s 4 for QDR or any combination thereof).

The iba_portconfig initscript must run after the drivers have been loaded. Which brings us to loading the driver and mounting the ipathfs.

This is all handled by the openibd initscript provided in ofa-kernel (under ofed_scripts/) and its companion script dedicated to QLogic adapters, truescale.cmds.

This initscript will load the OFED stack and drivers; if a QLogic adapter is present, it’ll mount the ipathfs in the right place.

Voila, once this is done, the adapter should happily report itself ACTIVE/LinkUp.

Kodak scanner drivers for Linux – not there yet

So, Kodak announced the availability of Linux drivers for some of its scanners, together with a frontend application.

While the effort is laudable, they’re not quite there yet. Let’s have a look.

First of all, and contrary to what a few people believe after reading the press release, the drivers themselves aren’t OpenSource, let alone Free Software. They’re all proprietary; read better, the press release clearly mentions that only the frontend application is GPLv2.

Now let’s look at what Kodak offers in its drivers packages. I’ve downloaded LinuxSoftware_i1200_v3.5.tar.gz from the Kodak website, these are the drivers for the Kodak i1200 series.

The archive contains a set of RPM packages, one Debian package corresponding to one of the RPM packages, and a setup script.

One of the packages contains the OpenUSB library, which is released under the LGPL; sources aren’t there, I haven’t asked for the sources, don’t plan on doing so, and am not implying anything regarding license compliance whatsoever. I’ll note that I haven’t seen any written offer for the sources, though, be it in the archive (there’s not even a README in there) or on the website.

So, although Ubuntu is listed as supported, we can say that no Debian packages are provided.

The setup script actually relies on alien to convert them if it thinks it’s running on a Debian-based system, which is determined by the availability of dpkg in the PATH. Using alien is far from ideal.

The packages ship libraries and components under /usr/lib, /usr/local/lib and even /opt, pretty randomly it seems. Let’s repeat here that /usr/local and /opt are strictly reserved for the local administrator and packages must not install anything here. There are other oddities in the paths used by the software stack, like /var/kodak.

It looks like the SANE backend is only a bridge to Kodak’s TWAIN data source, so you’d actually need the whole stack to use it.

All in all, Kodak does no better, I’d even venture to say it does worse, than Epson (Avasys). At least part of Epson’s backend is Free Software, even if it relies on proprietary libraries for some scanners, though unfortunately most if not all of the recent ones.

I also have to mention that Epson is offering both RPM and Debian packages, and by that I mean proper Debian packages. Kodak are far from that, even the RPM aren’t that great (I think the RPM packages even lack dependency information, based on what I’ve read in the setup script, though I haven’t checked).

What Kodak offers here is a GPL frontend for TWAIN that not many people will be using, while keeping the drivers proprietary. Unfortunately their SANE backend isn’t a backend so to speak, and even if it was OpenSource it’d still rely on the rest of the TWAIN software stack.

So, in a nutshell, while I’m very happy to see yet another vendor go down the road of Linux support for its products, I am immensely disappointed by the proprietary nature of the drivers and the low quality of the packages.

I was, of course, investigating the drivers to possibly package them for Debian; as you’ve probably understood, it’s not possible, both technically and legally.

And I kept a little gem from the setup script for the end:

# We need to create a link for the libdbus-1 requirement for openusb,
# but only on Fedora.
if [ -f /etc/fedora-release ]; then
    ln -s /lib/libdbus-1.so.3 /lib/libdbus-1.so.2 2>> /dev/null
fi

As we all know, sonames and soversions are useless and they just keep getting in the way. Duh.

Wow.

Put down the crack pipe. Really. Wow.

MacBook Pro power savings

Matthew: Thanks for the followup and the summary of the MacBook status wrt power consumption.

I’m running a 2.6.22 kernel with the hrt patchset and a set of patches including your patch to appletouch. Your patch works great and the trackpad’s behaviour is a bit better with it.

I know pretty well what the problems are on my machine, but your summary will probably be useful to a lot of people. I know some of the issues can be fixed or worked around and others cannot.

Also note that I can get something like 3h45 out of my battery under OS X with my standard workload, which is pretty good. If OS X can do that, there’s no reason Linux can’t, however bad Apple’s choices are wrt hardware.

And to make it clear once again, my previous post didn’t criticize the C-states or anything else, but the fact that I see PowerTOP being used more as a blame shifting tool than as a debugging/profiling tool. Though I maintain that UHCI suffers from a bad design, OHCI is a much, much better spec.

PowerTOP: demystifying Intel’s latest marketing coup

A couple of weeks ago, Intel has released the PowerTOP utility for Linux, a tool similar to the well known top(1) utility that can help you understand why your laptop battery runtime is as low as 1 hour. You need a tickless kernel (2.6.21 at least on i386, 2.6.22 + hrt patchset on amd64) for that to work, and timer statistics must be enabled.

This is an extremely bright marketing coup from Intel, for essentially two reasons:

  • with PowerTOP, they’re now telling “hey look, the CPU is in C3 xx,x% of the time, it’s saving power!!!” and have some real hard data to back that up;
  • with PowerTOP, they can also tell “hey look, the CPU cannot reach the C3 state for long enough periods because foobar is waking it up too often! That’s not the CPU’s fault!”, again with some real hard data to back that up.

In that, PowerTOP is an incredible PR spinning tool that can shift the blame away from the Intel CPU in your machine to about everything else in your machine – including software!

They’ve done extensive and intensive internal testing, which turned up some bugs and misbehaviours here and there, which is truly great.

They’re also announcing incredibly low “CPU wakeups per second” values, so low that there is no way you can reach such low values on any kind of real world setup. These numbers must be seen in the context of lab testing on otherwise idle machines.

On the “CPU wakeups per second” front, you’ll most probably see that the #1 offender is your UHCI USB controller. A UHCI USB controller produces a thousand interrupts per second, effectively waking up the CPU a thousand times per second even when nothing happens.

So, now it’s all the USB controller’s fault! Ah, wait, UHCI is an Intel spec! (UHCI is a joint effort by Intel and others, Intel being the lead) Fortunately, to be able to put the USB host controller to sleep, you need to put all the USB devices attached to it to sleep too. Here comes the (in)famous CONFIG_USB_SUSPEND kernel option introduced six or so months ago, the one that makes (made, I’ve worked around it in SANE last week-end) your scanner produce only “black scans” and prevents your printer from working properly. So now it’s all the USB devices’ fault.

Brilliant, isn’t it ?

To be honest and complete, once again, CONFIG_USB_SUSPEND exposed a lot of hardware bugs. The USB spec is probably one of the most violated spec these days (together with ACPI I’d say, though it looks like this one is improving lately).

Please don’t get me wrong: PowerTOP is a very useful tool that has its uses, the tickless kernel is of course a very important feature, and CONFIG_USB_SUSPEND would be the icing on the cake if USB devices weren’t so ignorant of the spec.

But the Intel marketing crap behind that really needs to stop. Produce better chips which consume less power, period. My PowerBook G4 could run 4h30 on battery (a machine designed 7 years ago). My Core2 Duo MacBook Pro, with a slightly bigger battery, can only run 1h40 with the same software and workload, with cpufreq enabled etc, etc. Granted, the ATI video card is a big sucker here too.

(I wrote this entry to share my thoughts on the matter, as a reaction to the many mails I’m getting about PowerTOP and the MacBooks, so as to have a reference to point people to, instead of rewriting the same things again and again.)

WTF Award of the week

Quoting from the Release Notes for the firmware v1.54.2 for the Thomson ST2030 VoIP telephone, in the “Relevant known limitations” section:

3.1 The display doesn’t show the name of the incoming call when received display name starts with the letter “O” (uppercase).

This raised quite a few eyebrows here, and raised quite a few concerns too. I refuse to even imagine how this kind of bug can happen.

And for those who will have a hard time believing this, we did the test, and, indeed, the CallerID name doesn’t appear on the phone when it starts with an “O”. Shrug.

Once again, XFS saves the day

The server hosting this blog just suffered what can only be described as catastrophic hardware failure, thanks to the crap that are Maxtor hard drives, Dell CERC 1.5 RAID controllers and the Linux driver for those (aacraid).

So, the machine got shut down for a moment on thursday night, just the time to move the rack around in the datacenter, and one of the disks did not come back in one piece.

But it did come back online after a reboot, prompting an array rebuild by the controller which took 12.5 hours for 250 GB (RAID 1). And the drive failed again 10 minutes after the rebuild was completed, confusing both the controller and the driver. At that point I could still log into the machine, but any command would trigger nice I/O errors all over the place.

Fortunately, the FS got shut down by XFS automatically on I/O errors and I haven’t lost a single bit of data. The machine came back just fine with the faulty drive removed.

And you thought a RAID controller would help in this case. Obviously this one doesn’t. I need to mention that we’ve observed this behaviour on another machine before this incident, we have another machine suffering from a faulty drive in just the same way, so this isn’t an isolated case and it’s reproducible.

Data recovery, or why I love XFS

So, here it is, the hard drive in my mother’s laptop (HP nx9105, 10 months old) just broke down this week, with some important data on it (I do have backups, but it broke down in between 2 backups with some important changes to the data – accounting and payroll).

The laptop was running an AMD64 Sarge on an XFS filesystem. Fortunately, XFS shuts the filesystem down when the underlying block device fails, so the damages are quite limited. dd_rescue to the rescue, only 40 kB of the 40 GB partition couldn’t be recovered. An xfs_repair run later, the FS mounts with only 30 files in lost+found: 10 of them are device nodes, 10 are empty files, another 5 are sockets and the remaining 5 are mails from anacron.

I just love this filesystem. SGI, you guys rock. All of this just went as planned when I installed the machine in July last year; I chose XFS for its ability to not screw up the FS when the underlying block device crashes and for the powerful and reliable set of maintenance tools that goes with it.