My tribute to kernel's IRQPOLL

My tribute to kernel's IRQPOLL

This is something I should have done before, since to be sincere, and having put into production several encoding appliances very successfully thanks to this "kinda Linux Kernel hackish trick" I owe part of my IT career success to irqpoll kernel option...
Yesterday, destiny faced me against my own ungratefulness and ease to forget, and punished me with and evening of headache and worry until, again, irqpoll came into rescue... Clearly, I realized that the time was come to do something about it, so here's my most sincere tribute to irqpoll

About irqpoll kernel option

Citations are probably needed here first, so, according to kernel documentation, regarding kernel configuration options, in Interrupt options section, I have found:

irqpoll - Extended fix to interrupt problems.

When an interrupt is not handled, search all known interrupt handlers for it and also check all handlers on each timer interrupt. This is intended to get systems with badly broken firmware running.

That basically means that, provide a poor hardware arrangement, a buggy driver, some BIOS incompatibility, or whatever other cause leads to an interruption problem or conflict during system operation, instead of rendering the conflicting device unusable, let the kernel take over, actively discover the offending conflict causant by examining all possible sources, and act magnanimously allowing it to operate "normally", fully transparent to it... GREAT!!!

The root cause of the mess is that, unlike early time systems were IRQs, DMAs and the like were hardware or BIOS rigidly controlled (I remember using jumpers to configure IRQs!) today's systems allow for more flexible usage due to interrupt handlers to control, if necessary, shared interrupt spaces.
Hardware devices and kernel have to keep a set of interrupt handlers under control, were the kernel registers which handler corresponds to each device, and device drivers have to handle this too.
The problem, as stated earlier, comes when irq handlers are not registered, wrongly registered, or hardware drivers do not honor assigned irq handlers.

My first and best use case

My first use case of irqpoll was as I was involved in the development of Linux driven audio encoding devices.
We were using OEM hardware, to build up a rack size, high performance and low cost audio encoder, and everything worked great: we where far from stressing CPU on early prototypes encoding simultaneous different streams beyond any prospects...
So, we were asked to double capacity in a more realistic and useful way by adding a second PCI audio card to double input and outputs.

Our hardware case provider included a good looking, robust, PCI riser card, and two cards fitted very cool in our encoder.. so we were happy expecting to easily please our demands, just to discover that, although the second card was fully recognised it simply didn't work...
No errors, no warns, no faults at runtime... it simply, don't work. We obviously tried several cards, and at the end, we discovered that the riser card was poorly designed, and lead to irq conflict.

The key to discover it: dmesg
The key to solve it once and forever!!!: irqpoll

Dmesg is your friend

Fortunately, irq problems are soon discovered during boot time, so dmesg will surely tell you something is wrong, just look for lines like this (here, in this example, replace 'N' with a valid IRQ number):

irq N:nobody cared (try booting with the "irqpoll" option)

As you see, the kernel even suggest you to solve the issue by using the irqpoll option.
So, executing something like this:

dmesg | grep irqpoll

Will lead you directly to the problem if it exists.

A second setback, and yet again irqpoll into the rescue

Yesterday I was happy playing with my Debian Jessie Alix board, and I was committed to get serial console up and running, since migration to systemd has rendered my old knowledge on setting serial consoles obsoleted.

After some googling, I finally get my serial up and running, so I though I should share it in my blog!... but, to my nasty surprise, Internet connection was lost...
Console output and /var/log/syslog started to flood with messages like this:

[ 2058.004197] via-rhine 0000:00:12.0 eth0: Transmit timed out, status 0003, PHY status 786d, resetting...

I was puzzled, and I seriously considered the integrated ethernet card having failed.
Googling around, I found a lot of past kernel problems for that device, most of them relating to wrong interrupt handling but.... it can't be! it was running!
I felt it was very suspicious that fault appeared as soon as I set the serial console up, and I didn't wanted to disable it, this is not a decent solution! I spend almost an hour checking IRQs, reading about setserial command, and finally I decided to reset my mind, go out into my garden and do some work there...
Having my mind cleared from stress and worry, and convinced that there was some kind of IRQ problem, I decided to go step by step, starting to carefully inspect booting output... and bingo!

.... irq 10:nobody cared (try booting with the "irqpoll" option)

I knew from my previous investigations IRQ10 was precissely eth0 assigned IRQ... "irqpoll" option... irq conflicts and headaches....I had a dejavue!
Efectively: booting with irqpoll option everything returned to normal with the ethernet adapter, now it all works perfect!!

Adding irqpoll to Grub's kernel boot options

I could not end this blog post without showing how to enable irqpolling to the linux kernel without having to manually stop Grub's countdown before booting and type it into kernel options... here we go:

Provide you're in a modern Debian or alike distro, using Grub2, you'll have control on boot configuration by using the file /etc/default/grub and applying the config by issuing the command update-grub.
Previously, in older systems and with Grub, you'll have to edit /boot/grub/menu.lst.

Anyhow, look and find the following line, where kernel boot options are found in a file like this:

GRUB_CMDLINE_LINUX_DEFAULT="...options here...."

Tipically, no options, or a single quiet option is found, so, I suggest to comment out the original line as a config backup, and add a new one with your options... I got mine like this:


Then, after this, apply the new booting config into grub and reboot to apply changes to system:


Irq problems are gone!!!!!