Monday, November 7, 2011

Paralells Virtuozzo Linux 4.7 kernel panic

After upgrading from PVCfL 4.0 / 4.6 to version 4.7, kernel panics started to occur randomly on almost all hardware nodes.
I configured kdump on the nodes to collect a kernel dump upon the next kernel panic, and contacted Parallels support team.
When the kernel panic occured and the dump was created, they analyzed it.

The feedback we got was the following:
The latest saved dump points to general protection fault caught by `mysqld`. This looks like just recently found issue - PCLIN-30321. I will pass it to the maintenance team.
As for the type of installation - there is no difference if this is a clean installation or an upgrade from the previous version of Virtuozzo.
The issue is quite complex (General Protection Fault from a process in a container is quite difficult to investigate), and it would take some time for maintenance / development team to check the case.
Though, the investigation might take significant amount of time and therefore the results will be provided through the separate ticket once they are ready.
You can also check the release notes for Virtuozzo kernel updates (CU-2.6.32-042stabXXX), and monitor the status of the request PCLIN-30321 which was assigned for this crash issue.

It's not always caused by 'mysqld', I've seen a kernel panic caused by another process as well, but most were indeed due to mysqld.
We don't know how long it will take for this to be fixed, so we decided to migrate the containers which cause the kernel panics to a 4.0 or 4.6 node.

Finding out which container caused the kernel panic can be done by using the crash tool.
An example:

[root@virtuozzo]# crash /root/vmlinux-2.6.32-042stab037.1 vmcore
crash 4.1.2-8.el5.centos

      KERNEL: /root/vmlinux-2.6.32-042stab037.1
    DUMPFILE: vmcore
        CPUS: 8
        DATE: Sat Nov  5 10:00:13 2011
      UPTIME: 10 days, 10:12:02
LOAD AVERAGE: 19.68, 15.35, 14.01
       TASKS: 2239
    NODENAME: virtuozzo
     RELEASE: 2.6.32-042stab037.1
     VERSION: #1 SMP Fri Sep 16 22:18:06 MSD 2011
     MACHINE: x86_64  (1995 Mhz)
      MEMORY: 32 GB
       PANIC: ""
         PID: 62872
     COMMAND: "mysqld"
        TASK: ffff88024d6892c0  [THREAD_INFO: ffff8802d65e4000]
         CPU: 5

crash> dmesg
[900722.171220] Process mysqld (pid: 62872, veid=7420232, threadinfo ffff8802d65e4000, task ffff88024d6892c0)

Update 15/11/2011
On 14/11/2011, Parallels has released a kernel update which contains a fix for this problem. More information on