|
Discussion Forums
|
Thread: Debian Lenny and XFS causes kernel panic: Suggested Workaround
|
|
|
Replies:
39
-
Pages:
3
[
1
2
3
| Next
]
-
Last Post:
Oct 1, 2009 8:28 PM
by: Eric Hammond
|
|
|
Posts:
252
Registered:
7/23/08
|
|
|
|
Debian Lenny and XFS causes kernel panic: Suggested Workaround
Posted:
Feb 12, 2009 11:16 AM PST
|
|
|
I did some digging around during lunch today, and I think I may have finally figured out how to make Debian Lenny and XFS coexist peaceably in EC2. This is going to be important, because Lenny is supposedly going to become the next stable version this weekend.
http://lists.debian.org/debian-devel-announce/2009/02/msg00000.html
The kernel panic seems to be caused by the XFS log version 2, which was made the default in the lenny version of xfsprogs (it was log version 1 in etch). If you force your xfs filesystem to use log version 1 like it was under etch, you should not experience the kernel panics*.
In order to force your new XFS filesystem to use log version 1, create it using the following command:mkfs.xfs -l version=1 DEVICE_NAME
| Here are the results of the testing that I did over lunch:domU-12-31-39-03-B9-D6:~# cat /etc/debian_version
5.0
domU-12-31-39-03-B9-D6:~# mkdir /vol
domU-12-31-39-03-B9-D6:~# mkfs.xfs -l version=1 /dev/sdh
meta-data=/dev/sdh isize=256 agcount=4, agsize=65536 blks
= sectsz=512 attr=2
data = bsize=4096 blocks=262144, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096
log =internal log bsize=4096 blocks=2560, version=1
= sectsz=512 sunit=0 blks, lazy-count=0
realtime =none extsz=4096 blocks=0, rtextents=0
domU-12-31-39-03-B9-D6:~# mount /dev/sdh /vol
domU-12-31-39-03-B9-D6:~# cd /vol
| If I had been using log version 2, the following command would have caused a kernel panicdomU-12-31-39-03-B9-D6:/vol# touch file
domU-12-31-39-03-B9-D6:/vol# ls
file
domU-12-31-39-03-B9-D6:/vol# rm file
| Ok, let's copy a few hundred MB of files to our xfs EBS volume:domU-12-31-39-03-B9-D6:/vol# df -h .
Filesystem Size Used Avail Use% Mounted on
/dev/sdh 1014M 160K 1014M 1% /vol
domU-12-31-39-03-B9-D6:/vol# cp -R /usr /var .
domU-12-31-39-03-B9-D6:/vol# df -h .
Filesystem Size Used Avail Use% Mounted on
/dev/sdh 1014M 410M 605M 41% /vol
domU-12-31-39-03-B9-D6:/vol# rm -rf *
| Ok, that went well. Now, let's make a mysql database on our xfs EBS volume and dump 100MB or so of data into itdomU-12-31-39-03-B9-D6:/vol# invoke-rc.d mysql stop
Stopping MySQL database server: mysqld.
domU-12-31-39-03-B9-D6:/vol# cp -R --preserve=all /var/lib/mysql .
domU-12-31-39-03-B9-D6:/vol# cat > /etc/mysql/conf.d/local.cnf <<EOM
> [mysqld]
> datadir=/vol/mysql
> EOM
domU-12-31-39-03-B9-D6:/vol# invoke-rc.d mysql start
Starting MySQL database server: mysqld.
Checking for corrupt, not cleanly closed and upgrade needing tables..
domU-12-31-39-03-B9-D6:/vol# df -h .
Filesystem Size Used Avail Use% Mounted on
/dev/sdh 1014M 21M 994M 3% /vol
domU-12-31-39-03-B9-D6:/vol# mysql -u root -p < ~/g_s.sql
Enter password:
domU-12-31-39-03-B9-D6:/vol# df -h .
Filesystem Size Used Avail Use% Mounted on
/dev/sdh 1014M 134M 881M 14% /vol
domU-12-31-39-03-B9-D6:/vol# mysql -u root -p spamassassin
Enter password:
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 30
Server version: 5.0.51a-24 (Debian)
Type 39;help;39; or 39;\h39; for help. Type 39;\c39; to clear the buffer.
mysql> select count(*) from bayes_token;
+----------+
| count(*) |
+----------+
| 187301 |
+----------+
1 row in set (0.00 sec)
mysql> Bye
| And just to come full-circle, let's create an XFS filesystem on EBS using log version 2, and cause a kernel panic:domU-12-31-39-03-B9-D6:/vol# cd ..
domU-12-31-39-03-B9-D6:/# invoke-rc.d mysql stop
Stopping MySQL database server: mysqld.
domU-12-31-39-03-B9-D6:/# umount /vol
domU-12-31-39-03-B9-D6:/# mkfs.xfs -f /dev/sdh # Make a toxic xfs filesystem
meta-data=/dev/sdh isize=256 agcount=4, agsize=65536 blks
= sectsz=512 attr=2
data = bsize=4096 blocks=262144, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096
log =internal log bsize=4096 blocks=2560, version=2
= sectsz=512 sunit=0 blks, lazy-count=0
realtime =none extsz=4096 blocks=0, rtextents=0
domU-12-31-39-03-B9-D6:/# mount /dev/sdh /vol
domU-12-31-39-03-B9-D6:/# touch /vol/NO_CARRIER
| And the instance dies, as many others have reported.
Hopefully this was able to help somebody. Please do your own testing and let me know if this works for you.
* According to the manpage for mkfs.xfs, using log version 2 gives a performance improvement if you are using RAID5, so you won't get that if you use this technique. On the other hand, if you were happy with XFS performance under etch (and thus using log version 1), then you'll probably be OK with log version 1 under lenny. All of my tests were performed using ami-115db978 (a.k.a. Eric Hammond's top notch Alestic Debian 5.0 base AMI)
|
|
Posts:
104
Registered:
5/14/08
|
|
|
|
Re: Debian Lenny and XFS causes kernel panic: Suggested Workaround
Posted:
Feb 12, 2009 5:41 PM PST
in response to: Lenny
|
|
|
Greetings,
We have been working to resolve this issue and believe the root cause is an incompatibility between the options used to compile the FC8 kernel and the XFS kernel module. We have recompiled the xfs module and have found this to work in preliminary testing with a few selected AMIs which are based on the AKI aki-a71cf9ce. This rebuilt XFS kernel module supports both Version 1 and Version 2 log. We have tested this only on the Version 2 log, however.
To obtain this copy of XFS and install it, first download the following script from S3:
https://s3.amazonaws.com:443/public-linux.2.6.21.7-2.fc8xen-xfs/download_xfs.sh
Executing the script will create an xfs subdirectory under $HOME, download and extract two tarballs which contains the xfs kernel module as well as the xfsprogs.
This what you should see when running ./download_xfs.sh:
.
/download_xfs.sh
./download_xfs.sh: downloading index (for your reference) ....
./download_xfs.sh: downloading xfsprogs (may take quite a while, please wait) ....
./download_xfs.sh: MD5 checksum for xfsprogs-2.10.2-fc8-build.tar.gz ok: 3d47985b54a67782492c1bdbcabc2171
./download_xfs.sh: downloading linux_fs (may take quite a while, please wait) ....
./download_xfs.sh: MD5 checksum for linux-fs-modules-build-2.6.21.7-2.fc8xen-xfs.tar.gz ok: 70c424e98acdc9b323f3ee6f4a38a26c
./download_xfs.sh: unpacking xfsprogs tarball ...
./download_xfs.sh: unpacking linux_fs tarball ...
./download_xfs.sh: xfs.ko kernel module is at /root/xfs/linux-2.6.21.i386/fs/xfs/xfs.ko
-rw-r--r-- 1 root root 9351967 2009-02-03 18:53 /root/xfs/linux-2.6.21.i386/fs/xfs/xfs.ko
./download_xfs.sh: mkfs.xfs utility is at /root/xfs/xfsprogs-2.10.2/mkfs/mkfs.xfs
-rwxr-xr-x 1 root root 1194124 2009-02-03 19:47 /root/xfs/xfsprogs-2.10.2/mkfs/mkfs.xfs
./download_xfs.sh: xfs_repair utility is at /root/xfs/xfsprogs-2.10.2/repair/xfs_repair
-rwxr-xr-x 1 root root 1984378 2009-02-03 19:47 /root/xfs/xfsprogs-2.10.2/repair/xfs_repair
To use the rebuilt tools do an insmod of $HOME/xfs/linux-2.6.21.i386/fs/xfs/xfs.ko and use the mkfs.xfs and xfs_repair tools listed by the download output.
We would like to hear your feedback on whether or not this fixes your issues with XFS.
Best Regards,
--Fio (Fio@AWS)
|
|
Posts:
1,134
Registered:
7/7/07
|
|
|
|
Re: Debian Lenny and XFS causes kernel panic: Suggested Workaround
Posted:
Feb 12, 2009 5:55 PM PST
in response to: fio@AWS
|
|
|
Yay! Two solutions!
For the record on this thread, I'll mention that the new official Ubuntu 2.6.27 kernel currently in beta is going to be a third solution for this problem (at least for Ubuntu Intrepid; I haven't tested it against Debian Lenny yet).
You can sign up for the official Ubuntu beta here:
http://ubuntu.com/ec2
--
Eric Hammond
|
|
Posts:
8
Registered:
2/20/09
|
|
|
|
Re: Debian Lenny and XFS causes kernel panic: Suggested Workaround
Posted:
Feb 20, 2009 7:13 PM PST
in response to: fio@AWS
|
|
|
I want to do CentOS with XFS over EBS. I read with concern the article referenced by wolfgang127us and his post at:
http://developer.amazonwebservices.com/connect/thread.jspa?messageID=117837&tstart=0
I gather your download_xfs.sh solution only applies to FC8 and not to a CentOS ami?
Is there really a problem with XFS on CentOS, and if so, is there a parallel solution to this one or should I just try it with log version 1?
Thanks,
Tim
|
|
Posts:
1,134
Registered:
7/7/07
|
|
|
|
Re: Debian Lenny and XFS causes kernel panic: Suggested Workaround
Posted:
Feb 20, 2009 8:36 PM PST
in response to: Tim Gardner
|
|
|
Tim:
The problem may only appear on some CentOS versions (as it does on only some Debian and some Ubuntu, mostly the newest ones).
The download solution from Amazon worked when I tested it on Debian Lenny and Ubuntu Intrepid. I imagine it's worth trying if you run into problems on CentOS and if "mkfs.xfs -l version=1" doesn't work for you.
|
|
Posts:
9
Registered:
3/9/09
|
|
|
|
Re: Debian Lenny and XFS causes kernel panic: Suggested Workaround
Posted:
Mar 17, 2009 12:11 PM PDT
in response to: fio@AWS
|
|
|
Hey there - it looks like I had this problem with a Hardy AMI - now I'm using Hardy ami-71fd1a18 - so will this also fix this ami? And is there or when will there be an ubuntu ami that includes this fix? So I guess I'm asking for an update on the status of this problem - because it sounds like you guys only 'thought' it was fixed on Feb 12 09 - so does this actually fix it? And can this problem also occur with instance storage?
Thanks!
|
|
Posts:
9
Registered:
3/18/09
|
|
|
|
Re: Debian Lenny and XFS causes kernel panic: Suggested Workaround
Posted:
Mar 18, 2009 9:08 PM PDT
in response to: fio@AWS
|
|
|
Fio,
Can you please provide details on how the xfs module was recompiled? I am currently maintaining my own set of kernel modules for Lenny, and need to update my scripts.
Also, does this same issue (and fix) apply to the 64-bit architecture?
Thanks,
Matt
|
|
Posts:
9
Registered:
3/18/09
|
|
|
|
Re: Debian Lenny and XFS causes kernel panic: Suggested Workaround
Posted:
Mar 19, 2009 12:12 AM PDT
in response to: fio@AWS
|
|
|
I'm also curious to learn why the xfsprogs were recompiled, and why we should use the updated mkfs.xfs and xfs_repair. Should we also update the other components of xfsprogs, or just these two, and if so, why only these two?
Any insight you can provide would be appreciated.
Thanks,
Matt
|
|
Posts:
9
Registered:
3/18/09
|
|
|
|
Re: Debian Lenny and XFS causes kernel panic: Suggested Workaround
Posted:
Mar 19, 2009 12:31 AM PDT
in response to: Matt Olson
|
|
|
I just tested usage of mkfs.xfs from the standard Debian Lenny xfsprogs package in combination with the kernel module supplied by Fio@AWS, and everything seems to work. So it appears that perhaps all we need is the kernel module?
|
|
Posts:
428
Registered:
2/5/08
|
|
|
|
Re: Debian Lenny and XFS causes kernel panic: Suggested Workaround
Posted:
Mar 19, 2009 11:34 AM PDT
in response to: Matt Olson
|
|
|
Fio is out this week; he will respond when he returns.
Andrew
|
|
Posts:
9
Registered:
3/18/09
|
|
|
|
Re: Debian Lenny and XFS causes kernel panic: Suggested Workaround
Posted:
Mar 19, 2009 6:24 PM PDT
in response to: AndrewC@AWS
|
|
|
OK, thanks for the update, Andrew.
|
|
Posts:
1,134
Registered:
7/7/07
|
|
|
|
Re: Debian Lenny and XFS causes kernel panic: Suggested Workaround
Posted:
Mar 19, 2009 6:34 PM PDT
in response to: Matt Olson
|
|
|
Matthew:
When Fio originally came up with this solution he told me he had only tested with both the recompiled kernel module and the recompiled tools. He stated that he wasn't sure if they were both needed.
Based on your tests it sounds like the recompiled kernel module works with the normal xfsprogs tools.
Given this information, I think it would be a good idea to include the recompiled xfs kernel module in the Debian Lenny and Ubuntu Intrepid AMIs I release on
http://alestic.com as soon as I get a free weekend to build it and test.
When the official Ubuntu kernels are released, this workaround would no longer be needed on Intrepid and hopefully they would work with Lenny as well.
--
Eric Hammond
http://twitter.com/esh
|
|
Posts:
9
Registered:
3/18/09
|
|
|
|
Re: Debian Lenny and XFS causes kernel panic: Suggested Workaround
Posted:
Mar 19, 2009 10:21 PM PDT
in response to: Eric Hammond
|
|
|
Eric,
Thanks for the info. I only tested file system creation with mkfs.xfs, but haven't had occasion to use any of the other tools. So once Fio gets back, it will be helpful to know what the reason was for recompiling the tools. If there is no strong evidence that this is required, I'd be inclined to stick with the stable package distributed with Debian.
As for the kernel module, I agree it would be good to bundle the recompiled module in the Alestic images. It seems to work for me.
I too am using custom built modules for my own images (based on your instructions at
http://groups.google.com/group/ec2ubuntu/web/compiling-2-6-21-kernel-modules-from-source-for-amazon-ec2) so it seems that we both need to know how they were compiled. Alternatively, I suppose you could just bundle the binary that Fio compiled, but I'd like to be able to compile from source. Plus, I'm just curious for more information about this bug.
--Matt
|
|
Posts:
8
Registered:
12/13/08
|
|
|
|
Re: Debian Lenny and XFS causes kernel panic: Suggested Workaround
Posted:
Mar 20, 2009 6:46 PM PDT
in response to: fio@AWS
|
|
|
Even with Fio's
workaround
we noticed some hangs after instance reboot - insmod - mount. Oddly enough, the system didn't hang instantly but only after 2-5 minutes, and, unlike with the original xfs module, I was able to reboot it using ec2 API calls. I was able to reproduce this 2x but no more. I may be wrong. You may wish to take a look at it after reboot and waiting for some minutes.
Otherwise it seems to work fine. I'm curious as well what you have changed in the xfs module! Thanks!
|
|
Posts:
428
Registered:
2/5/08
|
|
|
|
Re: Debian Lenny and XFS causes kernel panic: Suggested Workaround
Posted:
Mar 20, 2009 11:25 PM PDT
in response to: Norbert Mocsnik
|
|
|
I'm pretty sure that Fio didn't do anything but get the FC8 version of the sources and compile them. He can verify.
I am curious what you meant by emphasizing workaround. I infer that you are not satisfied that this solution is addressing the source of the problem. Can you help me understand more what you were looking for instead?
Andrew
|
|
|
|