Das Werkstatt

Posted: **Thu Mar 19, 2015 2:09 pm**

[PROBLEM]
Our GlusterFS share was filled up almost completely, when I suddenly got I/O errors on directory listings, not showing some of the folders on the bricks.
Checking /var/log/messages, gave hints that XFS seems to have troubles:

Code: Select all

Mar 19 04:39:15 vm-gluster-1 kernel: XFS: Internal error XFS_WANT_CORRUPTED_GOTO at line 1327 of file fs/xfs/xfs_alloc.c.  Caller 0xffffffffa014404d
Mar 19 04:39:15 vm-gluster-1 kernel:
Mar 19 04:39:15 vm-gluster-1 kernel: Pid: 615, comm: xfsalloc/1 Not tainted 2.6.32-431.5.1.el6.x86_64 #1
Mar 19 04:39:15 vm-gluster-1 kernel: Call Trace:
Mar 19 04:39:15 vm-gluster-1 kernel: [<ffffffffa016ce5f>] ? xfs_error_report+0x3f/0x50 [xfs]
Mar 19 04:39:15 vm-gluster-1 kernel: [<ffffffffa014404d>] ? xfs_alloc_ag_vextent+0xad/0x100 [xfs]
Mar 19 04:39:15 vm-gluster-1 kernel: [<ffffffffa0141679>] ? xfs_alloc_lookup_eq+0x19/0x20 [xfs]
Mar 19 04:39:15 vm-gluster-1 kernel: [<ffffffffa01431df>] ? xfs_alloc_ag_vextent_size+0x38f/0x630 [xfs]
Mar 19 04:39:15 vm-gluster-1 kernel: [<ffffffffa014404d>] ? xfs_alloc_ag_vextent+0xad/0x100 [xfs]
Mar 19 04:39:15 vm-gluster-1 kernel: [<ffffffffa0144a9c>] ? xfs_alloc_vextent+0x2bc/0x610 [xfs]
Mar 19 04:39:15 vm-gluster-1 kernel: [<ffffffffa014f2c8>] ? xfs_bmap_btalloc+0x398/0x700 [xfs]
Mar 19 04:39:15 vm-gluster-1 kernel: [<ffffffffa014f63e>] ? xfs_bmap_alloc+0xe/0x10 [xfs]
Mar 19 04:39:15 vm-gluster-1 kernel: [<ffffffffa0155622>] ? __xfs_bmapi_allocate+0x102/0x340 [xfs]
Mar 19 04:39:15 vm-gluster-1 kernel: [<ffffffff81527c20>] ? thread_return+0x4e/0x76e
Mar 19 04:39:15 vm-gluster-1 kernel: [<ffffffffa01559a0>] ? xfs_bmapi_allocate_worker+0x70/0xa0 [xfs]
Mar 19 04:39:15 vm-gluster-1 kernel: [<ffffffffa0155930>] ? xfs_bmapi_allocate_worker+0x0/0xa0 [xfs]
Mar 19 04:39:15 vm-gluster-1 kernel: [<ffffffff81094d10>] ? worker_thread+0x170/0x2a0
Mar 19 04:39:15 vm-gluster-1 kernel: [<ffffffff8109b290>] ? autoremove_wake_function+0x0/0x40
Mar 19 04:39:15 vm-gluster-1 kernel: [<ffffffff81094ba0>] ? worker_thread+0x0/0x2a0
Mar 19 04:39:15 vm-gluster-1 kernel: [<ffffffff8109aee6>] ? kthread+0x96/0xa0
Mar 19 04:39:15 vm-gluster-1 kernel: [<ffffffff8100c20a>] ? child_rip+0xa/0x20
Mar 19 04:39:15 vm-gluster-1 kernel: [<ffffffff8109ae50>] ? kthread+0x0/0xa0
Mar 19 04:39:15 vm-gluster-1 kernel: [<ffffffff8100c200>] ? child_rip+0x0/0x20
Mar 19 04:39:15 vm-gluster-1 kernel: XFS (vdd): Internal error xfs_trans_cancel at line 1948 of file fs/xfs/xfs_trans.c.  Caller 0xffffffffa0178c5d

This only happened on the one of 3 identical Software-RAIDs that is completely full (64kB free). Coincidence or is XFS having problems when running out of space?
Those are storage-partitions, so I did not reserve any additional space for "root" (as is good practice for system disks) to avoid such issues.

I'm not too happy with this.

[SOLUTION]

1) Umounted the gluster share, then stop the gluster volume:

Code: Select all

$ umount $YOUR_GLUSTER_MOUNTPOINT
$ gluster volume stop $VOLUME_NAME

2) Unmount affected XFS brick and run "xfs_repair":
First, I ran xfs_repair in non-modify mode, so I could see what it would do:

Code: Select all

$ xfs_repair -n /dev/sdX

It listed some inodes/files that came in when the disk was running out of space. Some of the files were clearly visible as rsync-temp files.
After checking the output of the dry-run, I got the following message when running xfs_repair without "-n":

Phase 1 - find and verify superblock...
Phase 2 - using internal log
- zero log...
ERROR: The filesystem has valuable metadata changes in a log which needs to
be replayed. Mount the filesystem to replay the log, and unmount it before
re-running xfs_repair. If you are unable to mount the filesystem, then use
the -L option to destroy the log and attempt a repair.
Note that destroying the log may cause corruption -- please attempt a mount
of the filesystem before doing this.

I mounted the XFS brick, and unmounted it, as described in the message and then re-ran the same xfs_repair call:

Code: Select all

$ xfs_repair /dev/sdX

3) Mount fixed XFS brick and restart/remount Gluster volume.
That's it. Seems to have worked

How to prevent this in the future?
It seems that "cluster.min-free-disk" only imposes a soft-limit that issues a warning in the logs, rather than preventing data to be written onto the bricks.

XFS doesn't support the "reserved for superuser" flag (like EXT3/4), so a viable option might be to implement user-based "quota" rules?

Links:

Das Werkstatt

XFS: "Internal error XFS_WANT_CORRUPTED_GOTO"

XFS: "Internal error XFS_WANT_CORRUPTED_GOTO"