I'm having problems with a glusterfs share crashing when I try to copy files to certain subfolders.
Once the mount crashes, rsync for example gives the following error:
In the logfiles for that volume in /var/log/glusterfs, I get the following messages:Transport endpoint is not connected (107)
The detailed log looks like this:gfid different on data file on dlp-storage-client-X
[...]
multiple subvolumes (dlp-storage-client-X and dlp-storage-client-Y) have file XXX
Code: Select all
[2015-01-27 00:24:11.553397] W [dht-common.c:1580:dht_lookup_linkfile_cbk] 0-dlp-storage-dht: /part2/video/12/00/12-00122_B03/HIRES/12-00122_b03.mpg: gfid different on data file on dlp-storage-client-0
[2015-01-27 00:24:11.554221] W [dht-common.c:1335:dht_lookup_everywhere_cbk] 0-dlp-storage-dht: /part2/video/12/00/12-00122_B03/HIRES/12-00122_b03.mpg: gfid differs on subvolume dlp-storage-client-0
[2015-01-27 00:24:11.554299] W [dht-common.c:1335:dht_lookup_everywhere_cbk] 0-dlp-storage-dht: /part2/video/12/00/12-00122_B03/HIRES/12-00122_b03.mpg: gfid differs on subvolume dlp-storage-client-2
[2015-01-27 00:24:11.554318] W [dht-common.c:1397:dht_lookup_everywhere_cbk] 0-dlp-storage-dht: multiple subvolumes (dlp-storage-client-0 and dlp-storage-client-2) have file /part2/video/12/00/12-00122_B03/HIRES/12-00122_b03.mpg (preferably rename the file in the backend, and do a fresh lookup)
[2015-01-27 00:24:11.554558] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 14805548: LOOKUP() /part2/video/12/00/12-00122_B03/HIRES/12-00122_b03.mpg => -1 (Stale NFS file handle)
[2015-01-27 00:24:11.559353] E [dht-helper.c:1240:dht_inode_ctx_get] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.4.6/xlator/cluster/distribute.so(dht_lookup_everywhere_done+0xa15) [0x7fb8150234c5] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.4.6/xlator/cluster/distribute.so(dht_layout_preset+0x59) [0x7fb815009879] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.4.6/xlator/cluster/distribute.so(dht_inode_ctx_layout_set+0x34) [0x7fb81500bc44]))) 0-dlp-storage-dht: invalid argument: inode
[2015-01-27 00:24:11.559397] E [dht-helper.c:1259:dht_inode_ctx_set] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.4.6/xlator/cluster/distribute.so(dht_lookup_everywhere_done+0xa15) [0x7fb8150234c5] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.4.6/xlator/cluster/distribute.so(dht_layout_preset+0x59) [0x7fb815009879] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.4.6/xlator/cluster/distribute.so(dht_inode_ctx_layout_set+0x52) [0x7fb81500bc62]))) 0-dlp-storage-dht: invalid argument: inode
[2015-01-27 00:24:11.559419] W [fuse-bridge.c:397:fuse_entry_cbk] 0-glusterfs-fuse: Received NULL gfid for /part2/video/12/00/12-00122_B03/HIRES/12-00122_b03.mpg. Forcing EIO
[2015-01-27 00:24:11.559475] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 14805551: LOOKUP() /part2/video/12/00/12-00122_B03/HIRES/12-00122_b03.mpg => -1 (Input/output error)
pending frames:
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(0) op(0)
patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash: 2015-01-27 00:24:11configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.4.6
/lib/x86_64-linux-gnu/libc.so.6(+0x321e0)[0x7fb8199371e0]
/usr/lib/x86_64-linux-gnu/glusterfs/3.4.6/xlator/protocol/client.so(client3_3_lookup_cbk+0x88)[0x7fb81526c948]
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_handle_reply+0xa4)[0x7fb81a4cb1a4]
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0xcd)[0x7fb81a4cb52d]
/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_transport_notify+0x23)[0x7fb81a4c7bf3]
/usr/lib/x86_64-linux-gnu/glusterfs/3.4.6/rpc-transport/socket.so(+0x88b6)[0x7fb81651c8b6]
/usr/lib/x86_64-linux-gnu/glusterfs/3.4.6/rpc-transport/socket.so(+0xaf9c)[0x7fb81651ef9c]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x63b7a)[0x7fb81a736b7a]
/usr/sbin/glusterfs(main+0x3f5)[0x7fb81ab7efe5]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xfd)[0x7fb819923ead]
/usr/sbin/glusterfs(+0x63b9)[0x7fb81ab7f3b9]
That's what the entry in the log clearly says:
It actually also mentions a possible solution:multiple subvolumes (dlp-storage-client-X and dlp-storage-client-Y) have file XXX
The system is a 64bit Debian 7 (Wheezy), with glusterfs-client package 3.4.6-1 installed from standard repositories.preferably rename the file in the backend, and do a fresh lookup
In a previous glusterfs version (3.4.0), the file listing (ls) showed identical filenames multiple times then. In v3.4.6, it crashes like this.
In my case, I'm dealing with a lot of such duplicate files on bricks (*), scattered across several subfolders.
Therefore, manually renaming them is very time consuming, so I'm looking for a quicker/scriptable solution.
...to be continued.
(*) NOTE: The problem was very likely caused by manual interference from my side when importing data from a previous glusterfs test-setup into a newly set up machine. So it's very likely not glusterfs's fault, but probably human error