I've got a strange issue running a diskless clusterhat setup. The system randomly freezes on diskless blades. The kernel continues to run, all the running processes too. Any disk read from not cached files hangs, and sometimes recovers. Sometimes not.
No additional software is used (yet), only what's on the images.
The blades are diskless, no SD card are used. No NFS errors are collected. The issue occurs every now and then, like from 1 minute uptime to 30 minutes, randomly.
I will be very thankful for any tips.
[ 870.119809] mmc0: timeout waiting for hardware interrupt.
[ 870.119833] [2a99a4c3] CMD 37 0
[ 870.119842] [2a99a4c3] REQ> d8963d38 0
[ 870.119848] [2a99a4cf] TSK< d8963d38 0
[ 870.119853] [2a99a4d6] TSK> d8963d38 0
[ 870.119859] [2a99a4e9] REQ< d8963d38 10800
[ 870.119864] [2a99a4ea] CMD< 37 0
[ 870.119869] [2a99a4ef] FCM< d8963d38 d8963d98
[ 870.119874] [2a99aafc] CMD 37 0
[ 870.119879] [2a99aafc] REQ> d8963d38 0
[ 870.119884] [2a99ab09] TSK< d8963d38 0
[ 870.119890] [2a99ab0f] TSK> d8963d38 0
[ 870.119895] [2a99ab1e] IOS< 30d40 0
[ 870.119901] [2a99ab25] REQ< d8963e00 10800
[ 870.119906] [2a99ab26] CMD< 1 0
[ 870.119911] [2a99ab2a] FCM< d8963e00 d8963e60
[ 870.119916] [2a99b139] CMD 1 0
[ 870.119921] [2a99b13a] REQ> d8963e00 0
[ 870.119927] [2a99b146] TSK< d8963e00 0
[ 870.119932] [2a99b14e] TSK> d8963e00 0
[ 870.119937] [2a99b15e] IOS< 0 0
[ 870.119942] [2a99b87a] IOS< 0 0
[ 870.119947] [2a99eb82] IOS< 1dd11 0
[ 870.119952] [2a9a1eac] RST< 0 0
[ 870.119957] [2a9a6cd9] REQ< d8963e08 10800
[ 870.119963] [2a9a6cda] CMD< 34 c00
[ 870.119968] [2a9a6ce1] FCM< d8963e08 d8963e68
[ 870.119973] [2a9a76aa] CMD 34 0
[ 870.119978] [2a9a76ab] REQ> d8963e08 0
[ 870.119984] [2a9a76bb] TSK< d8963e08 0
[ 870.119989] [2a9a76c4] TSK> d8963e08 0
[ 870.119994] [2a9a76db] REQ< d8963e08 10800
[ 870.119999] [2a9a76dc] CMD< 34 80000c08
[ 870.120005] [2a9a76e3] FCM< d8963e08 d8963e68
[ 870.120010] [2a9a80ca] CMD 34 0
[ 870.120015] [2a9a80cb] REQ> d8963e08 0
[ 870.120020] [2a9a80d7] TSK< d8963e08 0
[ 870.120026] [2a9a80de] TSK> d8963e08 0
[ 870.120031] [2a9a80ed] IOS< 1dd11 0
[ 870.120036] [2a9a880d] REQ< d8963e30 10800
[ 870.120041] [2a9a880e] CMD< 0 0
[ 870.120047] [2a9a8815] FCM< d8963e30 d8963e90
[ 870.120052] [2a9a8a23] FCM> d8963e30 0
[ 870.120057] [2a9a8a24] CMD 0 0
[ 870.120062] [2a9a8a24] REQ> d8963e30 0
[ 870.120067] [2a9a8a34] TSK< d8963e30 0
[ 870.120072] [2a9a8a3c] TSK> d8963e30 0
[ 870.120077] [2a9a8f4d] IOS< 1dd11 0
[ 870.120083] [2a9a9452] REQ< d8963e30 10800
[ 870.120088] [2a9a9452] CMD< 8 1aa
[ 870.120094] [2a9a9457] FCM< d8963e30 d8963e90
[ 870.120099] [2a9a9e3f] CMD 8 0
[ 870.120104] [2a9a9e40] REQ> d8963e30 0
[ 870.120109] [2a9a9e4e] TSK< d8963e30 0
[ 870.120114] [2a9a9e55] TSK> d8963e30 0
[ 870.120120] [2a9a9e6a] REQ< d8963dd8 10800
[ 870.120125] [2a9a9e6b] CMD< 5 0
[ 870.120130] [2a9a9e70] FCM< d8963dd8 d8963e38
[ 870.120135] [2a9aa85c] CMD 5 0
[ 870.120140] [2a9aa85c] REQ> d8963dd8 0
[ 870.120145] [2a9aa86a] TSK< d8963dd8 0
[ 870.120151] [2a9aa86e] TSK> d8963dd8 0
[ 870.120157] [2a9aa87c] REQ< d8963dd8 10800
[ 870.120162] [2a9aa87d] CMD< 5 0
[ 870.120167] [2a9aa882] FCM< d8963dd8 d8963e38
[ 870.120172] [2a9ab258] CMD 5 0
[ 870.120178] [2a9ab258] REQ> d8963dd8 0
[ 870.120183] [2a9ab266] TSK< d8963dd8 0
[ 870.120188] [2a9ab26c] TSK> d8963dd8 0
[ 870.120193] [2a9ab27b] REQ< d8963dd8 10800
[ 870.120198] [2a9ab27c] CMD< 5 0
[ 870.120203] [2a9ab281] FCM< d8963dd8 d8963e38
[ 870.120208] [2a9abc69] CMD 5 0
[ 870.120213] [2a9abc69] REQ> d8963dd8 0
[ 870.120219] [2a9abc74] TSK< d8963dd8 0
[ 870.120224] [2a9abc78] TSK> d8963dd8 0
[ 870.120230] [2a9abc84] REQ< d8963dd8 10800
[ 870.120235] [2a9abc85] CMD< 5 0
[ 870.120240] [2a9abc89] FCM< d8963dd8 d8963e38
[ 870.120245] [2a9ac64d] CMD 5 0
[ 870.120250] [2a9ac64d] REQ> d8963dd8 0
[ 870.120255] [2a9ac657] TSK< d8963dd8 0
[ 870.120261] [2a9ac661] TSK> d8963dd8 0
[ 870.120266] [2a9ac676] REQ< d8963d38 10800
[ 870.120271] [2a9ac677] CMD< 37 0
[ 870.120276] [2a9ac67b] FCM< d8963d38 d8963d98
[ 870.120282] [2a9ad060] CMD 37 0
[ 870.120287] [2a9ad060] REQ> d8963d38 0
[ 870.120292] [2a9ad06c] TSK< d8963d38 0
[ 870.120297] [2a9ad074] TSK> d8963d38 0
[ 870.120303] [2a9ad085] REQ< d8963d38 10800
[ 870.120308] [2a9ad086] CMD< 37 0
[ 870.120313] [2a9ad08a] FCM< d8963d38 d8963d98
[ 870.120318] [2a9ada61] CMD 37 0
[ 870.120324] [2a9ada62] REQ> d8963d38 0
[ 870.120329] [2a9ada6d] TSK< d8963d38 0
[ 870.120334] [2a9ada73] TSK> d8963d38 0
[ 870.120339] [2a9ada84] REQ< d8963d38 10800
[ 870.120344] [2a9ada84] CMD< 37 0
[ 870.120350] [2a9ada88] FCM< d8963d38 d8963d98
[ 870.120355] [2a9ae454] CMD 37 0
[ 870.120360] [2a9ae455] REQ> d8963d38 0
[ 870.120365] [2a9ae460] TSK< d8963d38 0
[ 870.120370] [2a9ae466] TSK> d8963d38 0
[ 870.120376] [2a9ae477] REQ< d8963d38 10800
[ 870.120381] [2a9ae478] CMD< 37 0
[ 870.120386] [2a9ae47c] FCM< d8963d38 d8963d98
[ 870.120391] [2a9aee86] CMD 37 0
[ 870.120397] [2a9aee86] REQ> d8963d38 0
[ 870.120402] [2a9aee91] TSK< d8963d38 0
[ 870.120407] [2a9aee98] TSK> d8963d38 0
[ 870.120412] [2a9aeea5] IOS< 1dd11 0
[ 870.120418] [2a9aeeac] REQ< d8963e00 10800
[ 870.120422] [2a9aeead] CMD< 1 0
[ 870.120428] [2a9aeeb1] FCM< d8963e00 d8963e60
[ 870.120433] [2a9af88d] CMD 1 0
[ 870.120438] [2a9af88d] REQ> d8963e00 0
[ 870.120443] [2a9af898] TSK< d8963e00 0
[ 870.120449] [2a9af89e] TSK> d8963e00 0
[ 870.120453] [2a9af8ae] IOS< 0 0
[ 870.120459] [2aab4968] IOS< 0 0
[ 870.120463] [2aab7c6a] IOS< 61a80 0
[ 870.120469] [2aabaf78] RST< 0 0
[ 870.120474] [2aabfda2] REQ< d8963e08 10800
[ 870.120480] [2aabfda4] CMD< 34 c00
[ 870.120485] [2aabfdab] FCM< d8963e08 d8963e68
[ 870.120490] [2aac0772] CMD 34 0
[ 870.120495] [2aac0773] REQ> d8963e08 0
[ 870.120501] [2aac0785] TSK< d8963e08 0
[ 870.120506] [2aac078e] TSK> d8963e08 0
[ 870.120511] [2aac07a6] REQ< d8963e08 10800
[ 870.120516] [2aac07a7] CMD< 34 80000c08
[ 870.120522] [2aac07ac] FCM< d8963e08 d8963e68
[ 870.120527] [2aac118a] CMD 34 0
[ 870.120532] [2aac118a] REQ> d8963e08 0
[ 870.120538] [2aac119a] TSK< d8963e08 0
[ 870.120543] [2aac11a2] TSK> d8963e08 0
[ 870.120548] [2aac11b1] IOS< 61a80 0
[ 870.120553] [2aac18d3] REQ< d8963e30 10800
[ 870.120558] [2aac18d4] CMD< 0 0
[ 870.120564] [2aac18db] FCM< d8963e30 d8963e90
[ 870.120569] [2aac1978] FCM> d8963e30 0
[ 870.120574] [2aac1978] CMD 0 0
[ 870.120579] [2aac1979] REQ> d8963e30 0
[ 870.120585] [2aac1989] TSK< d8963e30 0
[ 870.120590] [2aac1990] TSK> d8963e30 0
[ 870.120595] [2aac1eaa] IOS< 61a80 0
[ 870.120600] [2aac23bc] REQ< d8963e30 10800
[ 870.120606] [2aac23bc] CMD< 8 1aa
[ 870.120611] [2aac23c1] FCM< d8963e30 d8963e90
[ 870.120616] [2aac2714] CMD 8 0
[ 870.120621] [2aac2714] REQ> d8963e30 0
[ 870.120626] [2aac2721] TSK< d8963e30 0
[ 870.120631] [2aac2729] TSK> d8963e30 0
[ 870.120637] [2aac273d] REQ< d8963dd8 10800
[ 870.120642] [2aac273e] CMD< 5 0
[ 870.120647] [2aac2743] FCM< d8963dd8 d8963e38
[ 870.120652] [2aac2a43] CMD 5 0
[ 870.120657] [2aac2a43] REQ> d8963dd8 0
[ 870.120663] [2aac2a4f] TSK< d8963dd8 0
[ 870.120669] [2aac2a54] TSK> d8963dd8 0
[ 870.120674] [2aac2a63] REQ< d8963dd8 10800
[ 870.120679] [2aac2a64] CMD< 5 0
[ 870.120684] [2aac2a68] FCM< d8963dd8 d8963e38
[ 870.120689] [2aac2d92] CMD 5 0
[ 870.120694] [2aac2d92] REQ> d8963dd8 0
[ 870.120700] [2aac2d9f] TSK< d8963dd8 0
[ 870.120705] [2aac2da4] TSK> d8963dd8 0
[ 870.120710] [2aac2db0] REQ< d8963dd8 10800
[ 870.120715] [2aac2db1] CMD< 5 0
[ 870.120721] [2aac2db5] FCM< d8963dd8 d8963e38
[ 870.120726] [2aac30de] CMD 5 0
[ 870.120731] [2aac30de] REQ> d8963dd8 0
[ 870.120736] [2aac30ea] TSK< d8963dd8 0
[ 870.120741] [2aac30ef] TSK> d8963dd8 0
[ 870.120746] [2aac30fc] REQ< d8963dd8 10800
[ 870.120751] [2aac30fd] CMD< 5 0
[ 870.120757] [2aac3101] FCM< d8963dd8 d8963e38
[ 870.120762] [2aac340b] CMD 5 0
[ 870.120767] [2aac340c] REQ> d8963dd8 0
[ 870.120772] [2aac3419] TSK< d8963dd8 0
[ 870.120777] [2aac3421] TSK> d8963dd8 0
[ 870.120783] [2aac3438] REQ< d8963d38 10800
[ 870.120788] [2aac3438] CMD< 37 0
[ 870.120794] [2aac343d] FCM< d8963d38 d8963d98
[ 870.120799] [2aac3761] CMD 37 0
[ 870.120804] [2aac3761] REQ> d8963d38 0
[ 870.120809] [2aac376e] TSK< d8963d38 0
[ 870.120814] [2aac3775] TSK> d8963d38 0
[ 870.120820] [2aac3789] REQ< d8963d38 10800
[ 870.120825] [2aac3789] CMD< 37 0
[ 870.120830] [2aac378e] FCM< d8963d38 d8963d98
[ 870.120835] [2aac3aac] CMD 37 0
[ 870.120840] [2aac3aac] REQ> d8963d38 0
[ 870.120845] [2aac3ab7] TSK< d8963d38 0
[ 870.120851] [2aac3abe] TSK> d8963d38 0
[ 870.120856] [2aac3ad0] REQ< d8963d38 10800
[ 870.120861] [2aac3ad1] CMD< 37 0
[ 870.120867] [2aac3ad6] FCM< d8963d38 d8963d98
[ 870.120872] [2aac3df7] CMD 37 0
[ 870.120877] [2aac3df7] REQ> d8963d38 0
[ 870.120882] [2aac3e03] TSK< d8963d38 0
[ 870.120887] [2aac3e0a] TSK> d8963d38 0
[ 870.120893] [2aac3e1d] REQ< d8963d38 10800
[ 870.120898] [2aac3e1d] CMD< 37 0
[ 870.120903] [2aac3e22] FCM< d8963d38 d8963d98
[ 870.120908] [2aac4140] CMD 37 0
[ 870.120913] [2aac4140] REQ> d8963d38 0
[ 870.120919] [2aac414c] TSK< d8963d38 0
[ 870.120924] [2aac4153] TSK> d8963d38 0
[ 870.120930] [2aac4162] IOS< 61a80 0
[ 870.120935] [2aac4169] REQ< d8963e00 10800
[ 870.120940] [2aac416a] CMD< 1 0
[ 870.120945] [2aac416f] FCM< d8963e00 d8963e60
[ 870.120950] [2aac448f] CMD 1 0
[ 870.120955] [2aac448f] REQ> d8963e00 0
[ 870.120961] [2aac449b] TSK< d8963e00 0
[ 870.120966] [2aac44a2] TSK> d8963e00 0
[ 870.120971] [2aac44b2] IOS< 0 0
[ 870.120976] [2aac4bca] IOS< 0 0
[ 870.120982] [2aac7f01] IOS< 493e0 0
[ 870.120987] [2aaca8f7] RST< 0 0
[ 870.120992] [2aacf724] REQ< d8963e08 10800
[ 870.120997] [2aacf725] CMD< 34 c00
[ 870.121002] [2aacf72a] FCM< d8963e08 d8963e68
[ 870.121007] [2aad00ff] CMD 34 0
[ 870.121013] [2aad00ff] REQ> d8963e08 0
[ 870.121018] [2aad010f] TSK< d8963e08 0
[ 870.121023] [2aad0119] TSK> d8963e08 0
[ 870.121028] [2aad012c] REQ< d8963e08 10800
[ 870.121034] [2aad012d] CMD< 34 80000c08
[ 870.121039] [2aad0132] FCM< d8963e08 d8963e68
[ 870.121044] [2aad0b12] CMD 34 0
[ 870.121050] [2aad0b13] REQ> d8963e08 0
[ 870.121055] [2aad0b1f] TSK< d8963e08 0
[ 870.121060] [2aad0b27] TSK> d8963e08 0
[ 870.121065] [2aad0b35] IOS< 493e0 0
[ 870.121070] [2aad123c] REQ< d8963e30 10800
[ 870.121076] [2aad123d] CMD< 0 0
[ 870.121081] [2aad1245] FCM< d8963e30 d8963e90
[ 870.121086] [2aad1321] FCM> d8963e30 0
[ 870.121091] [2aad1322] CMD 0 0
[ 870.121096] [2aad1322] REQ> d8963e30 0
[ 870.121101] [2aad132f] TSK< d8963e30 0
[ 870.121107] [2aad1337] TSK> d8963e30 0
[ 870.121112] [2aad184a] IOS< 493e0 0
[ 870.121118] [2aad1d52] REQ< d8963e30 10800
[ 870.121123] [2aad1d53] CMD< 8 1aa
[ 870.121128] [2aad1d58] FCM< d8963e30 d8963e90
[ 870.121133] [2aad2164] CMD 8 0
[ 870.121138] [2aad2165] REQ> d8963e30 0
[ 870.121143] [2aad2175] TSK< d8963e30 0
[ 870.121149] [2aad217f] TSK> d8963e30 0
[ 870.121154] [2aad2194] REQ< d8963dd8 10800
[ 870.121159] [2aad2196] CMD< 5 0
[ 870.121164] [2aad219b] FCM< d8963dd8 d8963e38
[ 870.121170] [37844c6b] TIM< 0 0
[ 870.121183] mmc0:>cmd op 5 arg 0x0 flags 0x2e1 - resp 00000000 00000000 00000000 00000000, err 0
[ 870.121187] mmc0: =========== REGISTER DUMP ===========
[ 870.121192] mmc0: SDCMD 0x00004005
[ 870.121195] mmc0: SDARG 0x00000000
[ 870.121199] mmc0: SDTOUT 0x00024978
[ 870.121203] mmc0: SDCDIV 0x00000340
[ 870.121207] mmc0: SDRSP0 0xffffffff
[ 870.121211] mmc0: SDRSP1 0x0000ff7f
[ 870.121214] mmc0: SDRSP2 0xffffffff
[ 870.121218] mmc0: SDRSP3 0xffffffff
[ 870.121222] mmc0: SDHSTS 0x00000040
[ 870.121226] mmc0: SDVDD 0x00000001
[ 870.121230] mmc0: SDEDM 0x00010800
[ 870.121235] mmc0: SDHCFG 0x0000040a
[ 870.121239] mmc0: SDHBCT 0x00000000
[ 870.121243] mmc0: SDHBLC 0x00000000
[ 870.121246] mmc0: ===========================================