| TokuDB failed to initialize after datadir moved to a new partition | Egor Shevtsov | 25/10/13 08:45 | Hi guys. I run CentOS 6.4 fresh installation TokuDB 7.0.3 TokuDB installation was painless. I manages to launch server with enabled "TokuDB | YES " Next I shutdown the server, moved datadir to a new xfs partition: mv /var/lib/mysql /data/mysql cd /var/lib ln -s /data/mysql mysql start the server: tail -f /var/log/mysql/error.log 131025 16:28:26 [Note] Plugin 'FEEDBACK' is disabled. 131025 16:28:26 [ERROR] TokuDB: cant open rollback 131025 16:28:27 [ERROR] Plugin 'TokuDB' init function returned error. 131025 16:28:27 [ERROR] Plugin 'TokuDB' registration as a STORAGE ENGINE failed. 131025 16:28:27 [Note] Event Scheduler: Loaded 0 events 131025 16:28:27 [Note] /usr/local/mysql/bin/mysqld: ready for connections. Version: '5.5.30-tokudb-7.0.3-MariaDB' socket: '/tmp/mysql.sock' port: 3306 MariaDB Server 131025 16:28:44 [ERROR] TokuDB: cant open rollback 131025 16:28:44 [ERROR] Plugin 'TokuDB' init function returned error. 131025 16:28:44 [ERROR] Plugin 'TokuDB' registration as a STORAGE ENGINE failed. TokuDB wouldn't start. cat /sys/kernel/mm/transparent_hugepage/enabled always [never] sestatus SELinux status: disabled Not sure what to do. Any ideas? Thanks |
| Re: TokuDB failed to initialize after datadir moved to a new partition | Egor Shevtsov | 25/10/13 09:43 | When I start MySQL with default-storage-engine disabled , I have even more descriptive ERROR log: 131025 17:33:19 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql/ 131025 17:33:19 [Note] Plugin 'InnoDB' is disabled. 131025 17:33:19 [Note] Plugin 'FEEDBACK' is disabled. Fri Oct 25 17:33:19 2013 Tokudb recovery starting in env /var/lib/mysql/ Fri Oct 25 17:33:19 2013 Tokudb recovery failed -30975 /home/tcallaghan/temp/mariadb/mariadb/storage/tokudb/ft-index/portability/file.cc:468 file_fsync_internal: Assertion `get_error_errno() == EINTR' failed (errno=22) : Invalid argument Backtrace: (Note: toku_do_assert=0x0x7fe28f1612c0) /usr/local/mysql/lib/plugin/ha_tokudb.so(+0x128118)[0x7fe28f161118] /usr/local/mysql/lib/plugin/ha_tokudb.so(+0x1282bd)[0x7fe28f1612bd] /usr/local/mysql/lib/plugin/ha_tokudb.so(+0x1278a2)[0x7fe28f1608a2] /usr/local/mysql/lib/plugin/ha_tokudb.so(_Z23toku_logger_maybe_fsyncP10tokulogger10__toku_lsnib+0x23a)[0x7fe28f10251a] /usr/local/mysql/lib/plugin/ha_tokudb.so(_Z23toku_log_end_checkpointP10tokuloggerP10__toku_lsniS1_mjj+0x141)[0x7fe28f131311] /usr/local/mysql/lib/plugin/ha_tokudb.so(_ZN12checkpointer18log_end_checkpointEv+0x31)[0x7fe28f0cf5b1] /usr/local/mysql/lib/plugin/ha_tokudb.so(_ZN12checkpointer14end_checkpointEPFvPvES0_+0x65)[0x7fe28f0cfc35] /usr/local/mysql/lib/plugin/ha_tokudb.so(_Z15toku_checkpointP12checkpointerP10tokuloggerPFvPvES3_S5_S3_19checkpoint_caller_t+0x155)[0x7fe28f0bcd05] /usr/local/mysql/lib/plugin/ha_tokudb.so(+0x70d71)[0x7fe28f0a9d71] /usr/local/mysql/lib/plugin/ha_tokudb.so(+0x49d85)[0x7fe28f082d85] /usr/local/mysql/bin/mysqld(_Z24ha_initialize_handlertonP13st_plugin_int+0x48)[0x7005a8] /usr/local/mysql/bin/mysqld[0x5d99d9] /usr/local/mysql/bin/mysqld(_Z11plugin_initPiPPci+0x923)[0x5dcca3] /usr/local/mysql/bin/mysqld[0x55785e] /usr/local/mysql/bin/mysqld(_Z11mysqld_mainiPPc+0x35b)[0x55835b] /lib64/libc.so.6(__libc_start_main+0xfd)[0x7fe2a04a4cdd] /usr/local/mysql/bin/mysqld[0x51bbed] Engine status function not available Memory usage: Arena 0: system bytes = 0 in use bytes = 0 Total (incl. mmap): system bytes = 0 in use bytes = 0 max mmap regions = 0 max mmap bytes = 0 131025 17:33:19 [ERROR] mysqld got signal 6 ; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware. To report this bug, see http://kb.askmonty.org/en/reporting-bugs We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail. Server version: 5.5.30-tokudb-7.0.3-MariaDB key_buffer_size=104857600 read_buffer_size=131072 max_used_connections=0 max_threads=2002 thread_count=0 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 8587127 K bytes of memory Hope that's ok; if not, decrease some variables in the equation. Thread pointer: 0x0x0 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... stack_bottom = 0x0 thread_stack 0x48000 mysys/stacktrace.c:247(my_print_stacktrace)[0xada50b] sql/signal_handler.cc:153(handle_fatal_signal)[0x6fdf4e] ??:0(??)[0x7fe2a11ed500] ??:0(??)[0x7fe2a04b88e5] ??:0(??)[0x7fe2a04ba0c5] ??:0(??)[0x7fe28f161217] ??:0(??)[0x7fe28f1612bd] ??:0(??)[0x7fe28f1608a2] ??:0(??)[0x7fe28f10251a] ??:0(??)[0x7fe28f131311] ??:0(??)[0x7fe28f0cf5b1] ??:0(??)[0x7fe28f0cfc35] ??:0(??)[0x7fe28f0bcd05] ??:0(??)[0x7fe28f0a9d71] ??:0(??)[0x7fe28f082d85] sql/handler.cc:476(ha_initialize_handlerton(st_plugin_int*))[0x7005a8] sql/sql_plugin.cc:1354(plugin_initialize)[0x5d99d9] sql/sql_plugin.cc:1648(plugin_init(int*, char**, int))[0x5dcca3] sql/mysqld.cc:4289(init_server_components)[0x55785e] sql/mysqld.cc:4847(mysqld_main(int, char**))[0x55835b] ??:0(??)[0x7fe2a04a4cdd] ??:0(_start)[0x51bbed] The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains information that should help you find out what is causing the crash. 131025 17:33:19 mysqld_safe mysqld from pid file /var/lib/mysql//MAI-R6-DED24.pid ended |
| Re: TokuDB failed to initialize after datadir moved to a new partition | Egor Shevtsov | 25/10/13 14:24 | Ok guys, As was said, server started fine and registered TokuDB fine when datadir locates on /var/lib/mysql if the datadir locates on different XFS partition TokuDB fails to initialize. Strangely enough it was fixed after I disabled tokudb_directio = off Odd.
|
| Re: TokuDB failed to initialize after datadir moved to a new partition | Rich Prohaska | 28/10/13 09:13 | try compiling this problem: $ cat testdirect.cc #include <stdio.h> #include <assert.h> #include <fcntl.h> #include <unistd.h> int main(void) { int fd = open("directio.test.file", O_RDWR+O_CREAT+O_DIRECT); assert(fd != -1); int r = fsync(fd); assert(r != -1); return 0; } and running: strace -f ./testdirect |
| Re: TokuDB failed to initialize after datadir moved to a new partition | Egor Shevtsov | 28/10/13 09:44 | Hi Rich, This is the output i've got: execve("./testdirect.cc", ["./testdirect.cc"], [/* 23 vars */]) = -1 ENOEXEC (Exec format error) dup(2) = 3 fcntl(3, F_GETFL) = 0x8402 (flags O_RDWR|O_APPEND|O_LARGEFILE) fstat(3, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f0a98ebc000 lseek(3, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek) write(3, "strace: exec: Exec format error\n", 32strace: exec: Exec format error ) = 32 close(3) = 0 munmap(0x7f0a98ebc000, 4096) = 0 exit_group(1) = ? Thanks
|
| Re: TokuDB failed to initialize after datadir moved to a new partition | Rich Prohaska | 28/10/13 10:07 | you need to compile it first: gcc -g -o testdirect testdirect.cc then run it. |
| Re: TokuDB failed to initialize after datadir moved to a new partition | Egor Shevtsov | 28/10/13 10:34 | http://pastebin.com/013Mdmmj
|
| Re: TokuDB failed to initialize after datadir moved to a new partition | Rich Prohaska | 28/10/13 10:51 | here is a version of the program that creates a file in the current directory, writes into it, and fsync's it. $ cat testdirect.c #include <stdio.h> #include <stdlib.h>
#include <string.h> #include <malloc.h> int main(void) { const off_t offset = 0; const size_t bufalign = 512; const size_t bufsize = 512;
char *buf = (char *) memalign(bufalign, bufsize); assert(buf); memset(buf, 0, bufsize); ssize_t wr = pwrite(fd, buf, bufsize, offset); assert(wr == bufsize);
free(buf); return 0; |
| Re: TokuDB failed to initialize after datadir moved to a new partition | Egor Shevtsov | 29/10/13 02:22 | Ok, so it looks like this is an issue related to software RAID1 we are running on 2 our boxes. Looks like directio doesn't work properly on software RAID. We going to investigate it a bit further but everything points to that. The question still remains, why TokuDB doesn't catch this exception and fails flat instead of just start working in buffered IO and ignore the direct IO error?
|
| Re: TokuDB failed to initialize after datadir moved to a new partition | Rich Prohaska | 29/10/13 05:12 | TokuDB could detect this error and print an error message rather than crash. I will create a ticket for this. IMO, if TokuDB is configured for direct I/O and it does not work, then it should error out rather than ignoring the direct I/O variable. This method is more likely to get the administrator's attention that something in the storage configuration does not work. |
| Re: TokuDB failed to initialize after datadir moved to a new partition | Rich Prohaska | 29/10/13 08:18 | We added a function to testdirect.c that queries the xfs file system for some parameters and prints them out. Can you try it?
#include <xfs/xfs.h> static void do_dioinfo(int fd) { struct dioattr s; int r = xfsctl("directio.test.file", fd, XFS_IOC_DIOINFO, &s); if (r == 0) { printf("required mem alignment = %ld\n", s.d_mem); printf("required disk alignment = %ld\n", s.d_miniosz); printf("max io size = %ld\n", s.d_maxiosz);
int fd = open("directio.test.file", O_RDWR+O_CREAT+O_DIRECT, 0644); assert(fd != -1); do_dioinfo(fd);
|
| Re: TokuDB failed to initialize after datadir moved to a new partition | Egor Shevtsov | 29/10/13 09:38 | Hi Rich, I've got compilation error: Thanks.
|
| Re: TokuDB failed to initialize after datadir moved to a new partition | Rich Prohaska | 29/10/13 10:40 | It looks like the XFS is reporting that it needs 4O96 buffer alignment on one of your systems. The attached program uses XFS memory alignement for its memory allocations. Please compile and run this program. If it works, then we know what we need to fix. $ cat ~/testdirect.c
static off_t offset = 0; static size_t bufalign = 512; static size_t bufsize = 512;
bufalign = s.d_mem;
bufsize = s.d_miniosz;
printf("buf=%p bufalign=%u bufsize=%u\n", buf, bufalign, bufsize);
|
| Re: TokuDB failed to initialize after datadir moved to a new partition | Egor Shevtsov | 29/10/13 12:38 | Hi Rich, Please check: http://pastebin.com/6cSTgFYE Thanks, I
|
| Re: TokuDB failed to initialize after datadir moved to a new partition | Rich Prohaska | 29/10/13 12:46 | It looks like the 4096 byte memory alignment for direct I/O is necessary. On all of our machines, 512 byte memory alignment is good enough, and TokuDB assumes that all over the place. What components in your storage stack require 4096 byte memory alignment? |
| Re: TokuDB failed to initialize after datadir moved to a new partition | Egor Shevtsov | 30/10/13 06:04 | Hi Rich, Thanks for that. I've been asked our hosting guys to look into that. Could you give us any clues or maybe more specific question or more detail to get the answer, please? Thanks, I
|
| Re: TokuDB failed to initialize after datadir moved to a new partition | Rich Prohaska | 30/10/13 06:45 | perhaps your XFS file system was made with 4096 byte sectors rather than 512 byte sectors. xfs_info will tell you. on my system, sectsz=512. |
| Re: TokuDB failed to initialize after datadir moved to a new partition | Egor Shevtsov | 30/10/13 07:03 | meta-data=/dev/md0 isize=256 agcount=32, agsize=14408240 blks = sectsz=4096 attr=2, projid32bit=0 data = bsize=4096 blocks=461063647, imaxpct=5 = sunit=16 swidth=48 blks naming =version 2 bsize=4096 ascii-ci=0 log =internal bsize=4096 blocks=8192, version=2 = sectsz=4096 sunit=4 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 I checked on my other servers sectsz=512. All the servers were formatted with the same command. mkfs.xfs -f -d su=64k,sw=3 -l size=32m,su=16k -L /data /dev/vg/lv_mysql so why sectsz is different on those 2 new boxes and how to format it right? ok. So what would be best way to format XFS? fdisk -l Disk /dev/md0: 1888.5 GB, 1888516698112 bytes 2 heads, 4 sectors/track, 461063647 cylinders Units = cylinders of 8 * 512 = 4096 bytes Sector size (logical/physical): 512 bytes / 4096 bytes I/O size (minimum/optimal): 4096 bytes / 4096 bytes Disk identifier: 0x00000000
|
| Re: TokuDB failed to initialize after datadir moved to a new partition | Rich Prohaska | 30/10/13 07:27 | Please try asking the experts here: http://oss.sgi.com/archives/xfs/ |
| Re: TokuDB failed to initialize after datadir moved to a new partition | Alexey Zilber | 01/11/13 04:37 | Hi Egor, In my opinion, and I could be wrong, it's best (at least in my situation) to just use the defaults and not use su/sw. I've had a lot of discussion on the xfs forums about it. Your mileage may vary, but here's an example. I'm using SSD, and say I have a 4 SSD raid-10 volume that I format with su/sw. I use LVM with XFS on top of it. It's all good until I need to add more drives. I can expand lvm over a new 4 SSD raid-10 volume, and I can do xfs_growfs on it, but I cannot change su/sw as my volume grows. I find that unacceptable since people tend to grow their volumes. Just my opinion. Back to your original question though. What is your tokudb_data_dir set to? and the rest of the tokudb parameters? -Alex |
| Re: TokuDB failed to initialize after datadir moved to a new partition | Rich Prohaska | 01/11/13 05:44 | We captured this issue in https://github.com/Tokutek/ft-index/issues/99. Here is a good summary of the problem: http://people.redhat.com/msnitzer/docs/io-limits.txt. The current TokuDB assumes 512 byte alignment. |