A few questions before upgrading from ext3 to ext4

220 views
Skip to first unread message

scaramanga

unread,
Feb 4, 2014, 3:28:46 PM2/4/14
to al...@googlegroups.com

A bit of background first: I use one drive as a backup for the other and I have a nightly rysnc cron job that performs that on my DNS-323 HW rev C1 with Alt-F 0.1 RC3.

I'd like to upgrade my FSs from ext3 to ext4, but I'd like to do it one drive at a time, just to be on the safe side.
The thing is I read the in ext4, file modification times are much more precise. Since rsync uses that to decide what files have changed and needs backup I'm concerned that upgrading just the main drive can result in excessive/redundant copy of 1Gig of data or, alternatively, new files with precise modification time would be backed-up again and again and again because the backup FS would maintain the less accurate (=different) file modification time.

Are my concerns correct? What do you suggest I do?

João Cardoso

unread,
Feb 5, 2014, 1:31:43 PM2/5/14
to
There are some questions whose answers I don't know or have:

Does the hardware supports the ext4 increased time accuracy?
Does rsync supports or uses it?

Although busybox is not involved here, it looks like that it does not uses that extra info.
The following command sequences, on a ext4 filesystem, shows that no fractional time is used:

while true; do touch /mnt/sda2/fred.txt; stat /mnt/sda2/fred.txt|grep ^C; done

Change: 2014-02-05 15:52:58.000000000
Change: 2014-02-05 15:52:58.000000000
Change: 2014-02-05 15:52:58.000000000
Change: 2014-02-05 15:52:59.000000000
Change: 2014-02-05 15:52:59.000000000
Change: 2014-02-05 15:52:59.000000000
Change: 2014-02-05 15:52:59.000000000

while on my desktop computer it displays:

Change: 2014-02-05 15:58:07.994402153 +0000
Change: 2014-02-05 15:58:07.996402115 +0000
Change: 2014-02-05 15:58:07.999402058 +0000
Change: 2014-02-05 15:58:08.001402021 +0000
Change: 2014-02-05 15:58:08.003401983 +0000
Change: 2014-02-05 15:58:08.005401945 +0000
Change: 2014-02-05 15:58:08.007401908 +0000

So the question is if rsync is using that and, in the first place, what time resolution the kernel has on the hardware.
The linux kernel seems to be using nanoseconds:

dmesg | grep clock

shows

sched_clock: 32 bits at 166MHz, resolution 5ns, wraps every 25769ms
Switching to clocksource orion_clocksource


Regarding the conversion: when converting from ext3 to ext4 the current file timestamps shouldn't be changed, i.e. the extra time should be kept at zero, and rsync will find no change.
For new or changed files there might exists problems, if the source and destinations are on fs with different time resolutions (if rsync uses them).

In any case, Alt-F Filesystem Maintenance convert utility does not change individual files, only the fs structure, it uses (after a fsck):

tune2fs -m 0 -O extents,uninit_bg,dir_index

when converting from ext3 to ext4, so you have to use 'chattr' so that existing files will benefice with ext4
This is only relevant if the file will change, if it is not likely to change, like a photo, music or movie, then no improvement will exist, I believe.


What do you suggest I do?

Hard to tell.
As I have an old ext3 small partition on two of my disks (sda4/sdb4) I would make a test on them first?

Also, I'm just making educated guesses, based on my background knowledge. I'm not an expert on any particular subject, I just like to play all instruments well :-)

PS: don't forget to tell us about your findings

Added: well, it looks like rsync does not transfers sub-second changes, at least when using network transfers. I have verified this using:

# ./popo # a simple C program that sets "fred.txt" dates to know values
# stat fred.txt

Access: 1970-01-01 01:00:01.000010000 +0100
Modify: 1970-01-01 01:00:10.000100000 +0100
Change: 2014-02-05 18:02:43.699088298 +0000

# rsync -av fred.txt root@dns-325:/mnt/sdb4/ # transfer to ext4 under Alt-F
# rm fred.txt
# rsync -av root@dns-325:/mnt/sdb4/fred.txt . # transfer back
# stat fred.txt

Access: 2014-02-05 18:02:50.816952769 +0000
Modify: 1970-01-01 01:00:10.000000000 +0100
Change: 2014-02-05 18:02:50.816952769 +0000

you can see that the Modify date has lost the sub-second value (it was set by the 'popo' program to be second 10, microsecond 100 of the "epoch" (Jan 1, 1970).
From http://article.gmane.org/gmane.network.rsync.general/23846/match=sub+second (and others) this seems to be confirmed, although rsync source code seems to use sub-second system calls (but haven't followed the code to see if they only apply to local filesystems or network transfers).

Also, using rsync to network synchronize files between ext3<->ext4 worked as expected, i.e., a second rsync didn't transfer any files.

Now you only have to create a small ext4 on a USB pen and confirm this for local rsync transfers.

gee, I was really bored with what I was doing :-)


 

scaramanga

unread,
Feb 6, 2014, 9:02:01 AM2/6/14
to al...@googlegroups.com
Joao, you rock! Hard!
I'm glad to see that my question isn't dumb, but in fact interesting, but I never intended for anyone to go to such length to help out with it, just share some previous knowledge or some suggestion.
I completely forgot about sd[ab]4. I tune2fs-ed them to noauto some time back (https://groups.google.com/forum/#!msg/alt-f/XyFQ-7Z_nQw/WkqQPeYu7YcJ) and completely forgot about them.

I'll use them to experiment and report back.
(Now that I looked that old thread up, I think I might move Alt-F to sda4)

Much obliged,
Sharon


On Wednesday, February 5, 2014 6:33:33 PM UTC+2, João Cardoso wrote:


On Tuesday, February 4, 2014 8:28:46 PM UTC, scaramanga wrote:
What do you suggest I do?

João Cardoso

unread,
Feb 7, 2014, 10:57:27 AM2/7/14
to al...@googlegroups.com


On Thursday, February 6, 2014 2:02:01 PM UTC, scaramanga wrote:
Joao, you rock! Hard!
I'm glad to see that my question isn't dumb, but in fact interesting,

Yes, it is. So interesting that I continue digging on this subject.

First I confirmed that rsync does not use sub-second file modification date, neither for network nor local fs operations.

Then, as busybox 'stat' does not shows-up sub-second file information, I wrote a small C program (based on stat() manual page),  that shows sub-second info.
When using it under Alt-F (RC4 to be), I noticed that sub-second file timestamping has a 10 millisecond time granularity; then a quick search took me to this post on stackoverflow, that explains why:

The current time within the Linux kernel is cached, and generally only updated on a timer interrupt. So if your timer interrupt is running at 10 milliseconds, the cached time will only be updated once every 10 milliseconds. When an update does occur, the accuracy of the resulting time will depend on the clock source available on your hardware.

And Alt-F linux kernel is compiled with

CONFIG_NO_HZ_IDLE=y
CONFIG_NO_HZ=y
CONFIG_HZ=100

thus the 10ms time granularity:

while true; do touch fred.txt ; ./stat fred.txt | grep mod; done

Last file modification:   2014-02-07 15:33:15.120000000
Last file modification:   2014-02-07 15:33:15.120000000
Last file modification:   2014-02-07 15:33:15.130000000
Last file modification:   2014-02-07 15:33:15.140000000
Last file modification:   2014-02-07 15:33:15.150000000
Last file modification:   2014-02-07 15:33:15.150000000
Last file modification:   2014-02-07 15:33:15.160000000

However, not everything is already explained. Given the HZ setting I would expect the 10ms granularity, but given the timer resolution I would expect more resolution:

When an update does occur, the accuracy of the resulting time will depend on the clock source available on your hardware.
 
And on my desktop computer (HZ=1000) I really have nanosecond resolution:

Change: 2014-02-07 15:46:21.712241050 +0000
Change: 2014-02-07 15:46:21.714241012 +0000
Change: 2014-02-07 15:46:21.717240955 +0000
Change: 2014-02-07 15:46:21.719240917 +0000
Change: 2014-02-07 15:46:21.722240859 +0000

So it looks like the timer hardware with ns resolution is not being used under Alt-F. Nothing that important, I will not change it ;-)

 
but I never intended for anyone to go to such length to help out with it,

Well, to be honest it was not entirely just for helping you, it has raised my curiosity and I wanted to share my findings :-)

I attach the compressed 'stat' binary and C source code, so others can use it.

stat.gz
stat.c
Reply all
Reply to author
Forward
0 new messages