gmirror+gjournal often makes inconsistens file systems

Eugene Grosbein

unread,

Sep 9, 2011, 1:17:06 AM9/9/11

to

Hi!

For long time I experience same UFS2 filesystem problems with several 8.2 systems
running on gmirror+gjournal+async. In case of unclean shutdown, kernel panic or power failure
gjournal makes fsck skip its checks and that's why I use it.

But quite often my /var partition (and sometimes others) still has severe damage in it
and running with such /var mounted read-write leads to another panics or hangs and so on.

For example, I have such 8.2-STABLE system with ad4 and ad6 drives combined to /dev/mirror/gm0.
I have just removed ad6 from the mirror, ran fsck -y manually for all its filesystems,
shut down this machine again cleanly and booted it next time from ad6
while keeping mirror with ad4 not mounted nor checked.

Then, I ran fsck -y /dev/mirror/gm0.journals1e (/var on the mirrored drive)
and got LOTS of bad errors on presumably clean file system.
Of course, I've seen the same errors while checking ad6 after it was removed from running mirror.
I have auto-sync gmirror feature turned ON. I've tried to turn it OFF but that just
increase frequency of such damages not fixed after reboot.

It seems that gjournal cannot handle system crashes reliably, can it?
I basically run in without any manual tuning. I've also tried to tune it - without luck,
it works nice when there are no unclean shutdowns but it's here to deal with them in the first place.

# fsck -t ffs -y /dev/mirror/gm0.journals1e
** /dev/mirror/gm0.journals1e
** Last Mounted on /var
** Phase 1 - Check Blocks and Sizes
3955872 DUP I=989242
3955873 DUP I=989242
3955874 DUP I=989242
3955875 DUP I=989242
3955876 DUP I=989242
3955877 DUP I=989242
3955878 DUP I=989242
3955879 DUP I=989242
3955880 DUP I=989242
3955881 DUP I=989242
3955882 DUP I=989242
EXCESSIVE DUP BLKS I=989242
CONTINUE? yes

INCORRECT BLOCK COUNT I=989242 (448 should be 424)
CORRECT? yes

3955888 DUP I=989289
3955889 DUP I=989289
3955890 DUP I=989289
3955891 DUP I=989289
3955892 DUP I=989289
3955893 DUP I=989289
3955894 DUP I=989289
3955895 DUP I=989289
** Phase 1b - Rescan For More DUPS
3955872 DUP I=989242
3955873 DUP I=989242
3955874 DUP I=989242
3955875 DUP I=989242
3955876 DUP I=989242
3955877 DUP I=989242
3955878 DUP I=989242
3955879 DUP I=989242
3955880 DUP I=989242
3955881 DUP I=989242
3955888 DUP I=989242
3955889 DUP I=989242
3955890 DUP I=989242
3955891 DUP I=989242
3955892 DUP I=989242
3955893 DUP I=989242
3955894 DUP I=989242
3955895 DUP I=989242
** Phase 2 - Check Pathnames
DUP/BAD I=989289 OWNER=root MODE=100640
SIZE=14367 MTIME=Sep 9 11:30 2011
FILE=/log/kernel.log

REMOVE? yes

DUP/BAD I=989242 OWNER=root MODE=100640
SIZE=202631 MTIME=Sep 8 19:52 2011
FILE=/log/mpd.log.0

REMOVE? yes

** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
UNREF FILE I=376866 OWNER=root MODE=140666
SIZE=0 MTIME=Sep 5 12:27 2011
CLEAR? yes

UNREF FILE I=376868 OWNER=root MODE=140666

UNREF FILE I=376868 OWNER=root MODE=140666
SIZE=0 MTIME=Sep 7 20:30 2011
CLEAR? yes

UNREF FILE I=376869 OWNER=root MODE=140666
SIZE=0 MTIME=Sep 8 11:17 2011
CLEAR? yes

UNREF FILE I=376870 OWNER=root MODE=140666
SIZE=0 MTIME=Sep 8 12:11 2011
CLEAR? yes

BAD/DUP FILE I=989242 OWNER=root MODE=100640
SIZE=202631 MTIME=Sep 8 19:52 2011
CLEAR? yes

UNREF FILE I=989259 OWNER=root MODE=100640
SIZE=648 MTIME=Aug 27 00:00 2011
RECONNECT? yes

BAD/DUP FILE I=989289 OWNER=root MODE=100640
SIZE=14367 MTIME=Sep 9 11:30 2011
CLEAR? yes
LINK COUNT FILE I=989293 OWNER=root MODE=100640
SIZE=961 MTIME=Sep 9 11:26 2011 COUNT 1 SHOULD BE 2
ADJUST? yes

UNREF FILE I=989327 OWNER=root MODE=100640
SIZE=114 MTIME=Aug 27 00:00 2011
RECONNECT? yes

** Phase 5 - Check Cyl groups
FREE BLK COUNT(S) WRONG IN SUPERBLK
SALVAGE? yes

SUMMARY INFORMATION BAD
SALVAGE? yes

BLK(S) MISSING IN BIT MAPS
SALVAGE? yes

1188 files, 90007 used, 4987072 free (360 frags, 623339 blocks, 0.0%
fragmentation)

***** FILE SYSTEM IS CLEAN *****

***** FILE SYSTEM WAS MODIFIED *****
_______________________________________________
freebsd...@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stabl...@freebsd.org"

Lev Serebryakov

unread,

Sep 9, 2011, 4:21:15 AM9/9/11

to

Hello, Eugene.

You wrote 9 сентября 2011 г., 9:17:06:

> # fsck -t ffs -y /dev/mirror/gm0.journals1e

I may be wrong, but I've encountered strong advice not
to gjournal whole disk, but make gjournal on per-FS basis, many times.
And it seems, that your first create big journal, and splice/partition/newfs
it for several FSes.

--
// Black Lion AKA Lev Serebryakov <l...@FreeBSD.org>

Eugene Grosbein

unread,

Sep 9, 2011, 5:40:51 AM9/9/11

to

09.09.2011 15:21, Lev Serebryakov О©╫О©╫О©╫О©╫О©╫:
> Hello, Eugene.

> You wrote 9 О©╫О©╫О©╫О©╫О©╫О©╫О©╫О©╫ 2011 О©╫., 9:17:06:
>
>> # fsck -t ffs -y /dev/mirror/gm0.journals1e
> I may be wrong, but I've encountered strong advice not
> to gjournal whole disk, but make gjournal on per-FS basis, many times.
> And it seems, that your first create big journal, and splice/partition/newfs
> it for several FSes.

Yes, I did. Shoud not this kind of partitioning work too?

Eugene Grosbein

Lev Serebryakov

unread,

Sep 9, 2011, 6:02:43 AM9/9/11

to

Hello, Eugene.

You wrote 9 сентября 2011 г., 13:40:51:

>>> # fsck -t ffs -y /dev/mirror/gm0.journals1e
>> I may be wrong, but I've encountered strong advice not
>> to gjournal whole disk, but make gjournal on per-FS basis, many times.
>> And it seems, that your first create big journal, and splice/partition/newfs
>> it for several FSes.
> Yes, I did. Shoud not this kind of partitioning work too?

I'm not sure, should or should not it work. But it is common
answer/advice in mailing lists not to do so and to use one gjournal per
FS.
I think, freebsd-fs@ could give more qualified answer.

--
// Black Lion AKA Lev Serebryakov <l...@FreeBSD.org>

Eugene Grosbein

unread,

Sep 9, 2011, 6:31:49 AM9/9/11

to

Dear Pawel Jakub,

09.09.2011 12:17, Eugene Grosbein writes:
> Hi!
>
> For long time I experience same UFS2 filesystem problems with several 8.2 systems
> running on gmirror+gjournal+async. In case of unclean shutdown, kernel panic or power failure
> gjournal makes fsck skip its checks and that's why I use it.
>
> But quite often my /var partition (and sometimes others) still has severe damage in it
> and running with such /var mounted read-write leads to another panics or hangs and so on.
>
> For example, I have such 8.2-STABLE system with ad4 and ad6 drives combined to /dev/mirror/gm0.
> I have just removed ad6 from the mirror, ran fsck -y manually for all its filesystems,
> shut down this machine again cleanly and booted it next time from ad6
> while keeping mirror with ad4 not mounted nor checked.
>
> Then, I ran fsck -y /dev/mirror/gm0.journals1e (/var on the mirrored drive)
> and got LOTS of bad errors on presumably clean file system.
> Of course, I've seen the same errors while checking ad6 after it was removed from running mirror.
> I have auto-sync gmirror feature turned ON. I've tried to turn it OFF but that just
> increase frequency of such damages not fixed after reboot.
>
> It seems that gjournal cannot handle system crashes reliably, can it?
> I basically run in without any manual tuning. I've also tried to tune it - without luck,
> it works nice when there are no unclean shutdowns but it's here to deal with them in the first place.
>

> # fsck -t ffs -y /dev/mirror/gm0.journals1e

Please explain if such partitioning is supported?
physical drive - geom_mirror - geom_journal - geom_part_mbr - geom_part_bsd - journalled UFS2

If not, mounting such UFS2 should warn us, shouldn't it?
No warnings now.

Eugene Grosbein

Pawel Jakub Dawidek

unread,

Sep 9, 2011, 9:35:10 AM9/9/11

to

On Fri, Sep 09, 2011 at 05:31:49PM +0700, Eugene Grosbein wrote:
> Please explain if such partitioning is supported?
> physical drive - geom_mirror - geom_journal - geom_part_mbr - geom_part_bsd - journalled UFS2

No. It will only work properly for journaling UFS if UFS is placed
directory on gjournal provider. You configured slices and several
pratitions on one gjournal provider, which simply cannot work, as
one UFS file system has to talk to one gjournal provider.

> If not, mounting such UFS2 should warn us, shouldn't it?
> No warnings now.

It might be a bit hard to tell, but it could be at least better
documented.

--
Pawel Jakub Dawidek http://www.wheelsystems.com
FreeBSD committer http://www.FreeBSD.org
Am I Evil? Yes, I Am! http://yomoli.com

Kevin Oberman

unread,

Sep 9, 2011, 1:53:00 PM9/9/11

to

On Fri, Sep 9, 2011 at 6:35 AM, Pawel Jakub Dawidek <p...@freebsd.org> wrote:
> On Fri, Sep 09, 2011 at 05:31:49PM +0700, Eugene Grosbein wrote:
>> Please explain if such partitioning is supported?
>> physical drive - geom_mirror - geom_journal - geom_part_mbr - geom_part_bsd - journalled UFS2
>
> No. It will only work properly for journaling UFS if UFS is placed
> directory on gjournal provider. You configured slices and several
> pratitions on one gjournal provider, which simply cannot work, as
> one UFS file system has to talk to one gjournal provider.
>
>> If not, mounting such UFS2 should warn us, shouldn't it?
>> No warnings now.
>
> It might be a bit hard to tell, but it could be at least better
> documented.

Yes, the documentation could be better, especially since the
gjournal(8) man page example
where the entire disk (da0) is tied to a single journal. At least
change the example to read:
gjournal load
gjournal label da0p5
newfs -J /dev/da0p5.journal
mount -o async /dev/da05.journal /mnt

It MIGHT be better to use da0s1g, but I think we want to move toward
GPT, so I suggest
that type of example. Either make it clear that a file system, not a
drive, is the appropriate
application. I am quite aware of this as I just created my first
gjournal file system last night
and was briefly confused by this.
--
R. Kevin Oberman, Network Engineer - Retired
E-mail: kob...@gmail.com