Directory size calculation in Krusader

421 views
Skip to first unread message

Andrey Matveyakin

unread,
Aug 4, 2012, 4:13:37 PM8/4/12
to krusade...@googlegroups.com
Hello!

I'm working on improving directory size calculation in Krusader at this time. I want to share a draft with you. I would be glad to hear any feedbacks: which changes are good and which are are not? What else could be done about it?

I'm not posting the patches to the review board because they are not ready for pushing, it's a draft, as I have already said. I even copypasted some files from KIO to make it easier to preview the changes: you wouldn't have to recompile kdelibs to see the result. But this is definitely a bad move for a release version :-)
Another problem is, you might want to investigate separate changes, while the review board doesn't allow to send a chain of commits as far as I understand.

To sum up, I've taught Krusader to use KIO::DirectorySizeJob for space calculation.

The main benefits are:
1. It's now possible calculate size of a remote directory (fixes bug 272735).
2. Size is calculated in background on space hit.

The smaller benefits (some of them are not actually related to the principal changes):
3. “0” is shown instead of “<DIR>” when size is really zero.
4. “≥” is shown when size is calculated inaccurately (this requires modifing kdelibs and the patch is not finished).
5. “Calculate All Directories' Sizes” action added.
6. KrCalcSpaceDialog fixed: size in bytes is now written.
7. Code simplified: an analog of DirectorySizeJob removed from KrCalcSpaceDialog.

Changes:
8. Computed size value has changed: it now includes directories' self sizes. So, an empty folder is reported to occupy 4096 bytes on my ext4 volume.
I can't say whether it is good or bad. A advantage is, the size reported is now consistent with other KDE apps, such as file properties dialog, and consistency is always welcomed, isn't it?
My be, it worth adding a setting whether to include directory's sizes in total size or not. But I don't find it necessary.

Sadly, a huge drawback comes too:
9. Size calculation significantly slowed down.
Nearly two or three times. That's an inadmissible regression. So, it would be very nice to fix it. All the more, all KDE will benefit from it, not just Krusader.

I'm not sure where the slowdown comes from: DirectorySizeJob is certainly more complicated that current Krusader's algorithm: it works over network, handles hard liks correclty, etc. But I think, the problem is different.
DirectorySizeJob internally calls ListJob and simply sums up sizes of all item recieved from it. ListJob itself starts a Slave to list a directory. The Slave emits listEntries signal when it finds new entries.
I find it strange that this signal is emitted so frequently: nearly once per millisecond according to my measuring, often passing 1—4 or even 0 entries. As far as I understand, it shouldn't happen so often. At least, there is a
  static const int minimum_updatetime = 100;
line in
  void SlaveBase::listEntry( const UDSEntry& entry, bool _ready )
function.
It seems, it was planned not to send updates more often that every 100 milliseconds, but the delay isn't working for some reason. Or may be, I didn't caught the idea of that code.

Anyway, it seems to me, that the delay should be added. I'm going to investigate the problem closer in the nearest future.

Andrey.
krusader_directory_size_calculation.tar

jan

unread,
Aug 5, 2012, 8:28:46 AM8/5/12
to krusade...@googlegroups.com
On Sun, 5 Aug 2012 00:13:37 +0400
Andrey Matveyakin <a.matv...@gmail.com> wrote:

> Hello!
>
> I'm working on improving directory size calculation in Krusader at
> this time. I want to share a draft with you. I would be glad to hear
> any feedbacks: which changes are good and which are are not? What
> else could be done about it?
>
> I'm not posting the patches to the review board because they are not
> ready for pushing, it's a draft, as I have already said. I even
> copypasted some files from KIO to make it easier to preview the
> changes: you wouldn't have to recompile kdelibs to see the result.
> But this is definitely a bad move for a release version :-)
> Another problem is, you might want to investigate separate changes,
> while the review board doesn't allow to send a chain of commits as
> far as I understand.
>
> To sum up, I've taught Krusader to use KIO::DirectorySizeJob for space
> calculation.
>

Hi Andrey,

I also thought about using KIO::DirectorySizeJob, though I haven't
further looked into it.
But is seems to be the right way to me.

> The main benefits are:
> 1. It's now possible calculate size of a remote directory (fixes bug
> 272735).
> 2. Size is calculated in background on space hit.
>
> The smaller benefits (some of them are not actually related to the
> principal changes):
> 3. "0" is shown instead of "<DIR>" when size is really zero.
> 4. "≥" is shown when size is calculated inaccurately (this requires
> modifing kdelibs and the patch is not finished).
> 5. "Calculate All Directories' Sizes" action added.
> 6. KrCalcSpaceDialog fixed: size in bytes is now written.
> 7. Code simplified: an analog of DirectorySizeJob removed from
> KrCalcSpaceDialog.
>
> Changes:
> 8. Computed size value has changed: it now includes directories' self
> sizes. So, an empty folder is reported to occupy 4096 bytes on my ext4
> volume.
> I can't say whether it is good or bad. A advantage is, the size
> reported is now consistent with other KDE apps, such as file
> properties dialog, and consistency is always welcomed, isn't it?

I'd say yes.

> My be, it worth adding a setting whether to include directory's sizes
> in total size or not. But I don't find it necessary.
>
No, not necessary.

> Sadly, a huge drawback comes too:
> 9. Size calculation significantly slowed down.
> Nearly two or three times. That's an inadmissible regression. So, it
> would be very nice to fix it. All the more, all KDE will benefit from
> it, not just Krusader.
>
> I'm not sure where the slowdown comes from: DirectorySizeJob is
> certainly more complicated that current Krusader's algorithm: it
> works over network, handles hard liks correclty, etc. But I think,
> the problem is different. DirectorySizeJob internally calls ListJob
> and simply sums up sizes of all item recieved from it. ListJob itself
> starts a Slave to list a directory. The Slave emits listEntries
> signal when it finds new entries. I find it strange that this signal
> is emitted so frequently: nearly once per millisecond according to my
> measuring, often passing 1--4 or even 0 entries. As far as I
> understand, it shouldn't happen so often. At least, there is a
> static const int minimum_updatetime = 100;
> line in
> void SlaveBase::listEntry( const UDSEntry& entry, bool _ready )
> function.
> It seems, it was planned not to send updates more often that every 100
> milliseconds, but the delay isn't working for some reason. Or may be,
> I didn't caught the idea of that code.
>
> Anyway, it seems to me, that the delay should be added. I'm going to
> investigate the problem closer in the nearest future.
>
> Andrey.
>

That (what SlaveBase::listEntry() is doing) doesn't sound right, would
be great if you can find out what's going wrong there.

Btw: I did some changes to KrCalcSpaceDialog (and many other parts too)
in a separate branch (to be merged after 2.4 release), so it would be
best if you worked on top of that when you do bigger changes.
I just need to figure out how to upload branches first ;-)

Jan



Andrey Matveyakin

unread,
Aug 5, 2012, 10:13:45 AM8/5/12
to krusade...@googlegroups.com


2012/8/5 jan <jan_l...@gmx.de>
Does KDE policy prevent developers from simply doing
  git push origin your_branch_name
?
 

Jan



--
You received this message because you are subscribed to the Google Groups "krusader-devel" group.
To post to this group, send email to krusade...@googlegroups.com.
To unsubscribe from this group, send email to krusader-deve...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.



Yuri Chornoivan

unread,
Aug 5, 2012, 10:19:22 AM8/5/12
to krusade...@googlegroups.com
2012/8/5 Andrey Matveyakin <a.matv...@gmail.com>:
I guess, no. But can you close the committed review requests (or use
REVIEW: <number> at the first line of commit message)?

Yuri

jan

unread,
Aug 5, 2012, 10:59:03 AM8/5/12
to krusade...@googlegroups.com
On Sun, 5 Aug 2012 18:13:45 +0400
No idea, I will try.

Jonas Bähr

unread,
Aug 5, 2012, 2:13:06 PM8/5/12
to krusade...@googlegroups.com
Hi

Am 05.08.2012 um 16:13 schrieb Andrey Matveyakin:
> 2012/8/5 jan <jan_l...@gmx.de>
[...]
> > Btw: I did some changes to KrCalcSpaceDialog (and many other parts
> too)
> > in a separate branch (to be merged after 2.4 release), so it would
> be
> > best if you worked on top of that when you do bigger changes.
> > I just need to figure out how to upload branches first ;-)
>
> Does KDE policy prevent developers from simply doing
> git push origin your_branch_name
> ?

Another option would be to publish a personal clone of the krusader
[1]. This way you could share branches that are no "official
branches". On the developer machine you only need one repository with
several remotes [2,3,4].

I suggest to share "experimental" branches (or feature branches on
which several people are working) via personal repositories between
developers and push only release branches into the official
repository. If you have work that should go into the main line after
the 2.4 release we could create an 2.4 branch in the official repo and
to the integration/bugfixing there (with regular merging into the
master). New stuff can then go into mater directly.
Opinions? How do other projects handle these issues? The Amarok team
seems to have much experience with it...

[1] http://community.kde.org/Sysadmin/GitKdeOrgManual#Personal_repositories
[2] http://techbase.kde.org/Development/Git/Recipes#Tracking_Branches
[3] http://stackoverflow.com/questions/849308/pull-push-from-multiple-remote-locations
[4] http://techbase.kde.org/Getting_Started/Sources/Amarok_Git_Tutorial

bye,
Jonas

jan

unread,
Aug 5, 2012, 3:08:02 PM8/5/12
to krusade...@googlegroups.com
On Sun, 5 Aug 2012 20:13:06 +0200
Jonas Bähr <jonas...@web.de> wrote:

> Hi
>
> Am 05.08.2012 um 16:13 schrieb Andrey Matveyakin:
> > 2012/8/5 jan <jan_l...@gmx.de>
> [...]
> > > Btw: I did some changes to KrCalcSpaceDialog (and many other
> > > parts
> > too)
> > > in a separate branch (to be merged after 2.4 release), so it
> > > would
> > be
> > > best if you worked on top of that when you do bigger changes.
> > > I just need to figure out how to upload branches first ;-)
> >
> > Does KDE policy prevent developers from simply doing
> > git push origin your_branch_name
> > ?
>
> Another option would be to publish a personal clone of the krusader
> [1]. This way you could share branches that are no "official
> branches". On the developer machine you only need one repository
> with several remotes [2,3,4].
>

Hi Jonas,

thanks for the suggestion.

> I suggest to share "experimental" branches (or feature branches on
> which several people are working) via personal repositories between
> developers and push only release branches into the official
> repository. If you have work that should go into the main line after
> the 2.4 release we could create an 2.4 branch in the official repo
> and to the integration/bugfixing there (with regular merging into
> the master). New stuff can then go into mater directly.
> Opinions? How do other projects handle these issues? The Amarok team
> seems to have much experience with it...
>
> [1]
> http://community.kde.org/Sysadmin/GitKdeOrgManual#Personal_repositories
> [2] http://techbase.kde.org/Development/Git/Recipes#Tracking_Branches
> [3]
> http://stackoverflow.com/questions/849308/pull-push-from-multiple-remote-locations
> [4]
> http://techbase.kde.org/Getting_Started/Sources/Amarok_Git_Tutorial
>
> bye,
> Jonas
>

I'd suggest to continue to use master for bugfixes.
With the 2.4.0 release we can create a 2.4 branch and then cherry-pick
bugfixes from master to go into the 2.4.x releases.
But I will have a look at how the Amarok team does it.

Jan

jan

unread,
Aug 5, 2012, 3:42:11 PM8/5/12
to krusade...@googlegroups.com
I just realized that that the warning fixes by Andrey caused a lot of
conflicts when merging them in my post-2.4 branch. So it would be
better in future to commit all non-bugfix changes to post-2.4.
Or we go by what you have suggested - I'll have to think about that.

Andrey Matveyakin

unread,
Aug 5, 2012, 3:55:20 PM8/5/12
to krusade...@googlegroups.com
2012/8/5 jan <jan_l...@gmx.de>
--
You received this message because you are subscribed to the Google Groups "krusader-devel" group.
To post to this group, send email to krusade...@googlegroups.com.
To unsubscribe from this group, send email to krusader-deve...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Yes, that's a problem of using a single branch — the longer you keep your changes waiting for next release the more conflicts arise.

I think, we should have at least two branches in the main repository. I my view, merging release branch to master and cherry-picking bugfixes from master to release are both fine.

But there's still a question: where should we keep commits too experimental to be even in master? I like Jonas' idea of using clones. Let's keep official repo clean.

As of the conflict with warning fixes, if you have any troubles resolving them, it might be easier to revert my commits and fix the warnings anew when you push you changes — making gcc happy isn't a hard task.

Andrey

Andrey Matveyakin

unread,
Aug 5, 2012, 6:19:04 PM8/5/12
to krusade...@googlegroups.com
2012/8/5 Andrey Matveyakin <a.matv...@gmail.com>

By the way, I think there's at least one thing we can borrow from amarok's rules (though I can't say they follow it very good themselves): I suggest to use
  git pull --rebase
instead of
  git pull
to sync local changes with main repository.

A don't want to start a "merge vs rebase" holy war here, there's certainly nothing criminal in merges, but as for me, the history looks cleaner when it's straightforward. I find it both easier to observe and to handle in scripts.

And what do you think?

Andrey

jan

unread,
Aug 8, 2012, 5:18:06 PM8/8/12
to krusade...@googlegroups.com
Hi Andrey,

> Yes, that's a problem of using a single branch — the longer you keep
> your changes waiting for next release the more conflicts arise.
>
> I think, we should have at least two branches in the main repository.
> I my view, merging release branch to master and cherry-picking
> bugfixes from master to release are both fine.
>
> But there's still a question: where should we keep commits too
> experimental to be even in master? I like Jonas' idea of using
> clones. Let's keep official repo clean.
>

I will likely create a clone.

> As of the conflict with warning fixes, if you have any troubles
> resolving them, it might be easier to revert my commits and fix the
> warnings anew when you push you changes — making gcc happy isn't a
> hard task.

Thanks, but they are already merged :-)

>
> Andrey
>

Jan

jan

unread,
Aug 8, 2012, 5:40:55 PM8/8/12
to krusade...@googlegroups.com
I'm not sure yet, I fear that rebasing could mess things up because
it rewrites history.


Jonas Bähr

unread,
Aug 9, 2012, 3:42:22 PM8/9/12
to krusade...@googlegroups.com
Hi,

Am 08.08.2012 um 23:40 schrieb jan:
> On Mon, 6 Aug 2012 02:19:04 +0400
> Andrey Matveyakin <a.matv...@gmail.com> wrote:
>> [...]
>> A don't want to start a "merge vs rebase" holy war here, there's
>> certainly nothing criminal in merges, but as for me, the history
>> looks cleaner when it's straightforward. I find it both easier to
>> observe and to handle in scripts.
>>
>> And what do you think?
>>
>> Andrey
>>
>
> I'm not sure yet, I fear that rebasing could mess things up because
> it rewrites history.

As long as you did not publish your changes yet it's fine, I would
even recommend it. Once you pushed them into a public repository you
can't rebase any more (at least you're strongly discouraged) for
exactly that reason. In fact, git doesn't let you push such things
unless you --force it. But that's not allowed in most public repos,
Krusader's included.

bye,
Jonas

Andrey Matveyakin

unread,
Aug 16, 2012, 5:44:42 PM8/16/12
to krusade...@googlegroups.com
2012/8/9 Jonas Bähr <jonas...@web.de>
--
You received this message because you are subscribed to the Google Groups "krusader-devel" group.
To post to this group, send email to krusader-devel@googlegroups.com.
To unsubscribe from this group, send email to krusader-devel+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.


Yes, rebasing is quite harmless until you use it only for local changes that were never pushed anywhere. To be safe, you can create a backup branch before rebasing and keep it until you test if everything went fine (I usually do so).

Andrey
Reply all
Reply to author
Forward
0 new messages