db files

42 views
Skip to first unread message

Cotton Seed

unread,
Mar 30, 2014, 1:17:47 PM3/30/14
to clearsk...@googlegroups.com
What's the intended design for the configuration and share db files? To
store configuration in ~/.clearskies/ and store the db file for a share
in $SHARE/.clearskies/?

Perhaps I should update where I'm coming from. To test* the tracker
server, I needed a tracker client. So I created a dummy clearskies
client that just connects to the tracker.

But to really the test tracker, I need clients to create, remove and
attach shares. So I added a control server to the clearskies client and
wrote a rudimentary command-line control client in Python in the spirit
of Steven's old ruby command line client.

Now I'm ready to implement create_share in the clearskies client. I get
the basic idea from the share tests, but Share takes an explicit dbpath,
and I'm not sure what the design is here.

After this, I'd be inclined to make a first cut at access IDs so I can
see the whole tracker lifestyle roughly working.

Also, Dropbox DCMA takedowns? We need to get cracking!
https://news.ycombinator.com/item?id=7495888

Best,
Cotton

*I know TDD is all the rage, but I'm inclined to top-down testing. I
like to see a rough skeleton of the system working so I know the
internal organization is relatively stable before I write careful unit
tests.

Daniel Cachapa

unread,
Mar 30, 2014, 5:27:33 PM3/30/14
to ClearSkies Dev List
I'd argue against storing any files under the shared directories:
- It pollutes the user's filesystem
- It would make it impossible to share read-only directories, say for
auto-backup of your pictures folder with limited access control
- It will cause problems in systems with multiple users

I understand the attractiveness of storing conf files in the shares
themselves -- btsync does it -- but I think it's worth the extra work
to have them constrained to a single directory. ~/.cleaskies/ sounds
like the best bet.
Daniel Cachapa
> --
> You received this message because you are subscribed to the Google Groups "ClearSkies Dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to clearskies-de...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Pedro Larroy

unread,
Mar 30, 2014, 8:51:07 PM3/30/14
to clearsk...@googlegroups.com
My original idea is that if you have a share say "share_dir" there's either a "share_dir.db" or you can keep the db in memory if you just want a temporary client without any IO. This is tested already, (see the tables created in  share.cpp)

Then there's another sqlite3 database for the daemon configuration which has things like port, settings and references to the known shares the daemon should attach on boot. (see conf.cpp)

Pedro.
--
Pedro Larroy Tovar   |    http://pedro.larroy.com/

Pedro Larroy

unread,
Mar 30, 2014, 8:53:12 PM3/30/14
to clearsk...@googlegroups.com
@daniel    where to store the share database should be configured on the configuration database (in conf.cpp) so it should be possible to store all the configurations for everything in any directory such as .clearskies or in a single directory or next to the share. For example if you have the share in a usb drive.

Pedro.

Daniel Cachapa

unread,
Mar 31, 2014, 4:37:45 AM3/31/14
to ClearSkies Dev List
A config option for that would certainly be fine, if the default is to
store all configurations in a single dir.

I remember the discussion on the problems with shares on volumes which
might not be always available, such as usb drives.
Perhaps those shares can be marked as "volatile" by the user, and
there we'd make the exception of writing the db inside the share.

The problem with these approaches is that they hurt user experience:
the user needs to be aware of the technical details of the application
to know what those options even mean, let alone when to use them.

I wonder if there isn't a cleaner way of doing this without depending
on the user. We can certainly distinguish between a directory which
has suddenly become empty (the user deleted the contents) and a
directory which isn't there anymore (the host volume has been
unmounted).
The problem, I guess is on Unix systems when you're sharing the root
of a volume. In that case, unmounting the volume may leave an empty
mount directory behind, though it should be possible to detect if
we're sharing the root of a volume and to know if it's mounted by the
system or not.

Daniel Cachapa

Dmitry Yakimenko

unread,
Mar 31, 2014, 5:16:21 AM3/31/14
to clearsk...@googlegroups.com
On Monday, March 31, 2014 2:51:07 AM UTC+2, Pedro Larroy wrote:
Then there's another sqlite3 database for the daemon configuration which has things like port, settings and references to the known shares the daemon should attach on boot. (see conf.cpp)

Shouldn't this be human readable so it could be edited? 

Pedro Larroy

unread,
Mar 31, 2014, 6:53:34 AM3/31/14
to clearsk...@googlegroups.com

We have one config db for the daemon and then one manifest database for each share which can be hundreds of mb for big shares. I thought the default to be a file with a name derived from the shared directory in the same folder such as 'shared' and 'shared__cs.db' in the same parent dir. I think it makes more sense to have the default together with the shared directory because:

1. You can erase the manifest db if you want.
2. Copy the manifest db together with the share to move it somewhere (ex USB)
3. Don't blow up a hidden dir in the users home by default.
4. Don't blow up some /system dir in android. The manifest is in the sdcard together with the share.

You can always associate a share with a manifest in a different location set in conf.db and as seen in the cs::Share ctor.

$ cs attach shared

$ cs attach shared ~/.cs/share__cs.db

Maybe you have other ideas worth discussing. Those were my original thoughts. I didn't get why this approach might hurt user experience.  Maybe you can elaborate.

Regards.

Pedro.

Pedro Larroy

unread,
Mar 31, 2014, 7:07:23 AM3/31/14
to clearsk...@googlegroups.com

I think an sqlite file is ok for human interaction as a config file. Specially when you can configure it through simple commands in the commandline client or easily through a GUI in another language or through the remote control protocol.

Don't you think?  I think its better to skip the headache to parse some config file. Well of course also it can be json. This is another option.

I don't have a strong option on this. But we need a db to store the shares that the daemon has attached at least.

Pedro.

--

Pedro Larroy

unread,
Mar 31, 2014, 7:10:05 AM3/31/14
to clearsk...@googlegroups.com

You can see how I thought this to work in the share unit test. Nothing set in stone. We can change to something better if somebody has a better idea.

Pedro.

Daniel Cachapa

unread,
Mar 31, 2014, 8:00:19 AM3/31/14
to ClearSkies Dev List
> I think it
> makes more sense to have the default together with the shared directory
> because:
>
> 1. You can erase the manifest db if you want.

You can do that as well in the user's config dir. In fact, erasing the
db is akin to removing the share, something we can easilly do from the
UI.
On the other hand, when the user finds files in his share that he
didn't expect, he might delete those files not knowing they're related
to clearskies.

> 2. Copy the manifest db together with the share to move it somewhere (ex
> USB)

This seems to be an advantage only for very large shares, but is this
something we expect to happen a lot (moving very large shares around)?

> 3. Don't blow up a hidden dir in the users home by default.

I see this as the biggest benefit of having the shares together.
Though I'd say this can be fixed by allowing a configurable settings
path for clearskies. It's standard procedure in Unix, and will have
the benefit of making Clearskies a portable app. In fact, this will
also somewhat fix problem 2.

> 4. Don't blow up some /system dir in android. The manifest is in the sdcard
> together with the share.

Android offers the possibility to request internal or external private
storage for apps, depending on the intended usage:
http://developer.android.com/guide/topics/data/data-storage.html
Essentially, for something like this, we'd use the app's external
private storage area.

I'm not aware if there would be any problems in other platforms, but
it seems clear to me that if you can write your data into the shared
directories, you will also be able do it in some other external config
directory.

> Maybe you have other ideas worth discussing. Those were my original
> thoughts. I didn't get why this approach might hurt user experience. Maybe
> you can elaborate.

Let me try to enumerate the problems I see with putting the db with the share:
1. It's invasive - for example, I curate very carefully my music and
photo folders, and don't like to see unrelated files being put there
2. It's unclear to the user which app the file belongs to, and if
deleting it will break anything (she may not even be using clearskies
anymore)
3. The above can lead to the user deleting the file, which will break the app
4. Clearskies is a protocol, not an app. If two running apps
implementing clearskies share the same folder then a) they'll share
the db file, leading to conflicts, or b) have different filenames,
which will force each implementation to somehow recognize clearskies
manifests and not share them
5. Uninstalling the app will be anything but clean. Deleting the
binaries and config directory won't be enough. The user (or our
uninstall program) will have to go through every shared folder and
delete the manifest. Putting issues with automated deletion aside, if
the folder is unavailable, garbage will be left behind
6. It has privacy implications: essentially you're leaving behind a
log of all the files which have existed in every share

It's also unclear to me the side-effects of having:
1. Two users in the same system both running cleaskies on the same share
2. A user create sub-shares - say he shares his entire photo album
with his immediate family, but also wants to share a particular event
with a larger group of friends
3. Using different synching software on the same share. Say I'm
sharing my entire photo album with Dropbox, and the particular event
with Clearskies. Dropbox won't recognize our manifest and replicate it
just like any other file.
4. Restoring an old share from backup. Say that share has since been
destroyed and re-created in the network

In all, I don't like the idea of apps writing private files deeply
into the filesystem. If all apps were to do it, we'd have dozens of
configuration files splattered across our computers.
Windows used to do this (or still does, don't know) when storing the
image caches in hidden files inside the directory. Gnome did it right
IMO: the caches are stored in a single config directory in the user's
home. If the user wants to clear the cache for any reason, it's a
simple task of deleting that one directory.
This also protects the user's privacy since it doesn't leave cached
images in what may be publicly accessible folders.

I hope this clears up my position.

Pedro Larroy

unread,
Mar 31, 2014, 9:38:20 AM3/31/14
to clearsk...@googlegroups.com

I think you have given good arguments. I agree now that the default behaviour should be to have the manifest all in the config dir. You missed the point that the manifest is not inside the shared folder, but as a sibling in its parent directory.  So it invalidates some of your points but not all.

What should this directory be?  .clearskies/shares ?

Pedro.

Daniel Cachapa

unread,
Mar 31, 2014, 9:47:20 AM3/31/14
to ClearSkies Dev List
Ah, sorry I did miss that point.
Unfortunately I've been very busy lately and haven't been able to keep
proper track of the project other than lurking here in the list. I
hope to be able to contribute again soon.

.clearskies/shares sounds good to me.
Daniel Cachapa

Dmitry Yakimenko

unread,
Mar 31, 2014, 1:08:32 PM3/31/14
to clearsk...@googlegroups.com
I think all files and especially configs should be human readable when possible. It's also possible to commit them under revision control then. I don't think sqlite db is ok for human interaction. Though command line interface via tool is ok. Similar to git config.

Dima.

Dmitry Yakimenko

unread,
Mar 31, 2014, 1:13:27 PM3/31/14
to clearsk...@googlegroups.com
I think the .db should be inside the share. Similar to .git or .hg directory inside a repo. It can be hidden with a dot on unix and made hidden on Windows. Then it's possible to move the share which is possibly is still being synced to another drive/device. Many programs do that. Rsync creates some temporary files, wget goes it. I'm not sure about Dropbox. BTSync does it as well.

Dima.

Daniel Cachapa

unread,
Mar 31, 2014, 1:35:42 PM3/31/14
to ClearSkies Dev List
I wrote an extensive list on what I see as disadvantages of storing
the db with the share. I think that if we go that route, then those
issues would need to be addressed.
The fact that other software does it is not a strong argument, in my opinion.

Having said that, I use rsync mirror my computer's HD to a private
server, and have never seen any leftover files from those actions.
Same goes for wget, though I haven't used it extensively.

- Dropbox does create private files, but Dropbox uses a different
paradigm from ours: all of its shares are kept under a dedicated
folder which is created and maintained by the app.
However, DB doesn't create any other files inside any subfolders. In
fact, you can create symlinks to other folders in your system, and DB
will happily sync them without the need for any extra config files.
In a way, DB behaves as I was proposing for CS, since by default the
Dropbox folder is created in the user's home folder. The only
difference to clearskies is that DB keeps the shares inside their
config directory, and we will have them elsewhere in the system.

- Git, hg and other source control systems are special cases --
they're not really shared folders but rather somewhat of a
self-contained filesystem. If you try to mix source control systems in
the same folder you're just asking for trouble.

- Btsync does create config files inside the shares, but that is wrong
IMO. I could list other things btsync does which I also consider
wrong, so I'm not sure that we should be following them too closely
:-)
Daniel Cachapa

Dmitry Yakimenko

unread,
Mar 31, 2014, 1:48:41 PM3/31/14
to clearsk...@googlegroups.com
I read your list and I agree with many points there. 

Rsync and wget create temporaries in the same folder where they download files to. This is what I meant. So, technically if you have that folder in Dropbox, it would try to constantly resync the downloaded file.

I guess it's just personal preferences. I like when things are self contained. I like when I can install software by copying it (or even better just running it without copying). The same goes for shared folders. I'd like to be able to just copy the shared (possibly not yet completely) folder to another device and keep sharing it without needing to muck about with configs and side files. I had to deal with that with BTSync and I didn't like that after copying it didn't pick up everything automatically.

I would not like to be left with garbage in my home folder when I erase a shared folder with rm -rf. 

I especially wound want an app create a sibling to the shared folder. Either pollute ~/.clearskies or inside the share. Not outside of it.

Just an opinion. I hope we can choose the best approach with time. We can implement one, try the other later, see what's best.

Dima.

Daniel Cachapa

unread,
Mar 31, 2014, 3:28:43 PM3/31/14
to ClearSkies Dev List
> Rsync and wget create temporaries in the same folder where they download
> files to. This is what I meant. So, technically if you have that folder in
> Dropbox, it would try to constantly resync the downloaded file.

Ah, those would be temp files then. I guess we'll need something
similar, but that's orthogonal to the config file placement.

> I guess it's just personal preferences. I like when things are self
> contained. I like when I can install software by copying it (or even better
> just running it without copying). The same goes for shared folders. I'd like
> to be able to just copy the shared (possibly not yet completely) folder to
> another device and keep sharing it without needing to muck about with
> configs and side files. I had to deal with that with BTSync and I didn't
> like that after copying it didn't pick up everything automatically.

I also like things to be self-contained, which is why I like them to
be all kept in the same folder. I guess we just have slightly
different views on self-containment :-) I do see your point though.
I don't see rebuilding the manifest as a big issue. Really, for most
shares it'll be a matter of seconds, and for really long ones it'll be
a one-time thing anyway. How often are you setting up huge shares by
copying between computers?

My experience with btsync was the reverse: I didn't like that it was
leaving those .btsync folders everywhere. It was bad on my PC, but
even worse on my mobile when I was experimenting with it.
Particularly bad was when I was having problems with btsync and wanted
to revert to a clean slate. It took me quite some time to discover
those hidden directories inside my shares.

> I would not like to be left with garbage in my home folder when I erase a
> shared folder with rm -rf.

This can be solved by the client. Deleting a share does not delete it
from the list of configured shares (its volume might have been
unmounted). The entry can me marked with a warning sign indicating
that, and you can remove the share there. Removing the share should
clean the manifest from your home folder automatically.

> I especially wound want an app create a sibling to the shared folder. Either
> pollute ~/.clearskies or inside the share. Not outside of it.

Agreed.

> Just an opinion. I hope we can choose the best approach with time. We can
> implement one, try the other later, see what's best.

Also agreed. I just thought that if we can come to an agreement now,
we could avoid wasting effort later on changing the approach.

Steven Jewel

unread,
Mar 31, 2014, 7:03:15 PM3/31/14
to clearsk...@googlegroups.com
On 03/31/2014 01:28 PM, Daniel Cachapa wrote:
>> Rsync and wget create temporaries in the same folder where they download
>> files to. This is what I meant. So, technically if you have that folder in
>> Dropbox, it would try to constantly resync the downloaded file.
>
> Ah, those would be temp files then. I guess we'll need something
> similar, but that's orthogonal to the config file placement.

On unix it's important to keep the temporary files in the same directory
as the files they are going to replace, or possibly a subdirectory like
./.tmp/. The reason why is that in order to get atomic replacement with
rename(), you need the file to be on the same mount point.

This is why rsync creates temporary files the way it does.

It's possible to download the files in /tmp or in the clearskies config
directory, and then move them to ~/share/filename.tmp.!sync, and then
rename.

Steven

Steven Jewel

unread,
Mar 31, 2014, 7:09:43 PM3/31/14
to clearsk...@googlegroups.com
On 03/31/2014 07:38 AM, Pedro Larroy wrote:
> What should this directory be? .clearskies/shares ?
>

With clearskies-ruby we put all of the files in
~/.local/share/clearskies, with a separate database file for each share.

Probably it'd be better to put the configuration in ~/.config/clearskies
and the shares in ~/.local/share/clearskies.

I imagine OS X doesn't follow the XDG, so on there it'd be better to do
what is normal for the platform. Likewise on mobile and windows we'd
want to defer to the platform to know where to put these files.

Also to address Cotton's original question, for clearskies-ruby we had a
CLEARSKIES_DIR environment variable that could be used when testing to
override the config directory. This let us launch multiple copies at
the same time.

Steven

Pedro Larroy

unread,
Mar 31, 2014, 7:24:03 PM3/31/14
to clearsk...@googlegroups.com
I thought against having the manifest *inside* the share as this special file needs to be handled so it's not synchronized.

Pedro

Pedro Larroy

unread,
Mar 31, 2014, 7:26:04 PM3/31/14
to clearsk...@googlegroups.com
I think we can make this a configurable option so people can chose what they prefer. It should be quite easy.

Pedro.

Steven Jewel

unread,
Mar 31, 2014, 7:52:55 PM3/31/14
to clearsk...@googlegroups.com
On 03/31/2014 11:48 AM, Dmitry Yakimenko wrote:
> I guess it's just personal preferences. I like when things are self
> contained. I like when I can install software by copying it (or even
> better just running it without copying). The same goes for shared
> folders. I'd like to be able to just copy the shared (possibly not yet
> completely) folder to another device and keep sharing it without needing
> to muck about with configs and side files. I had to deal with that with
> BTSync and I didn't like that after copying it didn't pick up everything
> automatically.

That model works well for git and its .git directory because git is
initiated by the user, and the user will initiate it in the right
directory. Clearskies doesn't work the same way. If a directory is
moved it won't know what to sync anymore, since it won't be able to find it.

We can use filesystem events to sometimes detect when the entire share
is renamed or moved, but that won't work everywhere and in all cases.

If we did embed all the share information within the share directory,
we'd still have to search all the hard drives to try and find it if it
disappears, which most users wouldn't appreciate [1].

I'd recommend for clearskies_core we stick with a simple text file
config for non-share-specific options, and then leave the issue you're
mentioning (not wanting to muck with config files) to the GUI [2].

Steven

[1]: Remember how Windows 95 used to do this when you clicked on a
shortcut and it pointed to something that no longer existed? It'd have
some flashlight animation while it tried to find the program in
question, which never worked.

[2]: I'd recommend keeping the GUI(s) out of the clearskies_core
repository, and instead have them be separate projects, as we have
discussed before.

Having a daemon as well as a command-line interface like
clearskies-ruby, seems like the right amount of interface to have in the
core project.
Reply all
Reply to author
Forward
0 new messages