Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Re (and a patch) : Coda git repository available

0 views
Skip to first unread message

u-m...@aetey.se

unread,
Apr 22, 2016, 3:33:34 AM4/22/16
to
On Thu, Apr 21, 2016 at 02:25:28PM -0400, Jan Harkes wrote:
> On Thu, Apr 21, 2016 at 02:16:02PM +0200, u-m...@aetey.se wrote:
> > The only problem we hit there was not with clog but with memory alignment
> > in cfs (which we then fixed).
>
> Looking forward to seeing patches related to that.

Attaching.

> > There is no http involved (it would be a large overhead without any reason).
> >
> > This data does not have to be secured against MitM, the worst which can
> > happen is disruption of communication / DOS which MitM always can achieve.
>
> Depends on what the configuration contains.

Correct. That's why we can not come to a conclusion without looking at
the actual code / specification. We can return to the subject when we
look at the corresponding patch later.

> > With ViceGetVolumeLocation() we make a similar mistake (!), we do
> > not allow for the returned string to be invalidated "properly". There
> > should have been some kind of a validity promise (say a TTL) included.
> > A possibility to change host names in a realm setup without confusing
> > the clients is a Good Thing (TM), why forbid this by design?
>
> GetVolumeLocation returns a DNS hostname, DNS records have a TTL. There
> is nothing here forbidden by design, we just don't duplicate something
> that is already there.

In your scheme there are several mappings to do:
volume/server -[1]-> dnsname -[2]-> ip

It was not about [2] but about [1] which "risks to fail to become invalidated"
when it is stale.

I understand now that you implied to reresolve volume/server->dnsname
when the client loses contact with a server.

We in our code trigger reresolution when a server becomes unreachable
(for "too long").

Then indeed no special expiration time is needed.

> It just seems like we're having the same discussions over and over

You are right, we have already presented the arguments a couple
of years ago and also commented on them, no need to duplicate.

> again, and yet it always results in to us disagreeing and you just

No, not _that_ bad. We actually agree on most things, we just put the
discussion effort into the ones where we do not :) the other ones go
unnoticed.

Think also, if I wouldn't value your comments, I wouldn't ask for them.

> doing your thing anyway,

That's how the things get done. One implements what one feels is the
most suitable approach.

> Jan

Best regards,
Rune
0001-Avoid-unaligned-memory-access-in-cfs-important-on-p.patch

Jan Harkes

unread,
Apr 22, 2016, 11:56:39 AM4/22/16
to
On Fri, Apr 22, 2016 at 09:32:17AM +0200, u-m...@aetey.se wrote:
> coda-src/vtools/cfs.cc | 27 +++++++++++++++++++--------
> 1 files changed, 19 insertions(+), 8 deletions(-)
>
> --- a/coda-src/vtools/cfs.cc
> +++ b/coda-src/vtools/cfs.cc
> @@ -1826,8 +1826,8 @@ static void ListVolume(int argc, char *argv[], int opslot)
> VolumeStatus *vs;
> char *volname, *omsg, *motd;
> VolumeStateType conn_state;
> - int conflict, cml_count;
> - unsigned int age, time;
> + int32_t conflict, cml_count;
> + uint32_t age, time;
> uint64_t cml_bytes;
> char *ptr;
> int local_only = 0;
> @@ -1859,25 +1859,36 @@ static void ListVolume(int argc, char *argv[], int opslot)
> cml_count, offlinemsg, motd, age, time, cml_bytes) */
> ptr = piobuf; /* invariant: ptr always point to next obj
> to be read */
> +/* we hope that piobuf is sufficiently well aligned */

It should because as far as I remember to make it easier on the kernel
it is always aligned on a page boundary and always smaller than the size
of a memory page.

> vs = (VolumeStatus *)ptr;
> ptr += sizeof(VolumeStatus);
> volname = ptr; ptr += strlen(volname)+1;
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Here is the real culprit, if those strings were padded to the next
32-bit boundary none of the rest would even be necessary.

> - conn_state = (VolumeStateType)*(int32_t *)ptr;
> +/* avoid unaligned memory accesses */
> + { int32_t temp;
> + memcpy(&temp, ptr, sizeof(int32_t));
> + conn_state = (VolumeStateType)temp;
> + }
> +/* conn_state = (VolumeStateType)*(int32_t *)ptr; */

I am somewhat surprised you aren't getting the same alignment errors
when this is packed on the venus side, because the code there is pretty
much identical and even has a comment about possible alignment issues.

/* do we have to worry about alignment? */
*(int32_t *)cp = (int32_t)conn_state; cp += sizeof(int32_t);
...

This just needs to be fixed by padding the strings out.

I can't believe this was the only place that had an alignment problem
like this, those ioctls are badly packed all over the place.

Jan

Jan Harkes

unread,
Apr 22, 2016, 12:10:51 PM4/22/16
to
On Fri, Apr 22, 2016 at 11:55:46AM -0400, Jan Harkes wrote:
> This just needs to be fixed by padding the strings out.

Actually I'm going to fix this by reshuffling the packed fields
alignment issues pop up when an N-byte type is not read at an N-byte
boundary. I think if I move the strings to the end everything else
should fall into place.

Jan

u-m...@aetey.se

unread,
Apr 22, 2016, 1:25:48 PM4/22/16
to
For my taste consequently treating the data as a byte array and
packing/unpacking explicitly is safer in the long run and easier to
modify without bothering about alignment.

But of course as long as the constraints on data member ordering are
documented/commented, just go ahead, aligned access is faster and the
code smaller (for cfs this hardly makes any difference though :).

There are no compatibility problems either as cfs and venus being in
use together are assumed to be built together (which does not sound
positive to me, I would prefer very much to have a venus<->cfs protocol
specification instead, but this is the status quo anyway).

Rune

u-m...@aetey.se

unread,
Apr 22, 2016, 1:41:09 PM4/22/16
to
On Fri, Apr 22, 2016 at 11:55:46AM -0400, Jan Harkes wrote:
> I can't believe this was the only place that had an alignment problem
> like this, those ioctls are badly packed all over the place.

May be we are especially lucky, but this was the only visible problem.

Our changes in venus (for non-ip-based server addressing) might have
influenced this but with a quick look I do not see anything relevant
there.

Rune

Jan Harkes

unread,
Apr 22, 2016, 2:05:41 PM4/22/16
to
On Fri, Apr 22, 2016 at 07:40:00PM +0200, u-m...@aetey.se wrote:
> On Fri, Apr 22, 2016 at 11:55:46AM -0400, Jan Harkes wrote:
> > I can't believe this was the only place that had an alignment problem
> > like this, those ioctls are badly packed all over the place.
>
> May be we are especially lucky, but this was the only visible problem.

This particular VIOC_GETVOLSTAT pioctl call is used in 4 different
places in cfs and 2 places in repair, your fix addressed one place. Also
things like the pioctl in/out structures seem to be defined at the spot
of the implementation which makes it very easy to make a change
somewhere that is not picked up elsewhere.

Considering how many places need to change I'm no longer sure I want to
reshuffle and am starting to think that having common pioctl pack/unpack
functions in a shared place might be a better approach. Going to have to
ponder this a bit.

Jan

u-m...@aetey.se

unread,
Apr 22, 2016, 4:12:00 PM4/22/16
to
On Fri, Apr 22, 2016 at 02:04:51PM -0400, Jan Harkes wrote:
> On Fri, Apr 22, 2016 at 07:40:00PM +0200, u-m...@aetey.se wrote:
> > May be we are especially lucky, but this was the only visible problem.
>
> This particular VIOC_GETVOLSTAT pioctl call is used in 4 different
> places in cfs and 2 places in repair, your fix addressed one place. Also

It is likely that we never exercised repair on ARM machines.

(The repair process has unfortunately never been robust even on intel :(
but we did not have the resources to analyze and fix it.
I don't think we ever lost data but it is not unusual to see repair
misbehave.)

> things like the pioctl in/out structures seem to be defined at the spot
> of the implementation which makes it very easy to make a change
> somewhere that is not picked up elsewhere.
>
> Considering how many places need to change I'm no longer sure I want to
> reshuffle and am starting to think that having common pioctl pack/unpack
> functions in a shared place might be a better approach. Going to have to
> ponder this a bit.

Nice, thanks for this cleanup!

Rune

Jan Harkes

unread,
Apr 22, 2016, 9:35:30 PM4/22/16
to
I don't have a cleanup yet, maybe thank whoever resolves
https://github.com/cmusatyalab/coda/issues/1.

I am busy trying to get binary builds working again. The library dance
is all coming back to me,

- build tarballs for lwp, rpc2, rvm, coda.
- build lwp library as deb and rpm packages for 32 and 64 bit.
(really should build separate versions for each stable and lts
distribution wheezy, jessie, trusty, wily, xenial, fc-22, fc-23,
rhel-6, rhel-7, but that is 18 different combinations and it
doesn't cover freebsd/netbsd and such for which I don't even have
build VMs set up)
- find a problem in the rpm build.

- rebuild tarballs for lwp, rpc2, rvm, coda.
- build lwp library for debian/ubuntu/fedora/rhel.
- push lwp into private repos.
- build rpc2 library for debian/ubuntu/fedora/rhel.
- discover a problem with how rpc2 or lwp was built or packaged, go
back to the beginning.

- rebuild tarballs for lwp, rpc2, rvm, coda.
- build lwp library for debian/ubuntu/fedora/rhel.
- push lwp into private repos.
- build rpc2 library for debian/ubuntu/fedora/rhel.
- push rpc2 into private repos.
- build rvm library for debian/ubuntu/fedora/rhel.
- push rvm into private repos.
- discover that I somehow managed to not install the right lwp
library on the build host so I have to rebuild rpc2 and rvm and
push new versions.
...

It doesn't help that the packaging files are part of the source tree so
any packaging problem requires a rebuild of the source archive, that was
probably mistake #1. The fact that any scripts I used to use are no
longer working as I moved to a new desktop at least twice since the last
release. Right now it is actually easier and far more reliable to just
build from git than it is to build these packages from source tarballs.

Jan

Adam Goode

unread,
Apr 23, 2016, 3:47:33 AM4/23/16
to
I recommend removing the specfiles from your repository. They are distro-specific and should be stored in the distro's repositories. Here are Fedora's:

Although it looks like coda has been retired from Fedora from lack of use.

If you want to do binary packaging, you might consider making a Docker package.


Adam

 
Jan


u-m...@aetey.se

unread,
Apr 23, 2016, 3:57:09 AM4/23/16
to
On Fri, Apr 22, 2016 at 09:34:45PM -0400, Jan Harkes wrote:
> I am busy trying to get binary builds working again.

> - build lwp library as deb and rpm packages for 32 and 64 bit.
> (really should build separate versions for each stable and lts
> distribution wheezy, jessie, trusty, wily, xenial, fc-22, fc-23,
> rhel-6, rhel-7, but that is 18 different combinations and it
> doesn't cover freebsd/netbsd and such for which I don't even have
> build VMs set up)

Why not build a single binary Coda client package which works on all
reasonable distros (including all large ones besides rhel, who for an
inexplicable reason omits the Coda kernel module)?

This would save a lot of effort.

Did you check the Aetey installer?

http://www.aetey.se/dl/cocli-2.1-6.9.5+20140810-linux-ia32-1.bin
(1.5MB and installs literally in a couple of seconds)

It is all open source of course:

http://www.aetey.se/dl/cocli-2.1-6.9.5+20140810-linux-ia32-1.src.tar

(or see the corresponding s/ia32/x86_64/ versions)

Even doing a sligtly adapted variations for different distros
would be a lot less pain than yours now, because the client does not
have _any_ dependencies on the distro, besides the "rc interface" to be
started on boot.

> - rebuild tarballs for lwp, rpc2, rvm, coda.
> - build lwp library for debian/ubuntu/fedora/rhel.
> - push lwp into private repos.
> - build rpc2 library for debian/ubuntu/fedora/rhel.
> - discover a problem with how rpc2 or lwp was built or packaged, go
> back to the beginning.

> - rebuild tarballs for lwp, rpc2, rvm, coda.
> - build lwp library for debian/ubuntu/fedora/rhel.
> - push lwp into private repos.
> - build rpc2 library for debian/ubuntu/fedora/rhel.
> - push rpc2 into private repos.
> - build rvm library for debian/ubuntu/fedora/rhel.
> - push rvm into private repos.
> - discover that I somehow managed to not install the right lwp
> library on the build host so I have to rebuild rpc2 and rvm and
> push new versions.
> ...

This kind of effort is not necessary, honestly.

My 1 cent...

Rune

Jan Harkes

unread,
Apr 23, 2016, 10:19:34 AM4/23/16
to
Actually, it is vital for the overall health of the project. I would not
have found out that when librpc2 build with the lua extensions that we
have to add libm as a library dependency and various other things.

The same reason why building on a different architecture (arm) or with a
different compiler (clang) can surface hidden problems.

Jan

u-m...@aetey.se

unread,
Apr 23, 2016, 12:32:35 PM4/23/16
to
On Sat, Apr 23, 2016 at 10:18:46AM -0400, Jan Harkes wrote:
> On Sat, Apr 23, 2016 at 09:56:08AM +0200, u-m...@aetey.se wrote:
> > This kind of effort is not necessary, honestly.
> >
> > My 1 cent...
>
> Actually, it is vital for the overall health of the project. I would not
> have found out that when librpc2 build with the lua extensions that we
> have to add libm as a library dependency and various other things.

It is in practice virtually unseen that some upstream source builds
out-of-the-box on every usable distribution and gets it "just right".

My concern is that your time is invaluable for hacking on Coda internals,
preferably not on idiosyncrasies of different environments.

But this is your time and your decisions.

> The same reason why building on a different architecture (arm) or with a
> different compiler (clang) can surface hidden problems.

Every benefit has its cost...

Hope you soon will be done with this packaging jungle!

Best regards,
Rune

Jan Harkes

unread,
Apr 27, 2016, 4:04:02 PM4/27/16
to
On Fri, Apr 22, 2016 at 09:06:02PM -0500, Adam Goode wrote:
> On Fri, Apr 22, 2016 at 8:34 PM, Jan Harkes <jaha...@cs.cmu.edu> wrote:
> > It doesn't help that the packaging files are part of the source tree so
> > any packaging problem requires a rebuild of the source archive, that was
> > probably mistake #1. The fact that any scripts I used to use are no
...
> I recommend removing the specfiles from your repository. They are
> distro-specific and should be stored in the distro's repositories. Here are
> Fedora's:
> http://pkgs.fedoraproject.org/cgit/rpms/lwp.git/
> http://pkgs.fedoraproject.org/cgit/rpms/rpc2.git/
> http://pkgs.fedoraproject.org/cgit/rpms/rvm.git/
> http://pkgs.fedoraproject.org/cgit/rpms/coda.git/
>
> Although it looks like coda has been retired from Fedora from lack of use.

Good recommendation. I 'forked' a package we already had for another
project in our group and turned it into a Coda packaging repository that
holds all the Debian/Ubuntu/Fedora/RHEL packaging related chaos.

https://github.com/cmusatyalab/coda-packaging

The good part is that now we have proper package signing and something
of a manageable workflow to consistently build binary and source packages.

The not-so-good part is that it became pretty much impossible to
correctly build from separate source archives and so lots of things got
shuffled around.

liblwp/librpc2/librvm -> coda-common package *1
codaconfedit -> coda-common package *2
lib*-dev -> not packaged
rvmutl/rdstool -> coda-server package *3

*1 no actual conflicts with the old library packages because we install
them under /usr/lib/coda/.
*2 conflicts with old coda-client and coda-server packages, but makes
installing clients and servers on the same machine more straightforward
without needing special diversions.
*3 only used by the server anyway, does introduce a conflict with the
old rvm-tools package.

Also because I based the RPM build on the Fedora spec file, the
enterprise linux family (RHEL/CentOS/...) is currently not building. el6
is broken because it doesn't have systemd, and el7 is broken because it
probably uses a different package name for the readline5 library
hopefully in a couple of iterations those platforms will build again.

> If you want to do binary packaging, you might consider making a Docker
> package.

Not sure if you are joking or being serious here.

Jan

Adam Goode

unread,
Apr 27, 2016, 5:24:20 PM4/27/16
to
It's a reasonable thing to investigate. You can essentially pick whichever Linux distribution to start from (like https://hub.docker.com/_/debian/) , then build a coda overlay and package that all up. Cross-distro coda installation then becomes:

docker pull cmusatyalab/coda



 

Jan


0 new messages