KristalliProtocol work-in-progress specification

3 views
Skip to first unread message

Jukka Jylänki

unread,
Jan 20, 2010, 1:03:25 PM1/20/10
to realxt...@googlegroups.com
Hello,

It has been a long ongoing track of research with rex on how to extend
the bottom-level OSUDP protocol layer to suit for rex-specific scene
synchronization and how to fix the various performance problems the
current protocol is known to have. A recent design/testing period resulted
in some ideas for a protocol architecture that would be more suitable for
use in rex. The description of the architecture as well as a reference
library implementation (C++, Winsock2) is now hosted at
http://clb.demon.fi/Kristalli/ .

The documentation contains API reference as well as some code snippets,
but due to lack of time the full code base (and downloadable binaries for
the test samples) are not there yet. These will be uploaded later, but for
now, we welcome any feedback on the protocol design itself.

Best Regards,

Jukka

Morgaine

unread,
Jan 20, 2010, 1:53:45 PM1/20/10
to realxt...@googlegroups.com
Is Kristalli in use somewhere?

We're heading towards TCP-based transport for VWRAP (the IETF working group and interop protocol for next-gen SL), but the only mechanism on the table at the moment is horribly kludged HTTP of the COMET variety which just isn't going to have the desired throughput and latency performance, so I'm interested in better suggestions.


Morgaine.




=============================
> --
> http://groups.google.com/group/realxtend-dev
> http://wiki.realxtend.org
> http://dev.realxtend.org
>

Toni Alatalo

unread,
Jan 20, 2010, 2:06:11 PM1/20/10
to realxt...@googlegroups.com
Morgaine kirjoitti:

> Is Kristalli in use somewhere?

It was used in a game that Jukka was developing long ago, and in a
distributed processing thing he did recently.

We have been thinking that could perhaps merge it with MXP, 'cause at
least goals are the same, but haven't looked at similarities /
differences in the actual protocols yet. MXP has been revising specs and
working on transport layer spec, and thinking of using Google protobufs
for defining packets (Kristalli uses an own xml format for that but they
don't differ much iirc, we could perhaps switch to using protobufs with
Kristalli too).

> We're heading towards TCP-based transport for VWRAP

> <https://www.ietf.org/mailman/listinfo/ogpx> (the IETF working group

> and interop protocol for next-gen SL), but the only mechanism on the
> table at the moment is horribly kludged HTTP of the COMET variety
> which just isn't going to have the desired throughput and latency
> performance, so I'm interested in better suggestions.

In Kristalli whether you use TCP or UDP is set by one option in the call
where you open the connection or something, both work. I think Jukka has
been using it with UDP mostly.

COMET style is what I've so far assumed could be an alternative for at
least things like inventory, like we already have webdav for that in
Taiga now but something comet style would add change notifications and
possible other events to that. We'll soon see how Kristalli performs for
that as the guys are working on a proof of concept demo thing.

> Morgaine.

~Toni

> =============================

Toni Alatalo

unread,
Jan 20, 2010, 2:11:58 PM1/20/10
to realxt...@googlegroups.com
Toni Alatalo kirjoitti:

> COMET style is what I've so far assumed could be an alternative for at
> least things like inventory, like we already have webdav for that in
> Taiga now but something comet style would add change notifications and
> possible other events to that. We'll soon see how Kristalli performs
> for that as the guys are working on a proof of concept demo thing.

Err actually that upcoming demo is about asset transfers, asset storage
protocol that supports things like viewing what assets are available
etc. so similar to inventory in that sense, but not like the SL
inventory 'cause it actually is the asset store and not just asset
references.

But otherwise the Kristalli protocol is general, something that could
work for all networking, like MXP.

>> Morgaine.
>
> ~Toni

same.

Ryan McDougall

unread,
Jan 20, 2010, 4:08:23 PM1/20/10
to realxt...@googlegroups.com, Antti Ilomäki

As always a comparison about how this protocol compares to the many
existing implementations such as

* Enet (http://enet.bespin.org/)[1,2]
* SST (http://pdos.csail.mit.edu/uia/sst/)[1]
* protocol buffers (http://code.google.com/apis/protocolbuffers/)[3]
* apache thrift (http://incubator.apache.org/thrift/)
* RakNet (http://www.jenkinssoftware.com/)
* ...

[1] used in sirikata
[2] used in intensity engine
[3] used in MXP

is of course required. This should include the amount of time required
to complete the code base versus the amount of time required to learn
an existing library.

You also reference the suitability of a network library for use within
reX. Can you also link to a discussion and analysis of realXtend's
requirements for a network layer and a comparison matrix of your
protocol as well as some of the above alternatives?

Observations about the directions of existing popular systems such as
Second Life or Unreal Engine would also be appreciated, as decisions
as profound as those regarding the protocol layer are never made
lightly; especially given the Steering Group's oft repeated insistence
that realXtend (Naali) should never be caught depending on a single
back-end server/protocol.

Cheers,

Ryan McDougall

unread,
Jan 20, 2010, 4:43:05 PM1/20/10
to realxt...@googlegroups.com, Antti Ilomäki
2010/1/20 Ryan McDougall <semp...@gmail.com>:

Also, your library appears to be windows-only. Do you have any plans
for Linux or Mac ports?

Cheers,

Toni Alatalo

unread,
Jan 20, 2010, 5:21:28 PM1/20/10
to realxt...@googlegroups.com
Ryan McDougall kirjoitti:

> As always a comparison about how this protocol compares to the many
> existing implementations such as
>

Some quick remarks based on what I've learned so far:

This can well be a viable candidate. I haven't used it yet and also
Jukka only knew it superficially the last time we talked about it.
Certainly something to consider, may well be identical.

AFAIK this is only for serialization, defining packets, not a transport
layer protocol. Kristalli covers both, and like said in an earlier mail
could adopt protobufs for how to describe packets. It just now has what
was written earlier, before protobufs existed even I think.

> * RakNet (http://www.jenkinssoftware.com/)
>

This is the only tool of these which I've actually used, and happily so.
From a user point of view (I just used the API when made a networked
game and didn't look into internals) it seems identical to Kristalli
based on what I know: gives the means to open a connection, describe
packets, and then send those packets either as trusted or untrusted etc
and receive them.

The difference is that RakNet is commercial, as in they sell licenses to
use it, even though it is open source in the sense that the source is
available and small indie projects can use it for free. So AFAIK
businesswise not compatible with the bsd style 'do whatever you want, no
strings attached' in reX.

> is of course required. This should include the amount of time required
> to complete the code base versus the amount of time required to learn
> an existing library.
>

In the prototyping Jukka has done now, I think the motivation for using
Kristalli has been exactly that it required least time: he had already
written it earlier for other reasons, so it already existed and he knew
it by heart so has been easy to use. Fair enough.

But completing for it to be what reX needs is another story, more on
that below.

> You also reference the suitability of a network library for use within
> reX. Can you also link to a discussion and analysis of realXtend's
> requirements for a network layer and a comparison matrix of your
> protocol as well as some of the above alternatives?
>

I don't know if such a thing exists yet, something that should be done
as part of this research.

Briefly on a high level I know that we and AFAIK everyone else wants
efficiency and robustness - fast unreliable packets, and reliable
packets, and ways to prioritize. Also to combine packets.

And a way to define own packets as first-class citizens in the protocol
for your application, so that all that priorization and combining etc.
works well for them too.

> Also, your library appears to be windows-only. Do you have any plans
> for Linux or Mac ports?

Obviously it is a requirement if this will be adopted, reportedly would
not be a big job - just the replace to the winsock api calls to unix
sockets (which are afaik identical on linux and mac), not many lines of
code. But it is not required for a proof-of-concept demo, and the
current impl is his reference impl.

Besides other platforms, I think the protocol needs to be implemented in
several languages as well. At least in c# for opensim, unless we just
use a native implementation as a library there too.

It has seemed promising in the MXP community that there are
implementations in also Java and folks who test c++ impls on Solaris
etc. - I think exactly the protocol is the thing we want everyone to be
able to talk (for example java is quite a nice way to write opengl using
games that launch from a web browser, it's perhaps not talked that much
but those games are out there and some of them are quite big and work
well .. both gfx and networking. my 9 year old likes power football :)

So I'm happy this spec and ref impl is out there, so if e.g. MXP folks
defining their own transport layer, or Sirikata people who use enet and
I guess know it well, can read it if they want - but I agree that we
must also do our homework and read those similar things.

~Toni

Toni Alatalo

unread,
Jan 20, 2010, 5:39:29 PM1/20/10
to realxt...@googlegroups.com
Toni Alatalo kirjoitti:
> Certainly something to consider, may well be identical.

One difference seems to be that Enet is UDP only, whereas in Kristalli
you can use either UDP or TCP, just by defining it in a parameter in a
single call.

Whether that means it would be better to use Kristalli, or just use Enet
and add TCP support to it if it's needed some day, I don't know. Depends
on how they compare otherwise, and how necessary TCP is.

> ~Toni

same.

Ryan McDougall

unread,
Jan 21, 2010, 1:45:29 AM1/21/10
to realxt...@googlegroups.com

But Jukka is the only person for whom that is true.

> But completing for it to be what reX needs is another story, more on that
> below.
>
>> You also reference the suitability of a network library for use within
>> reX. Can you also link to a discussion and analysis of realXtend's
>> requirements for a network layer and a comparison matrix of your
>> protocol as well as some of the above alternatives?
>>
>
> I don't know if such a thing exists yet, something that should be done as
> part of this research.

This part absolutely has to come first. Making a library and then
finding a use is the cart before the horse.

Prioritization is the fundamental element of technical leadership, and
without having a proper analysis of the problem, and a complete
proposal for consideration, it is impossible to do anything but make
random guesses.

It may be that some people know the problem domain well, or find the
networking library high priority, but this is a group of many people
and limited resources. The problem must be well understood by all
parties, and commonly prioritized. If we can't manage that, we might
as well be trained monkeys wearing engineer suits.

This proposal starts out on the wrong foot, not the least reason
because it references problems with something called "OSUDP"
(presumably OpenSim UDP), a protocol that doesn't exist. Likely what
was meant is LLUDP (Linden Lab's Second Life UDP protocol).

When I was hired by the SG to organize and prioritize work over a year
ago, after a *lengthy* consultation about their wishes, I decided
against changing the underlying protocol, and made that decision clear
and well known. The reasoning was a) all parties had previously agreed
that SL/SLUDP was an acceptable solution for legacy reX b) while I
could think of many protocol improvements on many levels for VWs, I
could think of none that was so important that it would justify making
Yet Another Incompatible Protocol.

Ultimately what people (users) care most about is the experience. The
UI, the 3D graphics, ease of use, immersion, communication, etc. How
that is accomplished is irrelevant to them. If those user-visible
features can be accomplished without resorting to YAIP (and the
implied diversion of resources that should otherwise be applied to
making people really enjoy reX), then YAIP is not justified. If they
cannot be accomplished let's make a list and justify the expenditure.

If anyone wants to overturn this very reasonable, year+ old,
SG-influenced decision, then they really have to make their case
first; which is what should have been done with the research time. Not
doxygen.

> Briefly on a high level I know that we and AFAIK everyone else wants
> efficiency and robustness - fast unreliable packets, and reliable packets,
> and ways to prioritize. Also to combine packets.

Engineers want packet combining. Users want beautiful avatars,
reliability, and good performance; ie. features.

With limited resources one must compromise. If packet combining is
something users will notice, then the case first must be made to the
satisfaction of all parties. Preferably backed up with some real
numbers from profiling.

> And a way to define own packets as first-class citizens in the protocol for
> your application, so that all that priorization and combining etc. works
> well for them too.
>
>> Also, your library appears to be windows-only. Do you have any plans
>> for Linux or Mac ports?
>
> Obviously it is a requirement if this will be adopted, reportedly would not
> be a big job - just the replace to the winsock api calls to unix sockets
> (which are afaik identical on linux and mac), not many lines of code. But it
> is not required for a proof-of-concept demo, and the current impl is his
> reference impl.

Not a big job, yet so many project spend so much time on it.
Everything we do in naali is cross platform, yet we need a dedicated
person to keep linux running, and we still have no mac support.

Maintenance is the majority of effort over the lifecycle of a software project.

> Besides other platforms, I think the protocol needs to be implemented in
> several languages as well. At least in c# for opensim, unless we just use a
> native implementation as a library there too.

OpenSim already provides an easy to use application host. I've also
made it clear that if new protocols are to be considered, they must be
OpenSim-based. Anything else is just Not Invented Here syndrome.

> It has seemed promising in the MXP community that there are implementations
> in also Java and folks who test c++ impls on Solaris etc. - I think exactly
> the protocol is the thing we want everyone to be able to talk (for example
> java is quite a nice way to write opengl using games that launch from a web
> browser, it's perhaps not talked that much but those games are out there and
> some of them are quite big and work well .. both gfx and networking. my 9
> year old likes power football :)
>
> So I'm happy this spec and ref impl is out there, so if e.g. MXP folks
> defining their own transport layer, or Sirikata people who use enet and I
> guess know it well, can read it if they want - but I agree that we must also
> do our homework and read those similar things.
>
> ~Toni

I am happy the spec is out, but I am not confident how many people are
interested in discussing YAIP design. MXP has precisely this problem
-- it's not bad stuff, it's just no one has found it so useful as to
make it a justifiable option. Everyone has a limited budget.

Cheers,

Toni Alatalo

unread,
Jan 21, 2010, 1:58:22 AM1/21/10
to realxt...@googlegroups.com
Ryan McDougall kirjoitti:

>> In the prototyping Jukka has done now, I think the motivation for using
>> Kristalli has been exactly that it required least time: he had already
>>
> But Jukka is the only person for whom that is true.
>

Yes, and AFAIK he is also pretty much the only person doing this
research and experimentation so far.

> This part absolutely has to come first. Making a library and then
> finding a use is the cart before the horse.
>

Like said the protocol already existed.

> Prioritization is the fundamental element of technical leadership, and
>

Yes and this is a major decision that we'll have to make quite soon.
There are different plans and needs for reX tech - one idea has been
indeed to first complete all the basics well with the current tech,
exactly to focus on features, ui etc., and not touch the protocol
department yet (like during spring). So that we would have actually
usable as soon as possible. Then return to possibly switching protocols
in autumn or so. Another approach would be to switch protocol first,
make and then implement the features using it, but that seems like a
slower road to get something really usable. But arguably a better
foundation for longer term future.

~Toni

Ryan McDougall

unread,
Jan 21, 2010, 2:24:54 AM1/21/10
to realxt...@googlegroups.com
On Thu, Jan 21, 2010 at 8:58 AM, Toni Alatalo <ant...@kyperjokki.fi> wrote:
> Ryan McDougall kirjoitti:
>>>
>>> In the prototyping Jukka has done now, I think the motivation for using
>>> Kristalli has been exactly that it required least time: he had already
>>>
>>
>> But Jukka is the only person for whom that is true.
>>
>
> Yes, and AFAIK he is also pretty much the only person doing this research
> and experimentation so far.

But the research isn't for one person, it's for the whole project.

>> This part absolutely has to come first. Making a library and then
>> finding a use is the cart before the horse.
>>
>
> Like said the protocol already existed.

And like I said, that's not material yet.

>> Prioritization is the fundamental element of technical leadership, and
>>
>
> Yes and this is a major decision that we'll have to make quite soon. There
> are different plans and needs for reX tech - one idea has been indeed to
> first complete all the basics well with the current tech, exactly to focus
> on features, ui etc., and not touch the protocol department yet (like during
> spring). So that we would have actually usable as soon as possible. Then
> return to possibly switching protocols in autumn or so. Another approach
> would be to switch protocol first, make and then implement the features
> using it, but that seems like a slower road to get something really usable.
> But arguably a better foundation for longer term future.

Yes, one could argue that. That's why we need proposals to know what
we're arguing. As of yet, there is no proposal, so there's nothing to
argue.

> ~Toni

Cheers,

Toni Alatalo

unread,
Jan 21, 2010, 3:13:22 AM1/21/10
to realxt...@googlegroups.com
Ryan McDougall kirjoitti:

>> Like said the protocol already existed.
>>
>
> And like I said, that's not material yet.
>

It was something he could use to make a proof of concept demo. Yes, it's
not a finished solution for something reX could just start using.

> Yes, one could argue that. That's why we need proposals to know what
> we're arguing. As of yet, there is no proposal, so there's nothing to
> argue.
>

Agreed. Basically how I see this is that the demo will give us more
information, and is hence a sensible part of research.

One thing of interest there is asset downloads - afaik sludp is slow and
unreliable for that, have understood that it is a real problem that
after logging in to a remote server you may end up with an incomplete
world with missing objects and textures, and it takes long for them o
download if they come. Was a major problem in one demo situation at
least last year (Tuomo told me when he was setting it up). That's one
reason why we (and especially you :) have worked on getting assets over
http, 'cause that's reliable and at least the thruput for a single
download is AFAIK pretty much as fast as it can be - hopefully we get
that pipeline completed soon.

But Jukka is arguing that it can be done even better, for a large number
of small assets which is the case in Ogre/reX data, and wants to make a
demo of that and it hasn't been much work (at least from the project
budget). Besides thruput, there is the issue of change notifications,
which is not a prob in a sludp/kristalli style connection, but AFAIK is
not solved for webdav yet (apart from polling which e.g. file system
mounts of webdav dirs does, but that doesn't scale well enough?) - I've
understood that something comet style could be it there.

This is something we could measure and compare: use a) slviewer &/
libomv &/ naali code against opensim, b) some http client(s) (wget?
naali?) against a web server, and the c) upcoming Kristalli storage
client-server demo, to fetch a fair bunch assets (like 500 meshes which
refer to 300 materials which use 400 textures, or so). From Australia to
Finland and vice versa if our friends down under participate?-) Then see
whether all the data came and how fast.

I'm curious about the unreliability and slowness of sludp that people
sometimes mention (Tommi H. the prev time I heard) - I have seen it
sometimes myself, like the day before yesterday logged on several times
to the same region on sciencesim and let Naali sit there a while, to
finally get at least most of the textures for nice sceenshots. But I
don't recall experiencing this in SL, might be just that didn't notice
or don't remember of course. Didn't find anything about this with quick
googling now. Do recall people saying things like 'reliable packets in
sludp are not really reliable' etc. but don't know if that's actually
the case. I think this a major point regarding whether a change is
needed - having incomplete scenes is just not acceptable.

~Toni

lasse...@ludocraft.com

unread,
Jan 21, 2010, 3:40:26 AM1/21/10
to realxt...@googlegroups.com
> I'm curious about the unreliability and slowness of sludp that people
> sometimes mention (Tommi H. the prev time I heard) - I have seen it
> sometimes myself, like the day before yesterday logged on several times
> to the same region on sciencesim and let Naali sit there a while, to
> finally get at least most of the textures for nice sceenshots. But I
> don't recall experiencing this in SL, might be just that didn't notice
> or don't remember of course. Didn't find anything about this with quick
> googling now. Do recall people saying things like 'reliable packets in
> sludp are not really reliable' etc. but don't know if that's actually
> the case. I think this a major point regarding whether a change is
> needed - having incomplete scenes is just not acceptable.
>
> ~Toni

I believe part of the problem is that OpenSim is tuned to serve a SL based
client, and there is an "unspoken" and undocumented part of how SLUDP is
supposed to be used. While Naali uses the same protocol, it does not
conform to the same usage & request patterns (and obviously it can't,
without examining SL code.)

Also reX (even the old viewer) generally stretches the protocol more
because traditionally all assets other than textures were small
(notecards, scripts and such), but then we started using meshes for
instance.

- Lasse, realXtend developer

Ryan McDougall

unread,
Jan 21, 2010, 5:04:31 AM1/21/10
to realxt...@googlegroups.com
On Thu, Jan 21, 2010 at 10:40 AM, <lasse...@ludocraft.com> wrote:
>> I'm curious about the unreliability and slowness of sludp that people
>> sometimes mention (Tommi H. the prev time I heard) - I have seen it
>> sometimes myself, like the day before yesterday logged on several times
>> to the same region on sciencesim and let Naali sit there a while, to
>> finally get at least most of the textures for nice sceenshots. But I
>> don't recall experiencing this in SL, might be just that didn't notice
>> or don't remember of course. Didn't find anything about this with quick
>> googling now. Do recall people saying things like 'reliable packets in
>> sludp are not really reliable' etc. but don't know if that's actually
>> the case. I think this a major point regarding whether a change is
>> needed - having incomplete scenes is just not acceptable.
>>
>> ~Toni
>
> I believe part of the problem is that OpenSim is tuned to serve a SL based
> client, and there is an "unspoken" and undocumented part of how SLUDP is
> supposed to be used. While Naali uses the same protocol, it does not
> conform to the same usage & request patterns (and obviously it can't,
> without examining SL code.)

Right, but this problem is actively and with good will being
addressed. The reason my so many SL-isms remain the code base is lack
of concrete use cases, not the will to go beyond SL.

Having the most cursory conversation with a core OpenSim dev will lend
insight into the legion failings of SLUDP.

> Also reX (even the old viewer) generally stretches the protocol more
> because traditionally all assets other than textures were small
> (notecards, scripts and such), but then we started using meshes for
> instance.

Yes, and our first step to solving this is using HTTP/WebDAV assets.
This could be helped if there was a developer from ludo willing to
work on the server.

> - Lasse, realXtend developer

Ryan McDougall

unread,
Jan 21, 2010, 5:14:47 AM1/21/10
to realxt...@googlegroups.com
On Thu, Jan 21, 2010 at 10:13 AM, Toni Alatalo <ant...@kyperjokki.fi> wrote:
> Ryan McDougall kirjoitti:
>>>
>>> Like said the protocol already existed.
>>>
>>
>> And like I said, that's not material yet.
>>
>
> It was something he could use to make a proof of concept demo. Yes, it's not
> a finished solution for something reX could just start using.
>
>> Yes, one could argue that. That's why we need proposals to know what
>> we're arguing. As of yet, there is no proposal, so there's nothing to
>> argue.
>>
>
> Agreed. Basically how I see this is that the demo will give us more
> information, and is hence a sensible part of research.
>
> One thing of interest there is asset downloads - afaik sludp is slow and
> unreliable for that, have understood that it is a real problem that after
> logging in to a remote server you may end up with an incomplete world with
> missing objects and textures, and it takes long for them o download if they
> come. Was a major problem in one demo situation at least last year (Tuomo
> told me when he was setting it up). That's one reason why we (and especially
> you :) have worked on getting assets over http, 'cause that's reliable and
> at least the thruput for a single download is AFAIK pretty much as fast as
> it can be - hopefully we get that pipeline completed soon.

Doubleplusagreed. I hope we can get SciSim (with WebDAV
inventory/assets) done asap, and leave UDP assets to the junk heap of
history.

This has always been a priority, but circumstance and lack of
resources have conspired against us.

> But Jukka is arguing that it can be done even better, for a large number of
> small assets which is the case in Ogre/reX data, and wants to make a demo of
> that and it hasn't been much work (at least from the project budget).
> Besides thruput, there is the issue of change notifications, which is not a
> prob in a sludp/kristalli style connection, but AFAIK is not solved for
> webdav yet (apart from polling which e.g. file system mounts of webdav dirs

UDP and WebDAV are not mutually exclusive. If a prim associated with
an asset changes its asset, then the change notification is sent over
SLUDP. Nothing new here.

There is one and only one architectural disadvantage to immutable
assets: you cannot know when a stale asset is no longer referenced
anywhere without (costly) enumeration of all primitives.

There is at least one architectural disadvantage to mutable assets: it
doesn't scale beyond a single authoritative server. Therefore it's not
internet scalable. Therefore it doesn't pass the SG requirements.

This is the same discussion I've had 10 times already.

> does, but that doesn't scale well enough?) - I've understood that something
> comet style could be it there.

Let's leave long poll by the side of the road to die, where it
belongs. It's entirely supersceded by HTML5 web sockets.

> This is something we could measure and compare: use a) slviewer &/ libomv &/
> naali code against opensim, b) some http client(s) (wget? naali?) against a
> web server, and the c) upcoming Kristalli storage client-server demo, to
> fetch a fair bunch assets (like 500 meshes which refer to 300 materials
> which use 400 textures, or so). From Australia to Finland and vice versa if
> our friends down under participate?-) Then see whether all the data came and
> how fast.

Yes this would at least allow a reasonable discussion.

> I'm curious about the unreliability and slowness of sludp that people
> sometimes mention (Tommi H. the prev time I heard) - I have seen it
> sometimes myself, like the day before yesterday logged on several times to
> the same region on sciencesim and let Naali sit there a while, to finally
> get at least most of the textures for nice sceenshots. But I don't recall
> experiencing this in SL, might be just that didn't notice or don't remember
> of course. Didn't find anything about this with quick googling now. Do
> recall people saying things like 'reliable packets in sludp are not really
> reliable' etc. but don't know if that's actually the case. I think this a


> major point regarding whether a change is needed - having incomplete scenes
> is just not acceptable.

Agreed with this last point.

> ~Toni
>

Cheers,

Frisby, Adam

unread,
Jan 21, 2010, 6:03:52 AM1/21/10
to realxt...@googlegroups.com
> Having the most cursory conversation with a core OpenSim dev will lend
> insight into the legion failings of SLUDP.

Hi! *waves*

SLUDP has a number of failings, but my favourites are:
- No capability to send important information via TCP. This means, if you have a bit of data which will not be superseded in the near future and you need to ensure delivery on time & reliably - you are going to run up against the entire internet routing infrastructure as your enemy.
- Re-try logic is broken. The viewer seems to respond to retries randomly (at best) and because of the way this works, you may end up resending the same packet more times than was strictly nessecary.
- Requires tracking a huge number of packets to ensure reliable delivery, which results in a good deal of computational processing when attempting to handle thousands of packets a second.
- Too noisy. Clients send a whole bunch of info (such as AgentUpdates) even when they don't have anything particularly new to say; results in a ton of wasted bandwidth.
- Highly undocumented. It's not a standard; many aspects are a mystery even to LL; the way certain things work like redelivery logic were determined through a lot of trial & error.

UDP is great for one thing - and one thing only. Object motion updates. These can be superseded in the next frame, so sending them unreliably over UDP makes sense - they are fundamentally streaming data.

However, Object motion updates represent 1/500th of the packets defined in SLUDP; and the other 499 packets would benefit tremendously from reliable delivery over TCP. It would greatly simplify the processing aspects (letting routers & OS-level networking stacks handle your reliability) and generally result in a less cryptic protocol.

Honestly, I think the protocol could be simplified significantly too - this would lead to other benefits, like much easier extensibility (SLUDP does not support extensions or defining new packets for example.)

My personal advice:
- WebDAV is awesome for inventory. Keep at that. It's a good idea.
- HTTP is pretty good for asset delivery (it has some overhead, but the ubiquity and suitability for caching make up for that.)
- Look at something like RTP for sending streaming updates (eg aforementioned Object Motion Updates).
- Implement something semi-standard like protobufs-over-tcp for normal messages.

There are no good standards out there right now for the 'world representation' part of the protocol; so don't fear making something new if you have to. Just make sure it is extensible by third parties - this may mean exchanging lists of understood messages as part of the handshake; but see what you can do about hitting as many as possible with the 'core' set.

Adam

Jukka Jylänki

unread,
Jan 21, 2010, 12:00:12 PM1/21/10
to realxt...@googlegroups.com
On Wed, 20 Jan 2010 20:53:45 +0200, Morgaine
<morgain...@googlemail.com> wrote:

> Is Kristalli in use somewhere?

KristalliProtocol has been used in a game project before (which is not
much use here since it's closed source), to implement a distributed
computation cluster (http://clb.demon.fi/PointPacker/) in a programming
competition that was held last year, and for some experimentations
(customized asset server, voip architecture). I am hoping to separate some
of these to self-contained code samples so it would be possible to see how
it is used in practice. It might be a while before I have the chance to do
so, so I wanted to give a "heads-up" on the documentation bits already.

On Wed, 20 Jan 2010 21:11:58 +0200, Toni Alatalo <ant...@kyperjokki.fi>
wrote:

> Err actually that upcoming demo is about asset transfers, asset storage
> protocol that supports things like viewing what assets are available
> etc. so similar to inventory in that sense, but not like the SL
> inventory 'cause it actually is the asset store and not just asset
> references.

Just to clarify, the asset storage system that runs on top of this
protocol is not being developed as part of the core realxtend-dev project,
but with external resources. As such, there are no plans to merge it with,
or replace anything existing in the core.

On Wed, 20 Jan 2010 23:08:23 +0200, Ryan McDougall <semp...@gmail.com>
wrote:

> As always a comparison about how this protocol compares to the many
> existing implementations such as
>

> * apache thrift (http://incubator.apache.org/thrift/)
> * RakNet (http://www.jenkinssoftware.com/)
> * ...
>
> [1] used in sirikata
> [2] used in intensity engine
> [3] used in MXP
>

> is of course required. This should include the amount of time required
> to complete the code base versus the amount of time required to learn
> an existing library.
>

> You also reference the suitability of a network library for use within
> reX. Can you also link to a discussion and analysis of realXtend's
> requirements for a network layer and a comparison matrix of your
> protocol as well as some of the above alternatives?
>

> Observations about the directions of existing popular systems such as
> Second Life or Unreal Engine would also be appreciated, as decisions
> as profound as those regarding the protocol layer are never made
> lightly; especially given the Steering Group's oft repeated insistence
> that realXtend (Naali) should never be caught depending on a single
> back-end server/protocol.

By the request from Toni, I documented the internals of a previously
existing protocol I wrote, since he estimated it to be valuable
information when discussing other protocols as well. As you well note,
this task does not include comparison of this system to the ones you
mentioned. A prerequisite for this kind of comparison is the existence of
a reference documentation, and with this documentation, anyone can now
make the comparisons themselves. People will believe their own words
better than those of others, especially when there is a danger that mine
can be biased in this respect.

Now, some developers might have been jumping into conclusions that this
protocol has already been decided to replace the existing implemented
architecture. This has not been even proposed, in fact, the e-mail I
posted does not come with any kind of proposal attached at all.

On Wed, 20 Jan 2010 23:43:05 +0200, Ryan McDougall <semp...@gmail.com>
wrote:

> Also, your library appears to be windows-only. Do you have any plans
> for Linux or Mac ports?

In the core development, no ports of any kind have been planned. Designing
before implementing is important, so that work is not wasted when
directions change.

Thanks for the feedback,

Jukka

Ryan McDougall

unread,
Jan 21, 2010, 1:52:02 PM1/21/10
to realxt...@googlegroups.com

Not really.

Most of us have specialities, and I wouldn't pretend one of my is
protocol design. As a group we rely on the good will of individuals to
protect the collective best interest. That is what is required.

> those of others, especially when there is a danger that mine can be biased
> in this respect.

I would rather trust you word than fear that it's biased.

> Now, some developers might have been jumping into conclusions that this
> protocol has already been decided to replace the existing implemented
> architecture. This has not been even proposed, in fact, the e-mail I posted
> does not come with any kind of proposal attached at all.

Yes the fact that the original email contained over-general
assertions, most specifically that SLUDP is broken (granted) and
Kristalli fixes what's broken, yet came with no proposal to debate,
was entirely the problem. Numerous people all had a read of the
original email and their interpretation wasn't substantially different
than mine. Perhaps you could have avoided this misunderstanding by
speaking with me earlier.

> On Wed, 20 Jan 2010 23:43:05 +0200, Ryan McDougall <semp...@gmail.com>
> wrote:
>
>> Also, your library appears to be windows-only. Do you have any plans
>> for Linux or Mac ports?
>
> In the core development, no ports of any kind have been planned. Designing
> before implementing is important, so that work is not wasted when directions
> change.
>
> Thanks for the feedback,
>
>        Jukka
>

Toni Alatalo

unread,
Jan 21, 2010, 7:04:16 PM1/21/10
to realxt...@googlegroups.com
Jukka Jylᅵnki kirjoitti:

> design/testing period resulted in some ideas for a protocol
> architecture that would be more suitable for use in rex. The
> description of the architecture as well as a reference library
> implementation (C++, Winsock2) is now hosted at
> http://clb.demon.fi/Kristalli/ .

Took a first look now, I found you've done a good job with the docs
there - nice clear explanations etc. Some thoughts, also kind of newbie
questions 'cause I haven't really dealt with udp on this level before:

- Ping is unreliable, that i can understand and seems normal, 'cause it
can be used to measure packet loss etc. But "*ConnectionLostTimeout*:
PingInterval * 3" - does that mean that three consequtive lost ping
packets cuts the whole connection, or am I misunderstanding this
somehow? Ah well it'd be 15 seconds .. so not a short time, just bad
luck if all those 3 packets would be dropped even though some packets
would be getting through? I guess with wireless connections packet drop
can be quite high sometimes .. would it better to use some reliable
packet to see whether the conn is alive, or is this the best way?

- *Disconnect* is unreliable - why? The server will see the
disconnecting anyway then the connection is actually closed? I don't see
why messages like this would be unreliable, they are not streaming data
like voice or movements, but session management stuff which I would have
thought is good with reliable packets.

Also there seems to be a copy-paste typo on the writing a client page,
as it says "for the duration that the server is running" when it means
to say that the client (or the connection?) is running.

BTW are the MXP folks lurking here, did you take a note of this? Last I
understood you were in progress of working on a transport layer spec?
That's actually a main reason why I asked Jukka to document this
protocol he'd written earlier, so we could see how it matches MXP goals
and is similar/different w.r.t. what you've been doing there.

I think thanks to reading this I'm also better equipped to read Enet and
MXP etc, just took a quick glance of Enet earlier today but could read
more later (i expect it is small too, like Kristalli is, both in spec
and code).

Also seemed interesting that the Sirikata folks are planning to do
something with websockets - not using Enet in that case I figure, 'cause
it's udp only. I wonder if same Kristalli server code could handle both
udp clients and websocket using tcp clients, 'cause it's just a mode for
it and it can operate in either. Of course on TCP the own transport
layer becomes (almost) non-existing, 'cause TCP is the Transport Control
Protocol already .. like the minimality of the Kristalli over tcp spec
also shows .. so I think e.g. the Sirikata folks just put the google
protobug packets to the socket.

> Jukka

~Toni

Jukka Jylänki

unread,
Jan 26, 2010, 2:53:45 PM1/26/10
to realxt...@googlegroups.com
On Fri, 22 Jan 2010 02:04:16 +0200, Toni Alatalo <ant...@kyperjokki.fi>
wrote:

> Jukka Jylänki kirjoitti:


>> design/testing period resulted in some ideas for a protocol
>> architecture that would be more suitable for use in rex. The
>> description of the architecture as well as a reference library
>> implementation (C++, Winsock2) is now hosted at
>> http://clb.demon.fi/Kristalli/ .
>
> Took a first look now, I found you've done a good job with the docs
> there - nice clear explanations etc. Some thoughts, also kind of newbie
> questions 'cause I haven't really dealt with udp on this level before:
>
> - Ping is unreliable, that i can understand and seems normal, 'cause it
> can be used to measure packet loss etc. But "*ConnectionLostTimeout*:
> PingInterval * 3" - does that mean that three consequtive lost ping
> packets cuts the whole connection, or am I misunderstanding this
> somehow? Ah well it'd be 15 seconds .. so not a short time, just bad
> luck if all those 3 packets would be dropped even though some packets
> would be getting through? I guess with wireless connections packet drop
> can be quite high sometimes .. would it better to use some reliable
> packet to see whether the conn is alive, or is this the best way?

The way the documentation is meant to be read is that if the channel is
silent for the period of 'PingInterval * 3', it is declared lost.
So essentially yes, it can happen that in an otherwise idle channel, two
consecutive ping messages are lost and the third one does not get there in
time, which will cause a disconnection. If there are other messages going
through in the channel, this will not cause a disconnection by timeout.

> - *Disconnect* is unreliable - why? The server will see the
> disconnecting anyway then the connection is actually closed? I don't see
> why messages like this would be unreliable, they are not streaming data
> like voice or movements, but session management stuff which I would have
> thought is good with reliable packets.

To simplify the message handling mechanism, Disconnect is not acked
through the built-in PacketAck message, but using an explicit
DisconnectAck message. If a client wants to disconnect, it can wait for a
grace period, or can just forcibly close down without waiting for an Ack
message.

Thanks for the replies. These will be taken into account when revising the
document.

Jukka

Tommi Laukkanen

unread,
Jan 30, 2010, 2:45:42 AM1/30/10
to realXtend-dev
Hello Jukka

I have couple of questions about Kristalli transport layer
specification:

* Three step handshake to form connection.
-> What are the advantages of this approach compared to simple
connection request and response message pair?

* Message prioritization.
-> How is this practically handled with ordered messages? Can you set
the priority per message or per message type?

Regards,
Tommi

jukka....@ludocraft.com

unread,
Jan 31, 2010, 10:02:57 AM1/31/10
to realxt...@googlegroups.com
> Hello Jukka
>
> I have couple of questions about Kristalli transport layer
> specification:
>
> * Three step handshake to form connection.
> -> What are the advantages of this approach compared to simple
> connection request and response message pair?

Nothing major, just the same deal as with TCP three-way handshake. The
extra last message explicitly allows the server to properly transition
from a pending state into an ok state, without having to wait for a
timeout to realize if there was a problem. I see both methods (2-way and
3-way) are viable.

>
> * Message prioritization.
> -> How is this practically handled with ordered messages? Can you set
> the priority per message or per message type?
>
> Regards,
> Tommi
>

Prioritization comes into play when is more data to transfer than the
channel bandwidth can handle. High priority messages get transferred first
and low priority messages need to wait. The priority value manages the
order of messages in a priority queue. The priority is a per-message
attribute. If a message depends on another message, there is of course no
sense in sending it with a higher priority than the previous one, since it
would have to wait for a future message to appear to be processed.

Best Regards,

Jukka


Tommi Laukkanen

unread,
Jan 31, 2010, 1:54:37 PM1/31/10
to realXtend-dev
Thanks for the details. Could you elaborate how Kristalli handles
message dependencies. I understood from the other message that this is
an advanced way of fulfilling the channel requirement? Have you
experienced it to be practical to leave definition of the message
dependencies to the application runtime? I would imagine this includes
creating message dependency trees/lists out of scene graph and other
state modifications? Currently our rough plan for 0.6 is to pump
object creations, updates and inserts via single reliable sequenced
channel, large data blocks via single sequenced reliable background
channel and movement (and other) signals via unreliable sequenced
object specific channels. The transport layer will be supporting
flexible way of using channels though and previous sentence describes
how we envision to use it in conjunction with our own message layer.
It would be beneficial to define some channels like the large data
background channel to have lower priority and prioritization could be
good addition to the 0.6 spec.

-tommi

Tommi Laukkanen

unread,
Jan 31, 2010, 2:00:36 PM1/31/10
to realXtend-dev
Lets continue on this thread to include Sirikata and other interested
parties:

http://groups.google.com/group/kyoryoku/browse_thread/thread/8bf0f57e888480f5

Reply all
Reply to author
Forward
0 new messages