[Standards] NEW: XEP-0312 (PubSub Since)

28 views
Skip to first unread message

XMPP Extensions Editor

unread,
Feb 29, 2012, 9:13:57 PM2/29/12
to stan...@xmpp.org
Version 0.1 of XEP-0312 (PubSub Since) has been released.

Abstract: This specification defines a publish-subscribe feature that enables a subscriber to automatically receive pubsub and PEP notifications since the last logout time of a specific resource.

Changelog: Initial published version. (psa)

Diff: N/A

URL: http://xmpp.org/extensions/xep-0312.html

Sergey Dobrov

unread,
Mar 1, 2012, 7:36:25 AM3/1/12
to stan...@xmpp.org
Very expected XEP but very ambiguous:
1. "The service MAY include Result Set Management [4] data so that the
subscriber can page through the set of interim notifications, as
described in Section 6.5.3 of XEP-0060."
How can it be done if it will send separate events in <message> stanzas?

2. "The service SHOULD NOT include items that were removed from the node"
That's ok but what about retracts and other events which were fired
during the user's absence? I mean, I might to want to know which already
discovered items were retracted.

3. There is no possibility to retrieve notification by the query. It's
needed, for example, for nodes which has no user's presence subscription.

4. Not important subjective opinion: it will be more useful to operate
with some "revision" and not with time. Then clients will be allowed to
track if they missed something in case of connection troubles or if some
other resource has got higher priority and intercepted events.


--
With best regards,
Sergey Dobrov,
XMPP Developer and JRuDevels.org founder.

Stephan Maka

unread,
Mar 1, 2012, 7:57:12 AM3/1/12
to stan...@xmpp.org
Hi!

XMPP Extensions Editor wrote:
> Version 0.1 of XEP-0312 (PubSub Since) has been released.

Great that this is approached by somebody!

> Abstract: This specification defines a publish-subscribe feature that
> enables a subscriber to automatically receive pubsub and PEP
> notifications since the last logout time of a specific resource.

For buddycloud we use MAM[1] so far but gave the problem some thought
before.

1. Specifying a relative timespan instead of absolute timestamps avoids
the need for global clock synchronization. That is smart. Until now
we used the timestamp of the latest pubsub notification (ATOM
payload).

2. I've been very reluctant to employ <presence/> for requesting
history, especially presence broadcasts.

Presence will be resent upon changing the online status & text. It
could be important to exclude the XEP-0312 information in all but the
very first presence stanza.

Moreover, presence can be redistributed by a user's server. This
happens only on presence probes, right? Is this still safe enough?
What if the pubsub service needs to be restarted, subsequently
probing for presence of its subscribed users? It will needlessly
replay history.

Admittedly in buddycloud clients only communicate with their one
"inbox" pubsub service, so the MAM <iq/> doesn't add much complexity.
I can see a reason for <presence/> when dealing with many entities
though.

3. MAM specifies replayed notifications to be sent in XEP-0297ish
envelopes, despite that there's no forwarding going on. The history
sender is the notification source. XEP-0312 doesn't specify that and
I guess there is no reason for it?

4. RSM with <presence/> and multiple <message/> stanzas in response
needs to be specified clearly. RSM is only straight-forward for
<iq/>-based extensions such as MAM.


Stephan

[1] http://doomsong.co.uk/extensions/render/message-archive-management.html

Kevin Smith

unread,
Mar 2, 2012, 8:53:21 AM3/2/12
to XMPP Standards
On Thu, Mar 1, 2012 at 2:13 AM, XMPP Extensions Editor <edi...@xmpp.org> wrote:
> Version 0.1 of XEP-0312 (PubSub Since) has been released.

I have two main problems with this, the first is that using
timestamp-equivalents isn't a reliable way of syncing things on the
network. We have this approach in XEP-0045, and it results in
duplicates quite a lot of the time (without the added buffer period in
312) I suspect it also results in missed messages, although I've
clearly never seen them. The buffer period makes missed messages less
likely (although still possible), but increases the chance of
duplicates. It's not clear to me that this isn't harmful.

The second is that there are many other times a pubsub service will
receive delay inside a presence, and this proposal could result in
duplication within a single stream.

/K

Sergey Dobrov

unread,
Mar 2, 2012, 9:05:32 AM3/2/12
to stan...@xmpp.org
+1 for this problem. I (tried) explained the same thing.

Joe Hildebrand

unread,
Mar 5, 2012, 4:08:56 AM3/5/12
to Kevin Smith, XMPP Standards
On 3/2/12 6:53 AM, "Kevin Smith" <ke...@kismith.co.uk> wrote:

> On Thu, Mar 1, 2012 at 2:13 AM, XMPP Extensions Editor <edi...@xmpp.org>
> wrote:
>> Version 0.1 of XEP-0312 (PubSub Since) has been released.
>
> I have two main problems with this, the first is that using
> timestamp-equivalents isn't a reliable way of syncing things on the
> network. We have this approach in XEP-0045, and it results in
> duplicates quite a lot of the time (without the added buffer period in
> 312) I suspect it also results in missed messages, although I've
> clearly never seen them. The buffer period makes missed messages less
> likely (although still possible), but increases the chance of
> duplicates. It's not clear to me that this isn't harmful.

I don't see that as a large problem in pub/sub, since a later publish with
the same ID replaces the previous publish. As long as the receiving server
backs up enough to deal with reasonable clock skew, the worst case is a
small number of republishes, which is usually going to be much better than a
full synchronization, and is never going to be worse.

> The second is that there are many other times a pubsub service will
> receive delay inside a presence, and this proposal could result in
> duplication within a single stream.

A) Let's change the namespace then.
B) It may be that all of those times are at initial login, in which case
there's no need to change the namespace.

--
Joe Hildebrand

Kevin Smith

unread,
Mar 5, 2012, 4:21:34 AM3/5/12
to Joe Hildebrand, XMPP Standards
On Mon, Mar 5, 2012 at 9:08 AM, Joe Hildebrand <jhil...@cisco.com> wrote:
>> duplicates. It's not clear to me that this isn't harmful.
>
> I don't see that as a large problem in pub/sub, since a later publish with
> the same ID replaces the previous publish.  As long as the receiving server
> backs up enough to deal with reasonable clock skew, the worst case is a
> small number of republishes, which is usually going to be much better than a
> full synchronization, and is never going to be worse.

This assumes that a client will need to keep a cache of the node's
items such that it can do duplicate elimination locally. This seems
icky, but perhaps it's the cost of doing business. Are we sure there
are no edge cases in which a client will be unable to determine that
the stanzas are duplicates? Is there a way for the client to know how
much local caching it needs to do?

>> The second is that there are many other times a pubsub service will
>> receive delay inside a presence, and this proposal could result in
>> duplication within a single stream.
>
> A) Let's change the namespace then.
> B) It may be that all of those times are at initial login, in which case
> there's no need to change the namespace.

A) Yucky presence payload abuse - but again, maybe it's the cost of
doing business. It's not terrible, anyway, and I partly prefer this,
as it makes the behaviour opt-in.

B) I don't think they are - there's autoaway stuff as well, at least.

/K

Joe Hildebrand

unread,
Mar 5, 2012, 4:37:33 AM3/5/12
to Kevin Smith, XMPP Standards
On 3/5/12 2:21 AM, "Kevin Smith" <ke...@kismith.co.uk> wrote:

> On Mon, Mar 5, 2012 at 9:08 AM, Joe Hildebrand <jhil...@cisco.com> wrote:
>>> duplicates. It's not clear to me that this isn't harmful.
>>
>> I don't see that as a large problem in pub/sub, since a later publish with
>> the same ID replaces the previous publish.  As long as the receiving server
>> backs up enough to deal with reasonable clock skew, the worst case is a
>> small number of republishes, which is usually going to be much better than a
>> full synchronization, and is never going to be worse.
>
> This assumes that a client will need to keep a cache of the node's
> items such that it can do duplicate elimination locally. This seems
> icky, but perhaps it's the cost of doing business.

Oh! Of course that's what I was thinking. The main use case I've got is
large PEP nodes for long-lived stuff like vCards, ala XEP-292, section 5.2.

> Are we sure there
> are no edge cases in which a client will be unable to determine that
> the stanzas are duplicates? Is there a way for the client to know how
> much local caching it needs to do?

No, I'm not positive, but I can't think of any, aside from poorly-written
clients that shouldn't have sent the time in their presence. Don't
correctly-written clients already need to match IDs for incoming
notifications to catch updates?

>>> The second is that there are many other times a pubsub service will
>>> receive delay inside a presence, and this proposal could result in
>>> duplication within a single stream.
>>
>> A) Let's change the namespace then.
>> B) It may be that all of those times are at initial login, in which case
>> there's no need to change the namespace.
>
> A) Yucky presence payload abuse - but again, maybe it's the cost of
> doing business. It's not terrible, anyway, and I partly prefer this,
> as it makes the behaviour opt-in.
>
> B) I don't think they are - there's autoaway stuff as well, at least.

Oh, yeah. I had forgotten about that one. Ew. Ok, let's use a different
namespace.

Now, should we keep the time in seconds, or move to UTC? If we were UTC,
and both sides had good time sync, you could actually be more accurate in
matching, since network latency wouldn't matter. In practice, it's probably
a wash. Number of seconds is likely less bytes and easier to program
correctly.

--
Joe Hildebrand

Kevin Smith

unread,
Mar 5, 2012, 4:46:11 AM3/5/12
to Joe Hildebrand, XMPP Standards
On Mon, Mar 5, 2012 at 9:37 AM, Joe Hildebrand <jhil...@cisco.com> wrote:
> On 3/5/12 2:21 AM, "Kevin Smith" <ke...@kismith.co.uk> wrote:
>> This assumes that a client will need to keep a cache of the node's
>> items such that it can do duplicate elimination locally. This seems
>> icky, but perhaps it's the cost of doing business.
>
> Oh!  Of course that's what I was thinking.  The main use case I've got is
> large PEP nodes for long-lived stuff like vCards, ala XEP-292, section 5.2.

Ah. It suddenly makes much more sense to me.

>> Are we sure there
>> are no edge cases in which a client will be unable to determine that
>> the stanzas are duplicates? Is there a way for the client to know how
>> much local caching it needs to do?
>
> No, I'm not positive, but I can't think of any, aside from poorly-written
> clients that shouldn't have sent the time in their presence.  Don't
> correctly-written clients already need to match IDs for incoming
> notifications to catch updates?

I think it depends on the type of node. Notification-only nodes, I
wouldn't have thought so (and getting 'duplicates' there seems to be a
bad thing, potentially).

> Oh, yeah.  I had forgotten about that one.  Ew.  Ok, let's use a different
> namespace.

OK.

> Now, should we keep the time in seconds, or move to UTC?  If we were UTC,
> and both sides had good time sync, you could actually be more accurate in
> matching, since network latency wouldn't matter.  In practice, it's probably
> a wash.  Number of seconds is likely less bytes and easier to program
> correctly.

I think the problems are broadly similar with both.

/K

Peter Saint-Andre

unread,
Mar 8, 2012, 4:12:15 PM3/8/12
to ke...@kismith.co.uk, XMPP Standards
On 3/5/12 2:46 AM, Kevin Smith wrote:
> On Mon, Mar 5, 2012 at 9:37 AM, Joe Hildebrand <jhil...@cisco.com> wrote:
>> On 3/5/12 2:21 AM, "Kevin Smith" <ke...@kismith.co.uk> wrote:
>>> This assumes that a client will need to keep a cache of the node's
>>> items such that it can do duplicate elimination locally. This seems
>>> icky, but perhaps it's the cost of doing business.
>>
>> Oh! Of course that's what I was thinking. The main use case I've got is
>> large PEP nodes for long-lived stuff like vCards, ala XEP-292, section 5.2.
>
> Ah. It suddenly makes much more sense to me.
>
>>> Are we sure there
>>> are no edge cases in which a client will be unable to determine that
>>> the stanzas are duplicates? Is there a way for the client to know how
>>> much local caching it needs to do?
>>
>> No, I'm not positive, but I can't think of any, aside from poorly-written
>> clients that shouldn't have sent the time in their presence. Don't
>> correctly-written clients already need to match IDs for incoming
>> notifications to catch updates?
>
> I think it depends on the type of node. Notification-only nodes, I
> wouldn't have thought so (and getting 'duplicates' there seems to be a
> bad thing, potentially).
>
>> Oh, yeah. I had forgotten about that one. Ew. Ok, let's use a different
>> namespace.
>
> OK.

WFM. I thought it might be easier to re-use iq:last because we already
have that defined. However, it's unlikely that people would want to use
last login time for synchronization operations other than pubsub (e.g.,
we already have one for MUC history, which is the other major use case I
can think of), so a specialized extension for pubsub-since is fine with
me...

<presence from='jul...@capulet.com/balcony'>
<since xmlns='urn:xmpp:pubsub:since' seconds='86511'>
</presence>

>> Now, should we keep the time in seconds, or move to UTC? If we were UTC,
>> and both sides had good time sync, you could actually be more accurate in
>> matching, since network latency wouldn't matter. In practice, it's probably
>> a wash. Number of seconds is likely less bytes and easier to program
>> correctly.
>
> I think the problems are broadly similar with both.

Agreed, no strong preference here.

Peter

--
Peter Saint-Andre
https://stpeter.im/


Peter Saint-Andre

unread,
Mar 8, 2012, 4:25:53 PM3/8/12
to XMPP Standards
On 3/8/12 2:12 PM, Peter Saint-Andre wrote:
> On 3/5/12 2:46 AM, Kevin Smith wrote:
>> On Mon, Mar 5, 2012 at 9:37 AM, Joe Hildebrand <jhil...@cisco.com> wrote:

<snip/>

>>> Oh, yeah. I had forgotten about that one. Ew. Ok, let's use a different
>>> namespace.
>>
>> OK.
>
> WFM. I thought it might be easier to re-use iq:last because we already
> have that defined. However, it's unlikely that people would want to use
> last login time for synchronization operations other than pubsub (e.g.,
> we already have one for MUC history, which is the other major use case I
> can think of), so a specialized extension for pubsub-since is fine with
> me...
>
> <presence from='jul...@capulet.com/balcony'>
> <since xmlns='urn:xmpp:pubsub:since' seconds='86511'>

Well, this is presence. Let's see if we can shorten it.

<since xmlns='urn:xmpp:pubsub:since' seconds=''/> = 49 chars

<ago xmlns='urn:xmpp:ago' secs=''/> = 35 chars

Joe Hildebrand

unread,
Mar 11, 2012, 6:07:11 AM3/11/12
to XMPP Standards
On 3/8/12 2:25 PM, "Peter Saint-Andre" <stp...@stpeter.im> wrote:

>> <presence from='jul...@capulet.com/balcony'>
>> <since xmlns='urn:xmpp:pubsub:since' seconds='86511'>
>
> Well, this is presence. Let's see if we can shorten it.
>
> <since xmlns='urn:xmpp:pubsub:since' seconds=''/> = 49 chars
>
> <ago xmlns='urn:xmpp:ago' secs=''/> = 35 chars

If you're going to go that far, how about "s" instead of "secs". I don't
love "ago". Some other ideas:

- off
- missed
- age
- rewind
- delta
- diff

None of which is a clear winner to me.

--
Joe Hildebrand

Peter Saint-Andre

unread,
Mar 14, 2012, 8:21:05 PM3/14/12
to XMPP Standards
On 3/11/12 4:07 AM, Joe Hildebrand wrote:
> On 3/8/12 2:25 PM, "Peter Saint-Andre" <stp...@stpeter.im> wrote:
>
>>> <presence from='jul...@capulet.com/balcony'>
>>> <since xmlns='urn:xmpp:pubsub:since' seconds='86511'>
>>
>> Well, this is presence. Let's see if we can shorten it.
>>
>> <since xmlns='urn:xmpp:pubsub:since' seconds=''/> = 49 chars
>>
>> <ago xmlns='urn:xmpp:ago' secs=''/> = 35 chars
>
> If you're going to go that far, how about "s" instead of "secs".

Only saves 3 characters and it's awfully cryptic.

> I don't
> love "ago".

A long time ago in a galaxy far far away...

> Some other ideas:
>
> - off
> - missed
> - age
> - rewind
> - delta
> - diff
>
> None of which is a clear winner to me.

I think 'off' is fine = I went offline S seconds ago.

Reply all
Reply to author
Forward
0 new messages