* John Kalucki <j...
@twitter.com> [101031 20:30]:
> Create two in-memory hash sets of seen ids. Write ids to both. If the id is
> found on write, discard. Alternatively expire them every few tens of
> minutes to bound growth, but provide continuous coverage.
That's what I'm doing now for the Streaming API and it works very well.
But in the Site Streams API, I might receive the same ID several times
in context of different users (for_user).
E.g., status N mentions users A, B, and C. In addition it is favorited
by user D. If I'm following all 4 users is the in with Site Streams,
I'll see N 4 times in 4 different messages. However, if any of those
messages is repeated, I need to discard the repeats.
So, I can't simply track status IDs like I do in the Streaming API. I
need to track for_user/type/status_id.
Or am I missing somethings, here?