Annotation of content on sites like Facebook or Twitter?

Patrik Hoyer

未讀,

2019年5月20日中午12:20:132019/5/20

收件者：dev

Dear hypothes.is developers,

(I only very recently learned of this project. If the questions I am asking here already are answered elsewhere, or if this is the wrong forum, please excuse me and simply point me to the appropriate resources or forum. Thanks!)

Let me start by saying that I believe web annotation is a technology which has enormous potential. Also, I really like the open model used by the hypothes.is project! I am, however, wondering about one big conceptual issue.

If I am not mistaken, it seems that the hypothes.is system is specifically targeted at annotating individual webpages (or pdfs), accessible with a fixed URL. In other words, the underlying idea is that a fixed URL yields a (relatively) static webpage (which is almost identical for all users), and it is then possible on this webpage to leave annotations that other users can see (and reply to).

If the above is a reasonably accurate description of the philosophy of the system, then how does the system cope with content on Facebook or Twitter, or similar social media platforms, in which the role of the URL is much reduced. On such sites, users may see very different content at the same URL, and on the other hand users may see the same content at different URLs. In other words, the URL itself is not very much tied to what content is displayed. It would seem that the hypothes.is system would not work well in such cases? Or?

This brings me to the following question: What if an annotation would not really be tied to any specific webpage, but more generally to any piece of content seen in the browser? Has such an approach been considered for hypothes.is, and if so, what are the main arguments against it? Is this perhaps a direction that you are considering going?

Obviously, there are problems with linking a particular annotation to content; it is not clear how to determine if a match is "good enough" to say that the text the user is currently viewing is indeed the same as that which was originally annotated. Some kind of "fuzzy matching" is needed, of the sort that already is used by hypothes.is, but it would need to be more general. Perhaps the problem would be alleviated by attaching annotations to links (in the document), and/or to images (the URL of an image, or a hash of an image file, might serve as a fingerprint).

In any case, it would be great if the commenting model of hypothes.is could apply to dynamic content on sites such as Facebook or Twitter, or even on e-mails read through gmail (for flagging various forms email-based fraud, for example).

Any thoughts on this?

-Patrik

Steel Wagstaff

未讀,

2019年5月20日中午12:49:082019/5/20

收件者：d...@list.hypothes.is

Hi Patrik,

You might be interested in this article about 'fuzzy anchoring': https://web.hypothes.is/blog/fuzzy-anchoring/ and this (somewhat dated, but still useful) reading list: https://web.hypothes.is/robust-anchoring/. You can also see some simple advice for publishers that helps them understand the meta/microdata that they can include to make their documents more friendly for annotation even if they do not share exactly matching URLs: https://web.hypothes.is/for-publishers/

All best,
Steel

--
You received this message because you are subscribed to the Google Groups "dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dev+uns...@list.hypothes.is.
To post to this group, send email to d...@list.hypothes.is.
To view this discussion on the web visit https://groups.google.com/a/list.hypothes.is/d/msgid/dev/e4f50148-2077-4c60-a4b6-b9a5e3ee3a77%40list.hypothes.is.

Patrik Hoyer

未讀,

2019年5月20日晚上11:30:052019/5/20

收件者：dev

Thanks Steel for those links. I must admit I had not read those pages carefully yet; I took a closer look at them now.

I noticed that the pages on fuzzy/robust anchoring are more than 6 years old. I wonder, is this still representative of the current state? The selectors listed do seem quite strongly targeted to a "relatively static" document, as opposed to fluid content, so I would not expect these types of selectors to so work well on social media sites.

Do you (or someone else reading this) know, what is the status of hypothes.is concerning sites such as Facebook and Twitter? Is annotation on such platforms considered working well enough (and thus it is fully recommended and supported to use hypothes.is annotations for such sites)? If not, are there any improvements planned? (Some weeks ago I had a look at the public roadmap and did not there see anything related to this, but I must admit I may simply have missed it.)

Best regards,

Patrik

To unsubscribe from this group and stop receiving emails from it, send an email to d...@list.hypothes.is.

dan.w...@gmail.com

未讀,

2019年5月21日上午11:16:512019/5/21

收件者：dev

On Monday, May 20, 2019 at 11:30:05 PM UTC-4, Patrik Hoyer wrote:

Thanks Steel for those links. I must admit I had not read those pages carefully yet; I took a closer look at them now.

I noticed that the pages on fuzzy/robust anchoring are more than 6 years old. I wonder, is this still representative of the current state?

Yes. There are some very subtle differences, but in the areas that are probably important, yes it's still the same.

The selectors listed do seem quite strongly targeted to a "relatively static" document, as opposed to fluid content, so I would not expect these types of selectors to so work well on social media sites.

Well, the whole purpose of fuzzy anchoring was to solve for documents where there might be changes from one time to the next vs static documents. But if you're thinking about a truly fluid "feed" of the sort that you'd find at "facebook.com", then no "overlay" style third party annotation solution is likely ever going to be effective, unless that solution is targeting ID-level modules on the page and "knows" about facebook more natively. Even still if the content is no longer visible in the feed, then you're out of luck. From our perspective this is something which at that point becomes more a part of facebook the app then Hypothesis.

If however you're targeting something at facebook that is more like a normal webpage, like this: https://www.facebook.com/justinamash/posts/1326275734078496

Then annotation will work just fine, even if the content changes somewhat.

Similarly, if you're annotating your twitter feed, then as the feed shifts and the originally annotated tweet falls off the feed, you're never going to successfully anchor to it no matter how dynamic your algorithm. The original content is simply not in the DOM returned on the feed page. However, if you want to annotate an individual tweet, there's no problem at all.

My key question here is: What were you hoping would happen-- in detail?

Matthew Schneider

未讀,

2019年5月21日下午1:19:372019/5/21

收件者：dev、dan.w...@gmail.com

"you're never going to successfully anchor to it no matter how dynamic your algorithm."

There may be hope. See: https://hyp.is/RPqsBHvqEemJQFcf0qQnIg/web.hypothes.is/blog/fuzzy-anchoring/

Matt

Matthew Schneider

未讀,

2019年5月21日下午1:23:082019/5/21

收件者：dev、dan.w...@gmail.com

Maybe I should have just posted this:

This can be (nearly always) determined using "Matt's Rule of Text" which is as follows: "Any 8 to 10 word string uniquely identifies a document". Yep, mind-boggling, I know.

Here's an example (for this page): https://www.google.com/search?q=%22Assuming%20that%20we%20can%20determine%20that%20we%20are%22

Another example for this page: https://www.google.com/search?q=%22this%20fast%20and%20straightforward%20method%20will%20find%20a%22

And for completeness, one more: https://www.google.com/search?q=%22This%20is%20an%20old%20problem,%20and%20over%20the%22

Note that the last example returns two results. The second result does not include the quoted ",". The first result is (nearly always) the source document.

This algorithm has some very important ramifications, I feel. Its use here, in this type of application (e.g. documents returning a 404 could be passed to Google for reattachment of annotations) and another pressing one, determining the source of fake news. There are other uses clearly.

Please share with those who care. Thank me later. :)

ASIDE: I "discovered" this algorithm in 2003 during my work with PurpleSlurple and QuIP.

Patrik Hoyer

未讀,

2019年5月22日凌晨1:27:352019/5/22

收件者：dev、dan.w...@gmail.com

Hi Dan, Steel, Matthew, and others,

The use-case I am thinking of is fighting misinformation online. Such misinformation may come in the form of posts on Facebook or Twitter (or similar sites) with the post containing some brief text and in addition an image and a link to a "news story". Users would typically see such posts in their scrollable "feed" (rather than via a direct link to the post).

Typically the linked-to "news story" would have a proper URL (since there is a link to it) and a fact-checking team could use hypothes.is to indicate in comments that some of the "facts" presented in the story are actually not true. This is very nice and useful. Unfortunately, I believe it is quite common that people would see the post on Facebook and interact with it (share it, like it, react, comment) in ways that spread it further, possibly without ever going to the "news story" itself. Thus, it seems to me that there is a need for users to be able to see comments by the fact-checkers directly in the Facebook or Twitter "feed", rather than on the targeted "news story" page.

Of course, Facebook and Twitter already allow various forms of commenting on and replying to posts. Why then would the fact-checkers simply not use the existing mechanisms provided by the social media giants themselves? To me, it seems that the commenting/annotating mechanisms that the platforms provide are simply not adequate. One example of this is that, as far as I am aware, comments or annotations do not transfer from one post to another. Thus, two separate posts with essentially the same content (and same outbound link) are treated as fully distinct, and there is no way to transfer any annotations from one to the other. There are other ways as well in which the existing commenting mechanisms are suboptimal (a similar comparison concerns how annotations work on hypothes.is compared with traditional comments at the bottom of blog posts).

The issue of how to anchor an annotation to the content is critical, of course, but I don't think this is an impossible problem. I fully agree with Dan that if a story falls off the "feed" then it is invisible to a user and hence there is no place to anchor an annotation. But in this case there is also no need for the annotation! While perhaps it would seem that posts are quite short-lived and hence users (or fact-checkers) would have little incentive to annotate, my understanding is that various forms of misinformation and misleading memes not only circulate quite widely but also often resurface after some period of lying dormant, so annotations might again find content to attach to long after the annotations are originally written.

Concerning how to match annotations to dynamic content, I agree that the problem is somewhat difficult when considering text-only content. As Matthew pointed out on this thread though, when sufficient textual content is provided the chance of a false match actually becomes quite low. Nevertheless, an easier solution might be to anchor an annotation to an outbound link (the target URL) or to an image/video (either the URL of the file, or a digital fingerprint of the content itself).

Perhaps the kind of annotations I am thinking of are not in the roadmap of hypothes.is. But perhaps they could be? Anyway, I believe it is something to consider... which is why I started this thread here.

Any thoughts? Are there good reasons why annotations of the form I am arguing for here (attached to content only, regardless of URL; perhaps tied explicitly to outbound links or to image/video content) would never work?

Best regards,

Patrik

回覆所有人

回覆作者

轉寄