another angle of covering the news

21 views
Skip to first unread message

Waqas Ahmed

unread,
Dec 17, 2014, 5:17:01 PM12/17/14
to infobi...@googlegroups.com
hi, i just read about infobitt on reddit and it seems interesting

for the past couple of weeks i have been working on a prototype of my own, tackling the same "problem" but approaching it from a different angle.

Lets start with what the "problem" is. For wikipedia, the problem was that there was too much information that is unstructured, disorganized and spread across the web. wikipedia organized it and structured it. The problem is same for news, there is a lot of unstructured data, opinions, facts scattered all over the web from small blogs to major media outlets. there are all kinds of biases, spins, motives, propaganda, advocacy, PR and sheer random bullshit from top to bottom. So how do we organize and structure it?

You guys are dealing in "bitts" which i understand as the smallest unit of information in a news and we assume that a bitt is a true fact. That is a giant leap of faith there because my experience as a journalist tells me that news doesnt always work that way. For example, there is a news on RT that says "Crimean rebels fighting Ukranian forces" and about the same story CNN says something to the effect of "Undercover Russian soldiers infiltrate Crimea", and since this is a war zone it is hard to get verifiable true facts out of both statements and both might have some truth in them. This brings us to another issue with the news - the meta-news, the news about the news, the context of the news, the news story's relationship with the source of the news, who reports what and why. In this Ukraine issue we can see stark differences, since we can assume agendas so it may become easy to filter some facts from it. We may say, "Ohh RT is the propaganda arm of Russian govt" and disregard it completely, but that is not what we should be doing, because even the worst propagandists have some facts and even they have a side of story to tell. So how do we encode it all into bitts?  How do we encode meta-news into bitts? How do we structure that data?

And it gets trickier, sometimes there are no clear facts, sometimes news is just opinions of two different people. Sometimes the biggest news is just a difference of two peoples opinions, what are the conditions under which we call an opinions a bitt? "Saddam Hussain is a dangerous man who possesses weapons of mass destruction." or "Saddam Husain is a threat to world peace" is this a bitt? Is it a verifiable fact? This escalation of opinions turned into a war that still continues. In the run up to the war, a lot of people were writing Op-Eds about Iraq and Saddam Husain, while there was no actual facts/events happening. Do we count the sudden influx of Op-eds as "news", it certainly proved to be. So, at what time do we give importance to this opinion and how does the community decide that?

What am i doing with my prototype:

In my prototype i am not dealing in bitts or assuming that facts are to be found in news or even considering that news is a collection of facts. What i assume is that the news is a collection of information, and the source it comes from is also a part of that information. So, i am taking in all the information and just structuring it so we are able to analyse it. I am doing it by crawling an expanding list of news-sites and putting all that information in a database. I am not being opinionated about it nor can i be biased about what information I take in. Instead of establishing facts, the news can be organized/curated into a series of related topics. The community can rate news as positive and negative for example (a crowd powered sentiment analysis that works), biased and unbiased for example, add their own sources and ratings and opinions to make a topic complete and add as much info as possible in a pre-defines structure. As this meta data accumulates, we start getting the actual sense of what is happening. When we do it at a very large scale, news starts making sense and Since our data is structured we can run all types of queries on it..We are then able to also see which media outlet is mostly biased about what topic, about patterns opinions of different countries and different leaders of different topics. We can get all kinds of beautiful patterns and meaning from news. The bottom-line is, the hidden power of all the data can be unleashed if we just structure it enough and we can use the community to do it. When done this way, the past can even predict the future. We can graph news trends by topics, we can provide really sophisticated visual representations of data contained within news everywhere. 

That is what i have learned about this topics far. I wanted to share my ideas with you because we are essentially tackling the same problem - news is everywhere, there is a lot of it, and it just doesnt make sense unless you work really hard on it. And people don't have time, they just want to know what is happening fast. This is exactly the gap that the propagandists in history have exploited and this is a thing that technology can eliminate.

I am at a very early stage so my learning may not be very coherent so far, forgive me for the rant. Good luck to you.

 

Tim Chambers

unread,
Dec 18, 2014, 3:09:10 PM12/18/14
to Waqas Ahmed, infobi...@googlegroups.com
Welcome, Waqas. I've been with InfoBitt since the very beginning.

Do you have a link to share? For instance, when I am researching new for bitts, I use http://newsmap.jp/

I'm always on the lookout for additional sources.

I also hope you have some time to engage with the community who is active on our website. We might be able to learn from each other.


--
You received this message because you are subscribed to the Google Groups "Infobitt-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to infobitt-tal...@googlegroups.com.
To post to this group, send email to infobi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/infobitt-talk/7e10b57b-988f-4294-9f10-3f2c72b915e6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Darren Duncan

unread,
Dec 18, 2014, 10:22:15 PM12/18/14
to Tim Chambers, Waqas Ahmed, infobi...@googlegroups.com
This thread reminds me of a project I started working on about 15 years ago and
some relevant things that came out of it.

To be brief ...

I think it is more accurate, with respect to anything one is recording or
presenting, to refer to it as an "assertion" rather than a "fact". An assertion
is a generalization of a fact or a belief.

Fundamentally, any individual detail we record or say qualifies as one of these:

1. The writer asserts they have first-hand experience, that is they are a
primary source.

2. The writer asserts hearsay, that is they assume or believe something is true
but without claiming first-hand experience or a specific other source.

3. The writer asserts that there is another specific record that is their
source, and they name such, for example a web url or other citation. In the
process they are also asserting that both the other record exists and says what
they are claiming it says. They can also say that the other source claims any
of the 3 categories of assertions I've mentioned here, so we essentially have
chains, he-said, she-said, and so on.

With specific respect to Infobitt, logically each bitt/fact can't logically
assert anything but that "the actor claiming to be this person has made this
other newsy claim and asserts that X third party is the source".

So when you express in terms of assertions rather than facts, you can easily
take into account conflicting claims or claims with different levels of
verifiability. Moreover, by recording chains of sources, and all of this
structure is recursive, everything you say about a source for an assertion is
itself an assertion that can itself have sources etc, you can most accurately
record what you REALLY know. The real power here is you empower others to check
sources for themselves, and use chaining to help with how much they might trust
one thing or another.

What I'm talking about is widely useful, whether in news or encyclopedias or
genealogies or science or law etc.

I believe that through properly applying structures like these, giving
information in a way that users can trust due to personal verifiability etc, any
problem can be solved, any problem at all. I see my life's work as being
related to try and enable this.

Something I like about Infobitt is that its a public project which actually
applies a fraction of what I'm talking about here, in contrast to say Wikipedia,
because Infobitt actually tries to work on the individual detail level, citing a
source for each, rather than blanket sources for an article, so it is
structurally in a better place to be accurate.

-- Darren Duncan
> call an opinions a bitt? "/Saddam Hussain is a dangerous man who possesses
> weapons of mass destruction./" or "/Saddam Husain is a threat to world
> peace/" is this a bitt? Is it a verifiable fact? This escalation of opinions
> turned into a war that still continues. In the run up to the war, a lot of
> people were writing Op-Eds about Iraq and Saddam Husain, while there was no
> actual facts/events happening. Do we count the sudden influx of Op-eds as
> "news", it certainly proved to be. So, at what time do we give importance to
> this opinion and how does the community decide that?
>
> *What am i doing with my prototype:*

tbc

unread,
Dec 19, 2014, 11:51:38 AM12/19/14
to infobi...@googlegroups.com, t...@timchambersusa.com, wa...@opensource.com.pk
I'm enjoying this thread [1] very much. Deep, Darren, deep. :) You're getting into epistemology. I'll defer to Dr. Sanger. I'm merely a philosophical autodidact, a huge fan of Michael Polanyi, who has much to say about assertions. His work, and yours, could be applicable to a philosophy of journalism. (Google that phrase to see how far the world has to go to make the concept mature and meaningful.)

Bringing myself out of the clouds and back to InfoBitt's mission [2], which refers to "news" and "facts." Here there is little ambiguity. We summarize (or, less ideally, copy) articles from websites that claim authority as sources of news. We expect those news sources to report facts, and we try to tease those facts out and order them. We have an opinion category [3]. Larry uses the term "hard news." I think the site does a good job making the difference transparent.

Not all news sources have the same level of quality. I like the simple definition of quality: fit for use. In the FAQ, Larry writes, "Our democracies will together create a tool that could make us better informed than any in history." I personally work on InfoBitt constantly measuring the quality of the news against that standard. Is InfoBitt helping citizens in democracies to be better informed?


Waqas Ahmed

unread,
Dec 22, 2014, 5:50:34 PM12/22/14
to infobi...@googlegroups.com, wa...@opensource.com.pk, t...@timchambersusa.com
Heres a link of the prototype i am working on:

Since i am scraping news from local media, i can do this:

or check what is a priority of the local media:

Structure news a bit better so it allows us to run more queries in the future: http://mediaobserver.pk/khan-demands-better-resources-for-kp-to-fight-terrorism/

To do: Lots. Give news more structure, allow users to organize content in different ways (topics, personalities etc), Build relationships among news stories and crowdsource it, etc.

I want to give it a more comprehensive feel like wikipedia. Right now i feel that Infobitt seems more like Twitter for news rather than Wikipedia, dont you think?
Reply all
Reply to author
Forward
0 new messages