[Freebase-discuss] Big announcement!

56 views
Skip to first unread message

Kirrily Robert

unread,
Jul 16, 2010, 2:18:33 PM7/16/10
to Freebase.com discussion list
Hey everyone,

As you might already have heard, we have a big announcement: Metaweb is joining Google!

You can read about it on our blog and on Google’s blog:
http://blog.freebase.com/2010/07/16/metaweb-joins-google/
http://googleblog.blogspot.com/2010/07/deeper-understanding-with-metaweb.html

As you’ll see from both blog posts, Google is committed to keeping Freebase free and open. In fact, one thing that’s happening is that effective immediately, we’re making our data dumps weekly instead of quarterly. As for everything else, right now things will continue pretty much as usual, and going forward we’re looking forward to making Freebase bigger and better with Google's help.

As you can imagine, this is a ridiculously busy time for us as we settle in at Google, so please be understanding with us for a little while. In the meantime, if you have any Freebase-related questions, please feel free to drop me an email (direct at kir...@metaweb.com if you like) or find me on IRC (irc.freenode.net #freebase).

K.

--
Kirrily Robert
Freebase Community Director
kir...@metaweb.com


Shawn Simister

unread,
Jul 16, 2010, 3:32:49 PM7/16/10
to Freebase.com discussion list
Congratulations to everyone at Metaweb! This is fantastic news. Its
great to see that all your hard work has paid off and that Freebase will
continue to grow and improve under Google. This is certainly an exciting
time for all of us to be part of the open data movement.

Shawn

Kirrily Robert wrote:
> Hey everyone,
>
> As you might already have heard, we have a big announcement: Metaweb is joining Google!
>

> You can read about it on our blog and on Google�s blog:
> http://blog.freebase.com/2010/07/16/metaweb-joins-google/
> http://googleblog.blogspot.com/2010/07/deeper-understanding-with-metaweb.html
>
> As you�ll see from both blog posts, Google is committed to keeping Freebase free and open. In fact, one thing that�s happening is that effective immediately, we�re making our data dumps weekly instead of quarterly. As for everything else, right now things will continue pretty much as usual, and going forward we�re looking forward to making Freebase bigger and better with Google's help.

Temo Chalasani

unread,
Jul 16, 2010, 3:38:58 PM7/16/10
to Freebase.com discussion list
Congratulations to everyone at metaweb! i'm a big fan of your services
and i wish you all the success after your transition to google..
best, Temo.

On 16-Jul-10, at 3:32 PM, Shawn Simister wrote:

> Congratulations to everyone at Metaweb! This is fantastic news. Its
> great to see that all your hard work has paid off and that Freebase
> will continue to grow and improve under Google. This is certainly an
> exciting time for all of us to be part of the open data movement.
>
> Shawn
>
> Kirrily Robert wrote:
>> Hey everyone,
>>
>> As you might already have heard, we have a big announcement:
>> Metaweb is joining Google!
>>

>> You can read about it on our blog and on Google’s blog:
>> http://blog.freebase.com/2010/07/16/metaweb-joins-google/
>> http://googleblog.blogspot.com/2010/07/deeper-understanding-with-metaweb.html
>>
>> As you’ll see from both blog posts, Google is committed to keeping
>> Freebase free and open. In fact, one thing that’s happening is
>> that effective immediately, we’re making our data dumps weekly

>> instead of quarterly. As for everything else, right now things

>> will continue pretty much as usual, and going forward we’re looking

>> forward to making Freebase bigger and better with Google's help.
>>
>> As you can imagine, this is a ridiculously busy time for us as we
>> settle in at Google, so please be understanding with us for a
>> little while. In the meantime, if you have any Freebase-related
>> questions, please feel free to drop me an email (direct at kir...@metaweb.com
>> if you like) or find me on IRC (irc.freenode.net #freebase).
>>
>> K.
>>
>>
>

> _______________________________________________
> You are receiving this message because you are subscribed to the
> Freebase-discuss mailing list.
> To post a message to the list: Freebase...@freebase.com
> To unsubscribe, view archives, etc: http://lists.freebase.com/mailman/listinfo/freebase-discuss


glenn mcdonald

unread,
Jul 16, 2010, 3:47:09 PM7/16/10
to Freebase.com discussion list
Excited about this for both Metaweb and Google.

Will transition to *really* excited about this as soon as the DOJ says it's OK for us to be colleagues!

glenn (Needle/ITA)

Shawn Simister

unread,
Jul 16, 2010, 3:59:57 PM7/16/10
to Freebase.com discussion list
Hmm...so what are the chances that some of the ITA data would make its way into Freebase? I know there are a handful of people who would like to see better coverage of airports.

Shawn

glenn mcdonald wrote:
Excited about this for both Metaweb and Google.

Will transition to *really* excited about this as soon as the DOJ says it's OK for us to be colleagues!

glenn (Needle/ITA)


On 16 Jul 10, at 2:18pm, Kirrily Robert wrote:

  
Hey everyone,

As you might already have heard, we have a big announcement: Metaweb is joining Google!

You can read about it on our blog and on Google�s blog:
http://blog.freebase.com/2010/07/16/metaweb-joins-google/
http://googleblog.blogspot.com/2010/07/deeper-understanding-with-metaweb.html

As you�ll see from both blog posts, Google is committed to keeping Freebase free and open.  In fact, one thing that�s happening is that effective immediately, we�re making our data dumps weekly instead of quarterly.  As for everything else, right now things will continue pretty much as usual, and going forward we�re looking forward to making Freebase bigger and better with Google's help.

As you can imagine, this is a ridiculously busy time for us as we settle in at Google, so please be understanding with us for a little while.  In the meantime, if you have any Freebase-related questions, please feel free to drop me an email (direct at kir...@metaweb.com if you like) or find me on IRC (irc.freenode.net #freebase).

K.

-- 
Kirrily Robert
Freebase Community Director
kir...@metaweb.com



_______________________________________________
You are receiving this message because you are subscribed to the Freebase-discuss mailing list.
To post a message to the list: Freebase...@freebase.com
To unsubscribe, view archives, etc: http://lists.freebase.com/mailman/listinfo/freebase-discuss
    

Antonio Turdo

unread,
Jul 16, 2010, 6:24:42 PM7/16/10
to freebase...@freebase.com
Show romanization
About ITA I hope it will be possible to integrate Needlebase with Freebase tools (Gridworks).
Another thing I hope is that this agreement will be a push to the internationalization of Freebase.

Congratulations to Metaweb.

Antonio Turdo.

Antonio Turdo

unread,
Jul 16, 2010, 6:30:56 PM7/16/10
to freebase...@freebase.com
Obviously, "Show Romanization " is a piece of text that I don't know where it comes from (formatting ?).

Antonio Turdo

hangy

unread,
Jul 16, 2010, 6:36:09 PM7/16/10
to Freebase.com discussion list
Antonio Turdo wrote:
> Another thing I hope is that this agreement will be a push to the internationalization of Freebase.

That is just what I thought! :) The global presence of Google is something that could push the I18N and I10N of Freebase at some future date.

-hangy


gabriele renzi

unread,
Jul 17, 2010, 4:17:19 AM7/17/10
to Freebase.com discussion list
On Fri, Jul 16, 2010 at 8:18 PM, Kirrily Robert <kir...@metaweb.com> wrote:
> Hey everyone,
>
> As you might already have heard, we have a big announcement: Metaweb is joining Google!

congrats to everyone,
I hope you'll be able to keep up your great attitude inside the new
corporate environment and get the most of it. The amount of stuff that
could be poured into FB from base/froogle/maps/knol/etc is mindbending
:)

<snip>

I noticed the thing about the data dumps becoming weekly, that seems
quite a difference :)
But (I seem to recall this was discussed in the past but can't find
it) if they are this frequent it seems even more useful to provide
downloads of deltas mext to the big packages. Any hope? :)

Thad Guidry

unread,
Jul 17, 2010, 9:44:29 AM7/17/10
to Freebase.com discussion list
Gabriele,

Curious.  What kind of deltas would be appealing to you ?  Can you give examples ?

-Thad

Philip Kendall

unread,
Jul 17, 2010, 9:49:50 AM7/17/10
to freebase...@freebase.com
On Sat, Jul 17, 2010 at 08:44:29AM -0500, Thad Guidry wrote:
>
> Curious. What kind of deltas would be appealing to you ? Can you give
> examples ?

The diff between two consecutive dumps, to avoid downloading a great big
file, of which 99% is exactly the same data as you downloaded a week
ago.

If we're going to weekly dumps, it may just about be feasible to
construct the deltas via MQL.

Cheers,

Phil

--
Philip Kendall <phi...@shadowmagic.org.uk>
http://www.shadowmagic.org.uk/

Thad Guidry

unread,
Jul 17, 2010, 9:55:44 AM7/17/10
to Freebase.com discussion list
Philip,

How large is the current 12+ million graph as a download now ?  My thinking and question about the kind of deltas was aimed at partitioning domains and then their deltas.  Do we have that capability now ?  Perhaps in the future ?

Learning,
-Thad

Shawn Simister

unread,
Jul 17, 2010, 9:57:14 AM7/17/10
to Freebase.com discussion list
Philip Kendall wrote:
On Sat, Jul 17, 2010 at 08:44:29AM -0500, Thad Guidry wrote:
  
Curious.  What kind of deltas would be appealing to you ?  Can you give
examples ?
    
The diff between two consecutive dumps, to avoid downloading a great big
file, of which 99% is exactly the same data as you downloaded a week
ago.

If we're going to weekly dumps, it may just about be feasible to
construct the deltas via MQL.

Cheers,

Phil

  
My personal preference would be PubSubHub. That way you just get the deltas you're interested in as they happen.

Shawn

gabriele renzi

unread,
Jul 17, 2010, 9:57:14 AM7/17/10
to Freebase.com discussion list
On Sat, Jul 17, 2010 at 3:49 PM, Philip Kendall
<philip-...@shadowmagic.org.uk> wrote:
> On Sat, Jul 17, 2010 at 08:44:29AM -0500, Thad Guidry wrote:
>>
>> Curious.  What kind of deltas would be appealing to you ?  Can you give
>> examples ?
>
> The diff between two consecutive dumps, to avoid downloading a great big
> file, of which 99% is exactly the same data as you downloaded a week
> ago.

Exactly,
since the data dumps are in TSV format, they are suitable for line
oriented diffs, which can be individually downloaded incrementally
(and trivially scripted on our side if someone wants to fetch/apply
them less frequently).


For the quarterly dump it may have little sense because a lot of data
would be different, but for one week i guess most content would be
stable.

Jason Douglas

unread,
Jul 17, 2010, 2:09:28 PM7/17/10
to Freebase.com discussion list

That's an intriguing idea. We internally use a notification stream for replicating changes across graphds and into the search index (for Suggest). I don't know technically how big a leap it would be to go from that to syndicating that stream externally, though. Worth looking into.

-jason

gabriele renzi

unread,
Jul 20, 2010, 8:28:58 AM7/20/10
to Freebase.com discussion list
Sorry for late reply,

On Sat, Jul 17, 2010 at 8:09 PM, Jason Douglas <ja...@metaweb.com> wrote:

>> My personal preference would be PubSubHub. That way you just get the deltas you're interested in as they happen.
>>
>> Shawn
>
> That's an intriguing idea.  We internally use a notification stream for replicating changes across graphds and into the search index (for Suggest).  I don't know technically how big a leap it would be to go from that to syndicating that stream externally, though.  Worth looking into.

I believe these are different issues:
1. being able to be always up to date for a msq-selected subset (I
second the PuSH suggestion), which would be ubercool
2. being able to to obtain the full data in an efficient and
incremental fashion, which is simpler but solves my actual problem :)

To be clear: in one of our uses of the FB data we need to process the
whole dump to select a large subset of the topics, integrate it with
our data and compute some metrics, which is a batch process.
Even if we had updates available every minute we would still be
performing it as a scheduled process so 1. would not be especially
useful because I'd need to implement "stashing" on my side, while 2.,
saving a large part of the download/decompress time for each new
release, would.

But thanks in any case :)

Reply all
Reply to author
Forward
0 new messages