Can anyone tell me the meaning of the weight value in ConceptNet?

Quan Liu

unread,

Nov 4, 2015, 10:20:04 PM11/4/15

to conceptnet-users

Hi all

I am a junior in using ConceptNet. Can anyone tell me the meaning of the weight value in ConceptNet? How are those weights calculated?

Quan Liu

Luke Bechtel

unread,

Nov 5, 2015, 10:52:13 PM11/5/15

to conceptnet-users

I believe it has something to do with the frequency that type of connection between words was detected in the corpus they trained the conceptnet on. Essentially, if there is a high weight on a certain connection, that connection was detected a lot in the corpus.

Rob Speer

unread,

Nov 6, 2015, 12:16:24 PM11/6/15

to conceptnet-users

Yep, that's it. An assertion that appeared multiple times from different people in the same source (such as assertions that came from Open Mind Common Sense), or that appears in multiple different sources, gets a higher weight.

--
You received this message because you are subscribed to the Google Groups "conceptnet-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to conceptnet-use...@googlegroups.com.
To post to this group, send email to conceptn...@googlegroups.com.
Visit this group at http://groups.google.com/group/conceptnet-users.
For more options, visit https://groups.google.com/d/optout.

Fillipe Souza

unread,

Oct 23, 2016, 7:50:01 PM10/23/16

to conceptnet-users, r...@luminoso.com

Do you have or know the source where they explain how the weights are computed?

Rob Speer

unread,

Oct 24, 2016, 11:47:45 AM10/24/16

to Fillipe Souza, conceptnet-users

I don't think we ever went into that much detail in a publication.

The weight of an assertion is the total of the weights that come from all the sources that support it. The weights for each source are defined ad-hoc in the 'reader' code that imports various sources, and you can browse this code at https://github.com/commonsense/conceptnet5/tree/master/conceptnet5/readers .

Fillipe Souza

unread,

Oct 24, 2016, 12:23:10 PM10/24/16

to Rob Speer, conceptnet-users

Rob,

I appreciate your swift reply very much. I wanted to make sure I didn't miss any references out there. Your explanation helps me a lot -- I will examine the code from here :-)

Regards,

Fillipe

Dimitris Diochnos

unread,

Nov 8, 2018, 3:20:49 PM11/8/18

to conceptnet-users

Dear All,

I am reviving this thread because I would like to ask for some clarification -- if possible. Based on what has already been discussed in this thread, the weight provides a confidence measure on the validity of the assertion where it is found. Fine.

In ConceptNet 4, the equivalent measure was that of the `score' of an assertion, which was an integer. There (in ConceptNet 4) it made sense to filter out assertions that had a score of 0 or less and in fact it was a common practice in order to get rid of spurious links. For example, see the paper AnalogySpace: Reducing the Dimensionality of Common Sense Knowledge, from AAAI 2008.

So, I have what I believe is two natural questions:

(A) Is there a similar universal weight threshold that can be used for ConceptNet 5.6 as a score of 0 was used in ConceptNet 4?

(I assume the answer is `no'.)

(B) What about weight thresholds per dataset?

For example, based on the above descriptions and a quick glance on the code I can see that the default values for weights are:

-- conceptnet 4 : 1.0

-- dbpedia: 0.5 or 1.0

-- emoji: 1.0 (even though there is no weight in the code; all the assertions currently have only this value)

-- opencyc: 1.0

-- verbosity: <not clear because it depends on some `score'; has 4875 different weight values ranging from 0.1 to 15.414>

-- wiktionary: 0.25 or 1.0

-- wordnet 3.1: 1.0 or 2.0

Excluding verbosity which I do not know how to treat it at the moment, for all the other datasets mentioned above, I see that when we restrict to the English datasets in case of multiple languages ( e.g., `/d/conceptnet/4/en' ), then the respective weights of the assertions are never smaller than the default values mentioned above (take the minimum in case of multiple default values).

But now this is in sharp contrast to how `score' used to behave in ConceptNet 4. In fact, we can see that ConceptNet 4 is part of ConceptNet 5.6, so, unless all those assertions that had score of 0 or less in ConceptNet 4 have been dropped when inserting it to ConceptNet 5.6 (which would indeed be a good thing), then some care may be needed for assertions that have weights near the default values.

So, the question remains:

Do people still drop some assertions of datasets, even if this means that they may have to use different thresholds for different datasets?

As a last remark, there are assertions in the database that have weight, say, 0.1, or 0.101, or 0.102, or .... Based on all the above default values it looks like that such assertions should probably be dropped when one wants to apply some algorithm that operates on `meaningful' assertions -- regardless of the dataset where these assertions arise. But if this is the case, then we should probably be dropping assertions that have weights lower than the default values of the individual datasets, correct?