Contribute to Chinese ConceptNet

81 views
Skip to first unread message

陳櫻仁

unread,
Jan 23, 2021, 4:07:04 AM1/23/21
to conceptnet-users
Hello everyone,
Is there anyway that I can contribute to Chinese ConceptNet?

I used Chinese ConceptNet in my research (NLG task).
I found lots of errors in ConceptNet that I can't use it directly.
Therefore, I refined it in different methods (I almost checked all the assertions manually).
In attachment ConceptNet_data_cleaning.pdf, I list errors in ConceptNet and how I deal with these errors.
And in attachment ConceptNet_data_cleaning_relation.pdf, I list modifications of each relation.
Because the size in original Chinese ConceptNet is not enough to my research,
I expanded the size by antonyms and synonyms from 352,411 to 1,132,030.
The results of modifications and expansion are shown in attachment ConceptNet_data_cleaning_result.pdf.

I used modified and expanded version of Chinese ConceptNet in my NLG task, and employed 50 participants to rate the coherence of the generated text from 1 to 6.
(1: least coherent, 6:most coherent)
The original one gets 1.55, and ours gets 3.23 (The other conditions are the same).

I would like to contribute my refined and expanded Chinese ConceptNet.
The quality is much higher than the original one and no specific domain knowledge.
Is there anyway that I can contribute?
Thanks!
ConceptNet_data_cleaning_result.pdf
ConceptNet_data_cleaning.pdf
ConceptNet_data_cleaning_relation.pdf

Yen-Ling Kuo

unread,
Jan 25, 2021, 1:35:21 PM1/25/21
to conceptn...@googlegroups.com
Hi 櫻仁,

The stats of the cleaned-up data seems good to me! And thank you for doing that!
Do you want to set up a meeting to go through the data/changes you made and I can help you merge the data back to ConceptNet 5? 
Feel free to drop me an email to set up a time to discuss this.

@Robyn, I'll check the data and maybe create a pull request for the proposed changes. Anything I should note when making the changes?

Thanks,
Yen-Ling


--
You received this message because you are subscribed to the Google Groups "conceptnet-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to conceptnet-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/conceptnet-users/420f907b-1a74-49fd-aed8-6318c4dd763dn%40googlegroups.com.

Elia Robyn Speer

unread,
Feb 1, 2021, 4:21:47 PM2/1/21
to conceptn...@googlegroups.com
Hi, I'm catching up on this now. Some updated and improved Chinese data would be great to have in ConceptNet, especially if vetted by Yen-Ling.

The way the process would typically go is:

- Send me the data, in a format that can be included in the next "conceptnet-raw-data" package
- If the data is in a format that we already use for ConceptNet data, it can be imported as-is with just changes to the inputs in the Snakefile
- If the data is in a different format, we need to add or extend a module in conceptnet5.readers
- We run semantic benchmarks and make sure they don't get statistically worse
- We release the next version of ConceptNet that includes the new edges


Yen-Ling Kuo

unread,
Feb 2, 2021, 12:28:55 PM2/2/21
to conceptn...@googlegroups.com
Hi Robyn,

It's great to hear that the importing process is the same as the previous one.

I already got the data from Ying-Ren.
Since his changes are cleaning and extension of the current version, I'll merge it with the current version and let you know. 
The format I'll use should be the same one as before so no additional conceptnet5.readers is needed.
The only tricky part is that he revised/added some frames so we may need to change zh_frames.json as well.

I'll follow up with you if I have any questions while merging the data.

Thanks,
Yen-Ling


Reply all
Reply to author
Forward
0 new messages