【締切延長】森羅2020-MLタスク参加募集(2nd CFP to SHINRA2020-ML Classification Task)

0 views
Skip to first unread message

Masako Nomoto

unread,
Jul 8, 2020, 2:27:14 AM7/8/20
to shinr...@googlegroups.com

森羅2019メーリングリストの皆様:

森羅2020-ML 実行委員の野本です。

日本語構造化タスクと並行して、30言語のWikipediaを219種類の拡張固有
表現に分類するタスク(SHINRA2020-ML)も実施しています。〆切を1ヶ月
延長し、まだ日程に余裕がありますので、ぜひ参加をご検討ください!

・参加登録と結果提出の締切:8月31日(延長)
森羅2020-MLリーダーボードをリリースしましたので是非ご覧ください。

よろしくお願いいたします。

This is the second call for task participation to SHINRA2020-ML
Classification task, tackling the problem of classifying 30 language 
Wikipedia entities into 219 categories.

* The registration and result submission deadline has been
extended to August 31, 2020.

* We are pleased to inform you that we have released the
SHINRA2020-ML leaderboard.

Please take this opportunity to have a look at the data.
We look forward to having you join us.

(Japanese)
========================================================

    SHINRA2020-ML(森羅2020-ML:多言語分類タスク) 参加募集
    
     データリリース: 2020年1月
     参加登録 & 結果提出締切: 2020年8月31日 (延長)
     NTCIR-15 カンファレンス: 2020年12月
========================================================

*概要

森羅(SHINRA)は2017年にスタートしたリソース構築プロジェクトで、
Wikipediaの知識を計算機が扱える形に構造化することを目指し、協働に
よるリソース構築(Resource by Collaborative Contribution(RbCC))という
枠組みで、評価型タスクとリソース構築を同時に進めています。

SHINRA2020-MLは森羅プロジェクトの評価型タスク(shared-task)では
初めてのテキスト分類タスクで、NTCIR-15のタスクの一つとして実施し、
30言語のWikipedia項目のページの分類に取り組みます。

  [タスク紹介ビデオ](約11分:英語)
  
*タスク

30言語(*1)のWikipediaページを、分類済の日本語記事と対象言語の対応
するページへの言語間リンクを利用して、拡張固有表現(Extended Named
Entity)の219カテゴリに分類するタスクです。

参加者は1つ以上の対象言語を選び、分類済の日本語ページから言語間リン
クで対応づけられたWikipediaページをトレーニングデータとして、残りの
リンクのない未分類ページを分類します。

タスク終了後は全ての参加システムの結果を合わせて(参加者とともに)アン
サンブル学習を行い、結果を公表します。

多くの方にご参加いただき、皆様の善意でよりよいタスクとなることを期待
しています。

  (*1): 対象の30言語は English, Spanish, French, German,
  Chinese, Russian, Portuguese, Italian, Arabic,
  Indonesian, Turkish, Dutch, Polish, Persian, Swedish,
  Vietnamese, Korean, Hebrew, Romanian, Norwegian,
  Czech, Ukrainian, Hindi, Finnish, Hungarian, Danish,
  Thai, Catalan, Greek, Bulgarian です。

*スケジュール

2020年1月 データリリース
2020年8月31日 参加登録 & 結果提出締切 (延長)
2020年9月中旬 評価結果の返却 (延長)
2020年12月 NTCIR-15 カンファレンス (NII, Tokyo)

*参加方法

初めて森羅のタスクに参加される方、どんなタスクかご興味のある方は、
「トライアルデータセット」でデータをお試しください。
データセットのダウンロードやタスクの参加方法については森羅の2020-ml
タスクページの説明をご覧ください。

本タスクはNTCIR-15のタスクの一つです。参加される方は以下のNTCIR-15
の参加方法の説明ページからご登録ください。

NTCIR-15 タスク参加の手引き

*オーガナイザー

委員長:
関根聡 (理研AIP)

実行委員:
野本昌子 (理研AIP)
隅田飛鳥 (理研AIP)
中山功太 (筑波大/理研AIP)
松田耕史 (理研AIP/東北大)

プログラム委員:
Jiewen Wu (A*STAR, Singapore)
Christophe Gravier (Université de Lyon, France)
Hsin-Hsi Chen (National Taiwan University, Taiwan)
Haizhou Li (National University of Singapore, Singapore)
Virach Sornlertlamvanich (Thammasat Univercity,
Thailand / Musashino University, Japan)
Massimo Poesio (Mary Queen University of London, England)
Rafael Muñoz Guillena (Universitat d’Alacant, Spain)
Min Zhang (Soochow University, China)
Wenliang Chen (Soochow University, China)
Johan Bos (University of Groningen, Netherland)
Gerhard Weikum (DFKI, Germany)
Asif Ekbal (IIT Patna, India)
Gjergji Kasneci (Tübingen University, Germany)
Vasudeva Varma (IIIT Hyderabad, India)
Asanee Kasetsart (Kasetsart University, Thailand)
Pierpaolo Basile (Università degli Studi di Bari Aldo
Moro, Italy)
David Nadeau (Innodata, Canada)
Murat Can Ganiz (Marmara University, Turkey)
Adrian Iftene (“Alexandru Ioan Cuza” University, Romania)
Tommi A Pirinen (Universität Hamburg, Germany)
Tru Cao (The University of Texas Health Science Center at Houston, USA)
Petya Osenove (Sofia University “St. Kl. Ohridski”, Bulgaria)
Le Hong Phuong (Vietnam National University, Hanoi, Vietnam)
Nguyen Thi Minh Huyen (Vietnam National University, Hanoi
Vietnam)
Nicolas Heist (Universität Mannheim, Germany)
Zdenek Zabokrtsky (Charles University, Czech Republic)
Tim Finin (University of Maryland, USA)
Su Jian (A*STAR, Singapore)
Manar Alkhatib (The British University in Dubai, United
Arab Emirates)
Key-Sun Choi  (Korea Advanced Institute of Science and
Technology, Korea)
Nigel Collier (University of Cambridge, UK)
山田育矢(Studio Ousia/理研AIP)
乾健太郎(東北大/理研AIP)
岩倉友哉(富士通)
Mehrnoush Shamsfard (Shahid Beheshti University, Iran)
Galia Angelova (Bulgarian Academy of Sciences, Bulgaria)
宮尾祐介(東京大)
Kiril Simov (Bulgarian Academy of Sciences, Bulgaria)
馬場雪乃(筑波大)
吉岡真治(北海道大)
Heng Ji (University of Illinois at Urbana-Champaign, USA)
Miloslav Konopik (University of West Bohemia, Czech Republic)
Steven Skiena (Stony Brook University, USA)
Catherine Legg (Deakin University, Australia)

*連絡先

Email (オーガナイザー宛):
shinra20...@googlegroups.com

Slack (タスク参加者とオーガナイザー):
http://shinra2020-ml.slack.com [参加リンク]

=====================================================================
CALL FOR TASK PARTICIPATION
SHINRA2020-ML Classification Task

Data release: January 2020
Registration & Result submission deadline: August 31, 2020 (extended)
NTCIR-15 Conference: December 2020
=====================================================================

SHINRA is a resource creation project started in the year 2017, aiming to structure the 
knowledge in Wikipedia.

SHINRA2020-ML is the first shared-task of text classification in project SHINRA,
tackling the challenge of classifying 30 language Wikipedia entities in fine-grained
categories. The task is conducted as one of the NTCIR-15 tasks.

[Video] (approx.11 min):
Introduction of SHINRA2020-ML task (categorization of 30-language Wikipedia into ENE)

TASK OVERVIEW
The task is to classify 30 language (*1) Wikipedia pages into 219 categories using
categorized Japanese Wikipedia pages and the interlanguage links to the corresponding
pages in target languages. The categories are defined in Extended Named Entity (ENE) ver.8.0
a four-layer ontology for classifying names, time, and numbers.

The participants are expected to select one or more target languages, and for each
language, use the Wikipedia pages linked from the categorized Japanese pages as the
training data, and run the system to classify the remaining pages which are not linked
from the Japanese pages. Please see the TASK DESCRIPTION on the home page for 
further details.

After the task is over, we (including the participants) will combine the results by all the
participants (i.e. by Ensemble learning), and publish the results to the public. It is a
scheme called “Resource by Collaborative Contribution (RbCC)”.

We are expecting many participants with a good will.

(*1): The 30 languages are English, Spanish, French, German, Chinese, Russian,
Portuguese, Italian, Arabic, Indonesian, Turkish, Dutch, Polish, Persian, Swedish,
Vietnamese, Korean, Hebrew, Romanian, Norwegian, Czech, Ukrainian, Hindi, Finnish,
Hungarian, Danish, Thai, Catalan, Greek, Bulgarian.

IMPORTANT DATES
January 2020 Data release
August 31, 2020 Registration & Result submission deadline (extended)
mid September, 2020 Evaluation results due back to participants (extended)
December 2020 NTCIR-15 Conference (NII, Tokyo)

HOW TO PARTICIPATE
We encourage new participants to have a look at the data in “Trial datasets". How to 
download the datasets and participate in the task is described here.

Please note that the task is conducted as one of the NTCIR-15 tasks and you have to 
register through the NTCIR-15 registration page to participate in it.

ORGANIZERS
Chair
Satoshi Sekine (RIKEN AIP, Japan)

Organizing Committee
Masako Nomoto (RIKEN AIP, Japan)
Asuka Sumida (RIKEN AIP, Japan)
Kouta Nakayama (University of Tsukuba/ RIKEN AIP, Japan)
Koji Matsuda (RIKEN AIP/ Tohoku University, Japan)

PC Members
Jiewen Wu (A*STAR, Singapore)
Christophe Gravier (Université de Lyon, France)
Hsin-Hsi Chen (National Taiwan University, Taiwan)
Haizhou Li (National University of Singapore, Singapore)
Virach Sornlertlamvanich (Thammasat Univercity, Thailand / Musashino University,
Japan)
Massimo Poesio (Mary Queen University of London, England)
Rafael Muñoz Guillena (Universitat d’Alacant, Spain)
Min Zhang (Soochow University, China)
Wenliang Chen (Soochow University, China)
Johan Bos (University of Groningen, Netherland)
Gerhard Weikum (DFKI, Germany)
Asif Ekbal (IIT Patna, India)
Gjergji Kasneci (Tübingen University, Germany)
Vasudeva Varma (IIIT Hyderabad, India)
Asanee Kasetsart (Kasetsart University, Thailand)
Pierpaolo Basile (Università degli Studi di Bari Aldo Moro, Italy)
David Nadeau (Innodata, Canada)
Murat Can Ganiz (Marmara University, Turkey)
Adrian Iftene (“Alexandru Ioan Cuza” University, Romania)
Tommi A Pirinen (Universität Hamburg, Germany)
Tru Cao (The University of Texas Health Science Center at Houston, USA)
Petya Osenove (Sofia University “St. Kl. Ohridski”, Bulgaria)
Le Hong Phuong (Vietnam National University, Hanoi, Vietnam)
Nguyen Thi Minh Huyen (Vietnam National University, Hanoi Vietnam)
Nicolas Heist (Universität Mannheim, Germany)
Zdenek Zabokrtsky (Charles University, Czech Republic)
Tim Finin (University of Maryland, USA)
Su Jian (A*STAR, Singapore)
Manar Alkhatib (The British University in Dubai, United Arab Emirates)
Key-Sun Choi (Korea Advanced Institute of Science and Technology, Korea)
Nigel Collier (University of Cambridge, UK)
Ikuya Yamada (Studio Ousia/ RIKEN AIP, Japan)
Kentaro Inui (Tohoku University/ RIKEN AIP, Japan)
Tomoya Iwakura (Fujitsu, Japan)
Mehrnoush Shamsfard (Shahid Beheshti University, Iran)
Galia Angelova (Bulgarian Academy of Sciences, Bulgaria)
Yusuke Miyao (The University of Tokyo, Japan)
Kiril Simov (Bulgarian Academy of Sciences, Bulgaria)
Yukino Baba (University of Tsukuba, Japan)
Masaharu Yoshioka (Hokkaido University, Japan)
Heng Ji (University of Illinois at Urbana-Champaign, USA)
Miloslav Konopik (University of West Bohemia, Czech Republic)
Steven Skiena (Stony Brook University, USA)
Catherine Legg (Deakin University, Australia)

CONTACT
Email to the organizers:
shinra20...@googlegroups.com
Slack among the participants and organizers:
http://shinra2020-ml.slack.com
[Invitation link]

LINKS
Reply all
Reply to author
Forward
0 new messages