577 results found.
Written
Lexicon,
Language Type:
Multilingual
Languages:
Czech
Availability:
Freely Available
License:
Creative Commons 3.0 - BY - NC - SA
Size:
11656 entries Production Status:
Existing-used
Use:
Linking lexicons
-
Paper title:Automatic Mapping Lexical Resources: A Lexical Unit as the Keystone
-
Paper track:Written
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Eduard Bejček | Charles University in Prague, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics | CZ |
| Author 2 | Kettnerová Václava | Charles University in Prague | CZ |
| Author 3 | Marketa Lopatkova | Charles University in Prague | CZ |
| Main Contact | Eduard Bejček | Charles University in Prague, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics | None |
Documentation:
http://ufal.mff.cuni.cz/PDT-Vallex/
Written
Corpus,
Language Type:
Bilingual
Languages:
English Russian
Availability:
From Data Center(s)
License:
CC-BY-SA, public domain
Size:
115,000 tokens Production Status:
Newly created-finished
Use:
Named Entity Recognition
-
Paper title:Tagging Location Phrases in Text
-
Paper track:Written/poster presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Paul McNamee | Location Phrase Dataset v1.0 | /N |
Documentation:
None
Written
Terminology,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
Size:
44 MByte Production Status:
Existing-used
Use:
Question Answering
-
Paper title:How Far Does BERT Look At: Distance-based Clustering and Analysis of BERT's Attention
-
Paper track:Short paper/
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Yue Guan | The Stanford Question Answering Dataset (SQuAD) v2.0 | /N |
Documentation:
NoneLanguage Type:
Monolingual
Languages:
<Not Specified>
Availability:
From Data Center(s)
License:
LDC
Size:
1500 queries OtherProduction Status:
Newly created-finished
Use:
Information Extraction, Information Retrieval
-
Paper title:Linguistic Resources for Entity Linking Evaluation: from Monolingual to Cross-lingual
-
Paper track:Written
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country | ||
|---|---|---|---|---|---|
| Author 1 | Xuansong Li | <Not Specified> | None | University of Pennsylvania | None |
| Author 2 | Stephanie Strassel | <Not Specified> | None | LDC | None |
| Author 3 | Heng Ji | <Not Specified> | None | ||
| Author 4 | Kira Griffitt | <Not Specified> | None | ||
| Author 5 | Joe Ellis | <Not Specified> | None | ||
| Main Contact | Xuansong Li | Linguistic Data Consortium, University of Pennsylvania | US | Linguistic Data Consortium at the University of Pennsylvania | US |
Documentation:
yes, English, will soon be publicly available
Written
Corpus,
Language Type:
Monolingual
Languages:
Polish
Availability:
Freely Available
License:
CreativeCommons
Size:
61315 tokens Production Status:
Existing-updated
Use:
Information Extraction, Information Retrieval
-
Paper title:PST 2.0 – Corpus of Polish Spatial Texts
-
Paper track:Written/poster presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Michał Marcińczuk | Corpus of Polish Spatial Texts 2.0 (PST 2.0) | /N |
Documentation:
NoneLanguage Type:
Language Independent
Languages:
<Not Specified>
Availability:
Not Available
License:
<Not Specified>
Size:
<Not Specified> Production Status:
Existing-updated
Use:
Information Extraction, Information Retrieval
-
Paper title:From Speech to Trees: Applying Treebank Annotation to Arabic Broadcast News
-
Paper track:General issues
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country | ||||
|---|---|---|---|---|---|---|---|
| Author 1 | Mohamed Maamouri | <Not Specified> | None | LDC | None | Linguistic Data Consortium | US |
| Author 2 | Ann Bies | <Not Specified> | None | LDC | None | Linguistic Data Consortium, University of Pennsylvania | None |
| Author 3 | Seth Kulick | <Not Specified> | None | LDC | None | Linguistic Data Consortium | US |
| Author 4 | Wajdi Zaghouani | LDC | None | ||||
| Author 5 | Dave Graff | LDC | None | ||||
| Author 6 | Mike Ciul | LDC | None | ||||
| Main Contact | Ann Bies | LDC | US | Linguistic Data Consortium, University of Pennsylvania | US |
Documentation:
<Not Specified>Language Type:
Language Independent
Languages:
<Not Specified>
Availability:
Not Available
License:
<Not Specified>
Size:
<Not Specified> Production Status:
Existing-updated
Use:
Information Extraction, Information Retrieval
-
Paper title:From Speech to Trees: Applying Treebank Annotation to Arabic Broadcast News
-
Paper track:General issues
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country | ||||
|---|---|---|---|---|---|---|---|
| Author 1 | Mohamed Maamouri | <Not Specified> | None | LDC | None | Linguistic Data Consortium | US |
| Author 2 | Ann Bies | <Not Specified> | None | LDC | None | Linguistic Data Consortium, University of Pennsylvania | None |
| Author 3 | Seth Kulick | <Not Specified> | None | LDC | None | Linguistic Data Consortium | US |
| Author 4 | Wajdi Zaghouani | LDC | None | ||||
| Author 5 | Dave Graff | LDC | None | ||||
| Author 6 | Mike Ciul | LDC | None | ||||
| Main Contact | Ann Bies | LDC | US | Linguistic Data Consortium, University of Pennsylvania | US |
Documentation:
<Not Specified>Language Type:
Monolingual
Languages:
<Not Specified>
Availability:
Contact the author
License:
<Not Specified>
Size:
70,000 words OtherProduction Status:
under production
Use:
<Not Specified>
Paper:
N/A
Documentation:
Das, D. and Stede, M. (to appear). Developing the Bangla RST Discourse Treebank. In Proceedings of the 6th Workshop on Recent Advances in RST and Related Formalisms.
Written
Corpus,
Language Type:
Multilingual
Languages:
italian
Availability:
Freely Available
License:
<Not Specified>
Size:
<Not Specified> Production Status:
Existing-used
Use:
Parsing and Tagging
Paper:
N/A
Documentation:
<Not Specified>
Written
Corpus,
Language Type:
Multilingual
Languages:
Russian
Availability:
Freely Available
License:
OpenSource
Size:
Russian Dependency Syntax Multi-Treebank corpus is developed under RU-EVAL-2012 initiative on evaluation of Russian dependency tree parsers. The test corpus provides a standard for qualitative comparisons between various dependency parsing schemes used in Russian NLP tools. The corpus includes a sample 64800 sentences drawn by random from fiction, news, non-fiction, blogs etc. (three or more subsequent sentences per source text). The collection is parallelly annotated with a range of parse-trees Production Status:
Newly created-finished
Use:
Evaluation
Paper:
N/A
Documentation:
http://testsynt.soiza.com/




