577 results found.
Written
Corpus,
Language Type:
Multilingual
Languages:
English
Availability:
From Data Center(s)
License:
LDC
Size:
One million words Production Status:
Existing-used
Use:
Parsing and Tagging
Paper:
N/A
Documentation:
<Not Specified>Language Type:
Multilingual
Languages:
English
Availability:
From Data Center(s)
License:
<Not Specified>
Size:
<Not Specified> Production Status:
Existing-used
Use:
Language Modelling
Paper:
N/A
Documentation:
<Not Specified>
Written
Treebank,
Language Type:
Multilingual
Languages:
Czech English
Availability:
Freely Available
License:
CC-BY-NC-SA + LDC
Size:
50K sentences Production Status:
Existing-updated
Use:
Anaphora, Coreference
-
Paper title:Coreference in Prague Czech-English Dependency Treebank
-
Paper track:Written
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Anna Nedoluzhko | Charles University in Prague | CZ |
| Author 2 | Michal Novák | Charles University in Prague, Faculty of Mathematics and Physics | CZ |
| Author 3 | Silvie Cinkova | Charles University in Prague, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics | CZ |
| Author 4 | Marie Mikulová | Charles University in Prague, Faculty of Mathematics and Physics | CZ |
| Author 5 | Jiří Mírovský | Charles University in Prague | CZ |
| Main Contact | Michal Novák | Charles University in Prague, Faculty of Mathematics and Physics | None |
Documentation:
yes, English, http://ufal.mff.cuni.cz/pcedt2.0-coref/
Speech
Treebank,
Language Type:
Monolingual
Languages:
Czech
Availability:
Freely Available
License:
CreativeCommons
Size:
70000 sentences Production Status:
Newly created-in progress
Use:
-
Paper title:Prague Dependency Treebank - Consolidated 1.0
-
Paper track:Written/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Marie Mikulová | Prague Dependency Treebank of Spoken Czech 2.0 | /N |
Documentation:
https://ufal.mff.cuni.cz/pdtsc2.0
Written
Evaluation Data,
Language Type:
Multilingual
Languages:
English
Availability:
Freely Available
License:
<Not Specified>
Size:
<Not Specified> Production Status:
Existing-used
Use:
Document Classification, Text categorisation
-
Paper title:Learning from Domain Complexity
-
Paper track:Evaluation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Robert Remus | University of Leipzig | DE |
| Author 2 | Dominique Ziegelmayer | University of Cologne | DE |
| Main Contact | Robert Remus | University of Leipzig | None |
Documentation:
John Blitzer, Mark Dredze, Fernando Pereira. Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification. Association of Computational Linguistics (ACL), 2007.
Written
QA dataset,
Language Type:
Multilingual
Languages:
English
Availability:
From Owner
License:
<Not Specified>
Size:
<Not Specified> Production Status:
Existing-used
Use:
Information Extraction, Information Retrieval
Paper:
N/A
Documentation:
<Not Specified>Language Type:
Multilingual
Languages:
English Mandarin Chinese
Availability:
From Data Center(s)
License:
LDC
Size:
2M words Production Status:
Existing-updated
Use:
Knowledge Discovery/Representation
Paper:
N/A
Documentation:
<Not Specified>
Written
Treebank,
Language Type:
Monolingual
Languages:
English
Availability:
From Data Center(s)
License:
Size:
None Production Status:
Existing-used
Use:
Discourse
-
Paper title:Implicit Discourse Relation Classification: We Need to Talk about Evaluation
-
Paper track:Short/Discourse and Pragmatics
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Najoung Kim | Penn Discourse Treebank 3.0 | /N |
Documentation:
https://catalog.ldc.upenn.edu/docs/LDC2019T05/PDTB3-Annotation-Manual.pdf
Written
Corpus,
Language Type:
Multilingual
Languages:
Czech
Availability:
Freely Available
License:
CreativeCommons
Size:
49431 sentences Production Status:
Existing-used
Use:
Discourse
-
Paper title:Genres in the Prague Discourse Treebank
-
Paper track:Written
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Lucie Poláková | Charles University in Prague | CZ |
| Author 2 | Pavlína Jínová | Charles University in Prague | CZ |
| Author 3 | Jiří Mírovský | Charles University in Prague | CZ |
| Main Contact | Lucie Poláková | Charles University in Prague | None |
Documentation:
http://ufal.mff.cuni.cz/pdit/
Written
Lexicon,
Language Type:
Language Independent
Languages:
<Not Specified>
Availability:
Freely Available
License:
CC BY-SA
Size:
70 MByte Production Status:
Existing-updated
Use:
-
Paper title:Methodological Aspects of Developing and Managing an Etymological Lexical Resource: Introducing EtymDB-2.0
-
Paper track:Written/oral presentation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Clémentine Fourrier | EtymDB-2.0 | /N |
Documentation:
Yes, in English




