7693 results found.
<Not Specified>
Written
Corpus,
LTC2011
Expand/Collapse
Language Type:
Multilingual
Languages:
English Slovenian
Availability:
Available PDF files (http://nl.ijs.si/isjt10/). However the corpus itself is at this moment not available.
License:
<Not Specified>
Size:
4.6 MByteProduction Status:
Newly created-in progress
Use:
Topic Detection and Tracking
Paper:
N/A
Documentation:
No.
Microblog/Twitter Summarization Data Set
Written
Corpus,
COLING2010
Expand/Collapse
Language Type:
Multilingual
Languages:
English
Availability:
From Owner
License:
Under Advisement
Size:
More than 750K unique tweets Production Status:
Newly created-in progress
Use:
Summarisation
Paper:
N/A
Documentation:
Publicly available documentation in English
Test set for Chinese nonlocal dependencies
Written
Corpus,
LREC2018
Expand/Collapse
Language Type:
Multilingual
Languages:
Mandarin Chinese
Availability:
Freely Available
License:
Gnu
Size:
1 MByte Production Status:
Newly created-in progress
Use:
Emotion Recognition/Generation
Paper:
N/A
Documentation:
<Not Specified>
Salience-In-News-And-Tweets
Written
Corpus,
LREC2016
Expand/Collapse
Language Type:
Multilingual
Languages:
English
Availability:
From Owner
License:
<Not Specified>
Size:
<Not Specified> <Not Specified>Production Status:
Newly created-in progress
Use:
Information Extraction, Information Retrieval
Paper:
N/A
Documentation:
<Not Specified>
European Medicines Agency (EMEA) documents
Written
Corpus,
LREC2018
Expand/Collapse
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
<Not Specified>
Size:
325332 sentences Production Status:
Existing-used
Use:
Machine Translation, SpeechToSpeech Translation
Paper:
N/A
Documentation:
<Not Specified>
Mitchell and Lapata 2008 Dataset
Written
Evaluation Data,
COLING2012
Expand/Collapse
Language Type:
Multilingual
Languages:
English
Availability:
Freely Available
License:
<Not Specified>
Size:
3600 entries Production Status:
Existing-used
Use:
Sentence Similarity Task
Paper:
N/A
Documentation:
<Not Specified>
WSJ Penn Treebank
Written
Corpus,
COLING2010
Expand/Collapse
Language Type:
Multilingual
Languages:
English
Availability:
From Data Center(s)
License:
LDC
Size:
<Not Specified> Production Status:
Existing-used
Use:
Parsing and Tagging
Paper:
N/A
Documentation:
<Not Specified>
OpenWordnet-PT
Written
Lexicon,
COLING2012
Expand/Collapse
Language Type:
Trilingual
Languages:
American English Brazilian Portuguese Portuguese
Availability:
Freely Available
License:
CC by SA 3.0
Size:
35724 Production Status:
Newly created-in progress
Use:
Word Sense Disambiguation
Paper:
N/A
Documentation:
https://github.com/arademaker/wordnet-br
British National Corpus (BNC)
Written
Corpus,
NAACL2013
Expand/Collapse
Language Type:
Multilingual
Languages:
English
Availability:
<Not Specified>
License:
<Not Specified>
Size:
630 Production Status:
Existing-used
Use:
Language Modelling
Paper:
N/A
Documentation:
<Not Specified>
The RST Basque Treebank
Written
Treebank,
LREC2018
Expand/Collapse
Previous
|
Next
Language Type:
Multilingual
Languages:
Basque
Availability:
Freely Available
License:
N/A
Size:
15566 words Production Status:
Existing-used
Use:
Discourse
Paper:
N/A
Documentation:
Iruskieta, M.; Aranzabe, M.J.; Diaz de Ilarraza, A.; Gonzalez, I.; Lersundi, M.; Lopez de la Calle, O. 2013. The RST Basque TreeBank: an online search interface to check rhetorical relations. Paper presented at the 4th Workshop ''RST and Discourse Studies'', Brasil, October 21-23.