7693 results found.
Twitter corpus
Written
Corpus,
COLING2012
Expand/Collapse
Language Type:
Multilingual
Languages:
English
Availability:
Freely Available
License:
<Not Specified>
Size:
100000 Production Status:
Existing-used
Use:
Emotion Recognition/Generation
Paper:
N/A
Documentation:
<Not Specified>
CELEX Lexical Database (refurbishment)
Written
a simple Perl script for the production of the changed version,
LREC2016
Expand/Collapse
Language Type:
Multilingual
Languages:
German
Availability:
Freely Available
License:
<Not Specified>
Size:
4000 tokens Production Status:
Existing-used
Use:
Changes of old standards to new ones
Paper:
N/A
Documentation:
Perl comments
English GigaWord
Written
Corpus,
COLING2010
Expand/Collapse
Language Type:
Multilingual
Languages:
English
Availability:
From Data Center(s)
License:
LDC
Size:
<Not Specified> Production Status:
<Not Specified>
Use:
Not Applicable
Paper:
N/A
Documentation:
<Not Specified>
Hurricane Irene 2011 Dataset
Written
Corpus,
COLING2012
Expand/Collapse
Language Type:
Multilingual
Languages:
English
Availability:
From Owner
License:
<Not Specified>
Size:
240 MByte Production Status:
Newly created-finished
Use:
Information Extraction, Information Retrieval
Paper:
N/A
Documentation:
NO
Sighan 2005 bakeoff data
Written
Corpus,
COLING2010
Expand/Collapse
Language Type:
Multilingual
Languages:
Mandarin Chinese
Availability:
Freely Available
License:
<Not Specified>
Size:
37MB Production Status:
Existing-used
Use:
Word Segmentation
Paper:
N/A
Documentation:
The document is written in English
UNT Computer Science Short Answer Dataset v 2.0
Written
Evaluation Data,
ACLHT2011
Expand/Collapse
Language Type:
Multilingual
Languages:
English
Availability:
From Owner
License:
LDC
Size:
1.4MB Production Status:
Newly created-finished
Use:
Evaluation/Validation
Paper:
N/A
Documentation:
http://nlp.cs.qc.cuny.edu/kbp/2010/data.html
Wikipedia
Multimodal/Multimedia
Corpus,
COLING2010
Expand/Collapse
Language Type:
Multilingual
Languages:
English
Availability:
Freely Available
License:
<Not Specified>
Size:
<Not Specified> Production Status:
Existing-updated
Use:
Knowledge Discovery/Representation
Paper:
N/A
Documentation:
<Not Specified>
Balanced Corpus of Contemporary Written Japanese
Written
Corpus,
COLING2012
Expand/Collapse
Language Type:
Multilingual
Languages:
Japanese
Availability:
From Data Center(s)
License:
<Not Specified>
Size:
1286899 Production Status:
Existing-used
Use:
Morphological Analysis
Paper:
N/A
Documentation:
<Not Specified>
Indic Language Transliteration Data
Written
Evaluation Data,
COLING2010
Expand/Collapse
Language Type:
Trilingual
Languages:
Bangali Hindi Telugu
Availability:
From Owner
License:
<Not Specified>
Size:
<Not Specified> Production Status:
Newly created-finished
Use:
Transliteration test bench
Paper:
N/A
Documentation:
<Not Specified>
Project Gutenberg
Written
Corpus,
RANLP2011
Expand/Collapse
Previous
|
Next
Language Type:
Multilingual
Languages:
English
Availability:
Freely Available
License:
The Full Project Gutenberg License
Size:
331Mbyte Production Status:
Existing-used
Use:
Information Extraction, Information Retrieval
Paper:
N/A
Documentation:
<Not Specified>