7693 results found.
Chinese Penn Treebank 6.0
Written
Corpus,
COLING2010
Expand/Collapse
Language Type:
Multilingual
Languages:
Mandarin Chinese
Availability:
From Data Center(s)
License:
LDC
Size:
781,351 words, 28,295 trees Production Status:
Existing-used
Use:
Formalism-to-formalism corpus conversion
Paper:
N/A
Documentation:
Documentation publicly available in English
Fake news dataset
Written
Corpus,
COLING2018
Expand/Collapse
Language Type:
Multilingual
Languages:
English
Availability:
<Not Specified>
License:
Creative Commons
Size:
500 entries Production Status:
Newly created-finished
Use:
Document Classification, Text categorisation
Paper:
N/A
Documentation:
<Not Specified>
GRID
Multimodal/Multimedia
Corpus,
IS2013
Expand/Collapse
Language Type:
Multilingual
Languages:
English
Availability:
Freely Available
License:
OpenSource
Size:
34000 sentences Production Status:
Existing-used
Use:
Speech Recognition/Understanding
Paper:
N/A
Documentation:
yes. in English. yes.
Hamshahri
Written
Corpus,
COLING2012
Expand/Collapse
Language Type:
Multilingual
Languages:
Iranian Persian
Availability:
Freely Available
License:
OpenSource
Size:
400 Production Status:
Existing-updated
Use:
Language Modelling
Paper:
N/A
Documentation:
<Not Specified>
Providence Corpus for studying Child Directed Speech
Phonemic Transcription, written
Corpus,
COLING2012
Expand/Collapse
Language Type:
Multilingual
Languages:
American English
Availability:
Freely Available
License:
<Not Specified>
Size:
90000 Production Status:
Newly created-finished
Use:
Acquisition
Paper:
N/A
Documentation:
included in submission
Simulated Contact Center Dialogues
Speech
Corpus,
COLING2010
Expand/Collapse
Language Type:
Multilingual
Languages:
Japanese
Availability:
Not Available
License:
<Not Specified>
Size:
691 dialogues Production Status:
Newly created-finished
Use:
Summarisation
Paper:
N/A
Documentation:
<Not Specified>
Hyponymy extraction tool
Not Applicable
Evaluation Data,
COLING2010
Expand/Collapse
Language Type:
Multilingual
Languages:
Japanese
Availability:
Freely Available
License:
GNU GPL
Size:
66Mbytes Production Status:
Existing-used
Use:
Information Extraction, Information Retrieval
Paper:
N/A
Documentation:
Japanese
Ossetic National Corpus
Written
Corpus,
COLING2012
Expand/Collapse
Language Type:
Trilingual
Languages:
English Ossetian Russian
Availability:
Freely Available
License:
<Not Specified>
Size:
5000000 tokens Production Status:
Newly created-in progress
Use:
Corpus Creation /Annotation
Paper:
N/A
Documentation:
<Not Specified>
Ontology created from Wikipedia Animal articles
Written
Ontology,
COLING2010
Expand/Collapse
Language Type:
Multilingual
Languages:
English Finnish
Availability:
Freely Available
License:
<Not Specified>
Size:
<Not Specified> Production Status:
Newly created-in progress
Use:
Semantic Web
Paper:
N/A
Documentation:
<Not Specified>
Icelandic Parsed Historical Corpus (IcePaHC)
Written
Treebank,
ACL2016
Expand/Collapse
Previous
|
Next
Language Type:
Multilingual
Languages:
Icelandic
Availability:
Freely Available
License:
LGPL
Size:
1M words Production Status:
Existing-used
Use:
Morphological Analysis
Paper:
N/A
Documentation:
Yes, English