7693 results found.
Szeged LVC Corpus
Written
Corpus,
COLING2010
Expand/Collapse
Language Type:
Multilingual
Languages:
Hungarian
Availability:
From Owner
License:
<Not Specified>
Size:
250Mbyte Production Status:
Newly created-finished
Use:
Information Extraction, Information Retrieval
Paper:
N/A
Documentation:
available in Hungarian, English documentation in progress
Wall Street Journal (WSJ) Corpus
Speech
Corpus,
IS2013
Expand/Collapse
Language Type:
Multilingual
Languages:
American English
Availability:
From Data Center(s)
License:
LDC
Size:
<Not Specified> Production Status:
Existing-used
Use:
Speech Recognition/Understanding
Paper:
N/A
Documentation:
<Not Specified>
SIKOR North Saami free corpus
Written
Corpus,
LREC2018
Expand/Collapse
Language Type:
Multilingual
Languages:
Northern Sami
Availability:
Freely Available
License:
CC BY 3.0
Size:
8936437 tokens Production Status:
Existing-used
Use:
Language Modelling
Paper:
N/A
Documentation:
<Not Specified>
Tigrinya Word lexicon
Written
Lexicon,
COLING2012
Expand/Collapse
Language Type:
Multilingual
Languages:
Tigrinya
Availability:
Not Available
License:
<Not Specified>
Size:
17 Production Status:
Newly created-in progress
Use:
Information Extraction, Information Retrieval
Paper:
N/A
Documentation:
<Not Specified>
Context-free dataset
Written
Corpus,
COLING2018
Expand/Collapse
Language Type:
Multilingual
Languages:
English
Availability:
Freely Available
License:
<Not Specified>
Size:
759 context-response pairs OtherProduction Status:
Newly created-finished
Use:
Dialogue
Paper:
N/A
Documentation:
<Not Specified>
ITAAL speech corpus
Speech
Corpus,
IS2013
Expand/Collapse
Language Type:
Multilingual
Languages:
italian
Availability:
From Owner
License:
<Not Specified>
Size:
<Not Specified> Production Status:
Newly created-finished
Use:
Speech Recognition/Understanding
Paper:
N/A
Documentation:
<Not Specified>
AOL 2006 Query Log
Written
Query Log,
COLING2010
Expand/Collapse
Language Type:
Multilingual
Languages:
English
Availability:
Freely Available
License:
<Not Specified>
Size:
2.1Gbyte Production Status:
Existing-used
Use:
Session Detection
Paper:
N/A
Documentation:
<Not Specified>
OpenNLP
Written
Tagger/Parser,
COLING2012
Expand/Collapse
Language Type:
Multilingual
Languages:
English
Availability:
Freely Available
License:
OpenSource
Size:
<Not Specified> Production Status:
Existing-used
Use:
Information Extraction, Information Retrieval
Paper:
N/A
Documentation:
<Not Specified>
Chinese CCGbank
Written
Corpus,
COLING2010
Expand/Collapse
Language Type:
Multilingual
Languages:
Mandarin Chinese
Availability:
To be determined (in the next few months)
License:
To be determined (most likely distributed through the LDC)
Size:
760,000 words, 27,759 trees Production Status:
Newly created-in progress
Use:
Parsing and Tagging
Paper:
N/A
Documentation:
English documentation is not yet available, but will be provided on release.
en_w_fr_articles_subset_Wikipedia
Written
Evaluation Data,
LREC2014
Expand/Collapse
Previous
|
Next
Language Type:
Multilingual
Languages:
English french
Availability:
Freely Available
License:
<Not Specified>
Size:
199734 Production Status:
Newly created-finished
Use:
Information Extraction, Information Retrieval
Paper:
N/A
Documentation:
See paper