7693 results found.
Baidu Zhidao Corpus
Speech
Corpus,
IJCNLP2011
Expand/Collapse
Language Type:
Multilingual
Languages:
Mandarin Chinese
Availability:
<Not Specified>
License:
<Not Specified>
Size:
260MB Production Status:
Newly created-in progress
Use:
Text Mining
Paper:
N/A
Documentation:
<Not Specified>
TiGer
Written
Corpus,
COLING2010
Expand/Collapse
Language Type:
Multilingual
Languages:
German
Availability:
Freely Available
License:
<Not Specified>
Size:
40,000 sentences Production Status:
Existing-used
Use:
Parsing and Tagging
Paper:
N/A
Documentation:
<Not Specified>
Arabic Syntactic Dependency Parser Model
Written
Grammar/Language Model,
ACLHT2011
Expand/Collapse
Language Type:
Multilingual
Languages:
Standard Arabic
Availability:
From Owner
License:
For educational purposes only
Size:
<Not Specified> Production Status:
Newly created-in progress
Use:
Parsing and Tagging
Paper:
N/A
Documentation:
<Not Specified>
A Uyghur Tokenizer and part-of-speech tagger
Written
Corpus,
COLING2010
Expand/Collapse
Language Type:
Multilingual
Languages:
Uyghur
Availability:
From Owner
License:
<Not Specified>
Size:
434,169(Dataset in tokenization); 711,867(Dataset in POS) Production Status:
Existing-updated
Use:
Machine Translation, SpeechToSpeech Translation
Paper:
N/A
Documentation:
<Not Specified>
TSUBAKI Corpus
Written
Corpus,
COLING2010
Expand/Collapse
Language Type:
Multilingual
Languages:
Japanese
Availability:
Within our organization
License:
<Not Specified>
Size:
1TByte Production Status:
Existing-used
Use:
Knowledge Discovery/Representation
Paper:
N/A
Documentation:
<Not Specified>
Language identifier for Bosnian, Croatian and Serbian
Written
Language Identifier,
COLING2012
Expand/Collapse
Language Type:
Trilingual
Languages:
Bosnian Croatian Serbian
Availability:
Freely Available
License:
LGPL
Size:
6 Production Status:
Newly created-finished
Use:
Language Identification
Paper:
N/A
Documentation:
<Not Specified>
CTB6
Written
Corpus,
COLING2010
Expand/Collapse
Language Type:
Multilingual
Languages:
Mandarin Chinese
Availability:
<Not Specified>
License:
<Not Specified>
Size:
11Mbyte Production Status:
Existing-used
Use:
Parsing and Tagging
Paper:
N/A
Documentation:
<Not Specified>
CLP2010 Testing Dataset of the Chinese Word Sense Induction Task
Written
Evaluation Data,
IJCNLP2011
Expand/Collapse
Language Type:
Multilingual
Languages:
Mandarin Chinese
Availability:
From Owner
License:
<Not Specified>
Size:
100 target words and 5000 instances, 2.34Mbyte Production Status:
Existing-used
Use:
Word Sense Induction
Paper:
N/A
Documentation:
<Not Specified>
HCRC Map Task Corpus
Speech
Corpus,
RANLP2011
Expand/Collapse
Language Type:
Multilingual
Languages:
English
Availability:
Freely Available
License:
<Not Specified>
Size:
150 000 words Production Status:
Existing-used
Use:
Parsing and Tagging
Paper:
N/A
Documentation:
<Not Specified>
IREX data set for NE recognition
Written
Corpus,
COLING2010
Expand/Collapse
Previous
|
Next
Language Type:
Multilingual
Languages:
Japanese
Availability:
<Not Specified>
License:
<Not Specified>
Size:
<Not Specified> Production Status:
Existing-used
Use:
Named Entity Recognition
Paper:
N/A
Documentation:
<Not Specified>