|
ABSTRACT
Title |
: |
Discovering suffixes: A Case Study for Marathi Language |
Authors |
: |
Mudassar M. Majgaonker, Tanveer J Siddiqui |
Keywords |
: |
component; Marathi morphology, Marathi stemmer,
Unsupervised stemmer, Rule-based stemmer, Natural language
processing |
Issue Date |
: |
November 2010 |
Abstract |
: |
Suffix stripping is a pre-processing step required in a
number of natural language processing applications. Stemmer is
a tool used to perform this step. This paper presents and
evaluates a rule-based and an unsupervised Marathi stemmer.
The rule-based stemmer uses a set of manually extracted suffix
stripping rules whereas the unsupervised approach learns
suffixes automatically from a set of words extracted from raw
Marathi text. The performance of both the stemmers has been
compared on a test dataset consisting of 1500 manually stemmed
word.
|
Page(s) |
: |
2716-2720 |
ISSN |
: |
0975–3397 |
Source |
: |
Vol. 2, Issue.8 |
|