Document classification - 1 reference results
Wikipedia
Document classification/categorization is a problem in information science. The task is to assign an electronic document to one or more categories, based on its contents. Document classification tasks can be divided into two sorts: supervised document classification where some external mechanism (such as human feedback) provides information on the correct classification for documents, and unsupervised document classification, where the classification must be done entirely without reference to external information. There is also a semi-supervised document classification, where parts of the documents are labeled by the external mechanism.
Techniques
Document classification techniques include:- naive Bayes classifier
- tf-idf
- latent semantic indexing
- support vector machines
- artificial neural network
- kNN
- decision trees, such as ID3
- Concept Mining
and approaches based on natural language processing.
Applications
Classification techniques have been applied to spam filtering, a process which tries to discern E-mail spam messages from legitimate emails.See also
- classification
- Compound term processing
- supervised learning, unsupervised learning
- document retrieval
- information retrieval
- string metrics
- machine learning
- text mining, web mining, concept mining
Further reading
Publications:
- Fabrizio Sebastiani. Machine learning in automated text categorization. ACM Computing Surveys, 34(1):1–47, 2002

- Introduction to document classification
- Bibliography on Automated Text Categorization
- Bibliography on Query Classification
Data sets:
Wikipedia, the free encyclopedia © 2001-2006 Wikipedia contributors (Disclaimer)
This article is licensed under the GNU Free Documentation License.
Last updated on Tuesday September 09, 2008 at 06:20:32 PDT (GMT -0700)
View this article at Wikipedia.org - Edit this article at Wikipedia.org - Donate to the Wikimedia Foundation
This article is licensed under the GNU Free Documentation License.
Last updated on Tuesday September 09, 2008 at 06:20:32 PDT (GMT -0700)
View this article at Wikipedia.org - Edit this article at Wikipedia.org - Donate to the Wikimedia Foundation
Copyright © 2008, Dictionary.com, LLC. All rights reserved.
Get your FREE Subscription to Dictionary.com Word of the Day
The FREE Dictionary.com Toolbar
| Dictionary | Thesaurus | Reference |
The answers are right on your browser and just a click away with Dictionary.com Toolbar.









