Automatic detection of text genre

B Kessler, G Nunberg, H Schütze - arXiv preprint cmp-lg/9707002, 1997 - arxiv.org
arXiv preprint cmp-lg/9707002, 1997arxiv.org
As the text databases available to users become larger and more heterogeneous, genre
becomes increasingly important for computational linguistics as a complement to topical and
structural principles of classification. We propose a theory of genres as bundles of facets,
which correlate with various surface cues, and argue that genre detection based on surface
cues is as successful as detection based on deeper structural properties.
As the text databases available to users become larger and more heterogeneous, genre becomes increasingly important for computational linguistics as a complement to topical and structural principles of classification. We propose a theory of genres as bundles of facets, which correlate with various surface cues, and argue that genre detection based on surface cues is as successful as detection based on deeper structural properties.
arxiv.org