NeuroCOLT

Neural Networks and Computational Learning Theory

 

About NeuroCOLT

Papers Archive

1994 1995
1996 1997
1998 1999
2000 2001
2002

Books

info@neurocolt.org

NeuroCOLT Technical Report NC-TR-02-122


2002-122
On the Application of Diffusion Kernels to Text Data

Jaz Kandola
John Shawe-Taylor
Nello Cristianini

ABSTRACT
Kernel methods, such as Support Vector Machines, have successfully been used for text categorization. A standard choice of kernel function has been the inner product between the vector-space representation of two documents, in analogy with classical information retrieval (IR) approaches. In this paper we consider diffusion kernels (Kondor, 2001) and their suitability for text data. We motivate their use from a graph theoretic framework. We propose an approach based on alignment for selecting the optimal decay parameter $\lambda$ in these kernels. We provide experimental results demonstrating that diffusion kernels are attractive choices for modelling text data.

Download Postscript