SVM-based generalized multiple-instance learning via approximate box counting

Q Tao, S Scott, NV Vinodchandran… - Proceedings of the twenty …, 2004 - dl.acm.org
Q Tao, S Scott, NV Vinodchandran, TT Osugi
Proceedings of the twenty-first international conference on Machine learning, 2004dl.acm.org
The multiple-instance learning (MIL) model has been very successful in application areas
such as drug discovery and content-based image-retrieval. Recently, a generalization of this
model and an algorithm for this generalization were introduced, showing significant
advantages over the conventional MIL model in certain application areas. Unfortunately, this
algorithm is inherently inefficient, preventing scaling to high dimensions. We reformulate this
algorithm using a kernel for a support vector machine, reducing its time complexity from …
The multiple-instance learning (MIL) model has been very successful in application areas such as drug discovery and content-based image-retrieval. Recently, a generalization of this model and an algorithm for this generalization were introduced, showing significant advantages over the conventional MIL model in certain application areas. Unfortunately, this algorithm is inherently inefficient, preventing scaling to high dimensions. We reformulate this algorithm using a kernel for a support vector machine, reducing its time complexity from exponential to polynomial. Computing the kernel is equivalent to counting the number of axis-parallel boxes in a discrete, bounded space that contain at least one point from each of two multisets P and Q. We show that this problem is #P-complete, but then give a fully polynomial randomized approximation scheme (FPRAS) for it. Finally, we empirically evaluate our kernel.
ACM Digital Library