Multiple Instance Learning with Query Bags
Skip to main content
eScholarship
Open Access Publications from the University of California

Multiple Instance Learning with Query Bags

Abstract

In many machine learning applications, precisely labeled data is either burdensome or impossible to collect. Multiple Instance Learning (MIL), in which training data is provided in the form of labeled bags rather than labeled instances, is one approach for dealing with ambiguously labeled data. In this paper we argue that in many applications of MIL (e.g. image, audio, text, bioinformatics) a single bag actually consists of a large or infinite number of instances, such as all points on a low dimensional manifold. For practical reasons, these bags get subsampled before training. Instead, we introduce a MIL formulation which directly models the underlying structure of these bags. We propose and analyze the query bag model, in which instances are obtained by repeatedly querying an oracle in a way that can capture relationships between instances. We show that sampling more instances results in better classification performance, which motivates us to develop algorithmic strategies for sampling many instances while maintaining efficiency.

Pre-2018 CSE ID: CS2009-0949

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View