Ever since the advent of Alexnet in the ImageNet challenge in 2012, the medical image analysis community has taken notice of deep learning techniques and made the transition from systems that use handcrafted features to systems that learn feature from the data gradually. Histopathology images have been widely used to detect and diagnose a variety of cancers. With the growing availability of large scale gigapixel whole-slide images (WSI) of tissue specimen, digital pathology has become a very popular application area for deep learning techniques. Nevertheless, challenges exist in current computer-aided histopathology image analysis. Perhaps the biggest challenge is the insufficiency of annotated data. Deep learning requires extremely abundant training data to achieve good performance. However, only pathologists, who have been trained for years, can annotate the histopathology image accurately. Therefore, labeling histopathology images is both expensive and labor-intensive. The scarcity of the annotation can also be found at different scales. For example, to do a semantic segmentation task, it requires the network to have annotations at ``pixel-wise'' level; by tiling WSIs into different patches, patch-level labels are needed to provide accurate predictions. But in reality, most labels of WSIs are at case-level (\eg final diagnosis) at most.
This dissertation attempts to improve data efficiency on histopathology image analysis. We first start with a novel fully-supervised segmentation model for Gleason grading of prostate cancer. This method adopts two branches, an EpithelialNetwork Head (EHN) for detecting epithelial cells, and a Grading Network Head (GNH) for detecting, segmenting, and classifying the cancerous regions. Then we present a series of studies on semi-supervised learning, where we can take leverage of unannotated data. We focus on methods using generative adversarial networks (GANs). To this end, we demonstrate a pyramid GAN structure for high-resolution large-scale histopathology image generation and segmentation on both fully-supervised and semi-supervised scenarios. Finally, we present an active learning framework that is able to reduce the annotations required from the expert and handle noisy labels simultaneously. Extensive experiments and results have proved the eﬀectiveness of these methods, paving the way to optimize and improve the eﬀectiveness of data usage in histopathology image analysis.