This dissertation illustrates how certain information-theoretic ideas and views on learning problems can lead to new algorithms via concrete examples.The three information-theoretic strategies taken in this dissertation are
(1) to abstract out the gist of a learning problem in the infinite-sample limit;
(2) to reduce a learning problem into a probability estimation problem and plugging-in a "good" probability; and
(3) to adapt and apply relevant results from information theory.
These are applied to three topics in machine learning, including representation learning, nearest-neighbor methods, and universal information processing, where two problems are studied from each topic.