Conventional recommendation models often use the user-item interaction matrix (e.g. ratings) to predict user preferences and generate recommendations. However, it ignores abundant signals and context (e.g. visual signals or temporal context) existing in real-world applications. Moreover, efficiency becomes an essential factor when building large-scale recommendation engines. In this thesis, we seek to extend the conventional recommendation frameworks to adapt new and large-scale application scenarios. Specifically, this thesis includes three directions: (i) Visually-aware Recommendation: we extend recommendation models to visual domains. We develop CNN-based end-to-end learning approaches to make personalized image recommendations and complementary product recommendations. Moreover, beyond recommending existing products, we develop GAN-based preference models to generate new products that are preferred by users; (ii) Dynamic Recommendation: we adapt recommendation models to dynamic environments where user preferences constantly shift. We develop Markov-chain-based and self-attention-based sequential models to respond to the change of users' interests quickly and make more accurate recommendations. (iii) Efficient Recommendation: we consider both time efficiency and space efficiency, where the former seeks to optimize the serving latency, and the latter seeks to reduce the memory consumption of recommendation models.