This paper examines numerous machine learning methods in their application for predicting successful item recommendations to users within a network. By evaluating these methods, various industry-standard recommenders are compared in order to construct a baseline model for research purposes. Next we attempt to similarly generate recommendations using a natural language processing, or NLP, approach by leveraging available textual data. This is intended to explore if user preferences can be modeled using the semantic information about a given item. Various approaches are considered to capture this information and the final recommender is ultimately built with Google’s Universal Sentence Encoder, an NLP model used to encode full sentences into high-dimension vectors that capture context [CYK18]. In the case of our research, the items being recommended are movies. Each model is trained using a large dataset of film reviews provided by the University of Minnesota. In addition to user reviews, detailed information for each movie ranging from the popularity and genre to descriptive fields like plot overviews are also used as features. With this, we are able to assess a variety of methods for providing recommendations. Ensemble methods are used to build a baseline model with a RMSE of 0.843. Different NLP algorithms are compared and a recommender is successfully built on the Universal Sentence Encoder model having similar performance to our baseline, an RMSE of 0.872.
In the following we unite Adaptive Least Squares (ALS) and Inverse Distance Weighting as a computationally frugal means of modeling very large space-time data. This technique, dubbed weighting by inverse distance with adaptive least squares (WIDALS) boasts several merits, including a small and readily interpretable
hyperparameter space, and relative ease of implementation. We include RMSE comparisons between WIDALS and various solutions including the Kalman solution on small simulated data sets. We culminate our work with a large scale imputation/model tting ritual (dubbed \phyning") using WIDALS on 6 contemporaneous climatological data set, provided by the National Climatological Data Center, possessing 6 11669 = 70014 covariates/responses over 1016 time points accompanied by about 24 million exogenous covariates for a total of about 96,000,000 scalar data.
Cookie SettingseScholarship uses cookies to ensure you have the best experience on our website. You can manage which cookies you want us to use.Our Privacy Statement includes more details on the cookies we use and how we protect your privacy.