Data-driven Approaches to Inventory Management
- Author(s): Cao, Ying
- Advisor(s): Shen, Zuo-Jun (Max)
- et al.
With the advances in technologies and the growing popularity of e-commerce, huge datasets and massive computational power have never been more accessible. Moreover, the rising machine learning and distributionally robust optimization techniques bring new opportunity for more effective inventory decision making in data-rich environments. The goal of this dissertation is, thus, to explore data-driven approaches to inventory management problems that are efficient and practical.
We address the challenges in this field from three different aspects: firstly, we aim at proposing a flexible model for capturing real-world demand process accurately with as few assumptions as possible; then, we take into account additional features in the demand model and derive robust inventory policies with out-of-sample performance guarantees under milder assumption than current literature; and finally, we explore the usage of a group of decomposition algorithms to tackle the increasing computational difficulty as the data size grows. Chapter 2, Chapter 3 and Chapter 4 each delves into one of these three directions respectively.
In Chapter 2, we leverage the universal approximating capability of neural network structures to approximate an arbitrarily complex autoregressive demand process without any parametric assumptions. By adopting a quantile loss in training, we allow our neural network to output directly an estimation of the critical quantile, which is indeed the inventory policy for classical newsvendor problem. In addition, in contrast to the prevalent feedforward neural networks which are asymptotically stationary, the special structure we choose is capable of handling nonstationary time series. To the best of our knowledge, this is the first approach which deals with nonstationary time series without any parametric assumption or preprocessing to capture the components like trend or seasonality. Though theoretical guarantees are sacrificed due to a lack of assumption on the underlying real process, empirical studies validate the performance of our approach on real-world nonstationary demand process. Moreover, we establish the optimality of the myopic policy to the multi-period newsvendor problem where unmet demand and excess inventory can be carried over to the next period and argue that our approach is also a data-driven solution.
The second project in Chapter 3 addresses the data-driven newsvendor problem from a different angle with the goal to achieve robust policies as well as theoretical support. We start with a simple linear demand model to incorporate information from other features related to the demand, such as price, materials and etc. And to hedge against the uncertainty of the demand distribution, the idea of distributionally robust optimization (DRO) is applied. We contribute to the current literature of DRO applications in supervised learning by adopting a fixed design interpretation of the features. Thus, similar to the neural network approach, we are also able to relax the assumption of identical and independent sample points, which is more applicable in real-world scenarios. Then, by leveraging results from fixed design linear regression, we propose a two-step framework to obtain a newsvendor solution. Moreover, the Wasserstein metric is chosen for constructing the ambiguity set of all candidate distributions and based on which our data-driven policy can be obtained efficiently in polynomial time and attains both finite-sample and asymptotic performance guarantees.
Finally, in Chapter 4, we put some effort into dealing with practical issues of implementing such data-driven approaches when massive datasets are available. Specifically, we consider a group of decomposition algorithms which are suitable for large-scale multi-block convex optimization problems with linear constraints. This problem setting covers a lot of applications in machine learning and data-driven problems. We focus on a special case of such algorithms which can be guaranteed to converge under mild conditions with linear rate, and also enjoys the convenience of parallel implementable subproblems. We modify an adaptive parameter tuning scheme to achieve faster convergence in practice. And at the end, we further show that when parameters are chosen appropriately, global convergence can be established even if the primal subproblems are only solved approximately.
We reckon that our results are just some primary attempts at achieving the goal of efficient decision making in a data-driven environment, and hope that this dissertation can serve as a catalyst for other research in this field. Thus, we list a number of directions for future research in the last chapter after the concluding remarks.