As machine learning powered decision-making becomes increasingly important in our daily lives, it is imperative to strive for fairness in the underlying data processing. We propose a pre-processing algorithm for fair data representation via which L 2 ( ℙ ) -objective supervised learning results in estimations of the Pareto frontier between prediction error and statistical disparity. Particularly, the present work applies the optimal affine transport to approach the post-processing Wasserstein barycenter characterization of the optimal fair L 2 -objective supervised learning via a pre-processing data deformation. Furthermore, we show that the Wasserstein geodesics from learning outcome marginals to their barycenter characterizes the Pareto frontier between L 2 -loss and total Wasserstein distance among the marginals. Numerical simulations underscore the advantages: (1) the pre-processing step is compositive with arbitrary conditional expectation estimation supervised learning methods and unseen data; (2) the fair representation protects the sensitive information by limiting the inference capability of the remaining data with respect to the sensitive data; (3) the optimal affine maps are computationally efficient even for high-dimensional data.