Abstract:
This dissertation develops methods that combine the advantages of discrete choice models and machine learning methods into interpretable econometric models. The aim is to enhance the predictive power of discrete choice models and their flexibility in representing unobserved heterogeneity without weakening their behavioral and economic interpretability. Specifically, this dissertation focuses on bringing machine learning into the Latent Class Choice Models (LCCMs), which are widely used in the discrete choice modeling community to model the unobserved behavioral heterogeneity of a population through discrete segments (or latent classes). LCCM consists of two sub-components, a class membership model that formulates the probability of an individual belonging to a specific segment/class and a class-specific choice model that estimates the choice probabilities.
The dissertation develops two new Latent Class Choice Models with a flexible class membership component. In each of the two proposed models, the latent classes are defined using a different machine learning clustering technique as opposed to the random utility specification of the LCCM. The first proposed model is titled Gaussian-Bernoulli Mixture – Latent Class Choice Model (GBM-LCCM) while the second proposed model is called Gaussian Process – Latent Class Choice Model (GP-LCCM).
The GBM-LCCM formulates the latent classes using model-based mixture models as an alternative approach to the traditional random utility specification with the aim of comparing the two approaches on various measures including prediction accuracy and representation of heterogeneity in the choice process. Mixture models are parametric model-based clustering techniques that have been widely used in areas such as machine learning, data mining and pattern recognition for clustering and classification problems. An Expectation-Maximization (EM) algorithm is derived for the estimation of the proposed model. Using two different case studies on travel mode choice behavior, the proposed model is compared to its traditional discrete choice model counterpart, the LCCM, on the basis of parameter estimate signs, values of time, statistical goodness-of-fit measures, and cross-validation tests. Results show that mixture models improve the overall performance of LCCMs by providing better out-of-sample predication accuracy by around 3% in addition to better and more flexible representation of heterogeneity and more reasonable parameter estimate signs without weakening the behavioral and economic interpretability of the choice models.
The second model, the GP-LCCM, formulates the latent classes using Gaussian Processes (GPs), a nonparametric class of probabilistic machine learning. Gaussian Processes are kernel-based algorithms that incorporate expert knowledge by assuming priors over latent functions rather than priors over parameters, which makes them more flexible in addressing nonlinear problems. By integrating a Gaussian Process within the LCCM structure, we aim at improving discrete representations of unobserved heterogeneity. The proposed model would assign individuals probabilistically to behaviorally homogeneous clusters (latent classes) using GPs and simultaneously estimate class-specific choice models by relying on random utility models. Furthermore, we derive and implement an Expectation-Maximization algorithm to jointly estimate/infer the hyper-parameters of the GP kernel function and the class-specific choice parameters by relying on a Laplace approximation and gradient-based numerical optimization methods, respectively. The model is tested on three different mode choice applications and compared against the traditional LCCM and the proposed GBM-LCCM. Results show that the GP-LCCM allows for a more complex and flexible representation of heterogeneity and improves both in-sample fit and out-of-sample predictive power by up to 7.6% and 8.8%, respectively. Moreover, behavioral and economic interpretability is maintained at the class-specific choice model level while local interpretation of the latent classes can still be achieved, although the nonparametric characteristic of GPs lessens the transparency of the class membership component.
The two proposed models are also compared against the LCCM in terms of their forecasting capabilities. Results show that both the GBM-LCCM and GP-LCCM are capable of providing meaningful forecasts that are similar to the forecasts of the traditional LCCM, to some extent. A demand sensitivity analysis with respect to the cost of some travel mode alternatives is also conducted and similar order of changes are attained between the results of the proposed models and LCCM in terms of in-sample fit, out-of-sample prediction accuracy, and aggregate forecasts. The sensitivity analysis also highlights the advantage of the proposed models in identifying a higher number of classes than the LCCM by providing a more in-depth understanding of the behavioral heterogeneity within a population and the behavioral responses of the different classes to new policies.
Advisor(s):
Abou-Zeid, Maya; Kaysi, Isam; Pereira, Francisco Camara
Description:
Sadek, Salah;
Rodrigues, Filipe;
Awad, Mariette;
Chalak, Ali;
Farooq, Bilal