User:Quaenuncabibis/Biogeme

Biogeme is an open source software package dedicated to the estimation of discrete choice models. It implements optimization algorithms to perform maximum likelihood estimation of the model parameters. The users specify the model using a modeling language. The current version of the software is offered as a package of the Python programming language.

History
The development of Biogeme as a freeware package has started in 2000. Before that, a software package called "Hielow" was the precursor of Biogeme. Developed in the 1990's under Windows, it was designed to estimate nested logit models, and came with a graphical use interface.

Versions
Various versions of Biogeme have been released over the years. The three major releases are associated with a complete reimplementation of the software, in order to exploit the most up-to-date technology.

Version 1.0
BisonBiogeme, released in 2000, is a stand-alone software written in C++ and based on a simple modeling language derived from a parser generator called Bison.

Version 2.0
PythonBiogeme, released in 2010, is a stand-alone software written in C++ and based on Python for the modeling language.

Version 3.0
PandasBiogeme, released in 2018, is a Python package, written both in C++ and Python, based on Python for the modeling language, and on Pandas for the data management.

Development
Biogeme is developed and maintained by Michel Bierlaire at EPFL's (École Polytechnique Fédérale de Lausanne) Transport and Mobility Laboratory. The latest version of PandasBiogeme is a Python package, developed in Python and in C++.

Distribution
The source code is available on GitHub. The Python package is available on the Python Package Index. Material, including examples of models, text and video tutorials, and real data, is available on the Biogeme's webpage.

Features
Biogeme offers a great deal of flexibility to the modeler to code the (log) likelihood function of a model using the features of the Python language, as well as several utilities that are specific to Biogeme. The analytical derivatives of the log likelihood function are automatically calculated by the software using automatic differentiation. A library of choice models is directly available from the package. They include the logit model, the nested logit model,  the cross-nested logit model,  multivariate extreme value models, and mixtures of logit models. A tutorial is available for the specification of hybrid choice models. that is choice models with latent variables. Biogeme can handle both cross-sectional and panel data.

It also offers a simulation feature that allows to apply a previously estimated model on the database, and to derive various indicators, such as consumer surplus, elasticities and market shares.

In addition to maximum likelihood estimation, Biogeme can deal with models with random parameters, and perform simulated maximum likelihood estimation, using Monte-Carlo simulation. A library of draw generation functions (random, Halton, antithetic) is available in the package.

The documentation consists in videos, reports and online documentation of the code.

Example
The following example illustrates how to use Biogeme for the estimation of a logit model with three alternatives. It uses the Swissmetro dataset, collected from a stated preferences survey in 1998 in Switzerland, and used for teaching purposes. The code is decomposed in several part:

Importing of the modules: Reading the data: Removing some observations from the data set: Defining the list of parameters to be estimated: Defining new variables: Defining the utility functions: Associating the utility functions with the numbering of alternatives: Associating the availability conditions with the alternatives: Defining the contribution of each observation to the log likelihood function. Here, a logit model its considered: Creating the Biogeme object: Estimating the parameters: Obtaining the results as a Pandas table: The output is             Value   Std err     t-test   p-value  Rob. Std err Rob. t-test Rob. p-value ASC_CAR  -0.154603  0.043235  -3.575840  0.000349      0.058163    -2.658079      0.007859 ASC_TRAIN -0.701147 0.054874 -12.777443  0.000000      0.082562    -8.492375      0.000000 B_COST   -1.083768  0.051830 -20.910063  0.000000      0.068224   -15.885339      0.000000 B_TIME   -1.277885  0.056883 -22.464979  0.000000      0.104255   -12.257328      0.000000

Other software packages for discrete choice models

 * MIXL: Simulated Maximum Likelihood Estimation of Mixed Logit Models for Large Datasets, by Joseph Malloy
 * LARCH: A Freeware Package for Estimating Discrete Choice Models, by Jeffrey Newman.
 * Apollo: a flexible, powerful and customisable freeware package for choice model estimation and application, by Stephane Hess and David Palma.
 * ALOGIT: Software for estimating and analysing generalised logit choice models, by Andrew Daly.
 * NLOGIT: Estimation and analysis tools for multinomial choice modeling, by William Greene.

Literature

 * Bierlaire (2020) "A short introduction to PandasBiogeme", Report TRANSP-OR 200605, Transport and Mobility Laboratory, Ecole Polytechnique Fédérale de Lausanne
 * Bierlaire (2018) "Calculating indicators with PandasBiogeme", Report TRANSP-OR 181223, Transport and Mobility Laboratory, Ecole Polytechnique Fédérale de Lausanne
 * Bierlaire (2019) "Monte-Carlo integration with PandasBiogeme", Report TRANSP-OR 191231, Transport and Mobility Laboratory, Ecole Polytechnique Fédérale de Lausanne
 * Bierlaire (2018) "Estimating choice models with latent variables with PandasBiogeme", Report TRANSP-OR 181227, Transport and Mobility Laboratory, Ecole Polytechnique Fédérale de Lausanne