Predictive methods for surgery duration

Predictions of surgery duration (SD) are used to schedule planned/elective surgeries so that utilization rate of operating theatres be optimized (maximized subject to policy constraints). An example for a constraint is that a pre-specified tolerance for the percentage of postponed surgeries (due to non-available operating room (OR) or recovery room space) not be exceeded. The tight linkage between SD prediction and surgery scheduling is the reason that most often scientific research related to scheduling methods addresses also SD predictive methods and vice versa. Durations of surgeries are known to have large variability. Therefore, SD predictive methods attempt, on the one hand, to reduce variability (via stratification and covariates, as detailed later), and on the other employ best available methods to produce SD predictions. The more accurate the predictions, the better the scheduling of surgeries (in terms of the required OR utilization optimization).

An SD predictive method would ideally deliver a predicted SD statistical distribution (specifying the distribution and estimating its parameters). Once SD distribution is completely specified, various desired types of information could be extracted thereof, for example, the most probable duration (mode), or the probability that SD does not exceed a certain threshold value. In less ambitious circumstance, the predictive method would at least predict some of the basic properties of the distribution, like location and scale parameters (mean, median, mode, standard deviation or coefficient of variation, CV). Certain desired percentiles of the distribution may also be the objective of estimation and prediction. Experts estimates, empirical histograms of the distribution (based on historical computer records), data mining and knowledge discovery techniques often replace the ideal objective of fully specifying SD theoretical distribution.

Reducing SD variability prior to prediction (as alluded to earlier) is commonly regarded as part and parcel of SD predictive method. Most probably, SD has, in addition to random variation, also a systematic component, namely, SD distribution may be affected by various related factors (like medical specialty, patient condition or age, professional experience and size of medical team, number of surgeries a surgeon has to perform in a shift, type of anesthetic administered). Accounting for these factors (via stratification or covariates) would diminish SD variability and enhance the accuracy of the predictive method. Incorporating expert estimates (like those of surgeons) in the predictive model may also contribute to diminish the uncertainty of data-based SD prediction. Often, statistically significant covariates (also related to as factors, predictors or explanatory variables) — are first identified (for example, via simple techniques like linear regression and knowledge discovery), and only later more advanced big-data techniques are employed, like Artificial Intelligence and Machine Learning, to produce the final prediction.

Literature reviews of studies addressing surgeries scheduling most often also address related SD predictive methods. Here are some examples (latest first).

The rest of this entry review various perspectives associated with the process of producing SD predictions — SD statistical distributions, Methods to reduce SD variability (stratification and covariates), Predictive models and methods, and Surgery as a work-process. The latter addresses surgery characterization as a work-process (repetitive, semi-repetitive or memoryless) and its effect on SD distributional shape.

Theoretical models
A most straightforward SD predictive method comprises specifying a set of existent statistical distributions, and based on available data and distribution-fitting criteria select the most fitting distribution. There is a large volume of comparative studies that attempt to select the most fitting models for SD distribution. Distributions most frequently addressed are the normal, the three-parameter lognormal, gamma (including the exponential) and Weibull. Less frequent "trial" distributions (for fitting purposes) are the loglogistic model, Burr, generalized gamma and the piecewise-constant hazard model. Attempts to presenting SD distribution as a mixture-distribution have also been reported (normal-normal, lognormal-lognormal and Weibull–Gamma mixtures). Occasionally, predictive methods are developed that are valid for a general SD distribution, or more advanced techniques, like Kernel Density Estimation (KDE), are used instead of the traditional methods (like distribution-fitting or regression-oriented methods). There is broad consensus that the three-parameter lognormal describes best most SD distributions. A new family of SD distributions, which includes the normal, lognormal and exponential as exact special cases, has recently been developed. Here are some examples (latest first).

Using historical records to specify an empirical distribution
As an alternative to specifying a theoretical distribution as model for SD, one may use records to construct a histogram of available data, and use the related empirical distribution function (the cumulative plot) to estimate various required percentiles (like the median or the third quartile). Historical records/expert estimates may also be used to specify location and scale parameters, without specifying a model for SD distribution.

Data mining methods
These methods have recently gained traction as an alternative to specifying in-advance a theoretical model to describe SD distribution for all types of surgeries. Examples are detailed below ("Predictive models and methods").

Reducing SD variability (stratification and covariates)
To enhance SD prediction accuracy, two major approaches are pursued to reduce SD data variability: Stratification and covariates (incorporated in the predictive model). Covariates are often referred to in the literature also as factors, effects, explanatory variables or predictors.

Stratification
The term means that available data are divided (stratified) into subgroups, according to a criterion statistically shown to affect SD distribution. The predictive method then aims to produce SD prediction for specified subgroups, having SD with appreciably reduced variability. Examples for stratification criteria are medical specialty, Procedure Code systems, patient-severity condition or hospital/surgeon/technology (with resulting models related to as hospital-specific, surgeon-specific or technology-specific). Examples for implementation are Current Procedural Terminology (CPT) and ICD-9-CM Diagnosis and Procedure Codes (International Classification of Diseases, 9th Revision, Clinical Modification).

Covariates (factors, effects, explanatory variables, predictors)
This approach to reduce variability incorporates covariates in the prediction model. The same predictive method may then be more generally applied, with covariates assuming different values for different levels of the factors shown to affect SD distribution (usually by affecting a location parameter, like the mean, and, more rarely, also a scale parameter, like the variance). A most basic method to incorporate covariates into a predictive method is to assume that SD distribution is lognormally distributed. The logged data (taking log of SD data) then represent a normally distributed population, allowing use of multiple- linear-regression to detect statistically significant factors. Other regression methods, which do not require data normality or are robust to its violation (generalized linear models, nonlinear regression) and artificial intelligence methods have also been used (references sorted chronologically, latest first).

Predictive models and methods
Following is a representative (non-exhaustive) list of models and methods employed to produce SD predictions (in no particular order). These, or a mixture thereof, may be found in the sample of representative references below:

Linear regression (LR); Multivariate adaptive regression splines (MARS); Random forests (RF); Machine learning; Data mining (rough sets, neural networks); Knowledge discovery in databases (KDD); Data warehouse model (used to extract data from various, possibly non-interacting, databases); Kernel density estimation (KDE); Jackknife; Monte Carlo simulation.

Surgery as work-process (repetitive, semi-repetitive, memoryless)
Surgery is a work process, and likewise it requires inputs to achieve the desired output, a recuperating post-surgery patient. Examples of work-process inputs, from Production Engineering, are the five M's — "money, manpower, materials, machinery, methods" (where "manpower" refers to the human element in general). Like all work-processes in industry and the services, surgeries also have a certain characteristic work-content, which may be unstable to various degrees (within the defined statistical population to which the prediction method aims). This generates a source for SD variability that affects SD distributional shape (from the normal distribution, for purely repetitive processes, to the exponential, for purely memoryless processes). Ignoring this source may confound its variability with that due to covariates (as detailed earlier). Therefore, as all work-processes may be partitioned into three types (repetitive, semi-repetitive, memoryless), surgeries may be similarly partitioned. A stochastic model that takes account of work-content instability has recently been developed, which delivers a family of distributions, with the normal/lognormal and exponential as exact special cases. This model was applied to construct a statistical process control scheme for SD.