Hyperbolastic functions



The hyperbolastic functions, also known as hyperbolastic growth models, are mathematical functions that are used in medical statistical modeling. These models were originally developed to capture the growth dynamics of multicellular tumor spheres, and were introduced in 2005 by Mohammad Tabatabai, David Williams, and Zoran Bursac. The precision of hyperbolastic functions in modeling real world problems is somewhat due to their flexibility in their point of inflection. These functions can be used in a wide variety of modeling problems such as tumor growth, stem cell proliferation, pharma kinetics, cancer growth, sigmoid activation function in neural networks, and epidemiological disease progression or regression.

The hyperbolastic functions can model both growth and decay curves until it reaches carrying capacity. Due to their flexibility, these models have diverse applications in the medical field, with the ability to capture disease progression with an intervening treatment. As the figures indicate, hyperbolastic functions can fit a sigmoidal curve indicating that the slowest rate occurs at the early and late stages. In addition to the presenting sigmoidal shapes, it can also accommodate biphasic situations where medical interventions slow or reverse disease progression; but, when the effect of the treatment vanishes, the disease will begin the second phase of its progression until it reaches its horizontal asymptote.

One of the main characteristics these functions have is that they cannot only fit sigmoidal shapes, but can also model biphasic growth patterns that other classical sigmoidal curves cannot adequately model. This distinguishing feature has advantageous applications in various fields including medicine, biology, economics, engineering, agronomy, and computer aided system theory.

Function H1
The hyperbolastic rate equation of type I, denoted H1, is given by

where $$x$$ is any real number and $$P\left(x \right)$$ is the population size at $$x$$. The parameter $$M$$ represents carrying capacity, and parameters $$\delta$$ and $$\theta$$ jointly represent growth rate. The parameter $$\theta$$ gives the distance from a symmetric sigmoidal curve. Solving the hyperbolastic rate equation of type I for $$P \left(x \right)$$ gives

where $$\operatorname{arsinh}$$ is the inverse hyperbolic sine function. If one desires to use the initial condition $$P\left(x_0\right)=P_0$$, then $$\alpha$$ can be expressed as


 * $$\alpha=\frac{M-P_0}{P_0} e^{\delta x_0+ \theta \operatorname{arsinh}(x_0)}$$.

If $$x_0=0$$, then $$\alpha$$ reduces to


 * $$\alpha= \frac{M-P_0}{P_0}$$.

In the event that a vertical shift is needed to give a better model fit, one can add the shift parameter $$\zeta$$, which would result in the following formula


 * $$P(x)= \frac{M}{1+ \alpha e^{-\delta x- \theta \operatorname{arsinh}(x)}} + \zeta$$.

The hyperbolastic function of type I generalizes the logistic function. If the parameters $$\theta = 0$$, then it would become a logistic function. This function $$P(x)$$ is a hyperbolastic function of type I. The standard hyperbolastic function of type I is


 * $$P(x)= \frac{1}{1+ e^{-x-\theta \operatorname{arsinh}(x)}}$$.

Function H2
The hyperbolastic rate equation of type II, denoted by H2, is defined as


 * $$\frac{dP(x)}{dx}= \frac{\alpha \delta \gamma P^2 (x) x^{\gamma - 1}}{M} \tanh \left(\frac{M-P(x)}{\alpha P(x)}\right),$$

where $$\tanh$$ is the hyperbolic tangent function, $$M$$ is the carrying capacity, and both $$\delta$$ and $$\gamma>0$$ jointly determine the growth rate. In addition, the parameter $$\gamma$$ represents acceleration in the time course. Solving the hyperbolastic rate function of type II for $$P\left(x\right)$$ gives


 * $$P(x)=\frac{M}{1+\alpha\operatorname{arsinh} \left(e^{-\delta x^\gamma}\right)}

$$.

If one desires to use initial condition $$ P(x_0)=P_0,$$ then $$\alpha$$ can be expressed as


 * $$\alpha=\frac{M-P_0}{P_0 \operatorname{arsinh} \left(e^{-\delta x_0^\gamma}\right)}

$$.

If $$x_0=0$$, then $$\alpha$$ reduces to


 * $$\alpha=\frac{M-P_0}{P_0 \operatorname{arsinh} (1)}$$.

Similarly, in the event that a vertical shift is needed to give a better fit, one can use the following formula


 * $$P(x)=\frac{M}{1+\alpha\operatorname{arsinh} \left(e^{-\delta x^\gamma}\right)}+\zeta

$$.

The standard hyperbolastic function of type II is defined as


 * $$P(x)=\frac{1}{1+ \operatorname{arsinh} \left(e^{-x}\right)}

$$.

Function H3
The hyperbolastic rate equation of type III is denoted by H3 and has the form


 * $$\frac{dP(t)}{dt}= \left(M-P \left(t \right)\right)\left(\delta \gamma t^{\gamma - 1}+ \frac{\theta}{\sqrt{1+ \theta^2 t^2}}\right)$$,

where $$t$$ > 0. The parameter $$M$$ represents the carrying capacity, and the parameters $$\delta,$$ $$\gamma,$$ and $$\theta$$ jointly determine the growth rate. The parameter $$\gamma,$$ represents acceleration of the time scale, while the size of $$\theta$$ represents distance from a symmetric sigmoidal curve. The solution to the differential equation of type III is


 * $$P(t)= M- \alpha e^{-\delta t^\gamma- \operatorname{arsinh}(\theta t)}$$,

with the initial condition $$ P\left(t_0\right)=P_0$$ we can express $$\alpha$$ as


 * $$\alpha=\left(M-P_0 \right) e^{\delta t_0^\gamma+ \operatorname{arsinh}(\theta t_0)}$$.

The hyperbolastic distribution of type III is a three-parameter family of continuous probability distributions with scale parameters $$\delta$$ > 0, and $$\theta$$ ≥ 0 and parameter $$\gamma$$ as the shape parameter. When the parameter $$\theta$$ = 0, the hyperbolastic distribution of type III is reduced to the weibull distribution. The hyperbolastic cumulative distribution function of type III is given by


 * $$F(x; \delta, \gamma, \theta)=

\begin{cases} 1- e^{-\delta x^\gamma - \operatorname{arsinh}(\theta x)} & x\geq0 ,\\ 0 & x < 0 \end{cases} $$,

and its corresponding probability density function is



f(x; \delta, \gamma, \theta) = \begin{cases} e^{- \delta x^\gamma - \operatorname{arsinh}(\theta x)}\left(\delta \gamma x^{\gamma-1}+ \frac{\theta}{\sqrt{1+\theta^2 x^2}}\right) & x\geq0 ,\\ 0 & x<0 \end{cases}$$.

The hazard function $$h$$ (or failure rate) is given by


 * $$h\left(x; \delta, \gamma, \theta \right) = \delta\gamma x^{\gamma-1} + \frac{\theta}{\sqrt{1+ x^2\theta^2}}.$$

The survival function $$S$$ is given by


 * $$S(x; \delta, \gamma, \theta)= e^{- \delta x^\gamma- \operatorname{arsinh}(\theta x)}.$$

The standard hyperbolastic cumulative distribution function of type III is defined as


 * $$F\left(x\right)=1-e^{-x- \operatorname{arsinh}(x)}$$,

and its corresponding probability density function is



f(x) = e^{- x - \operatorname{arsinh}(x)}\left(1+ \frac{1}{\sqrt{1+ x^2}}\right) $$.

Properties
If one desires to calculate the point $$x$$ where the population reaches a percentage of its carrying capacity $$M$$, then one can solve the equation
 * $$P(x) = k M$$

for $$x$$, where $$0 < k < 1$$. For instance, the half point can be found by setting $$k= \frac{1}{2}$$.

Applications


According to stem cell researchers at McGowan Institute for Regenerative Medicine at the University of Pittsburgh, "a newer model [called the hyperbolastic type III or] H3 is a differential equation that also describes the cell growth. This model allows for much more variation and has been proven to better predict growth."

The hyperbolastic growth models H1, H2, and H3 have been applied to analyze the growth of solid Ehrlich carcinoma using a variety of treatments.

In animal science, the hyperbolastic functions have been used for modeling broiler chicken growth. The hyperbolastic model of type III was used to determine the size of the recovering wound.

In the area of wound healing, the hyperbolastic models accurately representing the time course of healing. Such functions have been used to investigate variations in the healing velocity among different kinds of wounds and at different stages in the healing process taking into consideration the areas of trace elements, growth factors, diabetic wounds, and nutrition.

Another application of hyperbolastic functions is in the area of the stochastic diffusion process, whose mean function is a hyperbolastic curve. The main characteristics of the process are studied and the maximum likelihood estimation for the parameters of the process is considered. To this end, the firefly metaheuristic optimization algorithm is applied after bounding the parametric space by a stage wise procedure. Some examples based on simulated sample paths and real data illustrate this development. A sample path of a diffusion process models the trajectory of a particle embedded in a flowing fluid and subjected to random displacements due to collisions with other particles, which is called Brownian motion. The hyperbolastic function of type III was used to model the proliferation of both adult mesenchymal and embryonic stem cells;   and, the hyperbolastic mixed model of type II has been used in modeling cervical cancer data. Hyperbolastic curves can be an important tool in analyzing cellular growth, the fitting of biological curves, the growth of phytoplankton, and instantaneous maturity rate.

In forest ecology and management, the hyperbolastic models have been applied to model the relationship between DBH and height.

The multivariable hyperbolastic model type III has been used to analyze the growth dynamics of phytoplankton taking into consideration the concentration of nutrients.

Hyperbolastic regressions


Hyperbolastic regressions are statistical models that utilize standard hyperbolastic functions to model a dichotomous or multinomial outcome variable. The purpose of hyperbolastic regression is to predict an outcome using a set of explanatory (independent) variables. These types of regressions are routinely used in many areas including medical, public health, dental, biomedical, as well as social, behavioral, and engineering sciences. For instance, binary regression analysis has been used to predict endoscopic lesions in iron deficiency anemia. In addition, binary regression was applied to differentiate between malignant and benign adnexal mass prior to surgery.

The binary hyperbolastic regression of type I
Let $$Y$$ be a binary outcome variable which can assume one of two mutually exclusive values, success or failure. If we code success as $$Y=1$$ and failure as $$Y=0$$, then for parameter $$\theta \geq -1$$, the hyperbolastic success probability of type I with a sample of size $$n$$ as a function of parameter $$\theta$$ and parameter vector $$\boldsymbol{\beta} = (\beta_0, \beta_1,\ldots, \beta_p)$$ given a $$p$$-dimensional vector of explanatory variables is defined as $$\mathbf{x}_i=(x_{i1},\ x_{i2},\ldots ,\ x_{ip})^T$$, where $$i = 1,2,\ldots,n$$, is given by


 * $$\pi(\mathbf{x}_i;\boldsymbol{\beta}) = P(y_i=1|\mathbf{x}_i;\boldsymbol{\beta})=\frac{1}{1+e^{-(\beta_0+\sum_{s=1}^{p}{\beta_s x_{is}})-\theta \operatorname{arsinh}(\beta_0+\sum_{s=1}^{p}{\beta_s x_{is}}) } }$$.

The odds of success is the ratio of the probability of success to the probability of failure. For binary hyperbolastic regression of type I, the odds of success is denoted by $$Odds_{H1}$$ and expressed by the equation


 * $$Odds_{H1}=e^{\beta_0+\sum_{s=1}^{p}{\beta_s x_{is}}+\theta \operatorname{arsinh}(\beta_0+\sum_{s=1}^{p}{\beta_s x_{is}}) }$$.

The logarithm of $$Odds_{H1}$$ is called the logit of binary hyperbolastic regression of type I. The logit transformation is denoted by $$L_{H1}$$ and can be written as


 * $$L_{H1}=\beta_0+\sum_{s=1}^{p}{\beta_s x_{is}} +\theta \operatorname{arsinh}[\beta_0+\sum_{s=1}^{p}{\beta_s x_{is}}] $$.

Shannon information for binary hyperbolastic of type I (H1)
The Shannon information for the random variable $$Y$$ is defined as


 * $$I(y)=-{log}_bP(y)$$

where the base of logarithm $$b > 0$$ and $$b \neq 1$$. For binary outcome, $$b$$ is equal to $$2$$.

For the binary hyperbolastic regression of type I, the information $$I(y)$$ is given by


 * $$I(y)=

\begin{cases} -log_b\frac{1}{1+e^{-Z-\theta \operatorname{arsinh}(Z)}} & y = 1 ,\\ -log_b\frac{e^{-Z-\theta \operatorname{arsinh}(Z)}}{1+e^{-Z-\theta \operatorname{arsinh}(Z)}} & y = 0 \end{cases} $$,

where $$Z= \beta_0+\sum_{s=1}^{p}\beta_sx_s$$, and $$x_s$$ is the $$s^{th}$$ input data. For a random sample of binary outcomes of size $$n$$, the average empirical information for hyperbolastic H1 can be estimated by


 * $$\overline{I(y)}=

\begin{cases} -\frac{1}{n}\sum_{i=1}^{n}{log_b\frac{1}{1+e^{-Z_i-\theta \operatorname{arsinh}(Z_i)}}} & y = 1 ,\\ -\frac{1}{n}\sum_{i=1}^{n}{log_b\frac{e^{-Z_i-\theta \operatorname{arsinh}(Z_i)}}{1+e^{-Z_i-\theta \operatorname{arsinh}(Z_i)}}} & y = 0 \end{cases} $$,

where $$Z_i= \beta_0+\sum_{s=1}^{p}\beta_sx_{is}$$, and $$x_{is}$$ is the $$s^{th}$$ input data for the $$i^{th}$$ observation.

Information Entropy for hyperbolastic H1
Information entropy measures the loss of information in a transmitted message or signal. In machine learning applications, it is the number of bits necessary to transmit a randomly selected event from a probability distribution. For a discrete random variable $$Y$$, the information entropy $$H$$ is defined as


 * $$H=-\sum_{y\in Y}{P(y)\ {log}_bP(y)}$$

where $$P(y)$$ is the probability mass function for the random variable $$Y$$.

The information entropy is the mathematical expectation of $$I(y)$$ with respect to probability mass function $$P(y)$$. The Information entropy has many applications in machine learning and artificial intelligence such as classification modeling and decision trees. For the hyperbolastic H1, the entropy $$H$$ is equal to



\begin{align} H & = -\sum_{y \in \{0,1\}}{P(Y=y;\mathbf{x},\boldsymbol{\beta})log_b(P(Y=y;\mathbf{x},\boldsymbol{\beta}))} \\ & = -[\pi(\mathbf{x};\boldsymbol{\beta})\ log_b(\pi(\mathbf{x};\boldsymbol{\beta})+(1-\pi(\mathbf{x};\boldsymbol{\beta}))log_b(1-\pi(\mathbf{x};\boldsymbol{\beta}))] \\ & = {log}_b(1+e^{-Z-\theta \operatorname{arsinh}(Z)})-\frac{e^{-Z-\theta \operatorname{arsinh}(Z)}{log}_b(e^{-Z-\theta \operatorname{arsinh}(Z)})}{1+e^{-Z-\theta \operatorname{arsinh}(Z)}} \end{align} $$

The estimated average entropy for hyperbolastic H1 is denoted by $$\bar{H}$$ and is given by



\bar{H}=\frac{1}{n}\sum_{i=1}^{n}{[log_b(1+e^{{-Z}_i-\theta \operatorname{arsinh}(Z_i)})-}\frac{e^{{-Z}_i-\theta \operatorname{arsinh}(Z_i)}\ {log}_b(e^{{-Z}_i-\theta \operatorname{arsinh}((Z_i)})}{1+e^{{-Z}_i-\theta \operatorname{arsinh}(Z_i)}}] $$

Binary Cross-entropy for hyperbolastic H1
The binary cross-entropy compares the observed $$y \in \{0,1\}$$ with the predicted probabilities. The average binary cross-entropy for hyperbolastic H1 is denoted by $$\overline{C}$$ and is equal to

\begin{align} \overline{C} & =-\frac{1}{n}\sum_{i=1}^{n}{{[y}_i log_b(\pi(x_i;\boldsymbol{\beta}))+}{(1-y}_i)log_b(1-\pi(x_i;\boldsymbol{\beta}))] \\ &=\frac{1}{n}\sum_{i=1}^{n}{[log_b(1+e^{{-Z}_i-\theta \operatorname{arsinh}(Z_i)})-}{(1-y}_i)log_b(e^{{-Z}_i-\theta \operatorname{arsinh}(Z_i)})] \end{align} $$

The binary hyperbolastic regression of type II
The hyperbolastic regression of type II is an alternative method for the analysis of binary data with robust properties. For the binary outcome variable $$Y$$, the hyperbolastic success probability of type II is a function of a $$p$$-dimensional vector of explanatory variables $$\mathbf{x}_i$$ given by


 * $$\pi(\mathbf{x}_i;\boldsymbol{\beta}) = P(y_i=1|\mathbf{x}_i;\boldsymbol{\beta})= \frac{1}{1 + \operatorname{arsinh}[e^{ - (\beta_0 + \sum_{s=1}^{p}{\beta_s x_{is}}) }]} $$ ,

For the binary hyperbolastic regression of type II, the odds of success is denoted by $$Odds_{H2}$$ and is defined as


 * $$Odds_{H2} = \frac{1}{\operatorname{arsinh}[e^{-(\beta_0+\sum_{s=1}^{p}{\beta_s x_{is}}) } ] }. $$

The logit transformation $$ L_{H2} $$ is given by


 * $$L_{H2}= - \log{( \operatorname{arsinh}[ e^{-(\beta_0+\sum_{s=1}^{p}{\beta_s x_{is}})}])}$$

Shannon information for binary hyperbolastic of type II (H2)
For the binary hyperbolastic regression H2, the Shannon information $$I(y)$$ is given by


 * $$I(y) =

\begin{cases} -log_b \frac{1}{1+arsinh(e^{-Z})} & y = 1 \\ -log_b \frac{arsinh(e^{-Z})}{1+arsinh(e^{-Z})} & y = 0 \end{cases} $$

where $$Z= \beta_0+\sum_{s=1}^{p}\beta_sx_s$$, and $$x_s$$ is the $$s^{th}$$ input data. For a random sample of binary outcomes of size $$n$$, the average empirical information for hyperbolastic H2 is estimated by


 * $$ \overline{I(y)}=

\begin{cases} -\frac{1}{n}\sum_{i=1}^{n}log_b \frac{1}{1+arsinh(e^{-Z_i})} & y = 1 \\ -\frac{1}{n}\sum_{i=1}^{n}log_b \frac{arsinh(e^{-Z_i})}{1+arsinh(e^{-Z_i})} & y=0 \end{cases} $$

where $$Z_i= \beta_0+\sum_{s=1}^{p} \beta_sx_{is}$$, and $$x_{is}$$ is the $$s^{th}$$ input data for the $$i^{th}$$ observation.

Information Entropy for hyperbolastic H2
For the hyperbolastic H2, the information entropy $$H$$ is equal to



\begin{align} H& = -\sum_{y\in \{0,1\}}{P(Y=y;\mathbf{x}, \boldsymbol{\beta}) log_b(P(Y=y;\mathbf{x} ,\boldsymbol{\beta}))} \\ & =-[\pi(\mathbf{x};\boldsymbol{\beta})\ log_b(\pi(\mathbf{x};\boldsymbol{\beta}))+(1-\pi(\mathbf{x};\boldsymbol{\beta}))log_b(1-\pi(\mathbf{x};\boldsymbol{\beta}))] \\ & =log_b(1+arsinh(e^{-Z}))-\frac{arsinh(e^{-Z}) log_b (arsinh(e^{-Z}))}{1+arsinh(e^{-Z})} \end{align} $$

and the estimated average entropy $$\bar{H}$$ for hyperbolastic H2 is



\bar{H}=\frac{1}{n}\sum_{i=1}^{n}{[log_b(1+{arsinh(e}^{{-Z}_i}))-}\frac{{arsinh(e}^{{-Z}_i})\ {log}_b{(arsinh(e}^{{-Z}_i}))}{1+{arsinh(e}^{{-Z}_i})}] $$

Binary Cross-entropy for hyperbolastic H2
The average binary cross-entropy $$\overline{C}$$ for hyperbolastic H2 is



\begin{align} \overline{C} & =-\frac{1}{n}\sum_{i=1}^{n}{{[y}_ilog_b(\pi(x_i;\beta))+}{(1-y}_i)log_b(1-\pi(x_i;\beta))] \\ & =\frac{1}{n}\sum_{i=1}^{n}{[log_b(1+{arsinh(e}^{{-Z}_i}))-}{(1-y}_i)log_b({arsinh(e}^{{-Z}_i}))] \end{align} $$

Parameter estimation for the binary hyperbolastic regression of type I and II
The estimate of the parameter vector $$\boldsymbol{\beta}$$ can be obtained by maximizing the log-likelihood function


 * $$ \hat{\beta} = \underset{\boldsymbol{\beta}}\operatorname{argmax}{\sum_{i = 1}^{n}[y_iln(\pi(\mathbf{x}_i;\boldsymbol{\beta}))+(1-y_i)ln(1-\pi(\mathbf{x}_i;\boldsymbol{\beta}))]} $$

where $$\pi(\mathbf{x}_i;\boldsymbol{\beta})$$ is defined according to one of the two types of hyberbolastic functions used.

The multinomial hyperbolastic regression of type I and II
The generalization of the binary hyperbolastic regression to multinomial hyperbolastic regression has a response variable $$y_i$$ for individual $$i$$ with $$k$$ categories (i.e. $$y_i \in \{1,2,\ldots,k\}$$). When $$k=2$$, this model reduces to a binary hyperbolastic regression. For each $$i=1,2,\ldots,n$$, we form $$k$$ indicator variables $$y_{ij}$$ where


 * $$y_{ij}=

\begin{cases} 1 & \text{if } y_i = j,\\ 0 & \text{if } y_i \neq j \end{cases} $$, meaning that $$y_{ij}=1$$ whenever the $$i^{th}$$ response is in category $$j$$ and $$0$$ otherwise.

Define parameter vector $$\boldsymbol{\beta}_j=(\beta_{j0},\beta_{j1},\ldots,\beta_{jp})$$ in a $$p+1$$-dimensional Euclidean space and $$\boldsymbol{\beta}=(\boldsymbol{\beta}_1,\ldots,\boldsymbol{\beta}_{k-1})^T$$. Using category 1 as a reference and $$\pi_1(\mathbf{x}_i;\boldsymbol{\beta})$$ as its corresponding probability function, the multinomial hyperbolastic regression of type I probabilities are defined as
 * $$\pi_1(\mathbf{x}_i;\boldsymbol{\beta})=P(y_i=1|\mathbf{x}_i;\boldsymbol{\beta})=\frac{1}{1+\sum_{s=2}^{k}e^{-\eta_s(\mathbf{x}_i;\boldsymbol{\beta})-\theta \operatorname{arsinh}[\eta_s(\mathbf{x}_i;\boldsymbol{\beta})]}}$$

and for $$j = 2,\ldots,k$$,


 * $$\pi_j(\mathbf{x}_i;\boldsymbol{\beta})=P(y_i=j|\mathbf{x}_i;\boldsymbol{\beta})=\frac{e^{-\eta_j(\mathbf{x}_i;\boldsymbol{\beta})-\theta \operatorname{arsinh}[\eta_j(\mathbf{x}_i;\boldsymbol{\beta})]}}{1+\sum_{s=2}^{k}e^{-\eta_s(\mathbf{x}_i;\boldsymbol{\beta})-\theta \operatorname{arsinh}[\eta_s(\mathbf{x}_i;\boldsymbol{\beta})]}}$$

Similarly, for the multinomial hyperbolastic regression of type II we have
 * $$\pi_1(\mathbf{x}_i;\boldsymbol{\beta})=P(y_i=1|\mathbf{x}_i;\boldsymbol{\beta})=\frac{1}{1+\sum_{s=2}^{k}arsinh[e^{-\eta_s(\mathbf{x}_i;\boldsymbol{\beta})}]}$$

and for $$j = 2,\ldots,k$$,


 * $$\pi_j(\mathbf{x}_i;\boldsymbol{\beta})=P(y_i=j|\mathbf{x}_i;\boldsymbol{\beta})=\frac{arsinh[e^{-\eta_j(\mathbf{x}_i;\boldsymbol{\beta})}]}{1+\sum_{s=2}^{k}arsinh[e^{-\eta_s(\mathbf{x}_i;\boldsymbol{\beta})}]}$$

where $$\eta_s(\mathbf{x}_i;\boldsymbol{\beta})=\beta_{s0}+\sum_{l=1}^{p}\beta_{sl}x_{il}$$ with $$s = 2, \dots, k$$ and $$i = 1,\dots,n$$.

The choice of $$\pi_i(\mathbf{x_i};\boldsymbol{\beta})$$ is dependent on the choice of hyperbolastic H1 or H2.

Shannon Information for multiclass hyperbolastic H1 or H2
For the multiclass $$(j=1, 2, \dots, k)$$, the Shannon information $$I_j$$ is


 * $$I_j=-log_b(\pi_j(\mathbf{x};\boldsymbol{\beta}))$$.

For a random sample of size $$n$$, the empirical multiclass information can be estimated by
 * $$\overline{I_j}=-\frac{1}{n}\sum_{i=1}^{n}{log_b(\pi_j(\mathbf{x_i};\boldsymbol{\beta}))}$$.

Multiclass Entropy in Information Theory
For a discrete random variable $$Y$$, the multiclass information entropy is defined as


 * $$H=-\sum_{y \in Y}{P(y)\ {log}_bP(y)}$$

where $$P(y)$$ is the probability mass function for the multiclass random variable $$Y$$. For the hyperbolastic H1 or H2, the multiclass entropy $$H$$ is equal to


 * $$H=-\sum_{j=1}^{k}{[\pi_j(\mathbf{x};\boldsymbol{\beta}) log_b(\pi_j(\mathbf{x};\boldsymbol{\beta}))]}$$

The estimated average multiclass entropy $$\overline{H}$$ is equal to


 * $$\overline{H}=-\frac{1}{n}\sum_{i=1}^{n}{\sum_{j=1}^{k}{[\pi_j(\mathbf{x_i};\boldsymbol{\beta}) log_b(\pi_j(\mathbf{x_i};\boldsymbol{\beta}))]}}$$

Multiclass Cross-entropy for hyperbolastic H1 or H2
Multiclass cross-entropy compares the observed multiclass output with the predicted probabilities. For a random sample of multiclass outcomes of size $$n$$, the average multiclass cross-entropy $$\overline{C}$$ for hyperbolastic H1 or H2 can be estimated by


 * $$\overline{C}=-\frac{1}{n} \sum_{i=1}^{n}{\sum_{j=1}^{k}{[y_{ij} log_b(\pi_j(\mathbf{x_i};\boldsymbol{\beta}))]}}$$

The log-odds of membership in category $$j$$ versus the reference category 1, denoted by $$\omicron_j(\mathbf{x}_i;\boldsymbol{\beta})$$, is equal to


 * $$\omicron_j(\mathbf{x}_i;\boldsymbol{\beta}) = ln[\frac{\pi_j(\mathbf{x}_i;\boldsymbol{\beta})}{\pi_1(\mathbf{x}_i;\boldsymbol{\beta})}]$$

where $$j=2,\ldots,k$$ and $$i=1,\ldots,n$$. The estimated parameter matrix $$\hat\boldsymbol{\beta}$$ of multinomial hyperbolastic regression is obtained by maximizing the log-likelihood function. The maximum likelihood estimates of the parameter matrix $$\boldsymbol\beta$$ is


 * $$\boldsymbol{\hat{\beta}} = \underset{\boldsymbol{\beta}}\operatorname{argmax}{\sum_{i=1}^n(y_{i1}ln[\pi_1(\mathbf{x}_i;\boldsymbol{\beta})]+y_{i2}ln[\pi_2(\mathbf{x}_i;\boldsymbol{\beta})]+\ldots+y_{ik}ln[\pi_k(\mathbf{x}_i;\boldsymbol{\beta})])}$$