User:Hassan142/sandbox

HRDetect (Homologous Recombination Deficiency Detect) is a whole-genome sequencing (WGS)-based model designed to predict BRCA1 and BRCA2 deficiency based on six mutational signatures. This model is also able to identify similar mutation profiles of tumors to mutations found in patients with BRCA1 and BRCA2, also known as BRCAness. Additionally, this assay has the potential to be applied to assess the implementation of PARP inhibitor in patients with BRCA1/BRCA2 deficiency.

Background
BRCA1/BRCA2

BRCA1 and BRCA2 play crucial roles in maintaining genome integrity, mainly through homologous recombination(HR) for DNA double-strand breaks(DSB)repair. The mutations of BRCA1 and BRCA2 can lead to a reduced capacity of HR machinery, increased genomic instability, and elicit a predisposition to malignancies. People with BRCA1 and BRCA2 deficiency have higher risks of developing certain cancers such as breast and ovarian cancers. For example, germline defects in BRCA1/BRCA2 genes account for up to 5% of breast cancer cases. PARP inhibitors Poly (ADP ribose) polymerase (PARP) inhibitors are designed to treat BRCA1- and BRCA2- defect tumors owing to their homologous recombination deficiency. These drugs have been majorly implemented in breast and ovarian cancers, and their clinical efficacy among patients with other types of cancers, such as pancreatic cancer, is still being investigated. It is vital to identify adequate patients with BRCA1/BRCA2 deficiency to utilize PARP inhibitors optimally.

HRDetect HRDetect was implemented to detect tumors with BRCA1/BRCA2 deficiency using the data from whole-genome sequencing of 560 breast cancer samples. This model quantitatively detected six different mutational signatures and combined the information to predict BRCA1/BRCA2 deficiency. The six signatures, ranked by decreasing weight, include microhomology-mediated indels, the HRD index, base- substitution signature 3, rearrangement signature 3, rearrangement signature 5, and base- substitution signature 8. Additionally, this weighted approach is able to identify BRCAness, which refers to mutational phenotypes displaying homologous recombination deficiency similar to tumors with BRCA1/BRCA2 defects.

Input
HRDetect requires four types of inputs:
 * 1) counts of mutations associated with each signature of single-base substitutions
 * 2) indels with microhomology at the indel breakpoint junction, indels at polynucleotide-repeat tracts and other complex indels as proportions
 * 3) counts of rearrangements associated with each signature of rearrangements RS1–RS6
 * 4) HRD indexTrain HRDetect.jpg

Statistical Analysis
It is based on a supervised learning method using a lasso logistic regression model to distinguish samples into those with and without BRCA 1/2 deficiency. Optimal coefficients are obtained by minimizing the objective function:

Log Transformation
To account for a high substitution count in some samples, the genomic data is first log transformed:

$$ x^{\prime}=\ln (x+1) $$



Normalization
The transformed data is then normalized to make mutational class values comparable giving each fratures a mean of 0 and a s.d of 1:

$$ \mathrm{x}^{\prime \prime}=\frac{x^{\prime}-\operatorname{mean}\left(x^{\prime}\right)}{\mathrm{s} \cdot \mathrm{d} \cdot\left(x^{\prime}\right)} $$

Lasso Logistical Regression Modelling
To be able to distinguish between those affected and not affected by BRCA1/BRCA2 deficiency, a lasso logistic regression model is used. $$\begin{array}{l} {\min _{((\beta 0, \beta)) \in \mathbb{R}^{p+1}}} \\ {\quad\left(-\left[\frac{1}{N} \sum_{i=1}^{N} y_{i} \cdot\left(\beta_{0}+x_{i}^{T} \beta\right)-\log \left(1+e^{\left(\beta_{0}+x_{i}^{T} \beta\right)}\right)\right]+\lambda\|\beta\|_{1}\right)} \end{array} $$

where:

$$y_{i}$$ : BRCA status of a sample || yi = 1 for BRCA1/BRCA2-null samples || yi = 0 otherwise $$\beta_{0}$$ : Intercept, interpreted as the log of odds of yi = 1 when xiT = 0 $$\beta$$ : Vector of weights $$p$$ : Number of features characterizing each sample $$N$$ : Number of samples $$x_{i}^{T}$$ : Vector of features characterizing the ith sample $$\lambda$$ : Penalty promoting the sparseness of the weights $$\|\beta\|$$ : L1 norm of the vector of weights

The β weights are constrained to be positive to reflect the presence of mutational actions due to BRCA1/BRCA2 defects.

HRD Score
Lastly, the model is used to give a new sample a probabilistic score using the normalized mutational data $$x_{i}^{T}$$and application of the model parameters($$\beta$$, $$\beta_{0}$$):

$$ P\left(C_{i}=B R C A\right)=\frac{1}{1+e^{-\left(\beta_{0}+x_{i}^{T} \beta\right)}} $$

where:

$$C_{i}$$ : variable encoding the status of the ith sample $$\beta_{0}$$ : Intercept weight $$x_{i}^{T}$$: Vector encoding features of the ith sample $$\beta$$: Vector of weights

Application
 Breast Cancer 

 Cohort of 560 Breast cancer samples 

HRDetect was first developed to detect tumors with BRCA1 and BRCA2 deficiency based on the data from whole-genome sequencing of a cohort of 560 breast cancer samples. Within this cohort, 22 patients were known to carry germline BRCA1/BRCA2 mutations. The research group explicited that BRCA1/BRCA2- deficiency mutational signatures were found in more breast cancer patients than previously known. This model was able to identify 124 (22%) breast cancer patients showing BRCA1/2 mutational signatures in this cohort of 560 samples. Apart from the 22 known cases, an additional 33 patients showed deficiency with germline BRCA1/2 mutations, 22 patients displayed somatic mutation of BRCA1/2, and 47 were recognized to show functional defect without detected BRCA1/2 mutation. As a result, with an application of a probabilistic cut-off 0.7, HRDetect was able to detect 76 patients with a score above 0.7 and demonstrate a 98.7% sensitivity recognizing BRCA1/2- deficient cases.

In contrast, germline mutations of BRCA1/2 are present in only 1~5% of breast cancer cases. Furthermore, these findings suggest that more breast cancer patients, as many as 1 in 5 (20%), may benefit from PARP inhibitors than a small percentage of patients currently given with the treatment.

Cohort of 80 Breast cancer samples

To validate the effectiveness, the research group tested HRDetect in another cohort of 80 breast cancer cases with mainly ER positive and HER2 negative. The tool was able to find ones that exceed HRDetect score 0.7, including one germline BRCA1 mutation carrier, four germline BRCA2 mutation carriers and one somatic BRCA2 mutation carrier. The sensitivity of this tool also reached 86%.

 Other cancers 

HRDetect was also tested in other cancers to understand its generalizability.

 Ovarian cancer  In a cohort of 73 patients with ovarian cancer, 30 patients were known to carry BRCA1/BRCA2 mutations and 46 (63%) patients were assessed by HRDetect to have HRDetect score over 0.7. The sensitivity of detecting BRCA1/2-deficient cancer was almost 100%, with an additional 16 cases identified.

 Pancreatic cancer  In another cohort of 96 patients with pancreatic cancers, 6 cases were known to have mutation or allele loss and 11 (11.5%) patients were identified by HRDetect to exceed cutoff 0.7. The study observed a similar result of sensitivity approaching 100%, with five other cases identified.