Conv1D-BiLSTM Prediction of CRISPR-Cas9 gRNA Efficiency in BRCA1 and BRCA2 Using AIdit-Provided Features
Journal
2026 International Conference on Artificial Intelligence, Computer, Data Sciences and Applications (ACDSA)
Date Issued
2026-02-05
Author(s)
Hristova, Evgenija
DOI
10.1109/acdsa67686.2026.11467818
Abstract
CRISPR-Cas9 is widely used for genome editing, but its success depends on choosing guide RNAs (gRNAs) that work well. When guides are poorly chosen, the editing can be less effective and may lead to unwanted changes especially for cancerrelated genes such as BRCA1 and BRCA2. Being able to predict gRNA efficiency in advance not only improves reliability but also saves time and resources when the next step is in vitro research. These genes are central to cancer research, yet finding reliable gRNAs for them is difficult because the available experimental data are limited and specific to each gene. Dealing with small datasets is challenging, but it also shows why reliable methods are needed to avoid misleading outcomes and to guide both research and possible treatments. To overcome the difficulties of working with limited and gene-specific data, we developed a Conv1DBiLSTM model designed to estimate CRISPR-Cas9 gRNA efficiency in BRCA1 and BRCA2. The model combines sequence one-hot encoding with AIdit-derived features, including GC content, off-target score, and double-strand break (DSB) predictions. To address the relatively small dataset (∼1,000 entries), duplicates and biologically invalid values such as negative efficiency scores were removed, values were normalized separately for each gene, and dropout and L2 regularization were applied. These adjustments limited overfitting and enabled the model to achieve consistent performance across gRNAs from both genes.
