A Novel Feature Representation Method for Automating Genetic Variant Classification
Authors: Chandra Prasetyo Utomo; Nashuha Insani; Puspa Setia Pratiwi; Muhamad Fathurahman; Ahmad Rusdan Utomo; Achmad Dimas Furqon
Abstract:
Automating genetic variant classification based on the American College of Medical Genetics and Genomics (ACMG) criteria is pivotal for ensuring consistent interpretation of genetic variants. This is crucial for accurate diagnosis and effective treatment. However, manual classification is labor-intensive and prone to errors due to the massive data generated by next-generation sequencing (NGS) technologies. This research aims to design and assess an AI-based system for automatically classifying genetic variants using ACMG criteria. Deploying AI in genetic variant classification presents several challenges. One significant issue is combining multiple features and designing effective feature represen-tation. This paper proposes a novel feature representation method called cancer variant representation (CVR). We aggregated ACMG-related criteria and transformed them into 118,125 features. We evaluated our model with eleven machine learning algorithms using experimental data with 23,198 variants. The result showed that our method provided the highest performance with accuracy = 0.9966 compared to existing methods. The more accurate model will improve the diagnosis accuracy and effective treatment delivery. This will lead to better patient outcomes.
