Executive Summary : | The research aims to identify etiological risk factors causing cancers in the Mizo population by combining diet-lifestyle factors and genetic data. By type-matching genome and diet-lifestyle information, clinicians can develop automated decision-making models that help in early disease detection, strengthening cancer diagnosis and treatment protocols, and raising awareness about diet-lifestyle risk factors for this disease-vulnerable population. Machine learning algorithms have high efficiency in extracting quantifiable features from robust ground-truth data, aiding in innovative cancer therapies. Mizoram, Northeast India, has a high incidence of major types of cancers and is associated with environmental risk factors like high consumption of salted and smoked meat and infection by pathogens. Lifestyle risk factors such as smoked food, excess salt, alcohol, and tobacco, such as "tuibur," have health consequences. Previous epidemiological and gene mutation investigations indicate that cancer accounts for the main mortality in the Mizo population.
The study aims to discern risk factors from diet-lifestyle-genome data using Machine Learning models. Epidemiological data will be collected from Civil Hospital Aizawl and Mizoram State Cancer Institute, while genomic data will be sequenced by the Department of Biotechnology, Mizoram University. The datasets will be analyzed using Python jupyter notebook and scikit-learn package, with support vector machine, decision trees, naïve bayes, multilayer perceptron, and random forest algorithms used for model development. |