Current Students | Faiz Rizvi
While completing my Masters degree in Biomathematics at Ohio State University, I had the opportunity to take many interesting classes. One of these was Biomedical Informatics, taught by Dr. Kevin Coombes. In this course, I was really fascinated with the use of various data analysis techniques on biological data, such as Hierarchical Agglomerative Clustering and K-Means Clustering. In class, we used Hierarchical Clustering and Principal Component Analysis on RNA-Seq data of patients with prostate cancer. The data analysis showed that the data set was skewed and confounded by the chip type used. This result was not evident in the initial analysis of the data. Such analysis techniques have been essential in my Masters thesis project on sleep inertia in children. A research paper by Van Dongen et al.  finds that sleep-deprived patients can be divided into groups. These groups consist of patients who are found to be resilient, vulnerable, or somewhere in between to sleep deprivation. My Masters thesis advisor, Dr. Best, and I used these machine learning algorithms to evaluate data from a 10-minute visual psychomotor vigilance task test on children aged 5 to 12 to find an example of this characteristic. Further, we would like to generate a model which will be able to use baseline reaction times to predict reaction times of patients awakened from deep sleep (as could happen in the case of a fire alarm).
As I graduated and moved on from Ohio State, my interest in machine learning and data analysis led me to the join the Miraldi Lab at Cincinnati Children's Hospital as a first-year student in the Systems Biology and Physiology PhD program at UC. In Dr. Miraldi's lab we use bioinformatics tools to help devise mathematical models to predict transcription factor activity, among other things. Dr. Miraldi and her colleagues were able to create the Inferelator in Th17 cells and predict transcription factor activity using sc-RNA-Seq data. A key input to the Inferelator is the prior matrix. This consists of known transcription factor and gene interactions which the program uses to help prune and predict new edges in the network. In my first semester in Dr. Miraldi's lab I helped generate a prior based on ChIP-seq experimental data available from Dr. Matt Weirauch's lab. With these, nearly 24,000 ChIP-seq experiments, we hope that the Inferelator will be able to perform at a much higher level (in regard to precision vs. recall). Even with much smaller data sets, the Inferelator is able to outperform current state-of-the-art techniques, so a larger data set should provide better precsion vs. recall curves. Moving forward, we hope to create a Convolutional Neural Network (CNN) that uses (sc)ATAC-seq data to predict ChIP-seq profiles. We are interested in using (sc)ATAC-seq as it requires significantly fewer cells than ChIP-seq.
 Van Dongen, H.P.A., Baynard, M.D., Maislin, G., and Dinges, D.F., Systematic Interindividual Differences in Neurobehavioral Impairment from Sleep Loss: Evidence of Trait-Like Differential Vulnerability, SLEEP 27.3