Skip to Content

Genetics and Biochemistry Profiles

Profile Photo

F. Alex Feltus

Genetics and Biochemistry


Core Faculty, Biomedical Data Science and Informatics
Faculty Member, Clemson Center for Human Genetics
Faculty Scholar, Clemson University School of Health Research

AG Biotech/Biosystems Research Complex / BRC 302C [Office]


Educational Background

Ph.D., Cell Biology, Vanderbilt University, 2000
B.Sc., Biohemistry, Auburn University, 1992

Profile/About Me

Dr. F. Alex Feltus received a B.Sc. in Biochemistry from Auburn University in 1992, served two years in the Peace Corps, and then completed advanced training in biomedical sciences at Vanderbilt and Emory. Since 2002, he has performed research in bioinformatics, high-performance computing, cyberinfrastructure, network biology, genome assembly, systems genetics, paleogenomics, and bioenergy feedstock genetics. Currently, Feltus is an Professor in Clemson University's Dept. of Genetics & Biochemistry, CEO of Allele Systems LLC, Core Faculty in the CU-MUSC Biomedical Data Science and Informatics (BDSI) program, member of the Center for Human Genetics, and serves on the Internet2 Board of Trustees as well as various "Advance Research Computing" engagement workgroups. Feltus has published numerous scientific articles in peer-reviewed journals, teaches undergrad and PhD students in bioinformatics, biochemistry, and genetics. At present, he is funded by multiple NSF grants and is engaged in tethering together extremely smart people from diverse technical backgrounds in an effort to propel genomics research from the Excel-scale towards the Exascale.

Research Interests

Our group uses software engineering and computational biology techniques to make useful molecular discoveries in human and plant biological systems; We also engineer elastic advanced compute systems and technologies to run robust genomics workflows to enable small labs to perform innovative petascale computational biology. The lab also actively engaged in traditional PhD training and the development of a scalable asynchronous training platform for data-intensive computing including but not limited to computational biology.

My lifetime research goal is to reveal the genomic mechanisms underlying phenotype expression. A core aspect of this approach to identify biomarkers that are able to group interesting biological states (e.g. normal kidney verses renal tumor somatic mutation and/or transcriptome profiles). Given that most traits are under control by complex cellular control systems, we always seek to identify sets of functionally interacting genes (biomarker systems) that discriminate between biological states. My group focuses on the transcriptome layer (RNA) of gene expression but we are always seeking methods to integrate data from other genome information orbitals.

A staple data construct of our lab is the gene co-expression network (GCN) where an edge represents a statistically significant RNA expression correlation between two gene products (network nodes). We are active developers of a GCN discovery software application called KINC that is able to identify condition-specific edges from mixed input gene expression matrices (GEMs) (Ficklin et al. [2017]). KINC GCNs are made from GEMs in a bottom up approach where all gene pairs are tested for correlation. This approach is computationally intensive and is not be scalable to millions of samples. Further, traditional GCNs do not detect non-linear relationships missed by correlation tests and do not place genetic relationships in a gene expression intensity context. In response, we developed EdgeScaping (Husain and Feltus [2019]), which constructs and analyzes the pairwise gene intensity network in a holistic, top down approach where no edges are filtered. EdgeScaping uses a novel technique to convert traditional pairwise gene expression data into an image based format and allows for exploring non-linear relationships between genes by leveraging deep learning image analysis algorithms. We have applied EdgeScaping to a human tumor expression profiles candidate biomarker systems that exhibit conventional and non-conventional interdependent non-linear behavior associated with brain specific tumor sub-types. Edgescaping is open source and available at [].

We have been mining RNA expression profiles for biomarker systems from many NIH projects including GTEx and TCGA. We are also leveraging open and protected PsychENCODE (Akbarian et al. [2015]) and SPARK (Feliciano et al. [2018]) datasets to better understand normal and aberrant brain expression patterns. Once we detect biomarker systems using the techniques described above, we try to understand the gene regulatory networks underlying those systems. We are focusing on detection and understanding biomarker systems for three specific biomedical phenotypes: intellectual disability (e.g. autism spectrum disorder -- ASD), brain cancer, and renal cancer.

Genomics databases are swelling and larger compute systems are needed by my group and thousands of individual life science investigators. Soon, DNA sequencers will replace qPCR machines in research labs and everyone will need terascale/petascale compute systems. Towards this disruptive technological event on par with the roll out of molecular biology into labs in the 1980s, my group is actively engaged in several funded cyberinfrastructure projects: “CC*Data: National Cyberinfrastructure for Scientific Data Analysis at Scale (SciDAS).” NSF[1659300] (Feltus PI); “RCN: Advancing Research and Education Through a National Network of Campus Research Computing Infrastructures - The CaRC Consortium” NSF[1620695] (Feltus PI – Bottum Former PI); “Exposing the Potential of Information Centric Networks for the Life Sciences” Cisco Research (Feltus PI); “CC* NPEO: Toward the National Research Platform.” NSF[826967](Smarr PI, Feltus End User); “DIBBs: EI: SLATE and the Mobility of Capability” NSF[1724821] (Gardner PI, Feltus End User). Along with many others, I am linking these partnerships to help build larger democratized compute systems and scaling the training so people can actually use them. In addition to the workflow engineering outlined above, we are focusing efforts in these cutting edge three cyberinfrastructure areas: scaling out usage of Kubernetes based compute systems and moving genomics data from traditional data repositories into information centric network systems.

Courses Taught

Computational Genomics
Biomedical Informatics/Medical Bioinformatics
Next-Generation Sequence Analysis
Special Topics in Advanced Biochemistry and Genetics (Network and Systems Genetics)
Essential Elements of Biochemistry
Issues in Research
Senior Seminar
Perl for Bioinformatics

Selected Publications

For a current list of publications, please visit Google Scholar at


Google Scholar
Genetics & Biochemistry Department
Biomedical Data Science & Informatics Program
Internet2 Board of Trustees