At Lilly, we unite caring with discovery to make life better for people around the world. We are a global healthcare leader headquartered in Indianapolis, Indiana. Our employees around the world work to discover and bring life-changing medicines to those who need them, improve the understanding and management of disease, and give back to our communities through philanthropy and volunteerism. We give our best effort to our work, and we put people first. We’re looking for people who are determined to make life better for people around the world.
We are a global healthcare leader headquartered in Indianapolis, Indiana. Our 39,000 employees work to discover and bring life-changing medicines to those who need them, improve the understanding and management of disease, and give back to our communities through philanthropy and volunteerism. We give our best effort to our work, and we put people first. We’re looking for people who are determined to make life better for people around the globe. Competency Summary We are seeking a Genomics AI Engineer to design, build, and deploy machine learning models and AI-powered solutions that accelerate genomics-driven drug discovery and development. This role sits at the intersection of computational biology, AI engineering, and translational science, supporting the integration of diverse multi-omics datasets with clinical and therapeutic data. The ideal candidate combines strong AI engineering skills with a working knowledge of genomics, functional assays, and therapeutics to enable high-impact drug discovery and development. Key Objectives/Deliverables Design, build, and maintain production-grade Machine Learning pipelines for training, evaluating, and serving AI models on genomic, transcriptomic, and epigenomic data to support drug discovery and target identification Understand and work with AAV vector biology data in the context of gene therapy design and regulatory element characterization. Engineer scalable data ingestion and feature engineering pipelines to prepare heterogeneous multi-omics datasets (e.g., GWAS, RNA-seq, eQTL/pQTL, PheWAS) and clinical data for model training and inference. Integrate external summary-level data sources including: GWAS, PWAS, and TWAS summary statistics Phenome-wide association study (PheWAS) datasets Molecular QTL data (eQTL, pQTL, cis-QTLs, and related modalities) Build and integrate AI/ML solutions leveraging: AI frameworks over genomics knowledge bases and literature Deep learning models for sequence analysis, variant effect prediction, gene expression modeling, and biomedical NLP Partner with computational biologists, biostatisticians, and translational scientists to translate research questions into robust data solutions. Minimum Position Qualifications Bachelor’s or Master’s degree in Bioinformatics, Computational Biology, Computer Science, Data Engineering, or a related quantitative field. Ph.D. preferred. Solid understanding of genomics fundamentals: DNA/RNA sequences, genetic variants, regulatory elements, and gene expression. Experience working with GWAS, molecular QTL (eQTL/pQTL), PheWAS, TWAS, or PWAS datasets. Exposure to therapeutic modality data: siRNA, ASO, mRNA therapeutics, or CRISPR-based approaches Solid understanding of AI/ML fundamentals: supervised and unsupervised learning, deep learning architectures (transformers, GNNs, Diffusion), and model optimization techniques(LoRA, RLHF). Strong proficiency in Python for data processing; familiarity with SQL and cloud-based data platforms (AWS, Databricks, or Azure).