Ye Zheng is an NIH/NHGRI K99/R00 fellow and a tenure-track Assistant Professor in the Bioinformatics and Computational Biology Department of the University of Texas MD Anderson Cancer Center. Dr. Zheng received her postdoctoral training at the Fred Hutchinson Cancer Center from both molecular biology and quantitative modelingperspectives mentored by Dr. Steven Henikoff and Dr. Raphael Gottardo. She has also established close collaborations with Dr. Cameron Turtle and Dr. Evan Newell to decipher the CAR-T cell immunotherapy response variations. Before her postdoctoral training, Dr. Zheng received a Ph.D. in Statistics from the University of Wisconsin-Madison under the supervision of Dr. Sündüz Keleş, and her dissertation topics were centered around statistical modelings of three-dimensional chromatin structure (3D genomics) for promoter-enhancer inference.
At MD Anderson Cancer Center, Dr. Zheng leads a quantitative research group dedicated to the statistical modeling and computational pipeline development using bulk and single-cell transcriptomics, proteomics, epigenomics, and 3D genomics data to address biological and clinical challenges. Her wet lab specializes in the epigenomic profiling of the Formalin-Fixed, Paraffin-Embedded (FFPE) samples.
The goal of Dr. Zheng’s research group is to solve biological and clinically important, and methodologically challenging problems by innovating cutting-edge statistical models. The group is actively hiring at all levels, including but not limited to Postdoc, Ph.D., Master students, technician and undergraduate or graduate interns, and open to discussion and collaboration. This highly interdisciplinary group looks forward to being inspired and motivated by the novel and intriguing problems in other disciplines.
Please send your CV/resume, a brief cover letter describing your relevant experience and motivations, a GitHub link to the repository or any other materials that can best demonstrate your programming skills, and any related research manuscripts/writing samples (if applicable) to yzheng8@mdanderson.org.
Ph.D. in Statistics - Minor in Quantitative Biology, 2019
University of Wisconsin - Madison
B.E. in Statistics, 2014
Renmin University of China
Integrative modeling of bulk and single-cell transcriptomics, epigenomics and proteomics:
Wet lab experimental training:
Chimeric antigen receptor T (CAR-T) Cell Immunotherapy:
Statistical consulting for genomics and biomedical studies:
Dissertation Research:
Collaborative Work with the Bresnick Lab:
Projects:
+: co-first authors, ++: co-corresponding authors
Henikoff S+, Zheng Y+, Paranal R, Xu Y, Greene J, Henikoff J, Russell Z, Szulzewsky F, Thirimanne H, Kugel S, Holland E, Ahmad K. RNA Polymerase II at histone genes predicts outcome in human cancer. Under review of Science. (2024) Previous version is available at: https://www.biorxiv.org/content/10.1101/2024.02.28.582647v3.
Henikoff S, Zheng Y, Ahmad K. Mitotic errors do not explain aneuploidy in cancer. Under review of Trends in Genetics. (2024)
[Book] Savonen et al. Choosing Genomics Tools. (2023) Chapter 19 CUT&RUN and CUT&Tag. Full author list. Online book chapter.
Wu S, Furlan S, Mihalas A, Kaya-Okur H, Feroze H, Emerson S, Zheng Y, Carson K, Cimino P, Keene C, Holland E, Sarthy J, Gottardo R, Ahmad K, Henikoff S, Patel A. [Single-cell CUTTag analysis of chromatin modifications in differentiation and tumor progression](https://doi. org/10.1038/s41587-021-00865-z). Nature Biotechnology. 2021.
Zheng Y, Ahmad K, Henikoff K. CUTTag Data Processing and Analysis Tutorial. Protocols.io. (2020) https://www.protocols.io/view/cut-amp-tag-data-processing-and-analysis-tutorial-e6nvw93x7gmk/v1. (17,459 views, 4,355 exports, and 239 questions)
Liao R+, Zheng Y+, Liu X, Zhang Y, Seim G, Tanimura N, Wilson G, Hematti P, Coon J, Fan J, Xu J, Keleş S++ and Bresnick E++. Discovering How Heme Controls Genome Function Through Heme-omics. Cell Reports. 2020.
Zeng X, Li B, Welch R, Rojo C, Zheng Y, Dewey CN, Keleş S. Perm-seq: Mapping Protein-DNA Interactions in Segmental Duplication and Highly Repetitive Regions of Genomes with Prior-enhanced Read Mapping. PLoS Computational Biology. 2015.
Zheng Y+, Shen S+, Keleş S. Normalization and de-noising of single-cell Hi-C data with BandNorm and scVI-3D. Accepted by Genome Biology. 2022. (*: co-first authors)
Shen S, Zheng Y++, Keleş S++. scGAD: single-cell gene associating domain scores for exploratory analysis of scHi-C data. Bioinformatics. 2022. (+: co- corresponding authors)
Cheng J, Clayton J, Acemel R, Zheng Y, Taylor R, Keleş S, Harley J, Quail E, Gómez-Skarmeta J and Ulgiati D. Regulatory architecture of the RCA gene cluster captures an intragenic TAD boundary and enhancer elements in B cells. Frontiers in Immunology, section B Cell Biology. 2022.
Zheng Y++, Zhou P, Keleş S++. FreeHi-C Spike-in Simulations for Benchmarking Differential Chromatin Interaction Detection. Methods. 2021.
Huang K, Wu Y, Shin J, Zheng Y, Siahpirani A, Lin Y, Ni Z, Chen J, You J, Keleş S, Wang D, Roy S, Lu Q. Transcriptome-wide transmission disequilibrium analysis identifies novel risk genes for autism spectrum disorder. PLOS Genetics. 2021.
Zheng Y, Keleş S. FreeHi-C simulates high-fidelity Hi-C data for benchmarking and data augmentation. Nature Methods. 2020.
The ENCODE Project Consortium, et al. Expanded Encyclopedias of DNA Elements in the Human and Mouse Genomes. Nature. 2020 .
The ENCODE Project Consortium, Snyder, M.P., Gingeras, T.R., Moore, J.E., Weng, Z., Gerstein, M.B., Ren, B., Hardison, R.C., Stamatoyannopoulos, J.A., Graveley, B.R., Feingold, E.A. and Pazin, M.J. Perspectives on ENCODE. Nature. 2020.
Zheng Y, Ay F, Keleş S. Generative modeling of multi-mapping reads with mHi-C advances analysis of Hi-C studies. eLife. 2019.
Fiorenza S, Zheng Y, Purushe J, Bock T, Sarthy J, Janssens D, Sheih A, Kimble E, Kirchmeier D, Phi T, Gauthier J, Hirayama A, Riddell S, Wu Q, Gottardo R, Maloney D, Yang J, Henikoff S, Turtle C. Histone marks identify novel transcription factors that parse CAR-T subset-of-origin, clinical potential and expansion. Accepted at Nature Communications. (2024)
Germanos AA, Arora S, Zheng Y, Goddard ET, Coleman IM, Ku AT, Wilkinson S, Amezquita RA, Zager M, Long A, Yang YC, Bielas J, Gottardo R, Ghajar C, Nelson P, Sowalsky A, Setty M, Hsieh A. Defining cellular population dynamics at single cell resolution during prostate cancer progression. eLife. 2022.
Hirayama AV, Zheng Y, Dowling MR, Sheih A, Phi TD, Kirchmeier DR, Chucka AW, Gauthier J, Maloney DG, Gottardo R, Turtle CJ. Long-Term Follow-up and Single-Cell Multiomics Characteristics of Infusion Products in Patients with Chronic Lymphocytic Leukemia Treated with CD19 CAR-T Cells. Blood. 2021.
Vitanza N, Biery M, Myers C, Ferguson E, Zheng Y, Girard E, Przystal J, Park G, Noll A, Pakiam F, Winter C, Morris S, Sarthy J, Cole B, Leary S, Crane C, Lieberman N, Mueller S, Nazarian J, Gottardo R, Brusniak M, Mhyre A, Olson J, Optimal therapeutic targeting by HDAC inhibition in biopsy-derived treatment-naïve diffuse midline glioma models. Neuro-Oncology. 2021.
Soukup AA, Zheng Y, Mehta C, Liu P, Hofmann I, Zhou Y, Zhang J, Choi K, Johnson KD, Keles S, Bresnick EH. Single-nucleotide human disease mutation inactivates a blood-regenerative GATA2 enhancer. Journal of Clinical Investigation. 2019.
Tanimura N, Liao R, Wilson GM, Dent MR, Cao M, Burstyn JN, Hematti P, Liu X, Zhang Y, Zheng Y, Keleş S, Xu J, Coon J, Bresnick E. GATA/Heme Multi-omics Reveals a Trace Metal-dependent Cellular Differentiation Mechanism. Developmental Cell. 2018.
ADTnorm: R package for normalization and integration tools for CITE-seq cell surface measurement.
scGAD: R package for extracting the three-dimensional chromatin interaction at the unit of genes and facilitate the integration of single-cell 3D genomcis with other single-cell modalities.
scVI-3D: Normalization and de-noising of single-cell Hi-C data using deep generative modeling using python pipline.
BandNorm: R package for fast band normalization for sing-cell Hi-C data. (Co-developer)
FreeHiC Spike-In: FreeHi-C python pipeline with a user/data-driven spike-in module to allow a comprehensive comparison of differential chromatin interaction detection methods where the ground truth differential chromatin interactions are known.
FreeHiC: Python pipeline using FRagment Interactions Empirical Estimation method for fast simulation of Hi-C and other 3D proximity ligation sequencing data. Major computing parts are accelerated by C.
mHiC: Python pipeline of multi-mapping strategy for Hi-C data by probabilistically assigning reads originatedfrom repetitive regions. Major computing parts are accelerated by C.
permseq: R package for mapping protein-DNA interactions in highly repetitive regions of the genomes with prior-enhanced read mapping.
permseqExample: R package for the permseq package illustration and demo runs. Smaller raw data and demo R scripts are provided for quick runs in order to get to know permseq package.
Daily Usage
Daily Usage
Optional if excel at R
Reproducible Report
Reproducible Research
Daily Usage
STAT 423/623 - Probability Bioinformatics and Genetics (Spring 2024 Guest Lecturer at Rice University):
Gave lecture to statistics and biostatistics graduate and undergraduate students about statistical analysis in single-cell genomics.
STAT 877 - Statistical Methods for Molecular Biology (Fall 2020 Guest Lecturer at University of Wisconsin - Madison):
Gave lecture to statistics and biostatistics graduate students about 3D Genomics and Long-range Gene Regulations.
AMSI BioInfoSummer (Winter 2019 Workshop Lecturer at University of Sydney, Australia)
Gave a workshop to the faculty and students attending the AMSI BioInfoSummer conference about basic concepts of 3D Genomics and how to do computational data processing and statistical modeling in a practical manner. Led interactive computational group work to process real Hi-C data using Google Box.
STAT 998 - Statistical Consulting (Fall 2019 Guest Lecturer at University of Wisconsin - Madison):
Lead lectures to discuss real-world consulting problem with statistics graduate students utilizing the traditional and modern statistical tools.
STAT 877 - Statistical Methods for Molecular Biology (Spring 2019 Guest Lecturer at University of Wisconsin - Madison):
Gave lecture to statistics and biostatistics graduate students about 3D Genomics and Long-range Gene Regulations.
2017-2018 Single-cell Technologies Journal Club (Organizer and Instructor at University of Wisconsin - Madison):
Gave lectures about single-cell related research topics, such as scRNA-seq, scATAC-seq and scHi-C, to graduate students and post-docs from statistics background, and led paper review discussions.
2017-2018 Three-dimensional Chromatin Interactions Journal Club (Organizer and Instructor at University of Wisconsin - Madison):
Gave lectures about 3D chromatin architecture related research topics to graduate students and post-docs from statistics background, and led paper review discussions.
STAT301 - Introduction to Statistical Methods (Fall 2014 Guest Lecturer for Discussion Sections at University of Wisconsin - Madison):
Led undergraduate students discussions for solving hypothesis testing and statistical estimation problems.
Long Nguyen, Bioinformatics Analyst I at Fred Hutchinson Cancer Center, currently Master student at University of Michigan (Feb. 2022 to July 2024):
Single-cell transcriptomics and proteomics integrative analysis for cell atlas construction of CAR-T cell therapy CITE-seq data and association with gene and protein markers with clinical responses.
Siqi Shen, Ph.D. Candidate at UW-Madison (June 2020 to Dec. 2023):
Co-mentor with Dr. Sunduz Keles on single-cell 3D chromatin organization normalization and integrative analysis with single-cell transcriptomics and epigenomics.
Fanding Zhou, VISP student at UW-Madison, currently Ph.D. student at UC Berkeley (June 2020 to Sep. 2021):
Co-mentor with Dr. Sunduz Keles on constructing tree-based statistical models for the false discovery rate control of 3D chromatin organization differential detection.
Olivia Rae Steidl, Summer Undergraduate Student at University of Wisconsin - Madison, currently Ph.D. student at University of Wisconsin - Madison (Summer 2019):
Co-mentor with Dr. Sunduz Keles on investigation of the poly(UG) tails at the end of RNAs and its function in human using eCLIP-seq data.
Genome Medicine
Science Advances
Nature Biotechnology
Briefings in Bioinformatics
Scientific Report
eLife
Bioinformatics
PLOS Computational Biology
BMC Bioinformatics
Life Science Alliance
Annals of Applied Statistics
Computational and Structural Biotechnology
Grant Review Committee: 2024 Fred Hutchison Cancer Center TDS IRC Postdoctoral Fellowship and TDS IRC Pilot Award
Program Committee: 2024 Regulatory and Systems Genomics Conference with DREAM Challenges (RSGDREAM2024)