Bioinformatics Resources and Expertise
UTHSC has deep expertise in bioinformatics and many resources. However, expertise is distributed across centers, departments, and research groups. It can be hard to know who to ask for advice or help. Do you ask William Taylor and colleagues in the Molecular Resource Center (MRC), bioinformaticists in the Center for Integrative and Translational Genomics (CITG) or members of the Department of Microbiology, Immunology and Biochemistry?
To make it easier for you find the right resources for specific types of bioinformatics challenges, we have listed some of the people and initial points of contact. These investigators may have additional ideas. If there are other use cases you would like us to add here, please just send Rob Williams and Terry Mark-Major a request.
This summary of resources does not include equipment for molecular and genomics/proteomics/metabolomics/metagenomics analysis. But many of the same people on this list may be able to point you in the right direction. Don’t assume because you have not heard of a local expert that one is not right around the corner. There is lots of world-class expertise in Memphis.
- Array data analysis. USE Case: The MRC has generated array data for 24 of your samples using an Affymetrix or Illumina array (human, mouse, rat). Now you want to analyze these data and generate a ranked list of transcripts that are affected by a treatment. CONTACTS: Robert W. Williams, Weikuan Gu, and Quynh Tran. Rob, Weikuan, or Quynh can help with the first stage of the analysis—generating a ranked list of transcripts/genes. We can also help with normalization, batch correction, and computing false discovery rates. Once you have a ranked list of genes you may want to perform pathway analysis (more below).
- RNA-seq, ChIP-seq, and DNA sequence analysis. USE Case: The MRC has generated FASTQ files for 12 samples using the ABI 5500XL or the Ion Proton system. You have just received notification that your sequence FASTQ files are available to you on an MRC server. Now you want to download copies of these files and align the sequences. Who do you ask for help? The Center for Integrative and Translational Genomics is the right first contact. Please contact our Systems Administrator and bioinformatist, Mr. Lei Yan or Quynh Tran in Preventive Medicine. Lei Yan will coordinate with you and your team to copy files to the CITG computer servers and file systems that are all housed in the UTHSC Computer Center (Lamar Alexander Building). Please also cc Rob Williams and Ashutosh Pandey. The CITG team or Quynh will help you convert FASTQ files to BAM or SAM files or to RPKM/FPKM files, generate a list of significant genes, and perform pathway analysis.
- Large data set storage. USE Case: You have generated 1–5 TB of proteomics data and need a long-term home for these data with good backup and security. Who do you ask? As in the previous case, please contact our Systems Administrator and bioinformatist, Mr. Lei Yan. Lei Yan will coordinate with you and your team to copy files to the CITG computer systems that are all housed in the UT Computer Center (Lamar Alexander Building). If you need more capacity, we may need to add more space. Jan van der Aa, the Chief Information Officer at UTHSC is another great contact and can provide solutions for even larger data sets.
- Access to computational resources. USE Case: You have FASTQ files and want to perform the sequence alignment yourself. But you would like to use UTHSC computer clusters. What do we have that you can use? The CITG maintains several clusters that you can use. You should be able to access at least 96 cores on our 184 -ore HERA system for several days without charge. Please contact our Systems Administrator and bioinformatist, Mr. Lei Yan. Please copy your email request to Rob Williams. Lei Yan will coordinate with you and provide you the right privileges and credentials to use the systems. You can check the status of HERA core.
- Genetics analysis. USE Case: You are thinking about taking advantage of the mouse or rat populations
and genetic resources at UTHSC and want to know what types of animal models and data
are already available. Who do you contact? Please contact Lu Lu, Terry Mark-Major, or Megan Mulligan. The CITG provides “starter kits” in the form of 60 mice to UT investigators. Please
visit this page for the application form. This program has been in place since 2009 and is still going strong. If you want
to know how to perform the genetic analysis, please contact either Lu Lu, Megan Mulligan, or Rob Williams.
Human genetic analysis: If you have a clinical cohort and need help designing or analyzing an experiment or gaining assess to human genetic data then please contact Khyobeni Mozhui or Rob Williams. Beni and Rob can help you gain access to dbGaP data at NCBI. Lawrence T. Reiter and Mark Ledoux in the Department of Neurology and Alessandro Iannaccone in Ophthalmology also have great expertise in human genetic analysis. Mary Relling’s group at St Jude Children’s Hospital has unrivaled expertise in pharmacogenetic analysis of clinical cohorts.
- Database development for bioinformatics. USE Case: You would like to develop a database system for your lab or research team that can handle the work flow for the next 5 years. Who do you ask for help setting up a system that is more powerful than a mess of Excel spreadsheets? You have several options: For human clinical cohorts, we advise you to contact Ian M. Brooks, Director of Biomedical Informatics. He and his team will have many web-based options for you, including REDCap databases and our own system Slim-Prim. If you are in the Department of Pediatrics, then you should also talk with Dr. Tee. He helped develop Slim-Prim and is an expert in advantage image databases (his Pivot system). If you think you just want a simple commercial solution that you can use in your lab for non-clinical samples and tracking experiments, then contact Rob Williams at the CITG.
- Pathway and network analysis. USE Case: You have a long list of transcripts, genes,
or proteins that are modulated by treatment or by a genetic difference and you want
to explore and analyze the biology of this list. Who can you turn to for ideas and
help? Your best first points of contact in alphabetical order are :
- Hao Chen in the Department of Pharmacology is an expert in network analysis and next-gen sequence alignment. He has built a powerful web service called Chilibot that can link genes to specific functions.
- Yan Cui, Director of Bioinformatics in the Center for Genomics and Bioinformatics and now in the same position in the CITG. Yan and his team are experts in network modeling, with a focus on Bayesian Network (see his powerful Bayesian Network Webserver). Yan also develops bioinformatic resources for the analysis of polymorphisms and somatic mutations in microRNAs and their target sites (see his team’s web site).
- Ivan Gerling has expertise using the Ingenuity Pathway Analysis suite (and a license too). This is a web-based system in which users upload list of genes to answer questions of how they are related to each other and whether or not the list is over-represented with genes of specific types (ontology) and specific pathways. It will also can highlight which molecules (transcription factors, drugs etc.) are common regulators for genes on lists.
- Weikuan Gu has resources and tools for the analysis of gene lists, with a focus on microarray-based studies.
- Ramin Homayouni and colleagues at the University of Memphis have great tools that will combined lists of genes and proteins with biological function (Ramin was in Neurology at UTHSC for many years). Ramin and colleagues have built tools that combined gene lists with PubMed literature search in a sophisticated way. See his company ComputableGenomix—one big UT bioinformatics success story.
- David R. Nelson in the Department of Molecular, Immunology, and Biochemistry is an expert in sequence analysis of gene families. He is the world authority on the P450 family of enzymes, but he may be able to provide help and pointers for other families of genes.
- Rob Williams, Lu Lu, and Megan Mulligan have tools for pathway and network analysis that are usually based on gene expression data in human, mouse, rat, and Drosophila. Their tools are available at the GeneNetwork.
- Igor Zhulin (Jouline) at UTK and ORNL has terrific resources for the analysis of protein structure and function. Check out his impressive www.SeqDepot.net system and resource. Igor is happy to help colleagues at UTHSC.
- Proteomics and protein analysis. USE Case: You have some proteomics data and you need help in the interpretation and analysis of these complex data. Who should you talk to? First, of course, work with your collaborators or the core that generated the data sets. But if you want advice at an early stage, you should talk with with Dominic Desiderio, Sarka Beranova, or Francesco Giorgianni. They are world authorities on proteomic data analysis and should be able to point you in the right directions. Catherine Kaczorowski in the Department of Anatomy and Neurobiology is an expert in membrane-bound proteomes, with a focus on the analysis of brain tissues.
- Boilerplate text for resources on my grant application. USE Case: You would like to
insert text on the bioinformatics resources that you will have available for your
grant application. Who has boilerplate of the right type that you can use as a starting
point? Please contact Rob Williams for a summary of computational hardware you can depend on once your work is funded.
Here is a late 2013 vintage summary of resources. Resources will only get better.
Check with Rob or Bill Taylor for updated specifics.
The Center for Integrative and Translational Genomics has Linux servers and Dell PowerVault RAID storage system (n = 6) for genomic and bioinformatic analysis that are maintained in the main UTHSC machine room in the Lamar Alexander Building. This facility contains three 96U racks. Each rack has a mix of Dell PowerEdge servers (from low-end R610s to high performance R620 and R815 systems) and dedicated Cisco switches. We also use three additional PowerVault systems (currently configured in RAID5) in this facility for storage associated with DNA sequence, ChIP-seq, RNA-seq, and proteomics data sets.
The main bioinformatic resources we use are provided by the Center for Integrative and Translational Genomics (CITG) and the UT Molecular Resource Center (MRC). The MRC currently has five next-generation sequencers (two ABI 5500XLs one with the WildFire upgrade, two Ion Torrents, and one Ion Proton) and ancillary equipment for RNA and DNA analysis (e.g., Agilent Bioanalyzers, robots, Fluidym BioMarkHD). MRC facilities are funded by the State of Tennessee. The CITG provides computational support from the initial stages in data analysis (e.g., sequence alignment, as well as storage and backup) and general support for servers and databases. The CITG maintains Linux clusters and mirrors of Galaxy, GeneNetwork, Bayesian Web Server, UCSC Genome Browser, and Sanger's Look-Seq. Both the MRC and CITG have a budget to upgrade equipment and storage.
- Hardcore statistics help. USE Case: You are about to design a complicated experiment involving several treatments, both sexes, multiple ages, multiple genotypes, and multiple batches. Who can help you with the complex ANOVA design and analysis? Who can help you with time-series data, or principal component analysis? Most of these more sophisticated statistical methods require a devoted statistician, often someone who has worked on large genomics data sets. The Department of Preventive Medicine has several card carrying biostatisticians who can help. To ask for help go to Preventive Medicine's Statistics & Methodology Consulting page and complete the on-line form and faculty from Preventive Medicine will contact you. If you want someone with experience analyzing arrays and RNA-seq data, then talk with Rob Williams, Weikuan Gu, Hao Chen, Yan Cui, Lu Lu, and Megan Mulligan. They may be able to point you to the right expert or collaborator.
- Information security. USE Case: Your data has Protected Health Information consisting of any of the 18 types of identifiers for an individual or for the individual's employer or family member. This information could be used, either alone or in combination with other information, to identify an individual. Who can help you with ensuring that the data are stored safely and properly secured? In case of questions contact UTHSC Information Security and the Office of Institutional Compliance, Frank Davison, for assistance.
- Learning more about bioinformatics and related fields at UTHSC. Use case: You or your
colleagues and students want to learn more about opportunities to carry out bioinformatic
studies. What special interest groups and meetings are open to you?
- Join BIG. BIG is an acronym for the Biomedical Informatics Group that is organized by Teresa Waters and funded by the College of Medicine. Free lunch, last Friday of every month on the firth floor conference room, 920 Madison. Speakers and participants have a very wide range of expertise—from genomics to health systems economics.
- Join the Center for Integrative and Translational Genomics mailing list. Just send an email to Terry Mark-Majors. The CITG organizes more than 8 seminars per year on special topics in genomics and bioinformatics. We also host technology seminars by major equipment vendors (Illumina, Affymetrix, ABI, Fluidigm, etc).
- Attend the 14th Annual UT-KIRN Bioinformatics Summit. This year the meeting is at Lake Barkley Resort in Kentucky (April 11 to 13). These are inexpensive but great meetings with speakers who are luminaries (Stuart Kauffman this year).