Available Projects in Bioinformatics, Computational Biology, Genetics and Machine Learning

If anyone is looking for a project in either the areas of bioinformatics, computational biology, genetics or machine learning, my group has many projects available.

Projects are available for all levels of students (Undergraduate, Masters or PhD).

Below are a few potential projects:
1. Complex Traits in Inbred Mouse Strains
2. Genetics of Gene Expression
3. Discovering the Genetic Basis of Human Disease
4. Statistical and Algorithmic Aspects of Motif Discovery
5. Regulatory Aspects of Human Disease
6. Webservers for Genetic Research

Contact me (eeskin at cs dot ucla dot edu) if you find any of these interesting and would like to get started. `




Complex Traits in Inbred Mouse Strains

Inbred mouse strains are a very powerful and well studied human disease and complex trait model. A tremendous amount of information is available for various inbred strains including phenotypic information stored in the Mouse Phenome Database (MPD) and high-throughput genomic data such as expression microarray data. Recently, several high density SNP maps have also been developed for inbred mouse stains. These resources combined with what is already known about mouse genetics in terms of quantitative trait loci (QTLs) and known pathways, make inbred mouse strains an ideal model system. This project combines multiple types of data in order to understand the genetic basis of complex traits. Some aspects of this project include performing whole genome association analysis of the mouse SNP maps over the phenotypes in the MPD and augmenting the association analysis results with information from expression data, known pathways and QTLs. The goal of this project is to discover regions in the mouse genome associated with phenotypes and verifying that many of the predictions are consistent with genes known to influence specific traits.




Genetics of Gene Expression

Many recent studies has demonstrated that genetic variation influences gene expression or which genes are active in any given cell. These gene expression changes may have implications in human diseases. Using both information on genetic variation and information on gene expression from different strains of model organisms, this project will to attempt to understand the how genetic variation affects gene expression.




Discovering the Genetic Basis of Human Disease through Association Studies

Humans differ by .1% of their genomes. Within this small amount of variation is encoded our genetic disposition to diseases such as hypertension. By examining populations of diseased and healthy individuals and their variation in genes known to be factors in the diseases we can identify specifically which variants correspond to the disease. This type of analysis is called an association study. The goal of designing association studies is to maximise the probability of detecting variation involved in disease while minimising the cost of the study. This project focuses on the design of efficient association studies and involves methodological challenges both statistical and algorithmic in nature.




Predicting the Effect of Variation on Molecular Function

Any two humans differ by approximately .1% of their genome. However, only even a smaller fraction has any biological function. This project attempts to identify what human variation has molecular function. By identifying this variation, we can reduce the set of variation which are candidates for being involved in genetic diseases. This project will develop techniques to predict the effect of the variation on a gene such as changing the structure of the protein product or affecting the regulatory structure. This project involves using probabilistic modelling and comparative genomics techniques.



Regulatory Aspects of Human Disease

Complex diseases have many genetic factors which influence the likelihood of contracting the disease. Many of these genetic factors are single nucleotide polymorphisms (SNPs) that occur in the regulatory region of promoter of genes that are known to be implicated in the disease. This project attempts to model the human promoter and understand how the SNP affects the functioning of the promoter. This project leverages several recent works on modelling of promoters.


Webservers for Genetic Research

Webservers for genetic research. A challenge in genetic research is the need to integrate large amounts of different types of genomic data from a variety of sources. This project develops visualisation and integration tools for genetic researchers to perform their analyses.