Bioinformatics

GridQTL and CloudQTL were part of a 10 year BBSRC funded project to run QTL algorithms  on Cloud & Grid systems  accessed via a web based portal.

A paper on Cloud Computing used in the above projects can be seen here and is available from the arΧiv repository.

A  project detailing bridged DNA microchip data using novel machine learning algorithms can be found here.

Technical Details:

Refactoring and optimising of Java/Fortran/C++ /matlab/R baysian statistical code with emphasis on parallel processing, big data read in optimisation (TB), and hooking into a Web Portal for data persistence and access to grid and cloud systems. For fast data reads of large (TB) genomic data we employed and researched a host of technologies – Java threading on multi processors (including GPUs), MapReduce techniques using R (bigmem & ff) and Apache Hadoop.

The web portal for GridQTL has over 600 users and 150 citations worldwide!