This website serves as a Python based machine learning platform to predict the strength of σ70 core promoters in Escherichia coli in a manner that subverts the need for tedious experiments and is cost effective.
The biopython routines were used to construct the PSSM (Biopython)
Here the user will have to enter the -10 and -35 regions in the input boxes and the platform will return the strength of the input promoter relative to the strength of the strongest Anderson promoter. The dynamic model is optinal and the user can click "Predict" once the Sequences are entered.
The advent of genetic engineering over the last few decades has opened up new avenues for biologists. One of them has been expressing proteins in organisms of their choice and altering the level of protein expression. Escherichia coli has been the lab workhorse and ideal model organism for many years, given the ease with which it can be grown in a lab, its extensive characterization of a variety of strains and high level of safety. The sigma 70 promoters in the organism are ubiquitously used by genetic engineers to initiate transcription. Yet, their characterization in a lab remains to be a time consuming and expensive process.
This website serves as a Python based machine learning platform to predict the strength of sigma seventy core promoters in Escherichia coli in a manner that subverts the need for tedious experiments and is cost effective.
Here multi-variate linear regression has been used where the parameters were optimized with gradient descent. The training data set used here is the Anderson promoter collection developed and characterized by the Anderson lab at UC Berkeley. This particular set was chosen given that this is highly used by members of the academia and teams that participate in the iGEM(international Genetically Engineered Machines Competition). The corollary is that given the highly characterized nature of data, these collection of promoters are more robust, hence training the linear model with these promoters will lead to a model whose predictions can be expected to be robust for other promoters that have not been as extensively characterized as the Anderson promoters.
The input variables here are the -10 and -35 motifs present in the promoters. A Position Specific Scoring Matrix(PSSM) is constructed to capture a generative model of the -10 and -35 promoter regions. These scores were then regressed using gradient descent to minimize a cost function against a set of relative promoter strengths provided by the Anderson lab. The cost function was minimized when gradient descent ran 10000 times at a learning rate of 0.015. As an addendum, Leave One Out Cross Validation(LOOCV) was performed on the model and an optimized R2(correlation co-efficient) was calculated to be 0.70,indicating the goodness of fit.
This is the Dataset of -35 and -10 Sequences along with their repective strengths which is being used to make the Predictions.
|Promoter ID||-35 Sequence||-10 Sequence||PSSM -35||PSSM -10||Actual Strength||ln(Actual Strength)|
Ashok Palaniappan is very interested in applying computational thinking to solve difficult biological problems. He has demonstrated expertise in the development of computational methods, notably the use of Fourier spectrum analysis to detect periodicity in evolutionary conservation of protein secondary structure.
He has shown ability to develop novel approaches for difficult biological problems, notably the identification of stage-specific biomarkers in colon cancer tumorigenesis, progression and metastasis. Using the differential ligand affinity and free energy of binding, he and co-worker were able to computationally analyze the role of P-glycoprotein polymorphisms in patient resistance to therapy.
Ashok obtained his PhD from the University of Illinois at Urbana-Champaign, USA (2005). He is Senior Assistant Professor in the School of Chemical and Biotechnology, Sastra University, Thanjavur 613401. 1. A. Palaniappan, E. Jakobsson, Fourier analysis of conservation patterns in protein secondary structure, Computat Struct Biotechnol J 2017, 15, 265-271. 2. A. Palaniappan, K. Ramar, S. Ramalingam, Computational identification of novel stage specific biomarkers in colorectal cancer progression. PLOS ONE 2016, 11(5): e0156665. doi:10.1371/journal.pone.0156665 3. S. Varghese, A. Palaniappan, Computational studies of P-glycoprotein polymorphisms in antiepileptic drug resistance mechanism, bioRxiv 2016, 095059; doi: https://doi.org/10.1101/095059.
Ramit is an undergraduate student in his final year studying Biotechnology Engineering at Sri Venkateswara College of Engineering, Anna University. He is looking to carve out a career in synthetic biology given the exciting possibilities that the field has. He was an integral part of his colleges iGEM team in 2016. He is the team of leader of his college’s 2017 iGEM team.
After learning about the exciting prospects of Machine learning algorithms, he intends to apply them to biological data sets to make meaningful inferences from them or build tools that can save time and money in the lab for Biologists. He believes that a true product of engineering is often one that is born out of a mixture of different fields and that combining the prowess of two burgeoning fields of the 21st century – Synthetic biology and Machine Learning could lead to exciting products and services in the future.
Keshav Aditya is currently a final year undergraduate student studying Computer Science Engineering at Sri Venkateswara College of Engineering, Anna University. He is a part of his colleges 2017 iGEM team. He is very passionate about programming and is looking to become a software developer. Apart from this he also an excellent sportman.
He is a Full-Stack Web Developer and Platform Independent Mobile Application Developer. He has also deployed Machine Learning algorithms for various problems. He's starting to explore new and exciting avenues such as deep learning and looks to implement and work with sophisticated learning algorithms in the future.
Ashok : +91 8056037107
Ramit B : +91 9940149332
Keshav Aditya R.P :+91 7299926896
Ramit B: email@example.com
Keshav Aditya R.P: firstname.lastname@example.org