[转载]Easy Modeller Introduction

2010-10-25 09:02 · pttianya

引言：Modeller是一个蛋白质结构预测的软件：1）支持安装在windows、Mac和Linux平台上； 2）支持基于多个模版模建结构； 3）它自身带一套模建结构后的优化、分析软件。关键是它是完全免费的，而且得到广泛的认可，目前最新版本为Modeller 9v7。但是该软件完全

引言：Modeller是一个蛋白质结构预测的软件：1）支持安装在windows、Mac和Linux平台上； 2）支持基于多个模版模建结构； 3）它自身带一套模建结构后的优化、分析软件。关键是它是完全免费的，而且得到广泛的认可，目前最新版本为Modeller 9v7。但是该软件完全是命令行模式，操作相对复杂，对于习惯于图形界面（GUI）的我们来说不太方便。印度 Hyderabad大学的一位牛人Kuntal Kumar Bhusan为其编写了一个GUI界面，即为Easy Modeller，使这一切变得极为简单，下面引用该软件的1.0版对其进行介绍。（目前该软件的最新版本为2.0，可以支持windows系统下各版本的Modeller）

Easy Modeller v1.0 is A GUI to MODELLER

Developed by: Kuntal Kumar Bhusan

Contact: kuntal.bhusan@gmail.com

Prof. Reddanna Eicosanoids, Inflammation and Cancer Research Group

Department of Animal Sciences, School of Life Sciences, University of Hyderabad

1. Introduction

One of the biggest goals in structural bioinformatics is the prediction of the three-dimensional structure of a protein from its one-dimensional protein sequence. The goal is to be able to determine the shape (known as a fold) that a given amino acid sequence will adopt. The problem is divided further based on whether the sequence will adopt a new fold or resemble an existing fold (template) in a protein structure database. Fold recognition is easy when the sequence in question has a high degree of sequence similarity to a sequence with known structure [7]. If the two sequences share evolutionary ancestry, they are said to be homologous. For such sequence pairs we can build a structure for the query protein by choosing the structure of the known homologous sequence as a template. This is known as comparative modeling. When the query lacks a good template structure, one must attempt to build a protein tertiary structure from scratch. These methods are usually called ab initio methods. In a third fold-prediction scenario, there may not necessarily be good sequence similarity with a known structure, but a structural template may still exist for the given sequence. To clarify this case, a person aware of the target structure could extract the template using structure?structure alignments of the target against the entire structural database. It is important to note that the target and template need not be homologous. These two cases define the fold prediction (homologous) and fold prediction (analogous) problems during CASP competition. Comparative Modeling or homology modeling is used when there exists a clear relationship between the sequence of a query protein (unknown structure) to that of a sequence of a known structure. The most basic approach to structure prediction for such (query) proteins is to perform a pairwise sequence alignment against each sequence in protein sequence databases. This can be accomplished using sequence alignment algorithms such as Smith?Waterman [55] or sequence search algorithms (e.g., BLAST [3]). With a good sequence alignment in hand, the challenge in comparative modeling becomes how best to build a three-dimensional protein structure for a query protein using the template structure. The heart of the process is the selection of a suitable structural template based on sequence pair similarity. This is followed by the alignment of query sequence to the template structure selected to build the backbone of the query protein. Finally the entire structure modeled is refined by loop construction and side-chain modeling. Several comparative modeling methods, more commonly known as modeler programs, focusing on various parts of the problem have been developed over the past several years [6, 13].

2. What is MODELLER?

MODELLER is a computer program that models three-dimensional structures of proteins and their assemblies by satisfaction of spatial restraints.

More generally, the inputs to the program are restraints on the spatial structure of the amino acid sequence(s) and ligands to be modeled. The output is a 3D structure that satisfies these restraints as well as possible. Restraints can in principle be derived from a number of different sources. These include related protein structures (comparative modeling), NMR experiments (NMR refinement), rules of secondary structure packing (combinatorial modeling), cross-linking experiments, fluorescence spectroscopy, image reconstruction in electron microscopy, site-directed mutagenesis, intuition, residue-residue and atom-atom potentials of mean force, etc. The restraints can operate on distances, angles, dihedral angles, pairs of dihedral angles and some other spatial features defined by atoms or pseudo atoms. Presently, MODELLER automatically derives the restraints only from the known related structures and their alignment with the target sequence.

A 3D model is obtained by optimization of a molecular probability density function (pdf). The molecular pdf for comparative modeling is optimized with the variable target function procedure in Cartesian space that employs methods of conjugate gradients and molecular dynamics with simulated annealing.

MODELLER can also perform multiple comparisons of protein sequences and/or structures, clustering of proteins, and searching of sequence databases. The program is used with a scripting language and does not include any graphics. It is written in standard FORTRAN 90 and will run on UNIX, Windows, or Mac computers.

3. Method for comparative protein structure modeling by MODELLER

MODELLER implements an automated approach to comparative protein structure modeling by satisfaction of spatial restraints [6].Briefly, the core modeling procedure begins with an alignment of the sequence to be modeled (target) with related known 3D structures (templates). This alignment is usually the input to the program. The output is a 3D model for the target sequence containing all mainchain and sidechain non-hydrogen atoms. Given an alignment, the model is obtained without any user intervention. First, many distance and dihedral angle restraints on the target sequence are calculated from its alignment with template 3D structures .The form of these restraints was obtained from a statistical analysis of the relationships between many pairs of homologous structures. This analysis relied on a database of 105 family alignments that included 416 proteins with known 3D structure [7]. By scanning the database, tables quantifying various correlations were obtained, such as the correlations between two equivalent Cα-Cα distances, or between equivalent mainchain dihedral angles from two related proteins. These relationships were expressed as conditional probability density functions (pdf's) and can be used directly as spatial restraints. For example, probabilities for different values of the mainchain dihedral angles are calculated from the type of a residue considered, from mainchain conformation of an equivalent residue, and from sequence similarity between the two proteins. Another example is the pdf for a certain Cα-Cα distance given equivalent distances in two related protein structures. An important feature of the method is that the spatial restraints are obtained empirically, from a database of protein structure alignments. Next, the spatial restraints and CHARMM energy terms enforcing proper stereochemistry [8] are combined into an objective function. Finally, the model is obtained by optimizing the objective function in Cartesian space. The optimization is carried out by the use of the variable target function method [9] employing methods of conjugate gradients and molecular dynamics with simulated annealing. Several slightly different models can be calculated by varying the initial structure. The variability among these models can be used to estimate the errors in the corresponding regions of the fold. There are additional specialized modeling protocols, such as that for the modeling of loops.

原文链接为：https://blog.sciencenet.cn/home.php?mod=space&uid=260508&do=blog&id=376809

关键词：