Yang Li

Publication Date


Document Type

Honors Thesis


Computer Science


Kinematics, Proteins-Structure, Ring networks (Computer networks), Robotics, Computational biology, Inverse kinematics, Protein loop closure


Proteins are building blocks of all living organisms. In order to perform their biological functions, proteins are capable of undergoing a variety of internal motions to switch from one conformation to another. The study of protein motion is thus essential for understanding how protein functions and facilitates pharmaceutical research. Since protein motion can not be observed under the microscope, researchers need efficient tools to simulate protein motion. KINARI, a protein rigidity analysis software developed in Linkage Lab, provides tools for analyzing and exploring the rigidity properties of protein structures. Using the rigidity properties, we envision a more efficient way to model protein motion. An important component of KINARI is curating input protein data before rigidity analysis. In the curation phase, additional information such as hydrogen atoms and molecular interactions are computed from the raw structural data to ensure the correct modeling of protein structures. The focus of this research is to investigate and prototype an additional feature of curation: fixing gaps in protein structural data. Protein structural data often contains "gaps", where the atomic coordinates have not been resolved during the structure determination process. Fixing gaps by computing the missing coordinates is therefore desired for improving the accuracy of rigidity analysis. This research aims to develop a gap completion tool that (i). detect gaps from protein structures presented in the PDB format, and most importantly, (ii). solve for missing coordinates using inverse kinematics, a well studied analytical technique in robotics. For (i), A C++ program was written to detect and classifies gaps in PDB files. By running the program on 527 sampled PDBs, we found that 57.3% of the tested structures contain at least one gap, and shorter gaps are much more common then longer ones. For (ii), we investigated existing inverse kinematics algorithms in robotics and computational biology literature, and devised our own methods for fixing short gaps with 1-2 missing residues. We also developed a Mathematica prototype that implements our methods and tested it on artificially generated gaps from known proteins. Because of the simplicity of our model, only 50% of the 2-residue gaps in the sample are successfully fixed. Nevertheless, our gap completion tool has demonstrated potential in enhancing input protein structures for rigidity analysis using KINARI. For future work, we will refine the gap fixing methods and transform the Mathematica gap fixing prototype into a C++ implementation to be incorporated into the future version of KINARI.




vii, 73, 8, 13 p. : col. ill. Honors Project-Smith College, Northampton, Mass., 2011. Includes bibliographical references (p. 70-73)