Equilibrium Dynamics of Proteins

I was a PhD student in Marina Guenza's lab at the University of Oregon (UO) from 2015-2021, following my graduation from Centre College with a bachelor's degree in chemical physics. While at the UO, my research focused mostly on extending a previously developed method in the group called the Langevin Equation for Protein Dynamics for analyzing the equilibrium dynamics of proteins in physiological conditions. I would say that a big focus of my PhD focused on developing an anisotropic version of the LE4PD that, under specific conditions, is analytically mappable onto the essential dynamics method, otherwise known as principal component analysis (PCA) on the alpha-carbon degrees of freedom of the protein. I specifically focused on the equilibrium dynamics of the protein ubiquitin, which is a small (76 residue) protein whose primary sequence is highly conserved evolutionarily. Ubiquitin functions in the cell as a post-translational modification that, depending on how it is covalently attached to the target protein, can cause various degrees of regulation, the most infamous of which is indicating the targeted protein for degradation by the proteasome.

Fundamentally, the LE4PD is a modified Rouse-Zimm approach to modeling protein dynamics that accounts for residue-specific friction coefficients, long-ranged hydrodynamic interactions between residues in the protein, and a protein-specific potential of mean force between residues that is built using informaton from atomistic molecular dynamics simulations. The coupled equations of motion for the coarse-grained sites (the alpha-carbons) in real space are subject to a linear transformation to yield a set of uncoupled equations of motion in a mode space of the same dimensionality, although typically the first ten or modes are those where most of the focus is applied, as they correspond to the most slowly decaying modes in the system. However, the behavior of all the modes must be taken into account accurately to describe faithfully e.g. time correlation functions where the dynamics are mapped back to the real space from the mode space. But performing a functional analysis of the dynamics in the mode space is absolutely critical for interpreting the functional behavior of the protein.

Fundamentally, in the mode space, as well as in the real space, the equation of motion of the modes is governed by an overdamped Langevin equation, where the force acting on each mode or bead is dictated by a combination the hydrodynamics and the correlations between the bond vectors defined between adjacent beads in the protein. Overall, the interactions between the beads in the protein are modelled as entropic springs with the beads also subject to the random impacts of the solvent governed by the random noise term in the Langevin equation and the associated fluctuation dissipation theorem.

However, since the model is both coarse-grained at the alpha-carbon level and the springs between beads are assumed to be harmonic, we must account for free-energy barriers along the mode coordinates in some manner if we want to correctly account for the timescales of the important dynamics of the protein. To take account of barriers, we take inspiration from Zwanzig's diffusion in a rough protential model and calculate the relevant free-energy barrier along each mode, using it to effectively renormalize the diffusion coefficient along that mode. Previously, the barriers were found by approximating a characteristic roughness of the free-erngy surface. I took a slightly more precise approach by approximately the slowest dynamics on the first ten modes of ubiquitin using a discrete-state kinetic model known as a Markov state model (MSM). We assume that the slowest modes in the system are slow enough to be ammenable to the MSM formulation and that all faster modes, since they cannot be described using the MSM approach, do not possess metastable minima, and, hence, their free-energy landscapes are merely rough, without any significant energy barriers impeding transport. Since the landscapes for the faster modes are merely rough, we can use the previous heuristic for these modes while treating the kinetics of the slow modes carefully through the MSM construction. This approach allowed for a better interpretation of the slow moded space, and we are able to postulate how these slow mode dynamics are related to the functional dynamics of ubiquitin.

Visualization of the dynamics and timescales of the slowest four modes of ubiquitin discovered through the LE4PD-MSM analysis technique, along with the biological relevance of each of the regions of ubiquitin involved in the slow dynamics.
Figure 1. Visualization of the dynamics and timescales of the slowest four modes of ubiquitin discovered through the LE4PD-MSM analysis technique, along with the biological relevance of each of the regions of ubiquitin involved in the slow dynamics. Maybe, one day in the future, I will improve the quality of this figure, but it seems unlikely that day will come.


This MSM technique can also be combined with the anisotropic version of the LE4PD mentioned above to create the LE4PD-3N approach to modelling protein dynamics. With the LE4PD-3N, we shift from the simulation box frame of reference to the body-centered frame of reference, which removes rotational and translational motions from the system, yielding a set of 3N-6 internal modes of motion. As with the isotropic LE4PD, once rescaling due to the free-energy barriers along each of the mode coordinates, the lowest index modes give the slowest internal dynamics of the protein. Due to the analytical mapping onto the PCA in alpha-carbon coordinates, we are able to show the importance of hydrodynamic effects and free-energy barriers to the predicted slow dynamical motions of the proteins, including the improved ability of the LE4PD-3N to reproduce the simulated time correlation functions from the simulation trajectory.

Comparison of the LE4PD-3N's ability to model the time correlation functions from simulation (black) for a sampling of residues along the primary sequence of ubiquitin with (blue) and without (red) hydrodynamic interactions included. Either method is generally superior to PCA (cyan), which decays too quickly because it does not properly account for the decay timescales along each moded because it neglects free-energy barriers.
Figure 2. Comparison of the LE4PD-3N's ability to model the time correlation functions from simulation (black) for a sampling of residues along the primary sequence of ubiquitin with (blue) and without (red) hydrodynamic interactions included. Either method is generally superior to PCA (cyan), which decays too quickly because it does not properly account for the decay timescales along each moded because it neglects free-energy barriers.


The study comparing the LE4PD-3N method with PCA for modeling the dynamics of ubiquitin was followed up by a study comparing the results of the LE4PD-3N method to the time-lagged extension of PCA, tICA. We performed a detailed comparison of the predicted slow mode dynamics and kinetics between the two approaches and found that the LE4PD-3N is slightly more robust at predicting the dynamics due to its lack of the lagtime parameter present in tICA. However, if the lagtime in tICA is tuned precisely, it can do very well at predicting the decay of the autocorrelation function in regions of the protein undergoing the slowest dynamical transitions.

Comparison of the LE4PD-3N's ability to model the time correlation functions from simulation (black) for a sampling of residues along the primary sequence of ubiquitin. The predictions of the LE4PD-3N is given in blue while the predictions of tICA are given at various lagtimes, with corresponding colors given in each subfigure legend.
Figure 3. Comparison of the LE4PD-3N's ability to model the time correlation functions from simulation (black) for a sampling of residues along the primary sequence of ubiquitin. The predictions of the LE4PD-3N is given in blue while the predictions of tICA are given at various lagtimes, with corresponding colors given in each subfigure legend.


Back to Research page