Conformational ensemble modeling of multi-domain proteins with experimental data
Date of Issue2017
School of Biological Sciences
The structural variation of multi-domain proteins with flexible components mediates many biological processes. Their structure determination is crucial for understanding the mechanism of their interactions and functions, but it remains a challenge to derive the structure ensemble of flexible biomolecules solely from experimental data. Computational methods assist in the interpretation of the time- and spatial-averaged experimental data for a correct structural characterization. A conformational ensemble can be determined by selecting a weighted combination of representative structures from a given structure pool, yielding the best fit to experimental measurements. However, a raw numerical data fitting causes the over-interpretation of the experimental data, which would recover the structure ensemble of wrong conformations. In our study, we have employed the idea of hybrid models to simulate multi-domain proteins. The intra-domain structures are modeled with the structure-based potential and the inter-domain interactions are characterized with the physics-based energies. For ensemble construction with experimental data, structure energy is proposed as a physics-based regularization to minimize the problem of data over-fitting. The absence of energy regularization exposes ensemble construction to the noise from high-energy structures. We have demonstrated the importance of the physics-based regularization to interpret experimental data to avoid arbitrary and spurious conformational representations. We have implemented our computational model and ensemble optimization to two systems of multi-domain proteins on the coarse-grained and atomistic levels respectively. Firstly, we simulate the dengue virus 2 (DENV2) non-structural protein 5 (NS5) at the coarse-grained level and construct its conformational ensemble with experimental small-angle X-ray scattering (SAXS) data. The energy regularization assists to recover a conformational ensemble from the SAXS profile, which successfully reveals the domain-domain orientation and domain contacting interface of NS5. Moreover, we applied the computational modeling scheme to a di-domain complex of a poly(U)-binding protein (Pub1), followed by ensemble construction with its experimental paramagnetic relaxation enhancement (PRE) data that correspond to inter-atomic distance restraints. A conformational ensemble of Pub1 has been recovered from the low-energy structures and with the ensemble-size restraint. This ensemble is in good agreement with experimental PRE rates and also supported by the experimental data of chemical shift perturbations.