Alexander Shumway and Faculty Mentor: Mark Transtrum, Department of Physics and Astronomy
Introduction
Mathematical models are ubiquitous in science. Many models are nonlinear in the parameters and may have dozens to thousands of parameters and make hundreds to thousands of predictions. Analysis and application of these models is thus theoretically complicated and computationally expensive.
The standard method of model analysis is a model-by-model approach that relies on the intuition of expert researchers. Recent research, however, has shown that many models—known as sloppy models—are statistically similar, despite coming from widely varied fields4. This suggests the possibility of developing a theory of modeling in place of relying on expert intuition. Our research works toward such a theory by leveraging recent advances in sloppy models and information geometry to classify and quantify parameter nonlinearities in complex models.
Methodology
We approach model analysis using the tools of differential geometry and multilinear algebra. Models are treated as mappings from the set of all parameter values (parameter space) to the set of all predictions (prediction space) by an evaluation map, where parameter space and prediction space are each subsets of high-dimensional Euclidean spaces. The image of parameter space under the evaluation map is a hypersurface in prediction space, known as the model manifold. Parameters can be thought of as control knobs to move along the surface of a manifold, and they thus determine a coordinate grid on the manifold.
Essential to our analysis are matrices and their higher-order analogs, known as tensors. The Jacobian matrix is the matrix of first-order partial derivatives of model predictions with respect to parameters. It encodes the local linearization of the model manifold. Similarly, the Acceleration tensor is the tensor of second-order partial derivatives of model predictions with respect to parameters, i.e. the Jacobian of the Jacobian. It encodes all of the information regarding the local nonlinearity of a model.
In our analysis of model nonlinearity, it is convenient to break the nonlinearity into two components: the component tangent to the manifold surface and that perpendicular to it. The tangential component, known as parameter effects curvature, corresponds to a warping of the above-mentioned coordinate grid. The perpendicular component, known as extrinsic curvature, corresponds to the curvature of the hypersurface itself.
We use a variety of models from several fields as test cases in our analysis, such as the Wnt signaling pathway model2, a biochemical adaptation model3, and artificial neural network models as would be used in machine learning.
Results
As noted above, previous research has found statistical similarity between the Jacobians of sloppy models: in these models the singular values of the Jacobians follow a logarithmic distribution4. Our research extends this result to the Acceleration tensor. We use a higher-order singular value decomposition for tensors1. We found that the higher-order singular values of the Acceleration tensor also follow a logarithmic distribution. Furthermore, the singular vectors for the Jacobian matrix and the Acceleration tensor are closely correlated. We also found that model nonlinearity is dominated by parameter effects curvature, while extrinsic curvature is largely negligible.
Discussion
The results obtained have a number of implications for improving model analysis, particularly in light of the large computational cost of calculating both the Jacobian matrix and the Acceleration tensor.
The logarithmic distribution of parameter sensitivities in the Acceleration tensor suggest being able to ignore most parameter combinations in local analysis of model nonlinearity, since most parameter combinations affect model predictions only slightly. Further, the close correlation between the singular vectors for the Jacobian matrix and the Acceleration tensor allows approximation of important parameter directions for the Acceleration tensor by computation of the directions for the much simpler Jacobian matrix. The fact that nonlinearity is dominated by parameter effects curvature further justifies a simplifying step of considering only nonlinearity tangent to the manifold surface. There is a clear hierarchy of simplifying approximations for analyzing model nonlinearities.
Conclusion
Mathematical models are often extremely complicated, leading to challenges in using them to gain theoretical insight into the phenomena modeled. Our development of the theory behind sloppy models has produced tools to aid in model analysis. The fact that extrinsic curvature is largely negligible, for instance, suggests that assuming a flat manifold is often a reasonable approximation. Correlation between directions of sensitivity for the Jacobian matrix and the Acceleration tensor also allows modelers to identify parameter combinations with important nonlinearity.
Further research may involve investigating the speed at which important parameter combinations vary for models. This will give an idea of the scale on which simplifying assumptions incorporating only the most important parameter combinations are valid.
Current work seeks to apply insight gained from our nonlinearity analysis to improve the Levenberg-Marquardt data-fitting algorithm. Other areas of model analysis, such as model reduction and machine learning, also have potential for improvement through application of this theory. We hope the current work on the Levenberg-Marquardt algorithm will serve as a starting point for future work to improve theoretical and computational methods of model analysis.
1. De Lathauwer, Lieven, Bart De Moor, and Joos Vandewalle. “A multilinear singular value decomposition.” SIAM journal on Matrix Analysis and Applications 21.4 (2000): 1253-1278.
2. Lee, Ethan, et al. “The roles of APC and Axin derived from experimental and theoretical analysis of the Wnt pathway.” PLoS Biol 1.1 (2003): E10.
3. Ma, Wenzhe, et al. “Defining network topologies that can achieve biochemical adaptation.” Cell 138.4 (2009): 760-773.
4. Machta B.B., Chachra R., Transtrum M.K., Sethna J.P. Parameter Space Compression Underlies Emergent Theories and Predictive Models. Science 342, 604-607 (2013).