VCADS

Hybrid modeling fouces on integrating the statistical capabilities of machine learning models with the mechanistic prowess of physics-based models. At the implementation level, hybridization is accomplished through the inclusion of physics-based biases into the learning system across various levels. Consequently, hybrid approaches can be classified into three primary types: observational bias, learning bias, and inductive bias.

Inductive bias pertains to the utilization of prior knowledge derived from physics to modify the computational graph of a machine learning model. This modification is implemented through the incorporation of artificial neural networks within a customized computational graph. The objective is to enforce consistency between the model and the physics-based prior knowledge of a dynamic system. This entails introducing additional computational steps, including mathematical operations, to integrate the physics-derived prior knowledge. By incorporating physics-based constraints into the computational graph, the machine learning model becomes capable of accurately capturing the underlying principles and relationships of the dynamic system.

We have used the power of hybrid approaches to model complex dynamics observed in nonlinear dynamic systems, by incorporating principles from classical mechanics [1]. This methodology employs the Euler-Lagrange and Hamiltonian formalisms to improve the learning process from data. This is achieved by modifying the computational graph of machine learning models to adhere to the physics rules that govern the dynamics of the system. This modification allows for parameterizing an intermediate scalar function in a machine learning model that has a physical interpretation, such as energy. The hybrid model is essentially an artificial neural network with a computational graph that is modified from standard neural networks in a few significant ways. The first modification includes incorporating an intermediate scalar function representing the Hamiltonian learned from data. The second modification enhances input/output channels for capturing the multidimensional dynamics of the system. The main goal of such hybrid reasoning is to improve the extrapolation capability of the model by enforcing conformance with some key aspects of the underlying physics in the form of a bias. The results demonstrate that incorporating this physics-based bias into the hybrid model empowers it to produce long-term and physically plausible predictions. The proposed modeling approach also shows high scalability for energy-based modeling of multidimensional dynamic systems.

Fig. 1 demonstrates the computational graph of a conventional neural network and a Hamiltonian neural network. In the second network, direct predictions of time derivatives of position and momentum are replaced by the terms ∂HΘ/∂px, -∂HΘ/∂qx, ∂HΘ/∂py, and -∂HΘ/∂qy. This customized network architecture allows a Hamiltonian neural network to learn the conserved quantity (Hamiltonian, analogous to the total energy) directly from data. In this network, the mapping from the position and momentum to associated time derivatives is indirectly developed in the learning process. It passes through an intermediate physics-inspired computational step which is embedded in the network to approximate the Hamiltonian. Also, it is used for taking derivatives with respect to the inputs based on the Hamiltonian equation. Furthermore, given the multidimensionality of the dynamic system, a Hamiltonian neural network with an extended number of input/output channels is developed in this study. The improved computational graph provides for incorporating the effects of each subsystem and the coupling between them in learning the Hamiltonian.

Figure 1 Computational graph of (a) a conventional neural network, (b) a Hamiltonian neural network

Coupled nonlinear oscillator:

The performance of the proposed hybrid modeling approach is evaluated using a coupled nonlinear oscillator, which represents a complex dynamic system. Coupled nonlinear oscillators are commonly used to describe the dynamics of various physical, chemical, and biological systems. In this study, a coupled oscillator with quartic nonlinearity is considered, specifically used to model free vibrations of stretched strings and dynamics of coupled nonlinear pendulums, among other applications.

The schematic of this system is depicted in Fig. 2, where x and y denote the two degrees of freedom, and m₁ and m₂ represent the masses of the subsystems. The linear stiffness coefficients associated with the x and y degrees of freedom are denoted by kₗₓ and kₗy, respectively. Furthermore, kₙₓ and kₙy indicate the corresponding nonlinear coefficients, while kₖ represents the coefficient of the nonlinear coupling stiffness.

Figure 2 represents schematic of the coupled nonlinear dynamic system.The graphical representation of model development and deployment is illustrated in Fig. 1. The samples, which include initial conditions for each subsystem, are distributed randomly between 𝐸𝑚𝑖𝑛 and 𝐸𝑚𝑎𝑥, where i and j are numerators for development and deployment samples, respectively. In the development process of the model, n number of initial conditions are selected randomly. Then, the position, momentum, and associated time derivatives of each sub-system are observed in a time range [0, 𝜏1] with a time step of dτ. Each sample includes 𝑞𝑥, 𝑝𝑥, 𝑞𝑦, and 𝑝𝑦 as the inputs and 𝑞̇𝑥, 𝑝̇𝑥, 𝑞̇𝑥, and 𝑝̇𝑦 as the outputs which are associated with n number of trajectories starting from initial conditions over the observation time range ([0, 𝜏1]). The mapping between these input and output sets fully characterizes the dynamics of the system, and the model is developed to capture it.

In the deployment section, which is to evaluate the performance of the model, it is employed to predict the system response. Hence, it is used in an initial value problem (IVP) for an unseen value of the initial condition. Accordingly, m number of unseen initial conditions are selected randomly. Each initial condition, including 𝑞x⁰, 𝑝x⁰, 𝑞y⁰, and 𝑝y⁰ is used in the IVP along with the developed model to roll out the dynamics of the system. Numerical time integration (INTG) is carried out in the time range of [0, 𝜏2] for predicting the time evolution of each subsystem. The IVP is solved using the fourth-order Runge-Kutta as a numerical time integrator. The outputs are predicted 𝑞x(t), 𝑝x(t), 𝑞y(t), and 𝑝y(t) within the time range of [0, 𝜏2]. As discussed earlier, this study focuses on the extrapolation capability of the hybrid model, so it is assumed that 𝜏2 = k𝜏1, where k > 1. The time range of [𝜏1, 𝜏2], which is noted by a yellow region in Fig. 3, includes a range of data to which the model has not been exposed, so it is used for evaluating the extrapolation capability of the model.

Figure 3 Graphical representation of the hybrid model development and deployment

The performance of the hybrid model is evaluated by physics-based and statistical measures. The statistical metric used is the Root Mean Square Error (RMSE), which quantifies the mismatch between the model's predictions and the observations for individual samples in the deployment set. It measures the fitness of the models in predicting each data point along trajectories starting from unseen initial conditions. The results indicate that the Hamiltonian Neural Network (HNN) and the Bayesian Neural Network (BNN) achieve comparable performance in terms of the RMSE, suggesting no significant difference in their predictive capabilities. To assess the compliance of the models with the governing physics, a physical metric based on energy is considered. This metric, known as the energy deviation error (𝐸𝑟𝑅𝑀𝑆𝐸), calculates the Root Mean Square of the error between the predicted and conserved energy for each trajectory. Since energy is conserved for each trajectory regardless of time length, this measure can also evaluate the extrapolation capability of each model.

Fig. 4 shows a comparison between the performance of the HNN and BNN using the energy deviation error within two time ranges, [0, 𝜏1] and [𝜏1, 𝜏2]. In this analysis, the HNN and BNN are utilized in an Initial Value Problem (IVP) with a set of unseen initial conditions to obtain Fig. 4a. The IVP is solved within the time range [0, τ₁], which corresponds to the time range of the model development data. The energy deviation error (𝐸𝑟𝑅𝑀𝑆𝐸) is then calculated for each trajectory, and the mean of this physics-based performance metric is compared between the BNN and HNN in Fig. 4a. The results show that the HNN exhibits a relative improvement of 13% compared to the BNN in terms of compliance with the conservation of energy.

Figure 4 Mean value of energy deviation error (𝐸𝑟𝑅𝑀𝑆𝐸 ) for HNN and BNN for trajectories starting from unseen initial conditions within the time range of (a) [0, τ₁] and (b) [τ₁, τ₂]

The same analysis is conducted within the time range [𝜏1, 𝜏2], which represents the extrapolation range. In this case, the relative improvement increases to 82%. Comparing the values of this physics-based performance measure in the two time ranges reveals a significant improvement in the extrapolation range. This demonstrates the notable capability of the hybrid model to provide physically plausible extrapolation. The incorporation of physics-based knowledge during the development of this model enhances its ability in this regard. In contrast, the analysis of the BNN reveals an accumulation of energy error as the time range expands. This can be attributed to the absence of knowledge about the physics of the dynamic system, leading to a violation of the governing physics and an inability to capture the long-term dynamics of the system. Consequently, the error grows as the model goes beyond the development range.

Results show that the hybrid model outperforms the pure data-driven model up to 14% and 82% in energy deviation metric within and out of the training range, respectively. The notable difference between the performances out of the training range demonstrates high extrapolation capability of the hybrid model. In other words, the hybrid model holds consistency with underlying physics over the range of data to which it has not been exposed. The model's predictive capability is critically important since it can be used for a wide range of applications, such as design, response prediction, optimization, control, diagnostics, and prognostics. Also, it has been shown that the proposed hybrid model is scalable to multidimensional dynamic systems with coupled subsystems. In high-dimensional systems, energy-based reasoning for modeling dynamics of a system is preferred. In this approach, unlike the Newtonian methods, there is no need for decoupling the subsystems and calculating the internal forces under energy conservation assumption. Evaluating the scalability of the proposed modeling approach for learning dynamics of systems with higher dimensions is the topic of our ongoing research.