## Abstract

We describe the development of the Intelligent Towing Tank, an automated experimental facility guided by active learning to conduct a sequence of vortex-induced vibration (VIV) experiments, wherein the parameters of each next experiment are selected by minimizing suitable acquisition functions of quantified uncertainties. This constitutes a potential paradigm shift in conducting experimental research, where robots, computers, and humans collaborate to accelerate discovery and to search expeditiously and effectively large parametric spaces that are impracticable with the traditional approach of sequential hypothesis testing and subsequent train-and-error execution. We describe how our research parallels efforts in other fields, providing an orders-of-magnitude reduction in the number of experiments required to explore and map the complex hydrodynamic mechanisms governing the fluid-elastic instabilities and resulting nonlinear VIV responses. We show the effectiveness of the methodology of “explore-and-exploit” in parametric spaces of high dimensions, which are intractable with traditional approaches of systematic parametric variation in experimentation. We envision that this active learning approach to experimental research can be used across disciplines and potentially lead to physical insights and a new generation of models in multi-input/multi-output nonlinear systems.

## INTRODUCTION

It is 3:00 a.m. in mid-winter in Cambridge, Massachusetts. The Massachusetts Institute of Technology (MIT) Sea Grant Hydrodynamics Laboratory is pitch black and vacant, but a periodic sound comes from the Intelligent Towing Tank (ITT) every 2 to 4 min, lasting 1 min. The ITT works continuously day and night without any interruption or supervision.

Here, we describe the robotic ITT, which we began operating only last year. The ITT has already conducted about 100,000 experiments, essentially completing the equivalent of all of a Ph.D. student’s experiments every 2 weeks. Over the past 30 years at a similar MIT laboratory, a typical doctoral student would finish her Ph.D. in about 5 years, having completed no more than a thousand laborious experiments (*1*).

The total number of experiments completed by the ITT in its first year of operation is perhaps comparable with all experiments done collectively to date by all of the different laboratories in the world on the subject of vortex-induced vibrations (VIVs). By deploying the Gaussian process regression (GPR) as the learning algorithm in the ITT, we experimentally studied VIV problems within a much wider parametric input space than previously explored. In doing so, we demonstrated a potential paradigm shift in conducting experimental research, where computers [advancement in artificial intelligence (AI) technology (*2*)], robots [increasing use of laboratory automation (*3*–*5*)], and humans can collaborate in real time to accelerate scientific discovery. In such a shift, robots, computers, and humans may expeditiously and effectively explore very large parametric spaces not possible with the traditional approach of sequential hypothesis testing and train-and-error execution.

Our laboratory is not the only one using this symbiosis between research, machines, and science. At Carnegie Mellon University, researchers are training robots to conduct chemical work. The “robot researcher” will decide how to modify substances and reactions without the need for human intervention (*6*). A number of similar tools have also been developed and used in the life sciences (*7*), e.g., involving an intelligent “robot scientist,” “Adam,” whose role is to generate new hypotheses for functional genomics and test them (*8*, *9*). Another robot scientist, “Eve,” has reportedly tested drugs successfully for malaria (*10*). These robot scientists are similar to what we have achieved with the ITT, where the computer decides which combination of parameters (e.g., speed, amplitude, and frequency) will be investigated and executed next by robots based on the targeted quantity of interest (QoI).

A further development along these lines is the Big Mechanism program of the Defense Advanced Research Projects Agency (DARPA) (*11*). The computer reads tens of thousands of papers and synthesizes a hypothesis that can then be tested in a laboratory by humans, robots, or a combination of all three—computers, robots, and humans. A vivid manifestation of this vision, at least in part of the upstream process, is the recent publication of a book authored by “Beta Writer” on a machine-generated summary of current research on lithium-ion batteries (*12*). The Big Mechanism of DARPA is another upstream component of the entire robot-computer-human research process to accelerate scientific inquiry and discovery. In the current paper, however, we focus on the formulation of the hypothesis testing and particularly on its effective execution, i.e., a downstream process.

Fifty years ago, this type of paradigm shift and potential revolution in scientific research and beyond was envisioned by the interactive computing pioneer J. C. R. Licklider, who hoped that, “in not too many years, human brains and computing machines will be coupled together very tightly, and that the resulting partnership will think as no human brain has ever thought and process data in a way not approached by the information-handling machines we know” (*13*). Today, 20 years into the 21st century, we see such a vision being finally realized, with different scientific fields contributing creatively to this metamorphosis of scientific inquiry and discovery.

In the following sections, we first describe the problem to study, provide an overview of our approach, and present the results. Then, we discuss conclusions and limitations of the current research and main future directions. The Materials and Methods section reviews the GPR learning algorithm used in our facility and defines the hydrodynamic quantities studied in the paper.

### Problem description

The field of fluid-structure interactions is very rich in physical complexity in both hydrodynamic and aerodynamic flows, and one of the canonical problems in the field is VIV (*14*, *15*). VIV occurs when a flexibly mounted bluff body is placed within an oncoming cross-stream that a spontaneous instability in the wake of the body, above a Reynolds number (*Re*) of about *Re* = 50, causes the formation of asymmetric vortical patterns, which induce unsteady loads on the body and hence a vibratory response (*16*). More than a hundred years ago, Strouhal showed that, for subcritical *Re*, alternating sign vortices form behind a circular cylinder at a distinct nondimensional frequency *fd*/*U* of about 0.20 (*17*), where *d* is the diameter of the cylinder, *f* is the frequency of vortex formation, and *U* is the stream velocity. When the cylinder is flexibly mounted, it vibrates harmonically but at a frequency somewhat different than the Strouhal frequency, influenced by the natural frequency of the structure and the effective added mass of the cylinder, which varies with frequency.

The complexity of the physical mechanisms that lead to the vibrations of the structure makes prediction very difficult. Bishop and Hassan made the successful hypothesis that, when a cylinder is forced to vibrate in the cross-flow direction at the frequency and amplitude of a freely vibrating cylinder, it is subjected to identical forces (*18*). Under the assumption that free vibrations are steady state and harmonic, exploring the fluid force dependence in forced vibrations as a function of the principal parameters—viz. the amplitude and frequency of oscillation—would allow the prediction of free vibrations as well. As a result, several studies (*1*, *19*–*22*) followed that mapped the force dependence and made the connection of the force parametric dependence with changes in the vortical patterns (*23*, *24*).

Hence, using force vibration data was shown to be a powerful tool to predict VIV. *Re* was found to influence the vibrational properties, leading to higher amplitudes of free vibrations in the subcritical regime with increasing *Re* (*25*) and then sharp changes within the critical *Re* range, with a return to smooth amplitude variations in the supercritical regime (*26*, *27*); therefore, extensive forced vibration testing as a function of *Re* was required (*28*). The discovery that in-line motions affect cross-flow vibrations substantially (*29*) added further parametric complexity, requiring inclusion of the amplitude of in-line vibration and the phase angle between in-line and cross-flow response. Also, long flexible structures placed in shear currents, such as cables and risers, are subject to multifrequency vortex-induced responses (*30*, *31*); it is fortuitous that strip theory is found to be largely valid; hence, again, forced vibrations can be used to explore the properties of VIV.

In summary, forced vibrations have become a uniquely effective tool in exploring the very complex properties of VIV (*32*), leading to the discovery of important properties and the development of comprehensive databases; however, the number of independent parameters required to predict the vibratory response of flexible structures in sheared flows is large, making a systematic parametric search intractable. That is, assuming that we typically have a 10-dimensional parametric space and we blindly conduct 10 measurements per parameter, this brute-force approach would require 10 billion experiments, which is clearly infeasible. Therefore, we hereby introduce active learning to endow the ITT with intelligence to automatically conduct a sequence of forced vibration experiments to study VIV, wherein the parameters of the next experiment to conduct are selected by minimizing suitable acquisition functions (*33*). In this way, we show that we reduced the experimental burden by several orders of magnitude, requiring only a few thousand experiments, whereas the choice of each next experiment is made by the computer, as we describe in detail in the next section, thus automating and accelerating the experimental effort.

### Approach overview

The experimental facility constructed for our study is the ITT shown in Fig. 1. It consists of a towing tank, a robot, and a computer. The ITT has a towing length of 10 m and a 1 m by 1 m test cross section. The main carriage is installed on two rails aligned with the tank length and is able to reach a constant velocity from 0.01 to 1.50 m/s. On the carriage, a stage is installed with three degrees of freedom, allowing combined trajectories of in-line (align with the towing direction), cross-flow (perpendicular to the towing direction), and rotation motions. The software of the experimental facility is developed with integrated capability of the motion update and trajectory monitoring (Power PMAC system), force measurement (NI DAQ-USB6218 with an ATI-Gamma six-axis force sensor), and GPR (*34*) learning in MATLAB.

Using disciplinary knowledge, we first identify the input parameters and their ranges that may affect the QoIs and pass this information to the ITT. In the future, this first step too could also be automated using approaches like the aforementioned Big Mechanism. Next, we can start exploration and exploitation of the parametric space adaptively and in sequence and automatically perform the corresponding experiments to predict QoIs. The flowchart in Fig. 1 presents the main steps of the adaptive sequential experimentation by the ITT. The process starts with a small number of experiments with inputs randomly selected in the parametric space (the initial number of tests has to be larger than the number of the parameters). After the new experiment, the ITT performs learning with GPR on the existing data to update its prediction on QoIs (a brief overview of GPR is provided in Materials and Methods). Meanwhile, we find the inputs of the next experiment by minimizing the acquisition function, which describes the uncertainty through the standard deviation (SD) as a function of the parameters (*35*). Before running the next experiment, a pause period is enforced to quiet the fluid motion and avoid cross-contamination of the results. Then, after the new data have been gathered, the next iteration of learning process begins.

The learning stops when the prediction of the QoIs is converged. Here, we track the maximum of the SD σ_{max} in the iteration to be stably smaller than a reference level. Because GPR learning is a stochastic process, such a reference of the convergence should be associated with the inherent system uncertainty. We classify the system uncertainty into two types due to (i) modeling and (ii) measurement. The modeling uncertainty arises from the selection of the surrogate model and the optimization methods for the learning, whereas the measurement uncertainty comes from the sensor noise as well as the physical uncertainty associated with the problem, e.g., background turbulence in the water. Ideally, if we can map the unknown function of the QoI perfectly, there will be a zero modeling error, and hence, the predicted uncertainty will converge to the measurement uncertainty. The measurement uncertainty is an inherent property of the experimental facilities (varies by facility) and has to be calibrated beforehand.

In our VIV studies, we quantified the measurement uncertainty of the ITT by evaluating the SD of a baseline case: repeated experiments for the mean drag coefficient *C _{d}* of a stationary circular cylinder in uniform flow at

*Re*= 12,000 (the mean and SD of the results are provided in table S1 and fig. S1). The mean and the SD of

*C*are found to be equal to 1.198 and 0.0398, respectively. We must point out that the variation of

_{d}*C*originates not only from sensor noise but also from the three-dimensional nature of the vortex shedding affecting the correlation length of the vortical structures in the wake of a stationary circular cylinder, whose length varies in the range of three to five diameters (

_{d}*36*). In our experiment, our cylinder length-to-diameter aspect ratio is 12.3, and hence, the vortex shedding process is not fully correlated along the entire model span. This result is comparable with previous literature (

*1*) of

*Re*= 10,000 conducted in a different facility. We choose to multiply this baseline SD with a factor to define the convergence reference level for the measurement uncertainty of all QoIs in our study. Together with the physical arguments and the inherent uncertainty of the facility, the prediction of QoIs is considered converged, and therefore, the ITT learning stops when σ

_{max}of 10 successive iterations is found to be smaller than 3σ

*.*

_{r}Typical results of the GPR learning process described above are shown in Fig. 2, where the QoI is the lift coefficient in-phase with the velocity *C _{lv}* of a cross-flow only forced vibrating rigid cylinder in uniform flow at

*Re*= 12,000. The definitions of the hydrodynamic coefficients are given in Materials and Methods. In Fig. 2, the arrows indicate the GPR learning sequence. With 15 experiments (Fig. 2A.1, the black dots denote the data used in the learning for the contour), the ITT finds that

*C*can be both positive and negative, separated by the red contour line of

_{lv}*C*= 0. The quantified uncertainty guides the next experiment at

_{lv}*f*= 0.1455 and

_{r}*A*/

_{y}*d*= 0.4571, where σ is found as the maximum σ

_{max}in the SD plot (inset). After the new data have been gathered, the ITT updated both its prediction and the quantified uncertainty of the QoI. The ITT observed a larger positive region for

*C*in Fig. 2A.2, and the next experiment selected was performed at

_{lv}*f*= 0.1427 and

_{r}*A*/

_{y}*d*= 1.35. With the increase in the number of experiments, the ITT revealed more details about

*C*versus

_{lv}*f*and

_{r}*A*/

_{y}*d*. In the meantime, the value of σ

_{max}found in each iteration in Fig. 2B was shown to decrease approaching the 3σ

*reference line. Between Fig. 2A.3 of 36 and Fig. 2A.4 of 37 experiments, a new feature of a second positive region of*

_{r}*C*emerged, accompanied with a slight increase of σ

_{lv}_{max}shown in Fig. 2B. Eventually, upon convergence, the ITT has learned the

*C*pattern, whose contours did not change by performing additional experiments. Figure 2A.5 selects the case of 80 experiments as a representative, and the entire evolution of the 176 experiments can be found in fig. S2. The process of the ITT sequential experimentation has been recorded and documented in movie S1.

_{lv}### Kernel selection

The GPR learning performance, viz. the convergence rate, depends on the selection of the surrogate model (see an overview of GPR in Materials and Methods). Figure 3 (A to C) plots the σ_{max} over 200 iterations of GPR learning QoI with different kernel functions for *C _{d}*,

*C*, and

_{lv}*C*, respectively, of a rigid cylinder forced vibration in uniform flow at

_{my}*Re*= 12,000. We see that, in comparing the performance among different kernel functions with a fixed basis function, the learning processes for all three hydrodynamic coefficients converged eventually but with different rates of convergence. After the test of the different basis and kernel function combinations, based on the convergence rate, we identified the best combinations for the different hydrodynamic coefficients and used them for the rest of our study (Table 1).

## RESULTS

Next, we demonstrate how GPR learning and the ITT accelerated the route to discovery by comparing with the traditional approach of manual uniform sampling of parametric space. We also demonstrate how the ITT allows us to explore wider parametric spaces for possible new insights and universal scalings.

### GPR adaptive learning versus uniform sampling

The ITT first learns the three hydrodynamic coefficients (*C _{d}*,

*C*, and

_{lv}*C*) versus

_{my}*f*and

_{r}*A*/

_{y}*d*of a rigid cylinder in cross-flow only forced vibration and uniform flow at

*Re*= 12,000 using a multi-output GPR learning strategy. The results are compared with the reference experiment using uniform sampling on a lattice, which includes 2268 experiments with 28 different values of

*f*, 27 different values of

_{r}*A*/

_{y}*d*, and three repeated runs for each

*f*and

_{r}*A*/

_{y}*d*combination. The multi-output GPR learning process updated the prediction of the multiple QoIs in batches, because between two iterations, multiple experiments were conducted on the basis of the searching of σ

_{max}for each QoI. The learning of a QoI stopped when the convergence criterion was met, and the whole process stopped when all QoIs converged on the basis of the aforementioned criteria.

The comparison of the results is shown in Fig. 4 for *C _{d}* (first row, 75 experiments for GPR learning),

*C*(second row, 77 experiments for GPR learning), and

_{lv}*C*(third row, 90 experiments for GPR learning). To quantify the difference between the two sets of experiments, we selected 30 points in the parametric space, denoted as the blue dots in Fig. 4A.2, to calculate their average value from the reference experiment, as well as the average of the prediction and SD at those 30 points for each iteration of the GPR learning sequential experiments, as follows

_{my}The results are plotted in Fig. 4 for *C _{d}* (A.3),

*C*(B.3), and

_{lv}*C*(C.3). We see that with the increase in the number of experiments, the difference of the prediction average of the corresponding 30 points between the reference experiments (red dashed line) and the GPR learning experiment (black dashed line) in each iteration became smaller, and the

_{my}On the basis of this comparison of the experiments using adaptive GPR learning and uniform sampling, we see that with <5% of the total reference experimental runs, the ITT using the GPR learning strategy was able to capture the major features of the hydrodynamic coefficients of a rigid cylinder in uniform flow at *Re* = 12,000, as follows: (i) *C _{d}* was found to vary from 1 to 4 and increase with

*f*and

_{r}*A*/

_{y}*d*; (ii)

*C*had two positive regions in the parametric space; (iii)

_{lv}*C*was found to change drastically from a negative to a large positive value around

_{my}*f*= 0.16. Previous research has shown, using flow visualization, that such changes are due to a wake mode change from the so-called 2P pattern to 2S pattern (

_{r}*37*).

*Re* effect

The effects of the physical nonlinearities become stronger with *Re* for VIV, so next we focused on learning the *Re* effect of an oscillating cylinder in the cross-flow direction. Previous research has shown that *Re* has played a significant role in the Strouhal number (St) (nondimensional frequency) and the mean drag coefficient *C _{d}* of a stationary smooth circular cylinder (

*38*,

*39*). However, only limited work has been performed on an oscillating cylinder at various

*Re*values, because adding

*Re*as a parameter substantially increases the number of experiments required and hence the complexity of the experimental problem. Nevertheless, limited available studies show that, even within the subcritical regime,

*Re*plays an important role in affecting the fluid forces and the wake states (

*24*). Hence, for the second set of experiments, in addition to

*f*and

_{r}*A*/

_{y}*d*, we included

*Re*as the third input parameter, ranging from

*Re*= 1200 to

*Re*= 19,000 of the subcritical regime.

Figure 5 plots the converged hydrodynamic coefficients (A, *C _{d}*; B,

*C*; and C,

_{lv}*C*) in the three-dimensional parametric space (

_{my}*f*,

_{r}*A*/

_{y}*d*, and

*Re*) from

*Re*= 1200 to

*Re*= 19,000. We found that (i)

*C*did not depend strongly on

_{d}*Re*; (ii) at higher

*Re*, the isosurface of

*C*= 0 depended only on

_{my}*f*, whereas at lower

_{r}*Re*, the isosurface of

*C*= 0 was also a function of

_{my}*A*/

_{y}*d*; (iii)

*C*had two separate positive regions consistently over the entire range of

_{lv}*Re*studied; and (iv)

*C*described the average energy transfer over time between the fluid and structure. Therefore, if the structure has zero damping, when the flexibly mounted cylinder has reached a steady-state vibration, the fluid energy flux will be zero, and hence,

_{lv}*C*will be equal to zero. Figure 5B shows that with increasing

_{lv}*Re*, the maximum

*A*/

_{y}*d*associated with the

*C*= 0 isosurface increased from

_{lv}*A*/

_{y}*d*= 0.75 to

*A*/

_{y}*d*= 1.15 over the studied

*Re*range (see also fig. S3, where we show contours of

*C*from GPR learning at various

_{lv}*Re*number values, and contours of

*C*and

_{d}*C*at various

_{my}*Re*number values are shown in figs. S4 and S5). This results in an increasing amplitude response of the cylinder cross-flow only free vibration in the subcritical

*Re*range when

*Re*increases from 1200 to 19,000, which has been reported in previous work (

*25*). This shows how the physics discovered in forced vibrations with the ITT can explain established results in literature for free vibrations.

It should be noted that our experimental facility is typical of laboratory-size facilities that target subcritical *Re*. Testing at high *Re* requires large facilities, where cost and time required to conduct them rise as the third power of *Re*; our methodologies hold even greater promise for testing at high *Re* because few experiments are possible to cover a wide parametric range.

### Larger parametric space

Using the assumption of strip theory, the hydrodynamic coefficients acquired from the rigid cylinder forced vibration experiment can be used to predict the VIV of a marine riser placed in sheared ocean current profiles (*40*). However, the riser response is not limited to cross-flow vibrations at a single frequency only; it involves an in-line response as well, which is coupled to the cross-flow motion, and, in addition, multiple frequencies may be excited (*30*, *31*). Hence, in the third task, the ITT aimed to learn a single QoI of *C _{lv}* for a rigid cylinder undergoing combined in-line and cross-flow forced vibration in uniform flow at

*Re*= 5715 and at either a single frequency or two, as follows

*A*/

_{y}*d*and

*A*

_{y2}/

*d*were selected within [0.05, 1.2];

*A*/

_{x}*d*and

*A*

_{x2}/

*d*were selected within [0.05, 0.4]; θ and θ

_{2}were selected in [0, 2π];

*V*= 1/

_{r}*f*is the reduced velocity, the inverse of the reduced frequency and was selected in [4, 8]; and

_{r}*V*

_{r2}was selected in [2, 15]. This part is also an important step in parametric analysis and is based on disciplinary knowledge, but future work could also automate this part using the aforementioned concepts for the upstream preparation for experimentation to formulate proper hypotheses and scalings (

*11*–

*13*).

Compared with approximately 10^{8} experiments required for double-frequency tests using uniform sampling strategy, the ITT obtained converged results of *C _{lv}* with 3944 experiments. To show the effect of the second vibration frequency on

*C*associated with the first frequency, we define χ

_{lv}*to be the average value of*

^{j}*C*for the double-frequency experiment with input ((

_{lv}*A*/

_{y}*d*,

*A*/

_{x}*d*,

*V*, θ)

_{r}*, (*

^{i}*A*

_{y2}/

*d*,

*A*

_{x2}/

*d*,

*V*

_{r2}, θ

_{2})

*) that 2400 (*

^{j}*A*/

_{y}*d*,

*A*/

_{x}*d*,

*V*, θ) combinations are randomly selected in the parametric space

_{r}*N*= 2400 and standard Morris sensitivity analysis (

*41*) is performed on χ, with 100 discrete levels along each dimension of the parametric space and 1000 elementary effects per parameters, resulting in

*j*= 1, 2, … 5000.

Figure 6 plots the comparison of *C _{lv}* versus

*V*and θ/π for rigid cylinder in-line and cross-flow combined forced vibration in uniform flow at

_{r}*Re*= 5715 between single- and double-frequency experiments at fixed

*A*/

_{x}*d*= 0.15 and

*A*/

_{y}*d*= 0.75, as well as the sensitivity analysis on χ

*. The results reveal that (i) with the existence of the second frequency component,*

^{j}*C*associated with the first frequency was found to be dependent on θ. More specifically, positive

_{lv}*C*was found to be mainly associated with θ ∈ [0, π] of counterclockwise in-line and cross-flow trajectory, similar to that of the single-frequency vibration (

_{lv}*26*,

*29*) shown in Fig. 6A. (iii) The sensitivity analysis in Fig. 6C indicates that

*V*

_{r2}had a much stronger effect on

*C*associated with the first frequency compared with

_{lv}*A*

_{y2}/

*d*,

*A*

_{x2}/

*d*, and θ

_{2}, where μ

_{χ}on the

*x*axis is the mean of the individual elementary effects (thus, the sensitivity of the parameter tested alone). Also, σ

_{χ}on the

*y*axis represents the SD of the elementary effects (thus, the sensitivity of the parameter tested in interaction with other parameters), which is also revealed by comparisons among panels B.1 to B.5 of Fig. 6.

## DISCUSSION

Our results bear promise for accelerating discovery in experimental science and for a potential paradigm shift in experimental laboratories around the world for new research procedures based on a combination of robots, machine learning, and humans synergistically. The idea is simple to implement as we resort to laboratory robots to perform automatic sequential learning tasks of studying the scientific hypotheses raised by humans or synthesized by both humans and AI technologies. With the newly constructed ITT, we studied one of the canonical fluid-structure interaction problems, VIV of bluff bodies, using rigid cylinder forced vibration experiments. The study serves as a realization of the not-so-new idea of the scientific robotic researcher. It demonstrates that, with a careful calibration of the inherent uncertainty of the experimental facility and a selection of the proper machine learning tools (in the current research, basis and kernel functions of the GPR for QoIs), the ITT is capable of (i) adaptively and intelligently designing and conducting sequential experiments to study targeted QoIs (in the current research, hydrodynamic coefficients of a forced vibrating cylinder in uniform flow); (ii) revealing the complex physics of the nonlinear system with the same level of accuracy but a reduced number of experiments by orders of magnitude compared with the traditional experimental sampling strategy; and (iii) exploring a wider parametric space (in the current research, up to eight parameters) for new physical insights and scalings, which was infeasible in past research and thus accelerates the scientific discovery.

One of the benefits for the GPR learning results is not addressed but should be mentioned here: When the learning process stops, the ITT provided not only a collection of experiment data but also, more importantly, an accurate functional approximation of the targeted QoI. Such a functional representation opens new possibilities to use various optimization tools while incorporating additional physical insights as constraints when applying the acquired data to predict or understand more complicated problems. For example, when predicting the VIV response of a long, slender marine riser in the ocean current using the hydrodynamic coefficients acquired from the rigid cylinder forced vibration experiment, a method of searching data tables contained in databases and using parametric interpolation is currently used (*40*). The use of deep learning methods allows the application of effective optimization methods, expanding the parametric space to represent appropriate riser physical conditions and enabling the study of complex phenomena, such as the recently observed flexible cylinder VIV hysteretic response associated with mode switch (*42*) or extremely large vibrations observed in some experiments (*43*).

In the current active learning framework for adaptive sequential experimentations, shown in the flowchart of Fig. 1, we selected GPR as the main learning tool. As we have shown, the performance of the GPR learning, such as the convergence rate, depends heavily on the choice of the kernel functions. In our current study, the candidates were drawn from several standard kernel functions of the Matern family. Other models, such as the newly developed neural net–induced Gaussian process (*44*), should be tested to improve the GPR learning performance, especially for rare events. The selection of the “best” kernel of each QoI in the current research is merely via brute force by comparing the performance of the converged GPR learning results with different kernels of the same experiment. Several recent studies (*45*, *46*) on learning the kernel from the data could be exploited so that, in a sequential experiment, not only the prediction of the QoIs but also the kernels are updated. With an increasing number of experiments, a “better” form of the kernel will emerge to represent the experimental data, which may accelerate the learning process but, more interestingly, may better reveal or interpret the hidden physics of the data by examining the learned kernel form.

One limitation of using GPR is that the computational expense quickly increases with the number of experiments (*47*). This limits the total experiment number and the dimension for the input parametric space. Deep neural networks (NNs) are known for their high expressivity and the ability to handle large dimensions of input parameters and big data (*48*). Hence, the next generation of the ITT should include a deep NN that can handle problems with hundreds of input parameters, unlike GPR. This will require robust methods of uncertainty quantification of NN (*49*), which is a subject of ongoing work (*50*).

The searching strategy is another key component in the active learning for the sequential experimentation. In the current study, we found the parameters of the next experiment input by merely applying the strategy of finding the maximum of the SD. In the future, prior knowledge of the problem (*51*), such as the well-understood physics and/or the engineering requirements, can be incorporated to form acquisition functions with multiple objectives and/or additional physical-informed constraints (*52*).

Although the robotic part of the apparatus described here is relatively simple, consisting of automatically conducting forced vibration experiments with prescribed motions, our laboratory has pioneered and made available the use of a more complex apparatus for fluid mechanics research involving simultaneous experimental testing of models connected with virtual systems through real-time simulation (*53*, *54*), which we can use within the same scheme. Our methodology has been emulated in other laboratories, denoted as cyber-testing (*55*–*57*), enabling complex system representation in the laboratory. Hence, the same procedure is applicable with complex cyber-physical systems that constitute an elaborate robotic apparatus.

The machine learning methodology in this paper is not limited to fluid mechanics and can be easily transferred to other areas, e.g., in experimental solid mechanics, where a large number of specimens are required to quantify the modulus of elasticity, the yield stress, and the onset of fracture. Hence, combined with advanced manufacturing technologies (*58*–*60*) that are capable of generating versatile prototypes in a short amount of time, we foresee great potential for automatic sequential experimentation to map material and structural properties (*61*) to obtain understanding that may lead to new advances, such as developing the next generation of morphing wings for aviation (*62*). Similarly, this methodology is readily applicable to nondestructive evaluation of materials, where uncertainty quantification and automation will accelerate considerably such testing. Furthermore, in an application outside the well-controlled environment of the laboratory, the methodology can be used with multiple, inexpensive robots to form dynamic swarms (*63*, *64*) that will enable adaptive and swift monitoring and exploration of the environment (*65*).

In conclusion, robotic scientists should be playing a greater role in the automation of science, in particular in engineering, where there are many opportunities to implement machine learning methodologies analogous to the one presented here.

## MATERIALS AND METHODS

In this section, we first review GPR, the learning algorithm selected in the current research. Then, we define the hydrodynamic coefficients of the rigid cylinder in the forced vibration experiments conducted, followed by the review of Morris method for global sensitivity analysis.

### Gaussian process regression

GPR is a nonparametric method of modeling unknown functions from a finite set of training points to make predictions, and it has been successfully applied in various fields to explore hidden physics from data (*65*–*69*). Furthermore, GPR provides quantification of the uncertainty based on the selected kernel function and the estimated measurement uncertainty, which can guide the sequential experimentation adaptively to explore the parametric space and predict QoIs. We briefly review GPR because it is an essential component of ITT’s algorithm, but we refer readers to the book by Rasmussen and Williams (*34*) for a detailed exposition.

The experiment dataset *n* observations of the output *y* with the input **x** ∈ ℝ* ^{s}* of

*s*dimensions. Let us consider the following map between the inputs and output

**h**(

**x**) are a set of fixed basis functions that transform the input

**x**∈ ℝ

*into new vectors of*

^{s}**h**(

**x**) ∈ ℝ

*and*

^{q}**β**is a

*q*-by-1 vector of basis function coefficients.

*f*(

**x**) ∼ GP(0,

*k*(

**x**,

**x**′)) is the bias of a zero mean GP with the covariance (kernel) function

*k*(

**x**,

**x**′) parameterized by a set of hyperparameters

**ϑ**, and thus

*k*(

**x**,

**x**′∣

**ϑ**), and the measurement noise ε ∼

*N*(0, σ

^{2}) is assumed independent between every two outputs. Therefore, the probability distribution of outputs

**y**of

*n*observations given

*X*(matrix of size

*r*×

*n*) and

**f**(vector of size

*n*× 1) can be modeled as follows

The joint distribution of **f** = (*f* (**x**_{1}), *f* (**x**_{2}), …, *f* (**x*** _{n}*))

*given*

^{T}*X*is

*K*(

*X*,

*X*) is the covariance matrix of the following form

To estimate the parameters **β**, **ϑ**, and σ^{2}, given the dataset 𝔻, we maximize the marginal likelihood *P*(**y**∣*X*) = *P*(**y**∣** f**,

*X*)

*P*(

**∣**

*f**X*) ∼

*N*(

*H*

**β**, σ

^{2}

*I*+

*K*(

*X*,

*X*∣

**ϑ**)) as a function of

**β**,

**ϑ**, and σ

^{2}. Therefore, the best estimate

With the best estimate of *y** with the input ** x*** as follows

Therefore, the parameters of the next experiment input **x***) as a function of **x*** as follows

### Using the rigid cylinder forced oscillation to study VIV

In this section, we describe the three types of rigid cylinder forced vibration experiment performed in the current study and define the corresponding hydrodynamic coefficients.

*Cross-flow only forced vibration*

A cylinder of length *l* and diameter *d* is forced to follow a sinusoidal trajectory in the cross-flow direction to the uniform inflow with velocity *U* as follows*Y*_{0} and *f*_{0} are the cross-flow oscillation amplitude and frequency, respectively. The lift and drag forces on the cylinder can be modeled as follows*D _{m}* is the magnitude of the mean drag force;

*L*

_{0}and

*D*

_{0}are the magnitudes of the oscillating lift and drag forces at frequencies

*f*

_{0}and 2

*f*

_{0}, respectively; ϕ

_{0}is the phase difference measured between the cross-flow motion and the oscillating lift force; and φ

_{0}is the phase difference measured between the cross-flow motion and the oscillating drag force. Therefore, the hydrodynamic coefficients—i.e., mean drag coefficient

*C*, lift coefficient in phase with velocity

_{d}*C*, and added mass coefficient in the cross-flow direction

_{lv}*C*—are functions of the nondimensional cross-flow amplitude

_{my}*A*/

_{y}*d*=

*Y*/

_{o}*d*and the reduced frequency

*f*=

_{r}*f*/

_{o}d*U*as follows

Therefore, from the experiments, we can measure the three hydrodynamic coefficients as follows

*In-line and cross-flow combined forced vibration*

When the rigid cylinder is forced to oscillate harmonically in uniform inflow, its trajectory can be described as follows*X*_{0} is the oscillation amplitude in the in-line direction and θ is the phase difference imposed between the in-line and cross-flow motions, and we define the trajectory as counterclockwise when θ ∈ [0, π] and clockwise when θ ∈ [π, 2π]. The lift and drag forces on the cylinder can be again modeled as in Eq. 2. The hydrodynamic coefficients are hence functions of four parameters as follows*A _{x}*/

*d*=

*X*/

_{o}*d*is the nondimensional in-line amplitude and

*V*= 1/

_{r}*f*is the reduced velocity, inverse of the reduced frequency. The targeted QoI in the current paper,

_{r}*C*, of rigid cylinder in-line and cross-flow combined forced vibration is the same as in Eq. 4.

_{lv}*In-line and cross-flow combined forced vibration with double frequencies*

Instead of single-frequency vibration, when vibrating simultaneously at two frequencies, the cylinder trajectory is prescribed as follows*X*_{2}, *Y*_{2}, and θ_{2} are the in-line, cross-flow amplitude, and phase difference of the second vibration frequency *f*_{2}. Therefore, the lift and drag forces on the cylinder can be modeled as*L*_{2} and *D*_{2} are the magnitudes of the lift and drag forces associated with second frequency *f*_{2} and ϕ_{2} and φ_{2} are the phases between forces and motions of second frequency *f*_{2}. The targeted QoI in the current paper, (*C _{lv}*)

_{double}, is associated with the first vibration frequency

*f*

_{0}, and it is a function of eight parameters, i.e.,

### Morris method for global sensitivity analysis

This method is widely used to screen the important input parameters for a given model or problem (*41*). Given a normalized input space *S* = [0,1]* ^{s}* with an

*s*-dimensional,

*p*-level full factorial grid that is

*x*∈ {0,1/(

_{i}*p*− 1),2/(

*p*− 1), …,1} for

*i*= 1, …,

*s*, for a given value of

**x**∈

*S*, the elementary effect of

*x*can be calculated as follows

_{i}*p*− 1) and therefore

*x*≤ 1 − Δ. Given the output

_{i}*y*with a screening plan

*X*, the sample mean and SD of a set of δ

*(*

_{i}**x**) values can be estimated for each input parameter.

The screening plan *X* is built from the sampling matrix *B*, where *B* is a (*s* + 1) × *s* matrix of 0’s and 1’s with the key property that, for every column *i* = 1, …, *s*, there are two rows of *B* that differ only in their *i*th entries. We denote *B*, and it can be expressed as follows*J*_{l, m} is *l*-by-*m* matrix of 1’s; *s*-dimensional diagonal matrix where each element on the diagonal is either +1 or −1 with equal probability; *s*-dimensional, *p*-level discretized input space *S*; *s*-by-*s* random permutation matrix, where each column contains only one element of 1 and all other equal to 0 and no two columns have 1’s in the same position. Therefore, to evaluate *r* elementary effects for each parameter, the screening plan *X* is constructed from *r* random orientations as follows

## SUPPLEMENTARY MATERIALS

robotics.sciencemag.org/cgi/content/full/4/36/eaay5063/DC1

Table S1. Statistics of hydrodynamic coefficients of a stationary rigid cylinder in uniform flow at *Re* = 12,000.

Fig. S1. Histogram of *C _{d}* of a stationary rigid cylinder in uniform flow at

*Re*= 12,000.

Fig. S2. Evolution of GPR learning sequence for *C _{lv}* of a rigid cylinder forced vibration in uniform flow at

*Re*= 12,000.

Fig. S3. *C _{lv}* of a rigid cylinder forced vibration from GPR learning at various

*Re*number values.

Fig. S4. *C _{d}* of a rigid cylinder forced vibration from GPR learning at various

*Re*number values.

Fig. S5. *C _{my}* of a rigid cylinder forced vibration from GPR learning at various

*Re*number values.

Data file S1. Sequential experimental data for hydrodynamic coefficients of a cross-flow only vibrating rigid cylinder at *Re* = 12,000.

Data file S2. Sequential experimental data for hydrodynamic coefficients of a cross-flow only vibrating rigid cylinder at various *Re* from 1200 to 19,000.

Data file S3. Sequential experimental data for hydrodynamic coefficients of a cross-flow and in-line combined single-frequency vibrating rigid cylinder at *Re* = 5715.

Data file S4. Sequential experimental data for hydrodynamic coefficients of a cross-flow and in-line combined double-frequency vibrating rigid cylinder at *Re* = 5715.

Movie S1. Experimental process of ITT sequential learning on *C _{lv}* of a cross-flow only vibrating rigid cylinder at

*Re*= 12,000.

Movie S2. ITT sequential experiment of *Re* effect on a cross-flow only vibrating rigid cylinder.

This is an article distributed under the terms of the Science Journals Default License.

## REFERENCES AND NOTES

**Acknowledgments:**We thank S. Mowlavi and T. Sapsis for helpful discussions.

**Funding:**This work has been possible thanks to DARPA support and F. Farhoo and J. Vanderbrande, under EQUiPS program grant HR0011517798. We also acknowledge support from Shell, Subsea 7, and MIT Sea Grant College Program.

**Author contributions:**D.F., G.E.K., and M.S.T. conceived the project. D.F., G.J., and T.R.C. designed and constructed the facility. D.F. performed the experiment. D.F. and Y.M. performed the analysis. D.F., G.J., T.R.C., L.B., Y.M., L.R.K., G.E.K., and M.S.T. contributed to the manuscript.

**Competing interests:**The authors declare that they have no competing financial interests.

**Data and materials availability:**All data needed to evaluate the conclusions in the paper are present in the paper or the Supplementary Materials.

- Copyright © 2019 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works