UTILIZING BAYESIAN NEURAL NETWORKS TO MODEL THE OCEAN-ATMOSPHERE INTERFACE.
Uncertainties in tropical cyclone (TC) intensification forecasting continue in modern-day forecasting modelsl. Recent storms such as TC Matthew (2016) and TC Irma (2017) demonstrated our limitations in predicting intensification, and rapid intensification events in a changing thermodynamic environment. A component of improving forecasting capabilities is representing the Ocean-Atmosphere Interface (OAI), also referred to as Air-Sea Interaction (ASI), within a model. Difficulties arise in resolving fluxes between atmosphere and ocean models to compute energetic and physical parameters within the interface. Research is underway to determine the model complexity necessary to simulate OAI (1). On-going research suggests climate change as a primary driver behind the environmental behavior change. Abdullah et al., identified several atmospheric and oceanic physical parameters that comprise the OAI (2). We attempt to extend the principle goal of establishing a technique to resolve the OAI by implementing a Bayesian Neural Network (BNN, or Bayesian Belief Network) statistical model. We treat the OAI holistically, thus our approach will linearize the physical parameters to simplify the model. We tabularize decadal data from several physical parameters within the Gulf of Mexico (GOM) domain to build the BNN and make a probabilistic prediction of the OAI magnitude.
Ocean-Atmosphere Interface (OAI). In general, large scale modeling of the ocean and atmosphere involves many complex physical equations, multi-dimensional regression and other techniques (3). Such techniques are inherently nonlinear to resolve physics, energy and chemical fluxes across various boundary solutions. A model ensemble is required to model a tropical cyclone, generally comprising of atmospheric, oceanic and vortex models (3). Generally, solutions from one model are passed, given established boundary conditions, to generate a tropical storm. The OAI exchange of fluxes occurs at the interface of ocean atmosphere boundary and through compensatory dynamical circulations that maintain the observed climate of the planet (3). However, representing the interface of energy flux between the ocean and atmosphere is either poor or nonexistent (1,3). This interface is important to understand both cyclogenesis and intensification.
OAI Current Implementation. Current schemes to represent the OAI include the Message Passing Interface Princeton Ocean Model for Tropical Cyclones (MPIPOM-TC) (3) in an ensemble, like that of NOAA Hurricane Weather Research Forecast model (HWRF). Energy exchanges occur at the flux boundary of a statistically accurate sea surface temperature (SST) field as input into the model. Issues arise for both coupled and uncoupled schemes. An uncoupled hurricane model with a static SST field is restricted by its inability to account for SST changes during model integration, which can contribute to high intensity bias (4). A hurricane model coupled to an ocean model that does not account for fully three-dimensional ocean dynamics may only account for some of the hurricane-induced SST changes during model integration (5).
OAI Linearization. As expressed by Pond et al., the OAI is a highly coupled, non-linear system that should be treated as a single entity (6). We attempt to simplify the OAI by treating it as a holistic, linear system, primarily to address uncertainties. To this end, this paper hypothesizes the effectiveness of treating the OAI linearly with ocean and atmosphere physical parameters to represent scalar magnitudes of the OAI. Additionally, we investigate whether a linearized OAI model can assist a TC ensemble to improve TC intensification prediction (Figure 1).
Description. In efforts to understand causal impacts of varying atmospheric and ocean parameters that comprise the OAI, a dataset was constructed specifically for statistical models to draw inferences of causal impacts and their potential relationships. We assumed no time or space dependency and considered only magnitudes of each parameter. We concentrated on measurements from the GOM basin, primarily due to high TC activity. Each value is taken as an area-averaged measurement (except for Convective Available Potential Energy or CAPE at the time this study was conducted) for the period of August, September and October for each year, resulting in three data points per year over twelve years. The geospatial range of study within the GOM basin is approximate max-min latitude 31[degrees], 23[degrees] respectively; approximate max-min longitude 97[degrees], 83[degrees] respectively.
Sources. We collect data from multiple resources including NOAA National Centers for Environmental Information (NCEP), NOAA Atlantic Oceanographic and Meteorological Laboratory (AOML), NOAA Earth System Research Laboratory (ESRL), the National Hurricane Center (NHC), the Cooperative Institute for Meteorological Satellite Studies (CIMSS) join project with the University of Wisconsin-Madison, and the University of Wyoming Department of Atmospheric Science.
Parameter Description. Atmosphere Temperature Anomaly (ATA) is the mean temperature in degrees ([degrees]C) averaged monthly per year relative to 1951-1980 base period and is represented as a double-precision number. Atmospheric Temperature (AirTemp) is the mean temperature in degrees ([degrees]C) averaged monthly via NCEP/NCAR reanalysis forecast system performing data assimilation using data from 1948 to present (8) and is represented as a double-precision number. Atmospheric Carbon Dioxide (CO2) is the monthly averages of atmospheric carbon dioxide (ppm) via NOAA ESRL Global Monitoring Division at Mauna Lao (10) and is represented as a double-precision number. Convective Available Potential Energy (CAPE) is the "area averaged" CAPE (Joules kg-1) via University of Wyoming (9) and is represented as double-precision number. Tropical Cyclone Heat Potential (TCHP) is the area/monthly averaged TCHP (kJ cm-2) via NOAA AOML (11) and is represented as a double-precision number. Sea Surface Temperature (SST) is the monthly mean SST ([degrees]C) via International Collaborative Ocean-Atmosphere Dataset (ICO ADS) (12) and is represented as a double-precision number. Preciptiable Water Content (PWC) is the mean water content precipitated from a column of air (kg m-2) via NCEP/NCAR reanalysis forecast system performing data assimilation using data from 1948 to present (8) and is represented as a double-precision number. Mid-layer Atmospheric Wind shear (Windshear) is the mean mid-level atmospheric wind shear (the change in wind speed and direction with height) via University of Wisconsin and the CIMSS (13), numeric format. Vertical Motions (VerticalMotion) is the vertical motion updrafts (m s-1) via University of Wyoming (9) and is represented as a double-precision number. OAI is the three-category representation of the state of the OAI, given the probability of all other parameters and is represented as a string format.
Uncertainties. Data collection devices (sensors, satellites, buoys etc.,) are subject to the elements and other factors and data retrievals are not always consistent. Therefore, missing data is an obvious limitation. Statistical techniques in data interpolation and extrapolation are necessary to overcome these limitations. The extents to which these techniques are implemented depend on many constraints surrounding length of time, sparseness, if the data is sufficient enough to interpolate from and the availability of previous data to allow for extrapolation.
Bayesian Neural Network (BNN). Bayesian Belief Networks or Bayesian Neural Networks are easily implementable statistical models that capture reasoning given uncertainty from data by either utilizing evidence from other data, domain expertise or both (14). For this reason, we identified Bayesian statistical modeling as a novel approach to stochastic predictions within a complex system, as identified by Berliner, Royle, Wikle and Milliff (1998) (15). Knowledge of the modeled domain is contained within directed-acyclic-graphs (DAGs), where each node contains a conditional probability table (CPT). BNN utilize inferencing to derive insights between nodes. For example, if we can infer node C from node A with certainty (x), and we can infer node C from node B with certainty (y), what can we conclude on the certainty of node C? The certainty of node C will be a probabilistic calculation.
BNN requirements. A BNN requires that the network contain a set of nodes (or variables) and a set of directed edges between nodes. Further, such networks are restricted from containing cycles. Nodes and their connected edges form a DAG (figure. 2), whereby each node has a finite set of mutually exclusive states. Each node A with parents B1, ... Bn is an attached CPT given by P(A|B1, ... Bn).
Bayes Theorem. Each node within a DAG contains a CPT built upon Bayes Theorem, which states the probability of an event based upon prior knowledge of conditions that might be related to the event (14). The basic property for conditional probability, known as the posterior distribution, is given as,
P(A|B) = [P(B|A) P(A)/P(B)] = [P(A [??] B)/P(B)] eq. 1
where the Joint-Probability distribution to build joint probability tables (JPTs) for A and B is the product of the prior P(A) and sampling P(B|A) distributions, given as,
P(A [??] B) = P(B)P(B|A) eq. 2
The property for marginalization of a parameter within a DAG is,
P(A) = [[summation].sub.B]P(A [??] B) eq. 3
Updating joint-probabilities is the product of the quotient of the initial joint probability and the prior distribution of the "evidenced" parameter and the distribution of evidence. The property is given as,
[P.sup.*](A [conjunction] B) = [P(A [??] B)/P(A)] [D.sup.*](A) eq. 4
Implementation. We utilize the statistical programming language, R, and supporting modules to perform data preprocessing and construct the BNN.
Data Preprocessing. We tabularized the data into a dataframe of 37 rows and 10 columns and performed a normalization scheme within the interval (0, 1], We set zero values equal to 0.01 due to interpreter formatting errors when splitting the data into categorical values. The values were randomized to prevent model fitting to the structure of the data. A correlation matrix was built to evaluate statistical significance between the parameters (Figure 3). The parameter AirTemp shows weak correlation with atmospheric CO2, despite domain knowledge confirming the opposite. AirTemp is also weak with respect to SST, which again is the opposite of common domain knowledge. Parameter WindShear shows larger weak correlation, again, behaving contrary to domain knowledge against all physical parameters. We removed the AirTemp parameter and constructed another matrix, given in figure 4.
The second matrix shows noticeable improvements, although the parameter WindShear continues to be estimated with very weak correlation with respect to TCHP, SST and PWC. We decided to leave WindShear within the dataframe to build the BNN under the assumption the model, given linear dependency from Bayes Theorem, would exclude it.
Constructing the BNN. R programming supports a module named "bnlearn" and we utilized it to construct the BNN. The module contains DAG building methods such as Iterative Associate Markov Blanketing, Hill-Climb and others. We implemented the Hill-Climb algorithm. To build CPTs, bnlearn calls the "fit" method. JPTs are built and inferencing between nodes is done by calling the "cpquery" method. We take our dataframe and convert it into numeric intervals (0, 0.3, 0.7, 1.1) and further, create categorical labels from 1 to 3. The min-max thresholds 0 and 1.1 respectively of the numeric interval were chosen to abide by formatting rules within the R programming interpreter to appropriately split the data. Because the minimum of the data was programmatically set to 0.01, we could split the data on a minimum--zero. Additionally, the data maximum is 1, therefore we could split the data at a categorical maximum--1.1.
Initial BNN. The initial BNN (Figure 5) contains two separate DAGs. The left DAG contains parameter OAI as a base node, the remaining DAG with parameter SST as the base node. The BNN did not consider either ATA or WindShear as statistically significant parameters and it agrees with the results of the second correlation matrix. Further, no CPTs of parameters ATA and WindShear are linearly related with respect to any DAG.
Second BNN. We removed ATA and WindShear and rebuilt the BNN (Figure 6). The previous DAGs maintain their structure, however, the CAPE and PWC parameters are related to each other and incorrectly remain separated. The state of the BNN is incomplete, as we will not be able to generate inferences between all nodes.
Completed BNN. CAPE is physically related to PWC. As PWC increases, latent heat of evaporation also increases and contributes to the available potential energy of an air parcel (Abdullah, et al) (2). BNNs collect information from uncertainty within data, expert (or domain) knowledge, or both. In this case, we know CAPE and PWC are related, therefore, we use this information to construct our completed BNN (Figure 7) by adding an edge between CAPE and PWC.
BNN Cpquery First Output. Now we draw inferences from our BNN by process of probabilistic query. Our initial R syntax Cpquery asks, "what is the likelihood that low CAPE, mid TCHP, mid PWC, mid Vertical Motion, and low atmospheric CO2 can predict a moderately favorable OAI?" (Table 1). We computed a result of 0.6612903, or approximately a 66 percent probability that a moderately favorable environment to support TC development or intensification can occur given the physical parameter conditions.
BNN Cpquery Second Output. To ensure stability of the model output, we ran another R syntax Cpquery. "what is the likelihood that low CAPE, midTCHP, mid PWC, mid Vertical Motion, low atmospheric CO2 and low SST can predict a moderately favorable OAI?" (Table 1). We computed a result of 0.85, or approximately an 85 percent probability that a moderately favorable environment to support TC development or intensification can occur given the physical parameter conditions.
Our approach to linearizing the OAI given a BNN provided insight into what we understand theoretically about OAI behavior. The BNN demonstrated sensitivity parameterizing, verifying that small changes in the system can produce considerable changes. Further, it is feasible to implement a large-scale BNN, however, problems may arise in data integration (i.e., model generated data versus raw observations) given the assumptions we used. Acquisition of a larger, higher-resolution data set is in progress to continue testing and verifying our BNN. Given our data set was relatively small, parameters ATA, WindShear and AirTemp were too coarse in variance, therefore were not considered statistically significant. This narrow variance is a product of the coarse resolution from NOAA/NCEP reanalysis maps, which are global in scale, whereas our domain is the GOM. Data for the WindShear parameter was retrieved for mid-layer atmosphere winds (850-500 mb), and we considered capturing features near sea level (1000-850 mb) in our continued research. Alhough the parameters have been linearized for the BNN, it must represent the OAI complexity in terms of parameters that exist. In future study, we will add additional physical conditions to the BNN. Finally, the OAI category must represent this order of complexity. We are developing a heuristic over our expanding dataset using machine learning algorithms to classify various magnitudes of the OAI.
1. Soloviev, A.V., Lukas, R., Donelan, M.A., Haus, B.K., Ginis, I., 2014: The air-sea interface and surface stress under tropical cyclones. Scientific Reports., Vol 4 5306.
2. Abdullah, W.S., Reddy, R., Heydari, E., Walters, W., 2016: A Study of Large-Scale Surface Fluxes, Processes and Heavy Precipitation Associated with Land Falling Tropical Storm Lee over Gulf of Mexico using Remote Sensing and Satellite Data. Mississippi Academy of Sciences., Vol 62.
3. Tallapragada, V., et al 2014: Hurricane Weather Research and Forecast (HWRF) Model 2014 Scientific Documentation. NOAA/NWS/NCEP.
4. Bender, M. A. and I. Ginis, 2000: Real case simulation of hurricane-ocean interaction using a high-resolution coupled model: Effects on hurricane intensity. Mon. Wea. Rev., 128, 917-946.
5. Yablonsky, R. M. and I. Ginis, 2009: Limitation of one-dimensional ocean models for coupled hurricane-ocean model forecasts. Mon. Wea. Rev., 137, 4410-4419.
6. Pond. S., 1972: The Exchanges of Momentum, Heat and Moisture at the Ocean-Atmosphere Interface. Numerical Models of Ocean Circulation, National Academy of Sciences Proceedings. Oct. 17-20, 26-38.
7. GISTEMP Team, 2016: GISS Surface Temperature Analysis (GISTEMP). NASA Goddard Institute for Space Studies. Dataset accessed 2016-11-15 at http://data.giss.nasa.gov/gistemp/.
8. The NCEP/NCAR 40-Year Reanalysis Project: March, 1996 BAMS
9. University of Wisconsin--CIMSS. Data accessed 2016-11-15 at http://tropic.ssec.wisc.edu/tropic.php
10. NOAA ESRL Global Monitoring Division at Mauna Lao, Hawaii. Data accessed 201611-15 at http://www.esrl.noaa.gov/gmd/ccgg/trends/g raph.html
11. NOAA AOML. Data accessed 2016-11-15 at http://www.aoml.noaa.gov/phod/cyclone/dat a/fulllist.html
12. NOAA ESRL ICOADS Dataset. Data accessed 2016-14-11 at http://www.esrl.noaa.gov/coads/coads_cdc netcdf.shtml
13. University of Wyoming Department of Atmospheric Science soundings. Data accessed 2016-14-11 & 2016-15-11 at http://weather.uwyo.edu/upperair/sounding. html
14. Finn V. Jensen. An Introduction to Bayesian Networks. UCL Press, 1996. Chap. 2.
15. Berliner, M.L., Royle, A.J., Wikle. C.K., Milliff. R.F., 1998: Bayesian Methods in the Atmospheric Sciences., Bayesian Statistics. Vol 6, 66-100.
Warith Abdullah, Remata Reddy, Cary Butler and Wilbur Walters
Department of Physics, Atmospheric Sciences & Geosciences
Jackson State University at Jackson
1400 J.R Lynch Street
Jackson, Mississippi 39217 USA
Corresponding Author: Warith Stone Abdullah, email: firstname.lastname@example.org
Caption: Figure 1. Generalized Model Ensemble set-up with OAI Model as an uncertainty resolver.
Caption: Figure 2. General DAG diagram.
Caption: Figure 3. Initial correlation matrix for dataframe.
Caption: Figure 4. Second correlation matrix, parameter "AirTemp" removed.
Caption: Figure 5. Initial BNN. Two DAGs are constructed, two parameters, ATA and WindShear, are not considered.
Caption: Figure 6. Second BNN configuration. Two DAGs remain as originally built, ATA and WindShear are removed.
Caption: Figure 7. Expert inferred completed BNN. An edge is added between CAPE and PWC parameters.
|Printer friendly Cite/link Email Feedback|
|Author:||Abdullah, Warith; Reddy, Remata; Butler, Cary; Walters, Wilbur|
|Publication:||Journal of the Mississippi Academy of Sciences|
|Date:||Jul 1, 2018|
|Previous Article:||DIFFERENCES IN SURVIVAL OF HEAT STRESS ADAPTED CELLS OF Listeria monocytogenes EGD (BUG 600) IN DISINFECTANTS AND ESSENTIAL OILS.|