# Independent Subspace Analysis of the Sea Surface Temperature Variability: Non-Gaussian Sources and Sensitivity to Sampling and Dimensionality.

1. IntroductionThe climate system is a highly complex dynamical system with many nonlinearly interacting modes of variability. A characteristic feature of the system is its non-Gaussianity. The nonnormality of climate is an important determinant of extreme events and may provide insights into understanding the system dynamics [1-3]. For example, it is generally accepted that understanding non-Gaussian statistics of weather and climate has important consequences in atmospheric research not least because weather/climate risk assessment depends on understanding the tail of the system probability density function (PDF). Because of the complex interactions involved in the system, climate signals tend to be mixed. An important problem in climate research is to be able to disentangle main (or independent) signals from the mixture.

Empirical orthogonal functions (EOFs) and closely related methods (e.g., [4]) find signals that explain maximum variance but do not address the problem of mixing. An appropriate method that addresses the latter problem is independent component analysis (ICA). ICA is a method for separating mixed signals into their sources. It is based on the assumption that different sources, stemming from different physical and dynamical processes, can be considered to be statistically independent and can be separated by ICA.

ICA can be relevant to climate research to identify and separate possible anomaly patterns that may have different forcing [5]. ICA has been used in climate research only lately. For example, [6] applied ICA to the study of the West African vegetation on intraseasonal and interannual time scales. Reference [7] analyzed sea level pressure via ICA to identify the main contributors to the Arctic Oscillation signal.

Other authors applied ICA to the sea surface temperature (SST) and surface temperature [8-10].

Conventional ICA is essentially a one-dimensional algorithm in that the sources are unidimensional. In reality, however, and given the high nonlinear character of climate one cannot expect all source components (i.e., unidimensional) to be statistically independent. Rather, this can be relaxed in favor of finding groups of hidden (independent) sources. This has led to the development of an extension of ICA to the multidimensional ICA also known as Independent Subspace Analysis or ISA [11-14], already used in the analysis of the atmospheric variability [15]. Most common algorithms used to obtain independent sources are based on maximizing various information measures such as negentropy (NE).

The SST field is an important source of information to weather and climate variability as it constitutes the bottom boundary forcing of the overlying atmosphere and controls air-sea interaction through sensible and latent heat fluxes. SST variability is characterized by high nonlinearity and non-Gaussianity [1, 16-19]. In this manuscript, we analyze the SST using ISA to identify groups of independent sources. We use high-order cumulant tensors of coskewness and cokurtosis combined with high-order singular value decomposition (HOSVD) and minimize the generalized mutual information (MI) to obtain the independent sources. However, the performance of the methods may be hindered by sample sizes, serial correlation, or high dimensionality, well known characteristics of weather/climate. Therefore, we present statistical tests examining the biases of NE and MI and the effects of the sampling and dimensionality, which helps in finding robust non-Gaussian subspaces.

The manuscript is organized as follows. Section 2 presents the separation method and corresponding statistical tests. Section 3 presents application to the SST anomaly data and to a synthetic non-Gaussian surrogate consistent with the same data. A summary and conclusion are provided in the last section.

2. The Method

2.1. Cumulant-Based Independent Subspace Analysis. We designate by [mathematical expression not reproducible] space-time data matrix anomaly (of rank [N.sub.r]), representing [N.sub.t] realizations of a random vector [mathematical expression not reproducible]. Principal Component Analysis (PCA) or Singular Value Decomposition (SVD) [20] are used to construct the PCs: [mathematical expression not reproducible]. Te method described below is applied to any whitened or isotropic random vector y of dimension N [less than or equal to] [N.sub.r] obtained from [x.sub.PC](t) or to a subsequent orthogonal rotation z = R'y (RR' = [1.sub.N]) (note that since y is isotropic and standard multivariate Gaussian, then the same holds for z). The source separation, or the factorization of the pdf of y, [[rho].sub.y] into partial pdfs is not a trivial problem when y is non-Gaussian. In fact, this is the case when the joint negentropy (NE), [21], which is invariant under any linear homeomorphism, is strictly positive, that is, J(z) [equivalent to] E[log([[rho].sub.z]/[[phi].sub.z])] = J(y) [equivalent to] [J.sub.rot] > 0 where [[phi].sub.z] is the multinormal probability density function (pdf) with the same first two moments as z.

The pdf separation relies on the maximization of the sum of negentropies of a set of candidate sources. Precisely, let us consider the partition of the rotated vector z into a number [N.sub.s] of subvectors or candidate sources (e.g., scalars, dyads, or tuples) as [mathematical expression not reproducible]. Thanks to the separation theorem [13] or Negentropy Lemma [15], [J.sub.rot] decomposes as a sum of partial Ns negentropies plus the nonnegative generalized mutual information (MI) [mathematical expression not reproducible] [22] as

[mathematical expression not reproducible]. (1)

The goal of ISA is to find a rotated vector z = R'y which is split into the largest number [N.sub.s] of less statistically dependent subvectors minimizing MI. ICA is a restriction of ISA to the case of scalar sources, that is, dim([z.sub.(i)])) = 1 for all i. Gaussian distributed sources, (if they exist) are necessarily unidimensional and span the so-called Gaussian subspace [W.sub.g](z) of dimension [N.sub.g]. Its orthogonal complement (with respect to the covariance inner product), of dimension [N.sub.ng] = N-[N.sub.g], is [W.sub.ng](z) = [W.sub.g][(z).sup.[perpendicular to]], that is, [W.sub.g](z) [cross product] [W.sub.ng](z) = [R.sup.N].

The explicit MI dependence on R (with components [R.sub.ij], ..., i, j = 1, ..., N) is quite difficult to implement in general. Here, we use a cumulant-based contrast function, relying on the Edgeworth expansion approximation of the pdf of y [23],namely: [[rho].sub.y] [approximately equal to] [phi](y) [1 + [P.sub.Ed](y)], where [P.sub.Ed] is a multivariate linear combination of Hermite polynomials depending on cumulants up to a given p = [p.sub.max] maximum order, set here to [p.sub.max] = 4. The third and fourth-order cumulants of y (components of the coskewness and cokurtosis tensors) are

[mathematical expression not reproducible] (2)

for i, j, k, l in {1, ..., N}. They are invariant under permutation of indices (hypersymmetry). After applying a tensor change of basis, the resulting cumulants are

[mathematical expression not reproducible], (3)

written in terms of R entries for i, j, k, and l in {1, ..., N}. Cumulant tensors of any order p [greater than or equal to] 3 aggregate measures of the joint non-Gaussianity of all possible subvectors w of y. They vanish for all w when y is multinormal or for those w intersecting independent subvectors. To approximate NE, we introduce a positive defined matrix by contracting S and K as

[mathematical expression not reproducible], (4)

and similarly, for z, that is, M[(z).sub.ij] = [[summation].sup.N.sub.p,q=1] [R.sub.pi][R.sub.qj]M[(y).sub.pq]. The Edgeworth-based NE approximation is given by

[mathematical expression not reproducible], (5)

where, for example, [[parallel]S(z)[parallel].sup.2.sub.F] stands for the Frobenius' squared norm (sum of all squared components) of the tensor S. Note that there are examples of non-Gaussian pdfs verifying [J.sub.Ed](y) = 0, including isotropic pdfs, for which nonlinear methods must be used [24, 25], such as the nonlinear-PCA (NL-PCA) [26] which has been applied to climatic data [27-31].

The approximation (5) is scaled as [J.sub.rot]-[J.sub.Ed](z) = O([N.sup.-1.sub.eq]), if z behaves like an average of an equivalent number [N.sub.eq] of independent and identically distributed (iid) random variables. [J.sub.Ed](z) is appropriate to be used in ISA since it satisfies to an equivalent separation theorem (1):

[mathematical expression not reproducible], (6)

where [I.sub.Ed] collects all the coskewness and cokurtosis mixing components of the different candidate sources. Therefore, ISA holds when [I.sub.Ed] is minimum over the manifold SO(N) of orthogonal rotations in [R.sup.N]. This is related to the Joint Approximated Diagonalization Eigen-matrices' (JADE) method [32].

The rotation matrix R = ([r.sub.1], [r.sub.2], ..., [r.sub.N]) composed of the Singular vectors (SVecs) of M(y) (in columns) provides the Tucker decomposition [33] or high-order singular value decomposition (HoSVD) [34]. The HOSVD applied to cumulant tensors lead to the Principal Cumulant Component Analysis (PCCA) [35] and to the Joint Moment Component Analysis (JMCA), [36]. Those techniques find an optimal reduced basis accounting for most of the variance and negentropy explained by a set of cumulants, making thus a generalization of PCA.

The singular values (SVs) of M(y): [[eta].sub.1,N] [greater than or equal to] [[eta].sub.2,N] [greater than or equal to] ... [greater than or equal to] [[eta].sub.N,N] can be expressed as

[mathematical expression not reproducible], (7)

which merge all the negentropy terms where the projection of z on the ith SVec: [Z.sub.i] = [r'.sub.i]y takes part. It provides the interpretation for the leading SVecs as being the main axes where non-Gaussianity is sought. Note also that there are other scalar tensor invariants characterizing the joint non-Gaussianity, relying on linear combinations of Tr[[M.sup.k](z)], k [less than or equal to] N (e.g., determinant of M).

The vectors corresponding to nonnull SVs span the estimated non-Gaussian subspace [W.sub.ng-Ed](z) whereas the kernel Ker(M) spanned by the trailing SVecs associated with zero SVs spans the estimated Gaussian subspace [W.sub.ng-Ed](z) as far as third- and fourth-order cumulants are concerned. Note, however, that the method provides an overestimation of the Gaussian manifold [W.sub.ng-Ed](z) [contains not equal to] [W.sub.g](z) and an underestimation of the non-Gaussian manifold [W.sub.ng-Ed](z) [subset or equal to] [W.sub.ng](z).

To summarize, the first step of source separation is to discard the Gaussian subspace and restrict the analysis to vectors in [W.sub.ng-Ed](z) = Ker[[M(z)].sup.[perpendicular to]] of dimension [N.sub.ng-Ed]. Let us denote the non-Gaussian part of z by [mathematical expression not reproducible].

A given partition [mathematical expression not reproducible] that leads to separated true non-Gaussian sources, that is, [mathematical expression not reproducible], makes the invertible matrix M([z.sub.ng]) block-diagonal with [N.sub.s] blocks. The SVD of M([z.sub.ng]) yields blocks of SVecs: [w.sub.(i)]; i = 1, ..., [N.sub.s], spanning the same source's subspaces (i.e., Span[[w.sub.(i)]] = Span[[z.sub.(i)]]; i = 1, ..., [N.sub.s]) (up to rotation, permutation, or inversion of axes within each source). This, however, does not provide information on how to group the SVecs onto sources (source configuration indeterminacy), which is a complex problem.

In the presence of nonseparable multivariate sources, one will still be interested in finding, within each source, axes maximizing the single NEs. In other words, one solves the ICA problem in [W.sub.ng-Ed] by maximizing the NE sum [mathematical expression not reproducible] in (6), that is, solving an ICA problem [37, 38] providing independent components (ICs) [Z.sub.i].

For a large class of pdfs, [13] has showed and conjectured that the ISA solution results in general from the grouping of ICs into independent subspaces.

The general problem of grouping ICs coming from HOSVD followed by ICA benefits from a special decomposition of [J.sub.Ed]([z.sub.ng]) into positive terms, hereafter called interactivities (ITIs). The decomposition runs through all possible nonnull [mathematical expression not reproducible] subsets u [subset or equal to] [z.sub.ng] with dimension 1 [less than or equal to] [absolute value of u] [less than or equal to] min(m, [p.sub.max] = 4), where (m = [N.sub.ng-Ed]). Therefore,

[mathematical expression not reproducible]. (8)

As an example, if u = ([U.sub.1], [U.sub.2])' and keeping in mind the hypersymmetry, we get

[mathematical expression not reproducible]. (9)

The ITIs, for 1D, that is, dim(u) = 1 (self-interactivities), coincide with single NEs whereas for dim(u) [greater than or equal to] 2 they coincide with joint-interactivities, which provide a measure of the non-linear statistical relationships within u contributing to [J.sub.Ed]([z.sub.ng]). Using cumulant properties [39], it is straight forward to show that [J.sub.Ed-IT](u) = 0 when u has non-null intersections with different independent sources. Therefore, after computing all ITIs, the source identification consists of determining a partition [mathematical expression not reproducible] of z composed of subvectors of minimal dimension such that [J.sub.Ed-IT](v) = 0 for all v intersecting at least two subvectors [u.sub.([alpha])] and [u.sub.([beta])], [alpha] [not equal to] [beta]. We note that the NE decomposition is not unique. In fact, the concept of interaction information [40-42] provides a general frame for the decomposition of J([z.sub.ng]) in terms of interaction information, defined for every u [subset or equal to] [z.sub.ng], which are given by linear combinations of mutual information of subvectors v [subset or equal to] u. See [43] for a proper application of ITIs on nonlinear fluid dynamics for the assessment of nonlinear wave resonances.

2.2. Independent Subspace Analysis from Finite Samples

2.2.1. Statistical Tests of Multivariate Non-Gaussianity. The above source separation method depends crucially on the approximation of the expectation of any function /(z) of z-function: E[/(z)] (e.g., the cumulants (2), the HOSVD matrix M(z) (4), the negentropy (5), and its interactivities (8)), through its estimation obtained by sample averages: [bar.f(z)]. The sample high-order cumulant estimators may be quite biased due to the eventual presence of single and joint outliers on short samples. This is particularly relevant to weather/climate with relatively short datasets with nonnegligible serial correlations yielding a smaller effective number of degrees of freedom: [N.sub.dof] [less than or equal to] [N.sub.t]. The bias becomes even larger when we deal with high-order cumulants. In fact, the pth order (hypersymmetric) cumulant tensor in N dimensions has [C.sup.N+p-1.sub.p] = (N + p-1)!/[(N-1)!p!] independent entries, which scales as a power law of N. This can lead to a high bias of the cumulants of the leading SVs of M(z) and to a possible fictitious non-Gaussianity of the pdf, resulting from the "curse of dimensionality." Briefly speaking, the "false" non-Gaussianity maybe due to either small sample size, high serial correlation, high dimension N, or high-order statistics p. To avoid non-Gaussianity overestimation, some procedures are usually applied including (i) data projection onto low-dimensional subspaces of high relevance (projection pursuit (PP) rationale [44, 45]); (ii) outlier's removal; and (iii) use of resistant (or order) statistics considering reasonable values of the order p.

Another strategy that we will also follow here is to evaluate the distribution, moments, and quantiles of the distribution of the estimators [[bar.f(z)].sub.G], necessary to perform ISA, under the null hypothesis ([H.sub.G]) of multivariate Gaussianity of z. All quantities under [H.sub.G] will hereafter be noted with the subscript "G." For iid realizations, all the above statistics are scaled as functions of [N.sub.t] and N, for which one has obtained analytic asymptotic formulas. In the presence of serial correlation, the statistics are derived from a Monte-Carlo procedure by taking an ensemble of surrogate Gaussian time-series with the same [N.sub.dof], taken here as simple red noise AR(1) processes fitted to each z component as g(t + 1) = [[??].sub.1]p(t) + [(1-[[??].sup.2.sub.1]).sup.1/2] w(t); t = 0, ..., [N.sub.t], where w(t) is a standard Gaussian white noise process and [[??].sub.1] is the lag-one autocorrelation. For the significance tests of non-Gaussianity, the upper quantiles (e.g., 90%-99%) statistics are taken from a large ensemble of surrogates.

The first step of source decomposition is to test the null hypothesis [H.sub.G]. The finite-sample estimator of [J.sub.Ed], hereafter denoted as [[??].sub.Ed] (all quantities based on sample statistics are overlined with a tilde) and others based on truncated NE approximations are conservative tests of [H.sub.G] (or less powerful tests of its negation: ~[H.sub.G]), because they reject non-Gaussianity (Type II error) resulting from discarded cumulants of order p > [p.sub.max] = 4.

Now, we discuss the behavior of the estimator [[??].sub.Ed:G] of [J.sub.Ed:G] (under [H.sub.G]) computed from [N.sub.t]-sized samples of a vector of N independent scalars. The pdf of [[??].sub.Ed:G] is positively skewed, depending on the joint distribution of [mathematical expression not reproducible], which are in general not independent since they satisfy a number of inequality constraints (e.g., [K.sub.1111]-[S.sup.2.sub.111] + 2 [greater than or equal to] 0). The bias of [[??].sub.Ed:G] is given by

[mathematical expression not reproducible], (10)

where the first and second square-bracket terms of (10) represent, respectively, scaled versions of [mathematical expression not reproducible]. The bias (10) tends asymptotically to [mathematical expression not reproducible]. It also provides a lower-bound of the positive [[??].sub.Ed] bias for non-Gaussian pdfs, as it occurs for any negentropy estimator [46].

The biases from the Frobenius norms were obtained by multiplying the biases of the different squared cumulant estimators by their multiplicities [36]. For instance, there are 3N(N-1) cumulant biases of type [mathematical expression not reproducible] based on the moments of the standard Gaussian: E([U.sup.2k]) = (2k)!/([2.sup.k]k!). Figure 1 shows the dependence on [N.sub.t] and [[??].sub.1] for N = 4, of the ratio

[mathematical expression not reproducible] (11)

between the actual bias and the asymptotic bias of [mathematical expression not reproducible], obtained from 1000 Monte-Carlo replicates. Moreover, the dependence of [f.sub.REd] on N is quite negligible with relative deviations less than 5% (not shown).

From the analysis of Figure 1(a) and as expected, the ratio [f.sub.REd] remains close to 1 when [[??].sub.1] is small and [N.sub.t] is large. For [N.sub.t] [greater than or equal to] 500, the ratio has reached the asymptotic value, where it becomes dependent only on [[??].sub.1] with [N.sub.dof] = [N.sub.t]/ [f.sub.REd]([infinity], [[??].sub.1], N) < [N.sub.t], where [f.sub.REd] is interpreted as the "average interval between independent realizations." For small sample sizes, [f.sub.REd] < [f.sub.REd] ([infinity], [[??].sub.1], N), due to a negative higher-order term in (10).

The variance of [[??].sub.Ed:G] involves covariances between generic cumulants, for example, cm([u.sub.1]) and cm([u.sub.2]), of subvectors [u.sub.1], [u.sub.2] [subset or equal to] z. Those covariances satisfy cov[cm([u.sub.1]), cm([u.sub.2])] = k[N.sup.-d.sub.t]; k [greater than or equal to] 0 with d = 2 if [u.sub.1] = [u.sub.2] and d = 3 if [u.sub.1] [contains] [u.sub.1] [intersection] [u.sub.2] [not equal to] [empty set] and d = [infinity] otherwise.

As for the bias, the quantiles [mathematical expression not reproducible] scale asymptotically as [N.sup.-1.sub.t]. Figure 1(b) shows the dependence on N and [[??].sub.1] for [N.sub.t] = 1000, of the ratio [mathematical expression not reproducible]. This figure provides a kind of a rule of thumb thresholds of [[??].sub.Ed] for rejecting [H.sub.G] in the asymptotic regime.

2.2.2. Estimation of the Gaussian Subspace by HOSVD. Following a rejection of [H.sub.G], the next step is to estimate the Gaussian subspace [W.sub.g] and its dimension [N.sub.g]. With finite samples, one computes the sample-based matrix [??](z) of M(z) (4). However, Wg can no longer be estimated by Ker([??]) (see Section 2.1) but through the subspace spanned by SVecs associated with the trailing (not significant) SVs.

Under [H.sub.G], the (estimated) SVs [[??].sub.i,N:G], (see (7)), with quantiles [mathematical expression not reproducible], are asymptotically O([N.sub.dof.sup.-1]) and the SVs become undistinguishable for large [N.sub.dof] [36]. However, the SVs are not independent and consequently their marginal quantiles cannot be estimated independently for each i. To overcome this difficulty, a sequence of statistical tests is applied, as discussed in the next subsection.

In summary, we first consider the vectors z = ([Z.sub.1], ..., [Z.sub.N])' = R'y and [z.sub.s] = ([Z.sub.s,1], ..., [Z.sub.s,N])' = [R'.sub.s][y.sub.s] where [y.sub.s] is a generic surrogate of y and R and Rs are matrices filled by the ranked SVecs of [??](y) and [??]([y.sub.s]), respectively. For each i in {1, ..., N], we construct the vector [b.sub.i] = ([Z.sub.1], ..., [Z.sub.i-1], [Z.sub.s,i], ..., [Z.sub.s,N]), where the trailing i, ..., N components are swapped by the associated ranked surrogates. Then we compute their NE contributions [[??].sub.ii]([b.sub.i]) (5) and finally the quantile [mathematical expression not reproducible] evaluated from a large ensemble of surrogates. The test then proceeds backward, starting from the smallest SV (i = N). Therefore, if [mathematical expression not reproducible], then the null hypothesis [H.sub.G,i] ([Z.sub.k] undistinguishable from Gaussian surrogates [Z.sub.s,k] for all k [greater than or equal to] i) cannot be rejected. The smallest i for which [H.sub.G,i] is not rejected determines the estimated Gaussian subspace [W.sub.g-Ed] with dimension [N.sub.g-Ed] (spanned by the trailing [N.sub.g-Ed] SVecs of R), and [W.sub.ng-Ed] = [W.sub.g-Ed.sup.[perpendicular to]] (spanned by the leading [N.sub.ng-Ed] = N-[N.sub.g-Ed] SVecs of R).

2.2.3. Estimation of Multivariate Sources. The next step is to identify the non-Gaussian sources within the vector of non-Gaussian components [mathematical expression not reproducible] spanning [W.sub.ng-Ed]. Since [z.sub.ng] is a subvector of z, we easily get [[??].sub.Ed]([z.sub.ng]) [less than or equal to] [[??].sub.Ed](z).

We test the separability of [W.sub.ng-Ed] into scalar and/or subvector sources by looking for a new orthogonal rotation [mathematical expression not reproducible] composed of a maximal number [N.sub.s] [greater than or equal to] 2 of subvector candidate sources [z.sub.(i),ISA]], i = 1, ..., [N.sub.s], such that the joint pdf satisfies

[mathematical expression not reproducible], (12)

where the norm [absolute value of [[epsilon].sub.ISA] of the pdf separation error [[epsilon].sub.ISA] is below the threshold of rejection of the null hypothesis of nonseparability ([N.sub.s] = 1). There is an a priori source configuration indeterminacy. For instance, for [N.sub.ng-Ed] = 4, there are five possible ways, namely: 4 scalars (ICA case); 2 scalars and a dyad; 2 dyads; a triad plus a scalar and the global quartet.

For a chosen source configuration, an appropriate contrast function to maximize is

[mathematical expression not reproducible], (13)

where [[??].sub.Ed] must be below some significance threshold, passing the separation statistical test. However, an undecidable situation may occur when multiple configurations pass the test. To avoid the case of multiple configuration indeterminacy, we solve instead the ICA problem

[mathematical expression not reproducible], (14)

providing the vector of best scalar sources: [mathematical expression not reproducible]. The algorithm to get Rica is explained in Appendix A. Finally, we build multivariate sources forming appropriate groups of [z.sub.ICA] components, according to the conjecture of [13], and as below.

We start with the NE decomposition into [N.sub.int] interactivities (8) as

[mathematical expression not reproducible]. (15)

To compare and account for the statistical significance of u-interactivities in (15) with different dimensions dim(u), we introduce scaled interactivities [[??].sub.Ed-ITsig](u) through its comparison with the quantiles of the NE [[??].sub.Ed-IT]([u.sub.G]) of the ranked (by HOSVD) Gaussian surrogates [u.sub.G] of u. It is is defined as

[mathematical expression not reproducible]. (16)

When dim(u) >1 in (16), w is the rotation of [u.sub.Gs] towards isotropy (e.g., by Gram Schmidt orthogonalization) with [u.sub.Gs] being composed of the standard Gaussianized components of u. Since [u.sub.G] has Gaussian marginals, this procedure leads to a fairer and unbiased test when u has non-Gaussian pdf marginals. This is also supported by the fact that the mutual information of u is I(u) = I([u.sub.Gs]) = I([u.sub.Gsi]) + [-(1/2) log(det [C.sub.Gs])] + [[summation].sup.N.sub.k=1] J([u.sub.GSi,k]) [approximately equal to] I([u.sub.Gsi]) where the second term is the Gaussian MI component [47, 48], which depends on the correlation matrix [C.sub.Gs] of [u.sub.Gs] and the third term is the sum of the NEs of the components of [u.sub.Gsi]. The above measure is therefore resistant to single outliers of the marginal pdfs.

The different [N.sub.int] scaled interactivities (16) are then sorted by decreasing order, retaining uniquely the leading ones verifying [[??].sub.Ed-ITsig](u) [greater than or equal to] [J.sub.sig] where [J.sub.sig] is a fixed level of acceptance. The emerging large interactivities normally yield ICs, which contribute to the multivariate sources. The NE interactivities are then grouped along the relevant interactivities until the stopping criteria are reached, for example, [[summation].sub.u] [[??].sub.Ed-ITsig](u) < [[??].sub.Ed]([z.sub.ng])], which will be adopted in practice.

3. Analysis of Sea Surface Temperature Data

3.1. Data Description and Processing. We consider the monthly SST data from the "Extended Reconstruction Sea Surface Temperature" (ERSST) version v3b product (https:// www.ncdc.noaa.gov/oa/climate/research/sst/ersstv3.php), from the International Comprehensive Ocean-Atmosphere Data Set (ICOADS) release 2.4. The SST data are on 2[degrees] x 2[degrees] latitude-longitude resolution and span the period 1875 to present [49]. We only consider the period 1910-2011 (102 years), restricted to the region equatorward of 65[degrees]. At each grid point, the climatology and the (12-month moving average) linear trend in addition to the mean seasonal cycle are removed from the data [50], thus yielding [N.sub.t] = 1224 monthly SST anomalies (SSTAs), on [N.sub.p] = 8714 grid points. A EOF analysis is then applied [51] and we only use the space spanned by the leading [N.sub.max] = 12 mode for the application of ICA/ISA. Its output is independent from PC variances, even when the used PCs are quasi-degenerated. The total variance is [[sigma].sup.2.sub.Total] = 1709.4 [([degrees]C).sup.2], the local variance at point P is [[sigma].sup.2.sub.P], with an average [[sigma].sup.2.sub.Local] = [[sigma].sup.2.sub.Total]/[[N.sub.p] [c.sub.lat] ~ [(0.50).sup.2] [([degrees]C).sup.2] ([c.sub.lat] = average of cos(latitude)) and we let the fractional local variance [[??].sup.2.sub.P] = [[sigma].sup.2.sub.P]/[[sigma].sup.2.sub.Local]. Figure 2(a) and Table 2 show the % of explained variance of the 12 leading EOFs with their confidence intervals [52] and Figure 2(b) shows the fitted lag-one autocorrelation [[??].sub.1] based on an AR(1) model.

Figure 3 shows the leading 12 EOFs which are described, with references in Table 1. Based on q, an estimate of the "average interval between independent realizations" is [[DELTA].sub.ind] = [N.sub.t]/[N.sub.dof] = [f.sub.REd]([N.sub.t], [[??].sub.1], N), (Figure 1(a)) which is not too different from (-1.)/ln([??].sub.1]) (e.g., [53]).

Figure 2(b) shows that [[DELTA].sub.ind] varies from about 3 months (PC10) to more than 12 months (PC2). For our simulation later, we will use the geometric average of the 12 [[??].sub.1] values, that is, 0.86 (see Section 3.3).

3.2. Negentropy Distribution and Non-Gaussian Subspace Estimation. Non-Gaussianity and negentropy of a field [??] have counterparts both on a local and on a PC basis. In fact, local cumulants are obtained from cumulants of the whitened PCs composing vector y, through linear operators. Let us consider, for example, the local-basis coskewness tensor S([??]). Its Frobenius norm is the area integral of the local squared skewness. By using the EOF orthogonality property, we may express this as a weighted sum over all grid point triads (P, Q, R):

[mathematical expression not reproducible], (17)

where [[??].sub.P] stand for the relative area at grid point P and [[??].sup.2.sub.P] is the fractional local variance. The contributions to (17) of a triad of whitened PCs are proportional to their variance fractions. The Frobenius norm (17) is dominated by contributions primarily from regions of high variance and absolute skewness, namely, the positively skewed El Nino zone [19, 54], in addition to to the positive skewness southwards of 50[degrees]S [17].

Now, the NE [[??].sub.Ed](y) can be decomposed using [??](y) (5). For each N = 1, ..., [N.sub.max] = 12, we denote by [y.sub.(N)] = ([Y.sub.1], ..., [Y.sub.N]) the set of N leading whitened PCs. Figure 4 shows the diagonal terms ([[??].sub.kk][[y.sub.(N)]]), Figure 4(a) and the total NE [[??].sub.Ed][[y.sub.(N)]] = Tr([??][[y.sub.(N)]]), Figure 4(b). Note that, for a given N, the smaller values of [[??].sub.kk] [[y.sub.(N)]] are either the least non-Gaussian PCs or the less interactive PCs with the remaining PCs (i [not equal to] k), providing therefore good candidates for the Gaussian subspace of [y.sub.(N)]. For example, with N = 12, we have k = 2, 7, 8, 9, and 12 (Figure 4(a)). Statistical significance of NE components and totals are assessed by using quantiles computed from Gaussian surrogates (Figure 4). Notice, for instance, the confidence level <70% of [[??].sub.22][[y.sub.(N)]] of the PC2 due to its small degrees of freedom.

Biases E{[[??].sub.Ed][[y.sub.(N)]]}-[J.sub.Ed][[y.sub.(N)]] may be approximated as [mathematical expression not reproducible], leading to [f.sub.REd]([N.sub.t] = 1224, [[??].sub.1] = 0.86) ~ 3.3 (see Figure 1(a)). For instance, for N = 12, [[??].sub.Ed] = 3.1, the bias is 1.7 (55%) with the remaining value (1.4) being attributed to true non-Gaussianity. Therefore, one expects that biases shall be drastically reduced after discarding the Gaussian subspace.

Figure 4(a) shows that, for all N, the largest NE component is [[??].sub.11][[y.sub.(N)]], resulting mostly from the nonlinear correlations of PC1 with other PCs and from its self-NE, due to its positive skewness (Table 2).

Using the nonnegative consecutive jumps: [[??].sub.kk][[y.sub.(N)]]-[[??].sub.kk] [[y.sub.(N-1)]] (Figure 4(a)), one may detect which interactivities mostly contribute to [[??].sub.Ed][[y.sub.(N)]]. The largest differences are for (k, N) pairs: {(1, 3), (1, 5), (1, 10), (1, 11), (3, 11), (4, 10), (4, 11), (5, 10), (5, 11), (6, 11)} in consistency with the high confidence level of [[??].sub.kk][[y.sub.(11)]] for k = 1, 3-6, 10, 11. Those jumps are often due to particular joint cumulants in the 2D subspaces spanned by (PCfc, PCV), responsible for particular types of joint non-Gaussianity, for example, scatter-plots resulting for instance from the occurrence of joint extremes. For instance, [bar.[Y.sup.2.sub.k][Y.sub.i]] > 0, is favored for the sign combinations (-, +) and (+, +) of ([Y.sub.k], [Y.sub.i]). As an illustration, Figures 5(a) and 5(b) show, respectively, scatter-plots of ([Y.sub.1g], [Y.sub.3g]) and ([Y.sub.1g], [Y.sub.11g]) (the marginals have been Gaussianized (subscript g) to emphasize the 2D pdf difference with respect to 2D isotropic Gaussianity). Figure 5 shows that most of the non-Gaussianity is due to the occurrence of joint coherent 2D positive extremes, for example, the 1983 and 1997 extreme El Nino events (see Figure 3 of [50]) during which a positive PC11 and a negative PDO phase have occurred.

Since [[??].sub.12,12][[y.sub.(12)]] is quite low and not highly significant (Figure 4(a)), we have decided to apply ICA and ISA in the 11D subspace (see Section 3.4) in which the total NE reach a confidence level > 95%. Besides the [H.sub.g] test to limit N [55] presents an alternative limit. The dimensionality reduction, however, is performed below via the statistical detection of the non-Gaussian subspace as detailed in Section 2.2.2.

Therefore, we now consider the state vector [y.sub.(11)] and the associated vector z = ([Z.sub.1], ..., [Z.sub.11])', yielding [[??].sub.Ed][z] = 2.55, and a bias of 1.73, leaving ~0.82 for "true non-Gaussianity."

Figure 6 shows the NE components (or SVs) [[??].sub.kk](z) (k = 1, ..., N), along with the threshold quantiles 90% and 95% of [mathematical expression not reproducible] obtained from an ensemble of 1000 Gaussian surrogates. For comparison, we also show the SVs [[??].sub.kk]([z.sub.s]) (open squares) of surrogate realizations [z.sub.s] of z. The SVs of [z.sub.s] all lie below the above thresholds, supporting the consistency of the Gaussianity test. Figure 6 also suggests that the estimated Gaussian subspace [W.sub.g-Ed] of the SSTA is spanned by the last 6 SVecs (with a confidence level [p.sub.sig] = 95%) and so [N.sub.g-Ed] = 6 and, [N.sub.ng-Ed] = 5 for the non-Gaussian subspace [W.sub.ng-Ed]. The total NE restricted to [z.sub.ng] = ([Z.sub.1], ..., [Z.sub.5])' is [[??].sub.Ed][[z.sub.ng]] = 1.11 with a bias of 0.14. The "useful" true non-Gaussianity is 0.97 which is of the order of the above estimated value (0.82). The NE components [[[??].sub.kk]([z.sub.ng]) (open black circles), with their low values compared to [[??].sub.kk](z), k = 1, ..., 5, are also shown in Figure 6.

The squared projections of [W.sub.ng-Ed] onto each of the 11 whitened PCs are given by the inner product (Appendix B) [W.sub.ng-Ed] x [e.sub.i] = [[summation].sup.5.sub.k=1] [R.sup.2.sub.i,k], for i = 1, ..., N = 11, where [e.sub.i] is the (unitary) ith vector. By definition, those values add up to [N.sub.ng-Ed] = 5. Table 2 shows that some PCs mostly project onto the non-Gaussian subspace such as PC1 (El Nino), PC4 (NPGO), and PC10 while others mostly project onto the Gaussian subspace, for example, PC7, PC8, and PC3 (negative PDO). The 48% of explained variance is distributed, respectively, as [[summation].sup.11.sub.i=1] [[lambda].sub.i] [[summation].sup.5.sub.k=1] [R.sup.2.sub.i,k] = 24.7% and 23.3%, for the non-Gaussian and Gaussian subspaces.

3.3. Application to Synthetic Non-Gaussian Data. We generate [N.sub.max] = 12 AR(1) independent series [G.sub.k](t), t = 1, ..., [N.sub.t], k = 1, ..., [N.sub.max]. The whitened and uncorrelated components of the non-Gaussian vector y' = ([Y.sub.1], ..., [Y.sub.N]) are computed following

[mathematical expression not reproducible] (18)

with [[sigma].sup.2] [member of] [0, 1] measuring the "strength" of non-Gaussianity. So y is composed of [N.sub.g] = N -5 Gaussian scalar sources spanning [W.sub.g] and [N.sub.ng] = 5 non-Gaussian variables spanning [W.sub.ng]. The latter is composed of one dyad [y'.sub.(1)] = ([Y.sub.1], [Y.sub.2]) and a non-Gaussian triad [y'.sub.(2)] = ([Y.sub.3], [Y.sub.4], [Y.sub.5]).

The next step consists of testing the null hypothesis [H.sub.G] of global Gaussianity (i.e., [[sigma].sub.2] = 0) and evaluating [[??].sub.Ed](y) by varying [[sigma].sup.2] in [0, 1] and [N.sub.g] in {0, ..., 7}.

Figure 7(a) shows the Monte-Carlo average E[[[[??].sub.Ed](y)] obtained from 1000 surrogates. As expected, the mean NE increases with [[sigma].sup.2] (nonlinearity effect) and Ng (dimensionality effect), though at a much smaller rate. The bias E[[[??].sub.Ed:G](y)] is the value reached at [[sigma].sub.2] = 0 in agreement with (11).

3.3.1. Gaussian Subspace Recovery. We investigate the strength of [[??].sub.Ed](y) as a function of [[sigma].sup.2] and [N.sub.g]. The null hypothesis [H.sub.G] is rejected at the confidence level [p.sub.sig] if [mathematical expression not reproducible] (Figure 1(b)). The Monte-Carlo fraction F of HG rejection is shown in Figure 7(b) for [p.sub.sig] = 99%. The contour line at F = 0.99 provides the graph [[sigma].sup.2.sub.min]([N.sub.g]) of [[sigma].sup.2] for the recovery of the non-Gaussian sources. As expected, [[sigma].sup.2.sub.min] increases with Ng, starting at [[sigma].sup.2.sub.min] = 20% and at [N.sub.g] = 0 and reaching [[sigma].sup.2.sub.min] ~ 45% at [N.sub.g] = 6. This means that the larger N, the stronger (through [[sigma].sup.2]) the non-Gaussian embedded sources. Moreover, [[sigma].sup.2.sub.min] diminishes under a less conservative test (lower [p.sub.sig]) or when the effective number of temporal degrees of freedom increases. This puts a limit to what is recoverable from the available (often short) weather/climate data, and, therefore, less conservative tests of Gaussianity or data dimension reduction must be considered (e.g., [p.sub.sig] = 90%) [47].

3.3.2. ICA and ISA. Here, we apply ICA and ISA for N =11 ([N.sub.g] = 6) and compute the skill of source separation in two cases: (a) [[sigma].sup.2] = 20%, high noise, and (b) [[sigma].sup.2] = 40%, intermediate noise, yielding, respectively, total NE of 1.79 and 2.57.

Figures 8(a) and 8(b) are similar to Figure 6 but for the cases (a) and (b), respectively. The analysis yields the correct dimension in both cases: [N.sub.ng-Ed] = 5 with [p.sub.sig] = 90% despite the fact that the SVs in the "high noise" case (a) are smaller. The score of the non-Gaussian subspace recovery is assessed using the inner products [W.sub.ng-Ed] x [W.sub.ng] (compared to 5 in the ideal recovery). Those products are, respectively, 3.99 and 4.49 for cases (a) and (b). Lower values of [[sigma].sup.2] lead to a subestimation of [N.sub.ng] and to a worse estimation of [W.sub.ng].

Now, by applying ICA (see Section 2.2.3 and Appendix A) we get ICs making the vector [mathematical expression not reproducible]. Tables 3 and 4 give, respectively, for cases (a) and (b), a summary of ICA with the sorted maximized self-NEs [[??].sub.Ed]([Z.sub.i,ICA]), the components [[??].sub.ii]([z.sub.ICA]), and the IC vector loadings in [RR.sub.ICA]. A few conclusions can be drawn from these tables; (i) the sum of self-NEs of ICs (2nd column) cannot explain the total NE (3rd column). This corroborates the existence of NE originating from IC interactivities and (ii) Up to a certain statistical significance, a given IC projects uniquely onto a certain source as seen through the dominant projections of ICs (in bold font). From these tables, the estimated dyad [z.sub.Ed-dyad] is spanned by IC1 and IC4 whereas the estimated triad [z.sub.Ed-triad] is spanned by IC2, IC3, and IC5. That supports the conjecture of [13] about the source emerging by IC grouping. Recovered sources may be rotated or inverted with respect to initial data. The distribution of ICs among sources may differ for smaller [[sigma].sup.2]. The squared projections onto the true sources are, respectively, 1.35 (on a maximum of 2) and 2.42 (on a maximum of 3) in case (a) and 1.68 and 2.75 in case (b). Like the non-Gaussian subspace, the recovery of individual sources is more skillful in case (b).

The final step consists of merging the ICs as outlined in Section 2.2.3. Figure 9 presents the results for both cases, where the self-interactivities (16) are the most relevant ITIs due to their maximization by ICA (except IC5 in case (a)). The cumulated NE reaches the unbiased NE value (dashed line) near the threshold of the ITI acceptance (scaled-ITI > 1). In case (a) (Figure 9(a)), the accepted ITIs are for ICs 1-5, and dyadic links (IC3, IC2), (IC5, IC3) yielding the emergence of the true triad (IC2, IC3, and IC5) (see Table 3). The marginally accepted ITIs holds for (IC4, IC3, and IC2) (false triad) and (IC1, IC4) (true dyad). Excluded ITIs remain at the level of the NE bias. For smaller values of [[sigma].sup.2] (not shown), the recovered sources maybe mixed, partly restored, or even not detected.

For case (b) (Figure 9(b)) the source recovery improves. In fact, the significant accepted ITIs are for ICs 1-5, the links for (IC5, IC3), (IC5, IC2), (IC3, IC2) with the true emerging triad (IC2, IC3, and IC5), and for the true dyad (IC4, IC1). The sum of ITIs associated with ICA and ISA is, respectively, 0.84 and 1.13, showing clearly the advantage of ISA in this case.

3.4. Application of ICA and ISA to SSTA Data. Using the estimated non-Gaussian subspace (Section 3.2), we apply here ICA to the leading [N.sub.ICA] = 5 SVecs of the SSTA dataset (see Figure 6). Table 5 gives a summary of the application with the characteristics of the ranked ICs [Z.sub.i,ICA], i = 1, ..., 5, by decreasing self-NEs. They explain, respectively, 7.1%, 5.5%, 4.3%, 4.1%, and 3.7% of the total variance.

Most of the NE components come from the self-NEs (an effect of ICA), through their high skewness and positive kurtosis or leptokurtosis (Table 5), leading to fat long pdf tails. Extreme values of about 3-5 standard deviations are often reached as seen by the IC time-series (Figure 10).

Under those conditions, ICs are basically aggregative and cooperative combinations of single and joint PC extremes. Quasi-independence and noncorrelation between ICs means that occurrence of extremes among different ICs tends to be asynchronous (see Figure 10) and therefore extreme phenomena are independent.

Figure 11 shows the correlation maps between the ICs and SSTAs. An alternative way to interpret ICs is through the patterns displayed in the corresponding normalized loading maps (Figure 12) obtained from combinations of EOFs (Figure 3). Correlation and loading map values at the grid point P for the ith IC are weighted by the PCs standard deviation and its inverse, respectively, [[summation].sup.N=11.sub.k=1] [D.sub.k,P][e.sub.k](P)[([RR.sub.ICA]).sub.k,i] where [D.sub.k,P] is [mathematical expression not reproducible] (Figure 11) and [[??].sup.-1/2.sub.k] (Figure 12), respectively, and [e.sub.k](P) is the kth EOF at P (Figure 3). By definition, ICs are spatial inner products between correspondent normalized loading maps (Figure 12) and the SSTA field. Consequently, maximal positive (negative) ICs on a given month are favored by highly positive (negative) pattern correlations between their loading maps and the SSTA field. Note, however, that the patterns shown in Figures 11 and 12 are not orthogonal. On loading maps, the weighting gives rise to the impact on ICs of PCs of small explained variance. Because of this weighting, the correlation maps (Figure 11) often display much broader and large-scale patterns than loading maps (Figure 12) [15], though both share the same main features when the range of PC variances contributing to a given IC is not too wide, which is the case here.

The leading IC1 explains about 2/3 of the total NE with the remaining 4 being almost evenly distributed (Table 5). Further experiments show that IC1 is quite robust with the use of [N.sub.ICA] < 5 leading SVecs with the remaining ICs being slight mixtures of those obtained with [N.sub.ICA] = 5. The above mixture of PCs suggests the existence of a global nonseparable source thanks to the nonlinear interaction across scales. This strongly suggests that ICs, obtained as blind source separation, are convenient tools for identifying nonlinear variability. Dynamical constraints maybe included explicitly, for example, via nonlinear dynamical modes [71, 72].

The leading IC1 reaches values of +5 standard deviations in the years 1983 and 1997 of strong El Ninos, during which PC1, PC3, PC4, and PC11 are synchronised (see also Figures 5(a) and 5(b) and Table 5). The coherence between those four PCs may sometimes be partial.

The correlation pattern for IC1 (Figure 11) has an El Nino pattern shape due to the high PC1 amplitude (Table 5) and PC1 variance whereas the loading map (Figure 12) is much correlated with EOF11, showing more local features. Therefore, from Figure 12 and Table 5, IC1 is maximal under El Nino conditions, positive NPGO (colder central North Pacific), slightly negative PDO, warmer (colder) mid-latitude (subtropical) Atlantic region, a dipolar structure in the South Pacific, and cold pools southward of 30[degrees]S in the Indian and Atlantic oceans. That is consistent with the oceanic influence of ENSO on decadal timescales [73]. Moreover, that agrees with the dynamical interactions between ENSO, PDO, and NPGO which have been discussed by several authors [74-76]. The quite relevant EOF11 pattern in IC1 is mostly correlated with the El Nino SST composite (asymmetric with respect to La Nina) at mid-latitudes in all oceans, during the December-February season [68, 77], which explains the above referred coherence of PC1 and PC11 during strong El Ninos.

The IC2 loading map (Figure 12) mainly coincides with the symmetrical EOF6, that is, with the positive Atlantic Nino phase, while the remaining dominant EOF contributors to IC2, that is, EOF1, EOF7, and EOF9 (Table 5), practically cancel each other, especially in tropical Pacific El Nino region, despite the presence of non-null correlation in that region (Figure 11). Negative IC2 extremes (e.g., in 1912 and 1951) coincide with the strongest Atlantic Ninas in agreement with the positive skewness of PC6 (Table 2).

The IC3 loading map pattern (Figure 12) is quite similar to EOF5 (Central Pacific (CP) El Nino), combined with the NPGO (colder North Pacific for positive IC3). That is also visible in the correlation map (Figure 11). The negative skewness of IC3 (Table 5) is compatible with the primarily negative SST skewness in Central Pacific [78]. It supports the existence of strong Central Pacific La Ninas (strong negative values of IC3), as in 1943-1944 (see Figure 10), especially during the positive phase of AMO (see the negative loading factor of PC2 for IC3 in Table 5). The loading factors of PC4, PC5, and PC2 contributing to IC3 (Table 5) also reflect the tendency for CP-El Nino (CP-La Nina) during negative (positive) phases of AMO [79], which is visible through the symmetrical correlation signals of IC3 with SSTA (Figure 11), respectively, in CP and North Atlantic areas. The imprint of the characteristic South Pacific dipole shown in EOF10 is also visible in the loading map (Figure 12).

The IC4 loading map (Figure 12) is a mix of several EOFs. The inverse proportionality of IC4 to the PCs standard deviation makes EOFs 4, 5, 9, 10, and 11 the strongest contributors (Table 5 and Figure 2(a)) to the loading map of IC4 (Figure 12), showing that IC4 is highly positive when the Indian and Atlantic oceans are warm, while the eastern (western) North and South Pacific are warm (cold), including the tropical El Nino region. That is also reflected in the correlation map (Figure 11).

The last IC5 loading map pattern (Figure 12) is nearly symmetrical to EOF10 (Table 5), especially over the Austral oceans with a characteristic sequence of three dipoles, located over the Indian, Pacific, and Atlantic subtropical oceans. Negative extremes of IC5 (e.g., 1947, 1992/93) are also associated with a strong negative AMO and NPGO. The NE of IC5 result essentially from the highly positive kurtosis and long pdf tails of PC10 and PC2 (Table 2). Both the loading and correlation maps are quite similar.

The results of ISA are summarized in Figure 13. They suggest 5 ICs with the emergence of one dyadic link (IC3, IC5) with ITI = 0.015 and a triadic link (IC1, IC4, and IC5) with ITI = 0.024 and finally the strongest dyad (IC1, IC5) with an ITI = 0.045. This interaction is even larger than the self-NEs of ICs 3, 4, and 5. This interaction is mainly due to the nonlinear correlation: cor(IC1, [IC5.sup.2]) = 0.23, that is, extremes of IC5 occurring preferentially under PC1 > 0 (e.g., strong El Ninos). A possible explanation is that PC4 (simulating NPGO) influence both IC1 and IC5. From the ensemble of interactions, only IC2 appears to be significantly independent of the remaining ICs. All the remaining ITIs remain at the noise level (Figure 13).

The above analysis extracts highly negentropic ICs resulting primarily from independent combinations and coherency between PC extremes. An alternative ICA and ISA analysis is obtained with Gaussianized (followed by whitening) PCs. Further experiments (not shown) indicate that the total NE comes from cross cumulants only and ICs emerge exclusively from combinations of joint PC extremes. The two leading ICs are practically unchanged as compared to previous ICs; the third and fourth ICs are slightly mixed while the fifth IC is not significant, that is, remaining in the Gaussian manifold. In summary, a prior Gaussianization of PCs leads in general to a reduction, both of the dimension and of the strength of the non-Gaussian subspace.

3.5. Sensitivity to the PC Spanning Space. Here, we study the behavior of the non-Gaussian subspace and ICA/ISA outputs when we sequentially increase the metric space dimension [N.sub.max] using the leading whitened PCs. In the discussion below, it is worth mentioning that, given the direct sum: [S.sub.+] = [S.sub.-] [cross product] [S.sub.r] between two orthogonal subspaces [S.sub.-], [S.sub.r], the relationships between the associated non-Gaussian subspaces and their dimensions hold:

[W.sub.ng] ([S.sub.+]) [contains or equal to] [W.sub.ng] ([S.sub.-]) [cross product] [W.sub.ng] ([S.sub.r]); [N.sub.ng] ([S.sub.+]) [greater than or equal to] [N.sub.ng] ([S.sub.-]) + [N.sub.ng] ([S.sub.-]), (19)

which leads to the identity

[W.sub.ng] ([S.sub.+]) x [W.sub.ng] ([S.sub.-]) = [N.sub.ng] ([S.sub.-]). (20)

Hereafter, we focus on the case where [S.sub.-], [S.sub.r] are spanned, respectively, by the leading [N.sub.ax]-N' PCs and by the remaining N'[greater than or equal to]1 PCs up to rank [N.sub.max]. In (19), the equality (19) holds when [S.sub.-], [S.sub.r] are statistically independent or equivalent when spaces [S.sub.-], [S.sub.r] have separable pdfs. Moreover, under statistical independence, the ISA/ICA may be done independently. However, the pdf separability is hardly verified on multiscale nonlinear natural systems as studied here which means that [W.sub.ng]([S.sub.+]) intersects the Gaussian subspaces [W.sub.g]([S.sub.-]), [W.sub.g]([S.sub.r]) of the parcel spaces while ICA/ISA is also changing when passing from [S.sub.-] to [S.sub.+].

An equivalent relationship to (19) for [W.sub.ng-Ed], obtained with finite samples, is much harder to obtain since [W.sub.ng-Ed] may be undersized (oversized) by errors of type I (II) when considering the Gaussianity null hypothesis [H.sub.G]. This means that [W.sub.g-Ed]([S.sub.+]) x [W.sub.ng-Ed]([S.sub.-]) [greater than or equal to] 0; that is, there may be some "leakage" of the "prior" non-Gaussian subspace ([W.sub.ng-Ed]([S.sub.-])), especially through the less non-Gaussian ICs, to the "posterior" Gaussian subspace [W.sub.g-Ed]([S.sub.+]).

Next, we analyze how the ICA/ISA applied to SSTAs varies with [N.sub.max]. To this end, we have considered the sequence [N.sub.max] = 3, 9, 11, 15, 17 in which the number of ICs [N.sub.ng-Ed], obtained by the method of Section 2.2.2 using a confidence level [p.sub.sig] = 90% (see, e.g., Figure 6), is [N.sub.ng-Ed] = 1, 4, 5, 6, 7, respectively. The above [N.sub.max] are those for which [N.sub.ng-Ed] have shown a jump (of one or more dimensions) with respect to that obtained for [N.sub.max]-1, where more substantial changes are expected to occur in the ICA/ISA sources. For example, [N.sub.ng-Ed] = 5 for 11 [less than or equal to] [N.sub.max] [less than or equal to] 14. Let us consider the corresponding sequence of embedding spaces: [S.sub.-], [S.sub.+], denoting by [mathematical expression not reproducible] the IC weighting vectors (sorted by decreasing self non-Gaussianity), spanning [W.sub.ng-Ed]([S.sub.-]) and [W.sub.ng-Ed]([S.sup.+]), respectively.

To assess the impact on ICA/ISA, we have computed some inner-product-based diagnostics, plotted in Figure 14. First, the measure [mathematical expression not reproducible] while the difference [P.sub.-+]-[N.sub.-] [greater than or equal to] 0 yields how the non-Gaussian subspaces are preserved throughout the space sequence, that is, how closely (20) is preserved. Then, we identify which terms 0 [less than or equal to] [r.sup.-.sub.i] x [r.sup.+.sub.j] [less than or equal to] 1 are mostly contributing to that, typically above a threshold (e.g., 0.08). For dimensions within the range of the selected [N.sub.max] where [N.sub.ng-Ed] is kept constant, every [r.sup.-.sub.i] is very well projected onto a certain [r.sup.+.sub.j(i)], leading to well tracked and robust ICs independently of the space dimension. For instance, in the sequence [N.sub.max] = 11, 12, 13, 14 ([N.sub.ng-Ed] = 5), we get [P.sub.-+] = 4.7,4.6,4.8, respectively, (quite close to the ideal value of 5) with very low leakage of non-Gaussianity. We also get the tracks of the five ICs: [IC1 [right arrow] [sup.0.96]IC1 [right arrow] [sup.0.91]IC1 [right arrow] [sup.0.94]IC1]; [IC2 [right arrow] [sup.0.99]IC3 [right arrow] [sup.0.98]IC3 [right arrow] [sup.0.95]IC3]; [IC3 [right arrow] [sup.0.83]IC2 [right arrow] [sup.0.77]IC2 [right arrow] [sup.0.99]IC2]; [IC4 [right arrow] [sup.0.97]IC4 [right arrow] [sup.0.98]IC4 [right arrow] [sup.0.97]IC4]; and [IC5 [right arrow] [sup.0.90]IC5 [right arrow] [sup.0.88]IC5 [right arrow] [sup.0.91]IC5], each one within brackets, where the subscript values stand for [r.sup.-.sub.i] x [r.sup.+.sub.j(i)]. We conclude that ICA and ISA are quite robust in the range [N.sub.max] = 11-14. Only the ranks of IC2 and IC3 were swapped in the step [N.sub.max] = 11 [right arrow] 12.

Now, in Figure 14, we analyze what happens when a highly non-Gaussian or highly nonlinearly interacting PC comes into the analysis space, corresponding, for instance, to PC3, PC9, PC11, PC15, and PC17. EOF15 and EOF17 are both dominated by subtropical modes in the South Pacific and Indian oceans (not shown), nonlinearly interacting with precedent modes.

In every sequence [S.sub.-] [right arrow] [S.sub.+], the ICs may experience three situations: (1) the emergence of new ICs; (2) the "leakage" of non-Gaussianity to the forthcoming Gaussian subspace; and (3) the decoupling of sets of ICs (one or more) into sets of ICs of larger dimensions. In our case, the emergence of 3 new ICs appears when [N.sub.max] changes from 3 to 9 (see Figure 14). The "leakage" is more relevant with weaker ICs (low negentropy). The decoupling of ICs becomes frequent, mainly at higher dimensions, being the main mechanism of IC generation. For instance IC3 at [N.sub.max] = 9 decouples into (IC4, IC5) with [N.sub.max] = 11. Multiple interactivities, like the dyadic ones (expressed by red segments in Figure 14), may emerge on incoming ICs or transfer to the tracked ICs (e.g., the dyad (IC1, IC6) with [N.sub.max] = 15 transfers to the dyad (IC1, IC7) with [N.sub.max] = 17). There are some ICs that remain robust without any decoupling like the case of IC1 and IC3 with [N.sub.max] = 11. In general, as [N.sub.max] increases, ICs project progressively onto low-ranked PCs dominated by smaller spatial scales and complex sharper features. In this situation, a hybrid criterion for the choice of the truncation [N.sub.max] (not explored here) maybe favored by four factors: the negentropy and explained variance by ICs, the low dispersion of IC weighting loadings (either on PC or original basis), and a low [N.sub.ng-Ed]. That justifies the choice [N.sub.max] = 11 as being the lowest dimension for which [N.sub.ng-Ed] = 4.

4. Summary and Conclusion

We present in this paper an approximation of the multivariate negentropy (NE) motivated by the Edgeworth expansion of the joint probability distribution, using a linear combination of Frobenius norms of the joint cumulant tensors of order p [greater than or equal to] 3. This form is especially sensitive to the extent and frequency of extremes. The NE approximation is based on the trace of a positive definite matrix M resulting from the high-order singular value decomposition of an appropriate contracted form of the coskewness and cokurtosis joint tensors in consistency with the exact negentropy. First, it is a tensor invariant, that is, invariant under orthogonal rotations. Second, when variables are whitened, a version of the "Separation Source Theorem" holds; that is, the approximated NE maybe explicitly decomposed into positive terms measuring self-NEs (due to skewness/kurtosis of marginals) and NEs resulting from nonlinear associations between variables. This permits the extraction of statistically independent components (ICs) from observations through the maximization of the sum of marginal NEs. The decomposition is further generalized to multivariate sources under the Independent Subspace Analysis (ISA), through the formation of groups of disjoint ICs, associated with positive NE contributors (or interactivities). The jointly Gaussian ICs span the Gaussian subspace, being simply approximated by the kernel subspace of M.

When cumulants are estimated from finite or serially correlated samples, the NE estimator bias is positive and grows with the dimension (Curse of Dimensionality). Therefore, the estimated NE of a Gaussian distributed finite multivariate sample is prone to produce "false non-Gaussianity," hence the importance of Gaussian subspace identification. This subspace is estimated from the trailing singular vectors of M for which the null hypothesis of Gaussianity cannot be rejected. The search of non-Gaussian ICs or multiple sources is sought in the non-Gaussian subspace.

The method is then applied to real data consisting of 11 leading PCs obtained from 102 years of the SST anomaly (SSTA) field over the global ocean, equatorward of 65[degrees]. To test the effectiveness of the method, prior to using the SSTA PCs, we have used synthetic data with a prescribed amount non-Gaussianity and sharing the same dimensions, sample size, lagged-one autocorrelation, and the level of negentropy as the SSTA PCs. In order to assure the recovery of the "true non-Gaussianity," the experiment suggests a minimum of variance of the true non-Gaussian sources, which increases as the sample gets smaller and the dimension of the embedding space gets larger.

Application of the method to SSTA data based on the leading 12 PCs (the last one going practically to the Gaussian subspace), explaining more than 50% of the monthly variance, yields 5 ICs. The most prominent or most negentropic IC (or IC1) represents El Nino conditions combined with negative PDO, positive NPGO, and composite impacts of El Nino (asymmetric with respect to La Nina) on oceanic mid-latitudes, poleward of 30[degrees]. The second component, IC2, reveals a loading map mostly projecting onto the Atlantic Nino and the AMO pattern. The third component, IC3, reflects the tendency of Central Pacific El Ninos during the negative phase of AMO. The fourth component, IC4, combines several EOFs and the last component, IC5, is dominated by EOF10, being positive when the SSTA field in the South Pacific exhibits a zonally oriented dipole. All ICs are basically independent combinations of PC extremes and suggest independent responses of the SST field to different forcings. Only IC2 appears to be significantly isolated from the other components. The remaining components have some statistical links dominated primarily by the dyad (IC1, IC5), possibly associated with the sharing of the NPGO effect, via PC4, among those ICs.

The identification of possible independent physical forcings, responsible for the independent sources may bring some benefits, concerning, namely, the forecasting and modelling of the SSTA variability and its impacts and teleconnections on relevant surface variables (e.g., precipitation, temperature, moisture, and wind). That opens the possibility of building phenomenological, dynamical, and stochastic models for the different ICs and independent subspaces.

Appendix

A. Method of Estimating the ICA Solution

The ICA looks for the maximum of a positive defined objective function [mathematical expression not reproducible]. In general, for a continuous and differentiable objective function [F.sub.ICA](R), the absolute maximum [R.sub.ICA] = arg [max.sub.R] {[F.sub.ICA](R)} holds at a null gradient [partial derivative][F.sub.ICA](R)/[partial derivative]R, subjected to the orthogonality constraint [mathematical expression not reproducible]. If [LAMBDA] is the matrix of the Lagrange multipliers, then we easily get

[mathematical expression not reproducible], (A.1)

where [dot encircle] is the component-wise multiplication operator. Eq. (A.1) means that the gradient matrix coincides with its matrix polar decomposition, that is, the product of an orthogonal matrix by a symmetric matrix [80]. For an invertible matrix [LAMBDA] there may exist a finite set of solutions R of (A.1), beyond those obtained by permutations. They correspond to suboptimal ICA solutions. However, for a singular matrix [LAMBDAA, there is a continuous manifold of solutions spanned along Ker([LAMBDA]). In the maximization ICA problem, the gradient matrix depends nonlinearly on R, hence the linear eigentechniques do not work. A usual approach is to write R as a product of Givens matrices in terms of rotation Euler angles [81] and find the absolute maximum by a gradient-based Newton method [15].

A different approach is used here. The solutions of (A.1) are obtained from an iterative algorithm. We start from a random orthogonal matrix [R.sub.1] and compute its gradient matrix and then its SVD, necessary to polar decomposition. Then we have

[mathematical expression not reproducible], (A.2)

where [R.sub.k+1] = [W.sub.k][V'.sub.k] and [[LAMBDA].sub.s,k+1] = [V.sub.k][[summation].sub.k][V'.sub.k]. The iteration stops when the relative Frobenius squared norm of the difference between consecutive rotation matrices (1/[N.sub.ng]) [[parallel][R.sub.k+1]-[R.sub.k][parallel].sup.2.sub.F] is less than a threshold, [delta], say [10.sup.-5] (note that [[parallel]R[parallel].sub.2.sub.F] = [N.sub.ng]). The random initial matrix is the product of Givens matrices by random Euler angles. From the trials (~1000), only the convergent sequences are retained and finally their limit matrices. The matrix producing the largest objective function [F.sub.ICA] is set to [R.sub.ICA]. A finite nonredundant number of different local minima are normally found in the ICA optimization.

The gradient [partial derivative][F.sub.ICA](R)/[partial derivative]R is written through the chain rule; [F.sub.ICA] is derived with respect firstly to cumulants, secondly to rotated variables, and thirdly to rotation matrix entries. More generally [F.sub.ICA] belongs to the of objective function F([cm.sub.1], ..., [cm.sub.i], ...), depending on cumulants (here a sum of weighted squares): cm;. Each cumulant is a linear combination of sampling averages: [cm.sub.i] = [[summation].sub.j] [a.sub.ij] [[bar.[f.sub.ij(z)]]. Therefore:

[mathematical expression not reproducible]. (A.3)

B. Similarity Score between Sources

Let us consider two sources [mathematical expression not reproducible], of dimensions [N.sub.1], [N.sub.2] not necessarily equal (e.g., the proposed one by ISA and the true one), where [f.sub.i], [t.sub.j] are orthogonal basis vectors spanning each source. A tensorial invariant measure of similarity between sources is the inner product between subspaces, hereafter denoted as f x t. That is given by the sum of squared SVs of the Gram matrix G = f't, that is,

f x t = Tr [GG'] = [[parallel]G[parallel].sup.2.sub.F] [member of] [0, min {[N.sub.1], [N.sub.2]}]. (B.1)

Symbols

[[??].sub.1]: Lag-one autocorrelation of the fitted AR(1) model

[C.sup.a.sub.b]: Number of combinations without repetition of b among a elements

cm(u): Generic cumulant in terms of the vector u

E(): Expectation operator

[e.sub.k](P): Value of the normalized fcth EOF at grid point P

[F.sub.ICA] (R), [F.sub.ISA] (R): Contrast function for ICA and ISA as a function of R

[f.sub.REd]: Ratio between actual and asymptotic bias of [F.sub.Ed:G]

[mathematical expression not reproducible]: Quantile [p.sub.sig] of [[??].sub.Ed:G] in units of [mathematical expression not reproducible]

[mathematical expression not reproducible]: Global and partial Gaussianity hypothesis

I(u, v, ...): Mutual information among vectors u, v, ...

[I.sub.Ed](u, v, ...): Edgeworth approximation of I(u, v, ...)

J(): Negentropy

[J.sub.Ed](): Estimated negentropy by the Edgeworth cumulant-based approximation

[J.sub.Ed:G]: Value of [J.sub.Ed] under the Gaussianity hypothesis

[J.sub.Ed-IT](u): Part (interactivity parcel) of [J.sub.Ed] associated with the subvector u

[J.sub.Ed-ITsig](u): Scaled value of [J.sub.Ed-IT] by quantiles of [J.sub.Ed-IT:G](u)

[J.sub.rot]: Invariant negentropy of a subspace

[J.sub.sig]: Threshold for the acceptance of a scaled interactivity [J.sub.Ed-ITSig](u)

K(y): Cokurtosis tensor of y

M(y): Matrix obtained by contraction products of S(y) and K(y)

N, [N.sub.max]: Dimension of y and its maximum value under study

[N.sub.dof]: Number of temporal degrees of freedom

[N.sub.eq]: Number of equivalent temporal degrees of freedom

[N.sub.g], [N.sub.ng]: Dimension of the Gaussian and non-Gaussian subspaces

[N.sub.g-Ed], [N.sub.ng-Ed]: Dimension of the Gaussian and non-Gaussian estimated subspaces

[N.sub.int]: Number of interactivities forming [J.sub.Ed]

[N.sub.P]: Number of domain points

[N.sub.r]: Rank of the dataset matrix

[N.sub.s]: Number of subvectors or candidate sources

[N.sub.t]: Time-series size

p, [p.sub.max]: Cumulant order and maximum cumulant order

[p.sub.Ed](y): Polynomial correction of the y pdf by the Edgeworth approximation

[mathematical expression not reproducible]: Polynomial dependence on N of the [J.sub.Ed:G] bias

[p.sub.sig]: Confidence level

[mathematical expression not reproducible]: Quantile at the [p.sub.sig] confidence level, used to establish thresholds

R: Orthogonal rotation matrix filled by singular vectors of M(y) (SVecs)

[R.sub.ICA]: Orthogonal matrix relating the SVecs and ICA vectors

[R.sub.ISA]: Orthogonal matrix relating the SVecs and ISA sources

[r.sub.1], [r.sub.2], ...: Normalized vectors filling R (Singular Vectors SVecs)

S(y): Coskewness tensor of y

Span(W): Subspace spanned by orthonormal column vectors in W

G(t): Single standard Gaussian white noise at time t

[W.sub.g], [W.sub.ng]: Gaussian and non-Gaussian subspaces

[W.sub.g-Ed], [W.sub.ng-Ed]: Estimated Gaussian and non-Gaussian subspaces by HOSVD of M

X: Space-time dataset matrix

x(t): Vector of local field values at time t

[??]: Vector of local standardized field values

[x.sub.PC]: Vector of PCs at time f, sorted by decreasing variance

y: Vector of whitened variables (e.g., whitened PCs)

z: Vector of rotated whitened PCs through SVD

[z.sub.ng]: z restricted to the non-Gaussian components

[z.sub.(i)]: Subvector of components spanning the ith candidate source

[z.sub.ICA], [z.sub.ISA]: Vector of rotated whitened PCs giving the ICA and ISA solutions

[[eta].sub.i,N]: ith singular value of M(y) of dimension N

[[lambda].sub.k]: kth PC variance sorted by decreasing order

[[rho].sub.z]: Joint pdf of the vector z

[[phi](z): Standard isotropic Gaussian PDF as a

function of joint pdf of z

[[sigma].sup.2]: Variance of the non-Gaussian signal part of a scalar

[[sigma].sup.2.sub.Local]: Average local variance

[[??].sup.2.sub.P]: Ratio between local variance at point P and total variance

[[sigma].sup.2.sub.Total]: Total variance

[??]: Quantity estimated from sample averages

[bar.()]: Sample or time average

dim: Dimension or cardinality of a vector

[[parallel][paralle].sup.2.sub.F]: Frobenius squared norm (sum of the entries squares)

[().sub.g]: Quantity evaluated under the Gaussianity hypothesis

()': Transpose

() [dot encircle] (): Component-wise product.

https://doi.org/10.1155/2017/3076810

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This research was developed at IDL with the support of the Portuguese Foundation for Science and Technology through the FCT, project UID/GEO/50019/2013, Instituto Dom Luiz (IDL). The authors acknowledge the International Meteorological Institute (IMI) at Stockholm University for hosting Carlos Pires and the omnipresent support of the family.

References

[1] P. Sura and A. Hannachi, "Perspectives of non-gaussianity in atmospheric synoptic and low-frequency variability," Journal of Climate, vol. 28, no. 13, pp. 5091-5114, 2015.

[2] C. Proistosescu, A. Rhines, and P. Huybers, "Identification and interpretation of nonnormality in atmospheric time series," Geophysical Research Letters, vol. 43, no. 10, pp. 5425-5434, 2016.

[3] A. Hannachi, D. M. Straus, C. L. Franzke, S. Corti, and T. Woollings, "Low-frequency nonlinearity and regime behavior in the Northern Hemisphere extra-tropical atmosphere," Reviews of Geophysics, vol. 55, no. 1, pp. 199-234, 2017

[4] A. Hannachi, I. T. Jolliffe, and D. B. Stephenson, "Empirical orthogonal functions and related techniques in atmospheric science: a review," International Journal of Climatology, vol. 27, no. 9, pp. 1119-1152, 2007.

[5] A. Hannachi, S. Unkel, N. T. Trendafilov, and I. T. Jolliffe, "Independent component analysis of climate data: a new look at EOF rotation," Journal of Climate, vol. 22, no. 11, pp. 2797-2812, 2009.

[6] N. Philippon, L. Jarlan, N. Martiny, P. Camberlin, and E. Mougin, "Characterization of the interannual and intraseasonal variability of West African vegetation between 1982 and 2002 by means of NOAA AVHRR NDVI data," Journal of Climate, vol. 20, no. 7, pp. 1202-1218, 2007

[7] A. Mori, N. Kawasaki, K. Yamazaki, M. Honda, and H. Nakamura, "A reexamination of the northern hemisphere sea level pressure variability by the independent component analysis," Scientific Online Letters on the Atmosphere, vol. 2, pp. 5-8,2006.

[8] F. Aires, A. Chedin, and J.-P Nadal, "Independent component analysis of multivariate time series: application to the tropical SST variability," Journal of Geophysical Research Atmospheres, vol. 105, no. 13, Article ID 2000JD900152, pp. 17437-17455, 2000.

[9] I. Fodor and I. C. Kamath, "On the use of independent component analysis to separate meaningful sources in global temperature series," Tech. Rep., Lawrence Livermore National Laboratory, 2003, https://computation.llnl.gov/ casc/sapphire/sep_climate/index.html.

[10] S. Westra, C. Brown, U. Lall, I. Koch, and A. Sharma, "Interpreting variability in global SST data using independent component analysis and principal component analysis," International Journal of Climatology, vol. 30, no. 3, pp. 333-346, 2010.

[11] J. Cardoso, "Multidimensional independent component analysis," in Proceedings of the 1998 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 4, pp. 1941-1944, Seattle, WA, USA.

[12] A. Sharma and K. K. Paliwal, "Subspace independent component analysis using vector kurtosis," Pattern Recognition, vol. 39, no. 11, pp. 2227-2232, 2006.

[13] Z. Szabo, B. Poczos, and A. Lrincz, "Separation theorem for independent subspace analysis and its consequences," Pattern Recognition, vol. 45, no. 4, pp. 1782-1791, 2012.

[14] F. J. Theis, "Towards a general independent subspace analysis," in Proceedings of Neural Information Processing Systems (NIPS '06), 2006.

[15] C. A. L. Pires and A. F. S. Ribeiro, "Separation of the atmospheric variability into non-Gaussian multidimensional sources by projection pursuit techniques," Climate Dynamics, vol. 48, no. 3-4, pp. 821-850, 2017.

[16] P. Sura, M. Newman, and M. A. Alexander, "Daily to decadal sea surface temperature variability driven by state-dependent stochastic heat fluxes," Journal of Physical Oceanography, vol. 36, no. 10, pp. 1940-1958, 2006.

[17] P. D. Sardeshmukh and P. Sura, "Reconciling non-Gaussian climate statistics with linear dynamics," Journal of Climate, vol. 22, no. 5, pp. 1193-1207, 2009.

[18] M. Perron and P. Sura, "Climatology of non-gaussian atmospheric statistics," Journal of Climate, vol. 26, no. 3, pp. 1063-1083, 2013.

[19] A. Hannachi, D. B. Stephenson, and K. R. Sperber, "Probability-based methods for quantifying nonlinearity in the ENSO," Climate Dynamics, vol. 20, no. 2-3, pp. 241-256, 2003.

[20] G. H. Golub and C. F. van Loan, Matrix Computations, The Johns Hopkins University Press, Baltimore, MD, USA, 3rd edition, 1996.

[21] T. M. Cover and J. A. Thomas, Elements of Information Theory, John Wiley & Sons, New York, NY, USA, 2nd edition, 2012.

[22] E. Schneidman, S. Still, M. J. Berry, and W. Bialek, "Network information and connected correlations," Physical Review Letters, vol. 91, no. 23, pp. 238701-1-238701-4, 2003.

[23] P. Comon, "Independent component analysis, a new concept?" Signal Processing, vol. 36, no. 3, pp. 287-314, 1994.

[24] A. Hyvarinen and P. Pajunen, "Nonlinear independent component analysis: existence and uniqueness results," Neural Networks, vol. 12, no. 3, pp. 429-439, 1999.

[25] L. Almeida, "MISEP--linear and nonlinear ICA based on mutual information," Journal of Machine Learning Research, vol. 4, no. 7-8, pp. 1297-1318, 2004.

[26] M. Scholz, "Validation on nonlinear PCA," Neural Processing Letters, vol. 36, no. 1, pp. 21-30, 2012.

[27] A. H. Monahan, "Nonlinear principal component analysis: tropical Indo-Pacific sea surface temperature and sea level pressure," Journal of Climate, vol. 14, no. 2, pp. 219-233, 2001.

[28] W. W. Hsieh, "Nonlinear canonical correlation analysis of the tropical pacific climate," Journal of Climate, vol. 14, no. 12, pp. 2528-2539, 2001.

[29] W. W. Hsieh and A. A. Wu, "Nonlinear multichannel singular spectrum analysis of the tropical Pacific climate variability using a neural network approach," Journal of Geophysical Research, vol. 107, no. C7, pp. 13-1-13-15, 2002.

[30] A. Wu, W. W. Hsieh, A. Shabbar, G. J. Boer, and F. W. Zwiers, "The nonlinear association between the Arctic Oscillation and North American winter climate," Climate Dynamics, vol. 26, no. 7-8, pp. 865-879, 2006.

[31] Q. Teng, J. C. Fyfe, and A. H. Monahan, "Northern Hemisphere circulation regimes: observed, simulated and predicted," Climate Dynamics, vol. 28, no. 7-8, pp. 867-879, 2007

[32] J. F. Cardoso and A. Souloumiac, "Blind beamforming for non-Gaussian signals," IEE Proceedings, Part F: Radar and Signal Processing, vol. 140, no. 6, pp. 362-370, 1993.

[33] L. R. Tucker, "Some mathematical notes on three-mode factor analysis," Psychometrika, vol. 31, no. 3, pp. 279-311, 1966.

[34] L. de Lathauwer, B. de Moor, and J. Vandewalle, "A multilinear singular value decomposition," SIAM Journal on Matrix Analysis and Applications, vol. 21, no. 4, pp. 1253-1278, 2000.

[35] L. H. Lim and J. Morton, "Cumulant component analysis: a simultaneous generalization of PCA and ICA," Computational Algebraic Statistics, Theories and Applications, 2008.

[36] E. Jondeau, E. Jurczenko, and M. Rockinger, "Moment component analysis: an illustration with international stock markets," Business & Economic Statistics, pp. 1-23, 2016.

[37] T. Blaschke and L. Wiskott, "An improved cumulant based method for independent component analysis," in Proceedings of the Artificial Neural Networks Proceedings (ICANN '02), vol. 2415 of Lecture Notes in Computer Science, Conference Paper, pp. 1087-1093, Madrid, Spain, August 2002.

[38] X. Geng, L. Ji, and K. Sun, "Principal skewness analysis: algorithm and its application for multispectral/hyperspectral images indexing," IEEE Geoscience and Remote Sensing Letters, vol. 11, no. 10, pp. 1821-1825, 2014.

[39] C. Ahlbach, J. Usatine, and N. Pippenger, "A Combinatorial Interpretation of the Joint Cumulant," 2012, https://arxiv.org/ abs/1211.0652v1.

[40] W. J. McGill, "Multivariate information transmission," Psychometrika, vol. 19, no. 2, pp. 97-116, 1954.

[41] T. Tsujishita, "On triple mutual information," Advances in Applied Mathematics, vol. 16, no. 3, pp. 269-274, 1995.

[42] A. Jakulin and A. I. Bratko, "Quantifying and visualizing attribute interactions: an approach based on entropy," 2004, https://arxiv.org/abs/cs/0308002.

[43] C. A. L. Pires and R. A. P. Perdigao, "Non-gaussian interaction information: estimation, optimization and diagnostic application of triadic wave resonance," Nonlinear Processes in Geophysics, vol. 22, no. 1, pp. 87-108, 2015.

[44] J. H. Friedman and J. W. Tukey, "A projection pursuit algorithm for exploratory data analysis," IEEE Transactions on Computers, vol. 23, pp. 881-890, 1974.

[45] P. J. Huber, "Projection pursuit," The Annals of Statistics, vol. 13, no. 2, pp. 435-525, 1985.

[46] L. Paninski, "Estimation of entropy and mutual information," Neural Computation, vol. 15, no. 6, pp. 1191-1253, 2003.

[47] C. A. Pires and R. A. P. Perdigao, "Non-Gaussianity and asymmetry of the winter monthly precipitation estimation from the NAO," Monthly Weather Review, vol. 135, no. 2, pp. 430-448, 2007.

[48] C. A. L. Pires and R. A. P. Perdigaao, "Minimum mutual information and non-gaussianity through the maximum entropy method: theory and properties," Entropy, vol. 14, no. 6, pp. 1103-1126, 2012.

[49] T. M. Smith, R. W. Reynolds, T. C. Peterson, and J. Lawrimore, "Improvements to NOAA's historical merged land-ocean surface temperature analysis (1880-2006)," Journal of Climate, vol. 21, no. 10, pp. 2283-2296, 2008.

[50] M. Messie and F. Chavez, "Global modes of sea surface temperature variability in relation to regional climate indices," Journal of Climate, vol. 24, no. 16, pp. 4314-4331, 2011.

[51] M. B. Richman, "Rotation of principal components.," Journal of Climatology, vol. 6, no. 3, pp. 293-335, 1986.

[52] G. R. North, T. L. Bell, R. F. Cahalan, and F. J. Moeng, "Sampling errors in the estimation of empirical orthogonal functions," Monthly Weather Review, vol. 110, no. 7, pp. 699-706, 1982.

[53] H. von Storch and F. W. Zwiers, Statistical Analysis in Climate Research, Cambridge University Press, 1999.

[54] Y. M. Okumura and C. Deser, "Asymmetry in the duration of El Nino and la Nina," Journal of Climate, vol. 23, no. 21, pp. 5826-5843, 2010.

[55] I. Koch and K. Naito, "Dimension selection for feature selection and dimension reduction with principal and independent component analysis," Neural Computation, vol. 19, no. 2, pp. 513-545, 2007

[56] K. Wolter and M. S. Timlin, "Monitoring ENSO in COADS with a seasonally adjusted principal component index," in Proceedings of the 17th Climate Diagnostics Workshop, Norman, OK, NOAA/NMC/CAC and Cosponsors, vol. 5257, pp. 52-57, 1993.

[57] R. A. Kerr, "A North Atlantic climate pacemaker for the centuries," Science, vol. 288, no. 5473, pp. 1984-1986, 2000.

[58] N. J. Mantua, S. R. Hare, Y. Zhang, J. M. Wallace, and R. C. Francis, "A Pacific interdecadal climate oscillation with impacts on salmon production," Bulletin of the American Meteorological Society, vol. 78, no. 6, pp. 1069-1079, 1997

[59] N. J. Mantua and S. R. Hare, "The pacific decadal oscillation," Journal of Oceanography, vol. 58, no. 1, pp. 35-44, 2002.

[60] E. Di Lorenzo, N. Schneider, K. M. Cobb et al., "North Pacific Gyre Oscillation links ocean climate and ecosystem change," Geophysical Research Letters, vol. 35, no. 8, Article ID L08607, 2008.

[61] K. Ashok, S. K. Behera, S. A. Rao, H. Weng, and T. Yamagata, "El Nino Modoki and its possible teleconnection," Journal of Geophysical Research C: Oceans, vol. 112, no. 11, Article ID C11007, 2007.

[62] Y. Morioka, T. Tozuka, S. Masson, P. Terray, J.-J. Luo, and T. Yamagata, "Subtropical dipole modes simulated in a coupled general circulation model," Journal of Climate, vol. 25, no. 12, pp. 4029-4047, 2012.

[63] R. R. Rodrigues, E. J. D. Campos, and R. Haarsma, "The impact of ENSO on the south Atlantic subtropical dipole mode," Journal of Climate, vol. 28, no. 7, pp. 2691-2705, 2015.

[64] S. E. Zebiak, "Air-sea interaction in the equatorial Atlantic region," Journal of Climate, vol. 6, no. 8, pp. 1567-1586, 1993.

[65] P. Chang, T. Yamagata, P. Schopf et al., "Climate fluctuations of tropical coupled systems: the role of ocean dynamics," Journal of Climate, vol. 19, no. 20, pp. 5122-5174, 2006.

[66] S. A. Venegas, L. A. Mysak, and D. N. Straub, "Atmosphere-ocean coupled variability in the South Atlantic," Journal of Climate, vol. 10, no. 11, pp. 2904-2920, 1997

[67] R. I. Saurral, F. J. Doblas-Reyes, and J. Garcia-Serrano, "Observed modes of sea surface temperature variability in the South Pacific region," Climate Dynamics, pp. 1-15, 2017

[68] C. Deser, M. A. Alexander, S.-P. Xie, and A. S. Phillips, "Sea surface temperature variability: patterns and mechanisms," Annual Review of Marine Science, vol. 2, no. 1, pp. 115-143, 2010.

[69] Y. Guan, J. Zhu, B. Huang, Z.-Z. Hu, and J. L. Kinter, "South Pacific Ocean dipole: a predictable mode on multiseasonal time scales," Journal of Climate, vol. 27, no. 4, pp. 1648-1658, 2014.

[70] C. Deser and M. S. Timlin, "Atmosphere-ocean interaction on weekly timescales in the North Atlantic and Pacific," Journal of Climate, vol. 10, no. 3, pp. 393-408, 1997

[71] D. Mukhin, A. Gavrilov, A. Feigin, E. Loskutov, and J. Kurths, "Principal nonlinear dynamical modes of climate variability," Scientific Reports, vol. 5, Article ID 15510, 2015.

[72] A. Gavrilov, D. Mukhin, E. Loskutov, E. Volodin, A. Feigin, and J. Kurths, "Method for reconstructing nonlinear modes with adaptive structure from multidimensional data," Chaos, vol. 26, no. 12, p. 123101, 2016.

[73] S.-I. An, J.-S. Kug, A. Timmermann, I.-S. Kang, and O. Timm, "The influence of ENSO on the generation of decadal variability in the North Pacific," Journal of Climate, vol. 20, no. 4, pp. 667-680, 2007.

[74] N. Schneider and B. D. Cornuelle, "The forcing of the Pacific Decadal Oscillation," Journal of Climate, vol. 18, no. 21, pp. 4355-4373, 2005.

[75] M. Newman, M. A. Alexander, T. R. Ault et al., "The pacific decadal oscillation, revisited," Journal of Climate, vol. 29, no. 12, pp. 4399-4427, 2016.

[76] E. Di Lorenzo, G. Liguori, N. Schneider, J. C. Furtado, B. T. Anderson, and M. A. Alexander, "ENSO and meridional modes: A null hypothesis for Pacific climate variability," Geophysical Research Letters, vol. 42, no. 21, pp. 9440-9448, 2015.

[77] B. Rodriguez-Fonseca, R. Suarez-Moreno, B. Ayarzaguena et al., "A review of ENSO influence on the North Atlantic. A nonstationary signal," Atmosphere, vol. 7, no. 7, article 87, 2016.

[78] J. Su, R. Zhang, T. Li, and X. Rong, "Skewness of subsurface ocean temperature in the equatorial Pacific based on assimilated data," Chinese Journal of Oceanology and Limnology, vol. 27, no. 3, pp. 600-606, 2009.

[79] J.-Y. Yu, P.-K. Kao, H. Paek et al., "Linking emergence of the central Pacific El Nino to the Atlantic multidecadal oscillation," Journal of Climate, vol. 28, no. 2, pp. 651-662, 2015.

[80] R. I. Jennrich, "A simple general procedure for orthogonal rotation," Psychometrika, vol. 66, no. 2, pp. 289-306, 2001.

[81] R. C. Raffenetti and K. Ruedenberg, "Parametrization of an orthogonal matrix in terms of generalized eulerian angles," International Journal of Quantum Chemistry, vol. 4, Proceedings of the International Symposium on Quantum Biology and Quantum Pharmacology, no. S3B, pp. 625-634, 1969.

Carlos A. L. Pires (1) and Abdel Hannachi (2)

(1) Instituto Dom Luiz (IDL), Faculdade de Ciencias, Universidade de Lisboa, 1749-016 Lisbon, Portugal

(2) Department of Meteorology, Stockholm University, Stockholm, Sweden

Correspondence should be addressed to Carlos A. L. Pires; clpires@fc.ul.pt

Received 23 May 2017; Accepted 10 July 2017; Published 22 August 2017

Academic Editor: Davide Faranda

Caption: Figure 1: (a) Contour plot of [f.sub.REd] as a function of [N.sub.t], [[??].sub.1] for a fixed dimension N = 4. (b) Ratio [mathematical expression not reproducible] for the confidence level [p.sub.sig] = 0.95, as a function of the dimension N and the lag-one autocorrelation [[??].sub.1] for a sample size [N.sub.t] = 1000. The ratio increases with decreasing N or increasing [[??].sub.1].

Caption: Figure 2: (a) % of explained variance confidence interval and (b) lag-one autocorrelation of the AR(1) fitting model of the leading 12 PCs.

Caption: Figure 3: Maps of the 12 variance-leading, spatially normalized EOFs of the monthly SSTAs field, computed in the period 1910-2011. Ranked EOFs are read from left to right and top to bottom.

Caption: Figure 4: (a) NE components parcel [[??].sub.kk][[y.sub.(N)]] (k in abscissas) for each N = 1, ... 12 (marked in grey shaded). (b) Total NE [[??].sub.Ed][[y.sub.(N)]] for N = 1, ..., 12. Values rejecting the null hypothesis of multi-Gaussianity at confidence levels <70%, 70%-90%, 90%-95%, and >95% are marked, respectively, by "+," open circle, small solid circle, and large solid circle.

Caption: Figure 5: Scatter-plots of PC1 versus PC3 (a) and PC1 versus PC11 (b). The marginal pdfs have been Gaussianized.

Caption: Figure 6: Ranked SVs [[??].sub.kk](z), k = 1, ..., N = 11, contributing to NE, both from real SSTA data (solid squares) and from Gaussian surrogate realizations (open squares) and 90% (red), 95% (blue) quantiles of [mathematical expression not reproducible] obtained from Gaussian surrogates after rank k, to be compared with [[[??].sub.kk](z). The trailing 6 SVs is not significant at [p.sub.sig] = 95%, leading to [N.sub.ng-Ed] = 5. The NE components restricted to the non-Gaussian subspace are marked as open circles.

Caption: Figure 7: (a) Monte-Carlo average of the sample negentropy estimator [[??].sub.Ed](y) as a function of [[sigma].sup.2] and [N.sub.g]. (b) Contour plot of the fraction F of times for which the null hypothesis [H.sub.G] is rejected at [p.sub.sig] = 99% (1% significance level) as a function of [[sigma].sub.2] and [N.sub.g]. The contour F = 0.99 can mark [[sigma].sup.2.sub.min] at the 99% confidence level.

Caption: Figure 8: Same as Figure 6 but for the for synthetic data with [[sigma].sup.2] = 20% (a) and [[sigma].sup.2] = 40% (b).

Caption: Figure 9: Sorted scaled-interactivities by decreasing order (solid circles), ITIs (solid squares) with cumulated ITIs (open crossed circles) and unbiased total NE (horizontal dashed line). The sorted IC groups are labeled (e.g., 32 means group of IC2 and IC3). The 95% confidence levels of the selected ITIs are those for which the cumulated ITIs are below the dashed line. (a) Case (a): high noise and (b) case (b): intermediate noise.

Caption: Figure 10: IC time-series for the period 1910-2011.

Caption: Figure 11: Correlation maps between ICs and SSTA field.

Caption: Figure 12: Spatially normalized loading maps of the 5 sorted ICs.

Caption: Figure 13: As in Figure 9 but for the ISA summary of SSTA data.

Caption: Figure 14: Schematic of the ICs (rank and negentropy shown in rectangles), aligned vertically for each [N.sub.max] = 3, 9, 11, 15, 17 (circles) with values of [P.sub.-+] (bulk arrows) between the sequential PC spanning spaces. Values with the dominant [r.sup.-.sub.i] x [r.sup.+.sub.j] are written next to the corresponding thin arrows. The red straight vertical lines indicate the significant dyadic links with corresponding NE interactivities.

Table 1: EOF description with the main SSTA modes, along with corresponding available indices from literature and references. EOF rank Description with the Available PC/Index Refs. main SSTA modes index cor. 1 El Nino (ENSO) MEI 0.98 [56] 2 Atlantic AMO 0.80 [57] Multi-Decadal Oscillation (AMO) 3 Pacific Decadal PDO -0.56 [58,59] Oscillation (PDO): negative phase 4 Northern Pacific NPGO 0.53 [60] Gyre Oscillation (NPGO) El Nino Modoki (EM): EMI -0.42 [61] negative phase 5 El Nino Modoki (EM): EMI 0.57 [61] positive phase Subtropical Indian SIOD [62, 63] Ocean Dipole: negative phase 6 Atlantic Nino: ATL3 -0.49 [64, 65] negative phase South Atlantic SASD [66] Subtropical Dipole Subtropical Indian SIOD [62, 63] Ocean Dipole: positive phase Zonally oriented [67, 68] dipole in South Pacific 7 Meridional sequence SPOD [67, 69] of dipoles crossing the North-South Pacific South Pacific Ocean Dipole (SPOD) 8 North Atlantic NAT [70] tripole (NAT) Tropical Atlantic [68] SST dipole 9 South Pacific Ocean SPOD [67, 69] Dipole (SPOD) 10 Zonally oriented [68] dipole in South Atlantic Zonally oriented [67, 68] dipole in South Pacific 11 Meridional Pacific SIOD [62, 63] wavelike pattern Meridional Atlantic wavelike pattern Subtropical Indian Ocean Dipole (positive phase) 12 Tripolar pattern in NAT [70] North Pacific North Atlantic tripole (NAT) EOF rank Description with the Notes main SSTA modes 1 El Nino (ENSO) 2 Atlantic Multi-Decadal Oscillation (AMO) 3 Pacific Decadal Oscillation (PDO): negative phase 4 Northern Pacific Also known as Gyre Oscillation Central Pacific or (NPGO) noncanonical El Nino El Nino Modoki (EM): negative phase 5 El Nino Modoki (EM): First EOF mode of positive phase the SSTA in the Indian Ocean ranging Subtropical Indian in the 15[degrees]S- Ocean Dipole: 45[degrees]S band negative phase 6 Atlantic Nino: First EOF coupled negative phase mode of the SSTA- wind in the South South Atlantic Atlantic First EOF Subtropical Dipole mode of the SSTA in the Indian Ocean Subtropical Indian ranging in the Ocean Dipole: 15[degrees]S- positive phase 45[degrees]S band Second EOF of the Zonally oriented South Pacific SSTA dipole in South Pacific 7 Meridional sequence First EOF of the of dipoles crossing South Pacific SSTA the North-South Pacific South Pacific Ocean Dipole (SPOD) 8 North Atlantic tripole (NAT) Tropical Atlantic SST dipole 9 South Pacific Ocean First EOF of the Dipole (SPOD) South Pacific SSTA 10 Zonally oriented Second EOF of the dipole in South South Pacific SSTA Atlantic Zonally oriented dipole in South Pacific 11 Meridional Pacific First EOF mode of wavelike pattern the SSTA in the Meridional Atlantic Indian Ocean ranging wavelike pattern in the 15[degree]sS- Subtropical Indian 45[degrees]S band Ocean Dipole (positive phase) 12 Tripolar pattern in North Pacific North Atlantic tripole (NAT) Table 2: % of explained variance, skewness, and kurtosis of the 12 leading PCs and the squared projections of Wng-Ed onto the individual whitened PCs for the analysis over N =11 PCs (see the end of Section 3.2). PC1 PC2 PC3 PC4 PC5 PC6 %Var 17.6 5.3 4.6 3.8 3.3 2.7 Skew. 0.24 0.03 0.01 0.28 -0.16 0.25 Kurt. -0.01 -0.64 -0.33 0.19 0.02 -0.02 SqPro 0.65 0.53 0.16 0.68 0.50 0.56 PC7 PC8 PC9 PC10 PC11 PC12 %Var 2.5 2.2 2.2 2.0 1.9 1.8 Skew. -0.08 0.23 0.14 0.16 0.29 0.13 Kurt. -0.06 0.17 -0.04 0.55 0.14 -0.16 SqPro 0.29 0.05 0.37 0.81 0.39 Table 3: Contributions to the ICA contrast function (2nd column). NE components (3rd column) and IC vector loadings on the basis of original variables (18) for the high noise case. Components in which ICs are most projected are in bold font. IC rank [[??].sub.Ed] [[??].sub.ii] [Y.sub.1] ([Z.sub.i,ICA]) ([Z.sub.i,ICA]) (dyad) IC1: i = 1 0.127 0.162 0.76# IC2: i = 2 0.066 0.125 -0.07 IC3: i = 3 0.042 0.102 -0.01 IC4: i = 4 0.014 0.053 -0.41# IC5: i = 5 0.011 0.063 -0.03 Total 0.260 0.506 IC rank [Y.sub.2] [Y.sub.3] [Y.sub.4] [Y.sub.5] (dyad) (triad) (triad) (triad) IC1: i = 1 -0.58# -0.02 0.01 0.16 IC2: i = 2 0.07 0.47# -0.56# 0.56# IC3: i = 3 -0.08 -0.38# -0.68# -0.42# IC4: i = 4 -0.52# 0.21 0.29 0.01 IC5: i = 5 0.02 -0.68# 0.07# 0.57 Total Note: Components in which ICs are most projected are indicated with #. Table 4: Table 3 but for the case of intermediate noise (b). IC rank [[??].sub.Ed] [[??].sub.ii] [Y.sub.1] ([Z.sub.i,ICA]) ([Z.sub.i,ICA]) (dyad) IC1: i = 1 0.410 0.454 0.70# IC2: i = 2 0.256 0.372 0.00 IC3: i = 3 0.109 0.257 0.03 IC4: i = 4 0.033 0.078 0.59# IC5: i = 5 0.031 0.114 -0.04 Total 0.840 1.277 IC rank [Y.sub.2] [Y.sub.3] [Y.sub.4] [Y.sub.5] (dyad) (triad) (triad) (triad) IC1: i = 1 -0.70# -0.03 0.00 0.08 IC2: i = 2 0.05 0.45# 0.65# 0.58# IC3: i = 3 -0.03 -0.30# 0.72# -0.55# IC4: i = 4 0.59# -0.17 -0.07 0.00 IC5: i = 5 0.04 -0.76# 0.11# 0.54# Total Note: Components in which ICs are most projected are indicated with #. Table 5: Sorted ICs i = 1, ..., 5, for SSTA data showing self-NEs contributing to the ICA contrast function (2nd column), NE components (3rd column), and loading factors [([RR.sub.ICA]).sub.k,i] (i = 1, ..., 5), of the leading ICs on the PC basis. Loading factors, sorted by absolute value, explaining up to 85% of the unitary norm are in bold font. The last two columns give the skewness and kurtosis of ICs. IC rank [[??].sub.Ed] [[??].sub.Ed] PC1 PC2 ([Z.sub.i,ICA]) ([Z.sub.ICA]) IC1: i = 1 0.694 0.747 0.53# 0.05 IC2: i = 2 0.099 0.137 0.44# 0.13 IC3: i = 3 0.042 0.091 0.24 -0.44# IC4: i = 4 0.042 0.074 -0.26 0.35# IC5: i = 5 0.016 0.064 -0.21 0.44# Total 0.893 1.113 IC rank PC3 PC4 PC5 PC6 PC7 PC8 IC1: i = 1 0.34# 0.40# -0.17 0.24# 0.06 0.12 IC2: i = 2 -0.14 0.02 0.03 -0.63# -0.35# -0.10 IC3: i = 3 0.11 0.35# 0.55# -0.27# 0.24# 0.03 IC4: i = 4 -0.07 0.47# 0.36# 0.12 0.28# 0.10 IC5: i = 5 0.06 0.42# -0.18 -0.12 -0.14 0.10 Total IC rank PC9 PC10 PC11 Sk. Kurt. IC1: i = 1 -0.10 0.18 0.52# 1.48 4.95 IC2: i = 2 -0.44# 0.13 -0.15 -0.62 1.79 IC3: i = 3 0.28# -0.29# -0.12 -0.44 1.12 IC4: i = 4 -0.29# 0.42# -0.27# -0.24 1.32 IC5: i = 5 -0.04 -0.70# 0.03 -0.32 0.58 Total Note: Loading factors, sorted by absolute value, explaining up to 85% of the unitary norm are indicated with #.

Printer friendly Cite/link Email Feedback | |

Title Annotation: | Research Article |
---|---|

Author: | Pires, Carlos A.L.; Hannachi, Abdel |

Publication: | Complexity |

Article Type: | Report |

Geographic Code: | 1USA |

Date: | Jan 1, 2017 |

Words: | 15906 |

Previous Article: | A New Ranking Principle For Ordering Trapezoidal Intuitionistic Fuzzy Numbers. |

Next Article: | Methods in Ranking Fuzzy Numbers: A Unified Index and Comparative Reviews. |

Topics: |