Printer Friendly

FCM-Type Fuzzy Coclustering for Three-Mode Cooccurrence Data: 3FCCM and 3Fuzzy CoDoK.

1. Introduction

In many web data analyses, we often have cooccurrence information among objects and items instead of multidimensional observations on objects. For example, web document summarization and web market purchase summarization are reduced to document-keyword cooccurrence analysis and customer-product basket analysis, respectively. FCM-type fuzzy coclustering is an extension of fuzzy c-Means (FCM) [1], where the degree of belongingness to clusters is represented by fuzzy memberships under the fuzzy partition concept [2]. Fuzzy clustering for categorical multivariate data (FCCM) [3] replaced the FCM clustering criterion with the aggregation degree of objects and items in coclusters by adopting entropy-based fuzzification [4, 5]. In fuzzy coclustering of documents and keywords (fuzzy CoDoK) [6], the FCCM criterion was maximized with quadratic regularization-based fuzzification [7], so that it can be applied to large data sets.

Besides their usefulness in many applications, it is also the case that the conventional fuzzy coclustering models cannot work well under severe influences of other intrinsic features. For example, in food preference analysis, users' preferences on foods cannot be revealed considering only user-food cooccurrences but should be found considering implicit relation among users and cooking ingredients, which compose the foods. Then, when we have not only cooccurrence information among objects and items but also intrinsic relation among items and other ingredients; we can expect to find more useful cocluster structures in three-mode cooccurrence information data.

In this paper, two FCM-type fuzzy coclustering models are extended for analyzing three-mode cooccurrence information data, in which FCM-like alternative optimization schemes are performed considering cooccurrence relation among objects, items, and other ingredients. First, the FCCM algorithm is extended to the three-mode FCCM (3FCCM) algorithm by utilizing three types of fuzzy memberships for objects, items, and ingredients, where the aggregation degree of three features in each cocluster is maximized through iterative updating of memberships supported by the entropy-based fuzzification. Second, the 3FCCM algorithm is further extended to the three-mode Fuzzy CoDoK (3Fuzzy CoDoK) by introducing the quadratic regularization-based fuzzification. The characteristic features of the proposed methods are demonstrated through a numerical experiment.

The remainder of this paper is organized as follows: Section 2 gives a brief review on the conventional FCM-type fuzzy coclustering models and Section 3 proposes the novel extensions of the FCM-type coclustering models for three-mode cooccurrence information data. The experimental result is shown in Section 4 and a summary conclusion is presented in Section 5.

2. FCM-Type Fuzzy Coclustering

Fuzzy c-Means (FCM) [1, 5] is a fuzzy extension of the conventional crisp k-Means [8] by introducing fuzzy partition concept [2]. When we have multidimensional observations on n objects [x.sub.i], i = 1, ..., n, they are partitioned into C fuzzy clusters by estimating fuzzy memberships [u.sub.ci] for each object, where [u.sub.ci] represents the degree of belongingness of object i to cluster c and is generally calculated under the probabilistic constraint of [[summation].sup.C.sub.c=1] [u.sub.ci] = 1. In FCM, each cluster is represented by prototypical centroids and objects are partitioned so that the membership-weighted within-cluster errors from prototypes are minimized in the multidimensional data space. On the other hand, in the coclustering context, we have only relational information among elements but do not use any cluster prototypes in multidimensional space. In this paper, two variants of FCM-type fuzzy coclustering are considered.

2.1. FCCM. Assume that we have n x m cooccurrence information R = [[r.sub.ij]} among n objects and m items; for example, in document-keyword analysis, [r.sub.ij] can be the frequency of keyword (item) j in document (object) i. The goal is to extract coclusters composed of mutually familiar pairs of objects and items by simultaneously estimating fuzzy memberships of objects [u.sub.ci] and items [w.sub.cj] such that mutually familiar objects and items with large [r.sub.ij] tend to have large memberships in the same cluster considering the aggregation degree of each cocluster. The sum of aggregation degrees to be maximized is defined as [3]

L = [C.summation over (c=1)] [n.summation over (i=1)] [m.summation over (j=1)] [u.sub.ci][w.sub.cj][r.sub.ij]. (1)

This objective function is based on the similar concept to such relational matrix decomposition methods as corresponding analysis (CA) [9] and nonnegative matrix factorization (NMF) [10], where relational matrices R = {[r.sub.ij]} are decomposed into two component matrices having orthogonal columns. Beside both objects and items are equally forced to be exclusive in the matrix decomposition methods, FCM-type coclustering models adopt different kinds of partition constraints [11]. Here, object memberships [u.sub.ci] have a similar role to those of FCM under the same condition, such that [[summation].sup.C.sub.c=1] [u.sub.ci] = 1. If item memberships [w.sub.cj] also obey a similar condition of [[summation].sup.C.sub.c=1] [w.sub.cj] = 1, the aggregation criterion has a trivial maximum of [u.sub.ci] = [w.sub.cj] = 1, [for all]i, j in a particular cluster c. Then, in order to avoid trivial solutions, [w.sub.cj] are forced to be exclusive in each cluster, such that [[summation].sup.m.sub.j=1] [w.sub.cj] = 1, and, so, [w.sub.cj] represent the relative typicalities of items in each cluster. As a result, object partitioning is mainly targeted in FCM-type coclustering while CA and NMF equally force exclusive nature to partitions of both objects and items.

Because of the linear nature with respect to [u.sub.ci] and [w.sub.cj], (1) is maximized with crisp memberships of [u.sub.ci] [member of] {0,1} and [w.sub.cj] [member of] {0,1} in a similar manner to k-Means. In order to find fuzzy partition, some fuzzification mechanism must be introduced like FCM.

In [3], the linear aggregation criterion of (1) was nonlinearized with respect to [u.sub.ci] and [w.sub.cj] by entropy-based penalties [4, 5] for fuzzification of two-types of memberships and the objective function for Fuzzy Clustering for Categorical Multivariate data (FCCM) was proposed as

[L.sub.fccm] = [C.summation over (c=1)] [n.summation over (i=1)] [m.summation over (j=1)] [u.sub.ci][w.sub.cj][r.sub.ij] - [[lambda].sub.u] [C.summation over (c=1)] [n.summation over (i=1)] [u.sub.ci] log [u.sub.ci]

- [[lambda].sub.w] [C.summation over (c=1)] [m.summation over (j=1)] [w.sub.cj] log [w.sub.cj], (2)

where [[lambda].sub.u] and [[lambda].sub.w] are the fuzzification weights for object memberships and item memberships, respectively. Larger [[lambda].sub.u] and [[lambda].sub.w] bring fuzzier partitions of objects and items.

Based on the alternative optimization principle, [u.sub.ci] and [w.sub.cj] are iteratively updated until convergent using the following updating rules:

[u.sub.ci] = exp ([[lambda].sup.-1.sub.u] [[summation].sup.m.sub.j=1] [w.sub.cj] [r.sub.ij])/[[summation].sup.C.sub.l=1] exp ([[lambda].sup.-1.sub.u] [[summation].sup.m.sub.j=1] [w.sub.lj] [r.sub.ij]),

[w.sub.cj] = exp ([[lambda].sup.-1.sub.w] [[summation].sup.n.sub.i=1] [u.sub.ci] [r.sub.ij])/[[summation].sup.m.sub.l=1] exp ([[lambda].sup.-1.sub.w] [[summation].sup.n.sub.i=1] [u.sub.ci] [r.sub.il]). (3)

Although the two updating rules are always fair under the constraints, they can be numerically unstable due to overflows because exp(*) function can take extremely large values with very large n or m.

2.2. Fuzzy CoDoK. As an alternative approach, Kummamuru et al. [6] extended FCCM by introducing the quadric term-based fuzzification mechanism [7] instead of the entropy-based fuzzification, so that it can handle larger data sets. The objective function of fuzzy coclustering of documents and keywords (Fuzzy CoDoK) was proposed as

[L.sub.fcdk] = [C.summation over (c=1)] [n.summation over (i=1)] [m.summation over (j=1)] [u.sub.ci][w.sub.cj][r.sub.ij] - [[lambda].sub.u] [C.summation over (c=1)] [n.summation over (i=1)] [u.sup.2.sub.ci] - [[lambda].sub.w] [C.summation over (c=1)] [m.summation over (j=1)] [w.sup.2.sub.cj], (4)

where [[lambda].sub.u] and [[lambda].sub.w] play similar roles to FCCM.

Based on the Lagrangian multiplier method, the updating rules are obtained as

[u.sub.ci] = 1/C + 1/2[[lambda].sub.u] {[m.summation over (j=1)] [w.sub.cj][r.sub.ij] - 1/C ([C.summation over (l=1)] [m.summation over (j=1)] [w.sub.lj] [r.sub.ij])},

[w.sub.cj] = 1/m + 1/2[[lambda].sub.w] {[n.summation over (i=1)] [u.sub.ci][r.sub.ij] - 1/m ([m.summation over (l=1)] [n.summation over (i=1)] [u.sub.ci] [r.sub.il])}. (5)

The updating rules are more numerically stable than those of FCCM because their calculation ranges are in linear orders with respect to n and m. However, [u.sub.ci] and [w.sub.cj] can be negative and are not fair under the constraints. Then, in practice, the negative memberships are set to zero, and the remaining positive memberships are renormalized so that their sum is one.

Besides the usefulness of these fuzzy coclustering models in handling two-modes cooccurrence information, their cocluster structures may be influenced by other third elements. Specifically, if each item is related to some other ingredients, the partition quality is expected to be improved by considering the intrinsic relation among three-mode elements. In the following section, the FCM-type coclustering algorithms are extended for analyzing such three-mode cooccurrence information data.

3. Extension of FCM-Type Coclustering for Three-Mode Cooccurrence Data Analysis

Assume that we have n x m cooccurrence information R = {[r.sub.ij]} among n objects and m items, and the items are characterized with other ingredients, where cooccurrence information among m items and p other ingredients are summarized in m x p matrix S = {[s.sub.jk]} with [s.sub.jk] representing the cooccurrence degree of item j and ingredient k. For example, in food preference analysis, R can be an evaluation matrix by n users on m foods and S may be appearance/absence of p cooking ingredients in m foods. The goal of three-mode cocluster analysis is to reveal the cocluster structures among the objects, items, and ingredients considering R and S and intrinsic relation among objects and ingredients.

In order to extend the conventional FCCM and Fuzzy CoDoK algorithms to three-mode cocluster analysis, additional memberships [z.sub.ck] are introduced for representing the membership degree of ingredients k to cocluster c. Besides the familiar pairs of objects and items simultaneously occur in the same cluster; typical ingredients of the items should also belong to the same cluster. Then, the aggregation degree to be maximized in the three-mode coclustering can be as

L = [C.summation over (c=1)] [n.summation over (i=1)] [m.summation over (j=1)] [p.summation over (k=1)] [u.sub.ci] [w.sub.cj] [z.sub.ck] [r.sub.ij] [s.sub.jk], (6)

where each cluster should be composed of the familiar group of objects, items, and ingredients such that they are assigned to the same cluster when object i cooccurs with item j composed of ingredient k by implying an intrinsic connection between object i and ingredient k.

In the following parts of this section, the conventional FCCM and Fuzzy CoDoK algorithms are extended to their three-mode versions utilizing the above aggregation criterion.

3.1. Three-Mode Extension of FCCM. First, the FCCM algorithm is extended by using the modified aggregation criterion of (6) supported by the entropy-based fuzzification scheme. The objective function for three-mode FCCM (3FCCM) is constructed by modifying the FCCM objective function of (2) as

[mathematical expression not reproducible], (7)

where [[lambda].sub.z] is the additional penalty weight for fuzzification of ingredient memberships [z.sub.ck]. The larger the value of [[lambda].sub.z] is, the fuzzier the ingredient memberships are.

Here, it should be noted that we can adopt two different types of constraints to ingredient memberships [z.sub.ck], such that object-type probabilistic constraint [[summation].sup.C.sub.c=1] [z.sub.ck] = 1, [for all]k or item type typicality constraint [[summation].sup.p.sub.k=1] [z.sub.ck] = 1, [for all]c. In such cases as food preference analysis, some common ingredients may be widely used in many foods while other rare ingredients can be negligible in all clusters. Then, from the view point of typical ingredient selection for characterizing cocluster features, item-type typicality constraint is adopted in this paper, such that [[summation].sup.p.sub.k=1] [z.sub.ck] = 1, [for all]c.

The clustering algorithm is an iterative process of updating [u.sub.ci], [w.sub.cj], and [z.sub.ck] under the alternative optimization principle. Considering the necessary conditions for the optimality [partial derivative][L.sub.3fccm]/[partial derivative][u.sub.ci] = 0 [partial derivative][L.sub.3fccm]/[partial derivative][w.sub.cj] = 0, and [partial derivative][L.sub.3fccm]/[partial derivative][z.sub.ck] = 0 under the sum-to-one constraints, the updating rules for three memberships are given as

[u.sub.ci] = exp ((1/[[lambda].sub.u]) [[summation].sup.m.sub.j=1] [[summation].sup.p.sub.k=1] [w.sub.cj] [z.sub.ck] [r.sub.ij] [s.sub.jk])/[[summation].sup.C.sub.l=1] exp [[summation].sup.m.sub.j=1] [[summation].sup.p.sub.k=1] [w.sub.lj] [z.sub.lk] [r.sub.ij] [s.sub.jk]), (8)

[w.sub.cj] = exp ((1/[[lambda].sub.w]) [[summation].sup.n.sub.i=1] [[summation].sup.p.sub.k=1] [u.sub.ci] [z.sub.ck] [r.sub.ij] [s.sub.jk])/[[summation].sup.m.sub.l=1] exp ((1/[[lambda].sub.w]) [[summation].sup.n.sub.i=1] [[summation].sup.p.sub.k=1] [u.sub.ci] [z.sub.ck] [r.sub.il] [s.sub.lk]), (9)

[z.sub.ck] = exp ((1/[[lambda].sub.z]) [[summation].sup.n.sub.i=1] [[summation].sup.m.sub.j=1] [u.sub.ci] [w.sub.cj] [r.sub.ij] [s.sub.jk])/[[summation].sup.p.sub.l=1] exp ((1/[[lambda].sub.z]) [[summation].sup.n.sub.i=1] [[summation].sup.m.sub.j=1] [u.sub.ci] [w.sub.cj] [r.sub.ij] [s.sub.ll]). (10)

3.2. Three-Mode Extension of Fuzzy CoDoK. Next, Fuzzy CoDoK is extended to the three-mode coclustering model named three-mode Fuzzy CoDoK (3Fuzzy CoDoK). The objective function of (4) is modified as

[L.sub.3fcdk] = [C.summation over (c=1)] [n.summation over (i=1)] [m.summation over (j=1)] [p.summation over (k=1)] [u.sub.ci][w.sub.cj][z.sub.ck][r.sub.ij][s.sub.jk] - [[lambda].sub.u] [C.summation over (c=1)] [n.summation over (i=1)] [u.sup.2.sub.ci]

- [[lambda].sub.w] [C.summation over (c=1)] [m.summation over (j=1)] [w.sup.2.sub.cj] - [[lambda].sub.z] [C.summation over (c=1)] [p.summation over (k=1)] [z.sup.2.sub.ck], (11)

where [[lambda].sub.z] play a similar role to that in 3FCCM and the three types of fuzzy memberships also follow the same constraints with 3FCCM.

The updating rules are given in the similar manner to the previous section as follows:

[u.sub.ci] = 1/C + 1/2[[lambda].sub.u] {[m.summation over (j=1)] [p.summation over (k=1)] [w.sub.cj][z.sub.ck][r.sub.ij][s.sub.jk]

- 1/C ([C.summation over (l=1)] [m.summation over (j=1)] [p.summation over (k=1)] [w.sub.lj][z.sub.lk][r.sub.ij][s.sub.jk])}, (12)

[w.sub.cj] = 1/m + 1/2[[lambda].sub.w] {[m.summation over (i=1)] [p.summation over (k=1)] [u.sub.cj][z.sub.ck][r.sub.ij][s.sub.jk]

- 1/m ([m.summation over (l=1)] [n.summation over (i=1)] [p.summation over (k=1)] [u.sub.ci][z.sub.ck][r.sub.il][s.sub.lk])}, (13)

[z.sub.ck] = 1/p + 1/2[[lambda].sub.z] {[n.summation over (i=1)] [m.summation over (j=1)] [u.sub.ci][w.sub.cj][r.sub.ij][s.sub.jk]

- 1/p ([p.summation over (l=1)] [n.summation over (i=1)] [m.summation over (j=1)] [u.sub.ci][w.sub.cj][r.sub.ij][s.sub.jl])}. (14)

In a similar manner to Fuzzy CoDoK, the above updating rules are computationally more stable than 3FCCM because of the lack of exp(*) function. However, [u.sub.ci], [w.sub.cj], and [z.sub.ck] can be negative. Then, in practice, the negative memberships should be set to zero, and the remaining positive memberships can be renormalized so that their sum is one.

3.3. A Sample Algorithm for FCM-Type Fuzzy Coclustering for Three-Mode Cooccurrence Data. Following the above derivation, a sample algorithm is represented as follows:

[FCM-Type Fuzzy Coclustering for Three-Mode Cooccurrence Data: 3FCCM and 3Fuzzy CoDoK]

(1) Given n x m cooccurrence matrix R and m x p cooccurrence matrix S, let C be the number of clusters. Choose the fuzzification weights [[lambda].sub.u], [[lambda].sub.w], and [[lambda].sub.z].

(2) [Initialization] Randomly initialize [u.sub.ci], [w.sub.cj], and [z.sub.ck], such that [[summation].sup.C.sub.c=1] [u.sub.ci] = 1, [[summation].sup.m.sub.j=1] [w.sub.cj] = 1, and [[summation].sup.p.sub.k=1] [z.sub.ck] = 1.

(3) [Iterative process] Iterate the following process until convergence of all [u.sub.ci].

(a) Update [u.sub.ci] with (8) for 3FCCM or (12) for 3Fuzzy CoDoK.

(b) Update [w.sub.cj] with (9) for 3FCCM or (13) for 3Fuzzy CoDoK.

(c) Update [z.sub.ck] with (10) for 3FCCM or (14) for 3Fuzzy CoDoK.

4. Experimental Results

4.1. Experimental Design. In order to demonstrate the characteristics of the proposed algorithms, a numerical experiment was performed with an artificially generated three-mode data set, in which 40 objects (n = 40) have relational connection with 50 items (m = 50) and the items are related to 30 ingredients (p = 30). The artificial three-mode cooccurrence matrices were generated under the assumption that objects and ingredients have intrinsic (unknown) connections, as shown in the 40 x 30 matrix X = {[x.sub.ik]} of Figure 1(a), where black and white cells represent full-connection ([x.sub.ik] = 1) and no-connection ([x.sub.ik] = 0), respectively. (Note that all the following gray-scale figures depict visual images of matrices, where black and white cells represent maximum and minimum values.)

50 x 30 cooccurrence matrix S = {[s.sub.jk]} among items and ingredients was constructed, as shown in Figure 1(b), where ingredients k randomly occurred ([s.sub.jk] = 1) in each item with 10% probability, whereas others remained as [s.sub.jk] = 0. Then, 40 x 50 cooccurrence matrix R = {[r.sub.ij]} among objects and items was generated, such that [r.sub.ij] = 1 if item j cooccurs with several ingredients, which are connected with object i in X, and [r.sub.ij] = 0 otherwise. Figure 1(c) shows the cooccurrence matrix R. For example, in food preference analysis, 50 foods are made from 30 cooking ingredients (matrix S) and each of 40 user chooses some foods (matrix R) considering their intrinsic preferences on cooking ingredients (matrix X).

The goal of this experiment is to extract the intrinsic cocluster structure among objects, items, and ingredients from cooccurrence matrices R and S without utilizing the intrinsic (unknown) connection X among objects and ingredients; that is, X is withheld in the following experiments.

4.2. Cocluster Extraction by 3FCCM and 3Fuzzy CoDoK. First, the proposed 3FCCM and 3Fuzzy CoDoK algorithms were applied to R and S with C = 3 and their results are compared. Fuzziness penalties were [[lambda].sub.u] = 0.1, [[lambda].sub.w] = 0.2 and [[lambda].sub.z] = 0.3 for 3FCCM, and [[lambda].sub.u] = 0.05, [[lambda].sub.w] = 10.0, and [[lambda].sub.z] = 15.0 for 3Fuzzy CoDoK. The derived three types of memberships, which were the most frequent solutions in 100 trials with different random initializations, are depicted in the gray-scale figures of Figures 2 and 3, where each row represents the membership degree of objects, items, or ingredients for a cluster.

Figures 2(a) and 2(b) indicate that the 40 objects were successfully partitioned into three clusters by the 3FCCM algorithm, in which some meaningful ingredients, that is, cluster-wise typical ingredients in Figure 1(a), have large memberships for characterizing each cocluster, even though the intrinsic information X was withheld in the experiment. Additionally, some typical items of each cluster were also indicated by large [w.sub.cj], as shown in Figure 2(c); for example, items 4, 5, 9, 14, 25, and 37 are typical in cluster 1.

By the way, Figure 3(a) indicates that the 3Fuzzy CoDoK algorithm also extracted almost same object clusters with Figure 2(a), but the ingredient memberships shown in Figure 3(b) have slightly different features from Figure 2(b). Only a few ingredients have very large memberships while many other ones have completely zero memberships because of negativity of (14). Additionally, some meaningless ingredients had nonzero memberships in contrast to the result of 3FCCM. The similar feature can be also seen in Figure 3(c).

These results imply that the 3FCCM algorithm is more suitable for clearly capturing the intrinsic connections although 3Fuzzy CoDoK has an advantage in computational stability.

4.3. Comparison with Conventional Two-Mode Fuzzy Coclustering. Second, the above clustering results are compared with the conventional FCCM and Fuzzy CoDoK, which are designed only for two-mode cooccurrence information. Although the intrinsic connection X is withheld in this experiment, a similar intrinsic information can be reconstructed by multiplying two cooccurrence matrices R and S, such that R x S gives an n x p relational matrix on objects and ingredients. Figure 4 shows the estimated 40 x 30 intrinsic connection matrix [??] = R x S.

The conventional FCCM and Fuzzy CoDoK were applied to [??]. Fuzziness penalties were [[lambda].sub.u] = 0.05 and [[lambda].sub.w] = 50.0 for FCCM and [[lambda].sub.u] = 0.1 and [[lambda].sub.w] = 100.0 for Fuzzy CoDoK. Here, item memberships [w.sub.ck] are identified with ingredient memberships [z.sub.ck] in the algorithms. Figures 5 and 6 show the derived memberships, which most frequently appeared in 100 trials with different random initializations. The figures imply that 40 objects were partitioned into similar three clusters to those of 3FCCM and 3Fuzzy CoDoK. However, ingredient memberships [z.sub.ck] were slightly contaminated and it is hard to intuitively select meaningful ingredients comparing with the result of 3FCCM. It may be because all items are embedded into [??] with equal responsibilities and the estimated [??] = R x S was contaminated by noise as shown in Figure 4 rather than X of Figure 1(a). In contrast, the typical ingredients can be extracted in 3FCCM by selecting only meaningful items in each cluster.

Next, the robustness of the algorithms against random initialization is studied by comparing the frequencies of the plausible solutions. Table 1 compares the frequencies of the above results in 100 trials with different random initializations and indicates that the proposed three-mode coclustering models are more robust to random initialization than the conventional two-modes ones by utilizing three-mode cooccurrence information. That is, the optimal selections of both items and ingredients contribute to reduction of influences of randomness.

Therefore, the proposed algorithms are useful in analyzing three-mode cooccurrence information, which simultaneously consider the typicality of three elements.

4.4. Comparison with Multiple Corresponding Analysis. Finally, the partition characteristic of the proposed coclustering models is compared with the relational matrix decomposition method. Multiple correspondence analysis (MCA) [9] is a technique for revealing the structural information of categorical data, where mutual relations among objects and multiple categories are summarized into low-dimensional plots. In this experiment, an enlarged cross-tabulation was constructed by combining two cooccurrence matrices R and Z into m x (n + s) matrix [[R.sup.T], Z] so that the three elements are summarized on a plots figure. Figure 7 shows the 2D plots figure given by MCA. Although MCA does not necessarily aim at object-targeting partition, n objects were clearly separated into three subgroups in a similar manner to the proposed coclustering models because the objects had almost crisp boundaries. However, many other items and ingredients were distributed in the middle area and their contribution to the clusters was not emphasized as in the case of two-mode fuzzy coclustering of the previous subsection.

The proposed algorithms have advantages in handling three-mode elements by emphasizing their contributions to each coclusters. Additionally, while the implicit fuzziness degree of MCA is fixed (unchangeable), the proposed coclustering model can improve the interpretability of cluster partition by tuning the fuzziness degrees.

5. Conclusion

In this paper, novel coclustering models were proposed for analyzing three-mode cooccurrence information with the goal being to improve the partition quality of the conventional two-modes analysis. The proposed 3FCCM and 3Fuzzy CoDoK algorithms extended the conventional FCCM and Fuzzy CoDoK algorithms by introducing an additional membership for ingredients into the aggregation degree of three elements: objects, items, and ingredients. A numerical experiment with an artificial data set demonstrated that 3FCCM is more useful in capturing the intrinsic connection among objects and ingredients while 3Fuzzy CoDoK is suitable for handling large data sets with its computational stability.

Besides the simplicity of FCM-type coclustering, FCCM and fuzzy CoDoK sometimes have the difficulty in tuning of fuzziness degrees. In the conventional two-modes coclustering, an MMMs-induced model [12] showed a better utility than FCCM and fuzzy CoDoK. A potential future work is to improve the proposed FCM-type three-mode coclustering by introducing a statistical concept for easy tuning of fuzziness degrees. Another direction of future work is to develop a validity measure [13] for selecting the optimal cluster partitions.

https://doi.org/10.1155/2017/9842127

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work was supported in part by Tateisi Science and Technology Foundation, Japan, under Research Grant 2017.

References

[1] J. C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithms, Plenum Press, NY, USA, 1981.

[2] E. H. Ruspini, "A new approach to clustering," Information and Control, vol. 15, no. 1, pp. 22-32,1969.

[3] C.-H. Oh, K. Honda, and H. Ichihashi, "Fuzzy clustering for categorical multivariate data," in Proceedings of the Joint 9th IFSA World Congress and 20th NAFIPS International Conference, vol. 4, pp. 2154-2159, July 2001.

[4] S. Miyamoto and M. Mukaidono, "Fuzzy c-means as a regularization and maximum entropy approach," in Proceedings of the 7th International Fuzzy Systems Association World Congress, vol. 2, pp. 86-92,1997

[5] S. Miyamoto, H. Ichihashi, and K. Honda, Algorithms for Fuzzy Clustering: Methods in C-Means Clustering with Applications, vol. 229 of Studies in Fuzziness and Soft Computing, Springer, Berlin, Germany, 2008.

[6] K. Kummamuru, A. Dhawale, and R. Krishnapuram, "Fuzzy coclustering of documents and keywords," in Proceedings of the IEEE International Conference on Fuzzy Systems, vol. 2, pp. 772-777, May 2003.

[7] S. Miyamoto and K. Umayahara, "Fuzzy clustering by quadratic regularization," in Proceedings of the 1998 IEEE International Conference on Fuzzy Systems IEEE World Congress on Computational Intelligence, vol. 2, pp. 1394-1399, Anchorage, AK, USA.

[8] J. MacQueen, "Some methods for classification and analysis of multivariate observations," in Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, pp. 281-297, University of California Press, Berkeley, Calif, USA, 1967.

[9] M. Tenenhaus and F. W Young, "An analysis and synthesis of multiple correspondence analysis, optimal scaling, dual scaling, homogeneity analysis and other methods for quantifying categorical multivariate data," Psychometrika, vol. 50, no. 1, pp. 91-119, 1985.

[10] D. D. Lee and H. S. Seung, "Learning the parts of objects by non-negative matrix factorization," Nature, vol. 401, no. 6755, pp. 788-791, 1999.

[11] K. Honda, "Fuzzy co-clustering and application to collaborative filtering," in Integrated Uncertainty in Knowledge Modelling and Decision Making, V N. Huynh, M. Inuiguchi, B. Le, and T Denoeux, Eds., vol. 9978 of Lecture Notes in Computer Science, pp. 16-23, Springer International Publishing, Cham, Switzerland, 2016.

[12] K. Honda, S. Oshio, and A. Notsu, "Fuzzy co-clustering induced by multinomial mixture models," Journal of Advanced Computational Intelligence and Intelligent Informatics, vol. 19, no. 6, pp. 717-726, 2015.

[13] W. Wang and Y. Zhang, "On fuzzy cluster validity indices," Fuzzy Sets and Systems, vol. 158, no. 19, pp. 2095-2117, 2007

Katsuhiro Honda, (1) Yurina Suzuki, (1) Seiki Ubukata, (1) and Akira Notsu (2)

(1) Graduate School of Engineering, Osaka Prefecture University, Sakai, Osaka 599-8531, Japan

(2) Graduate School of Humanities and Sustainable System Sciences, Osaka Prefecture University, Sakai, Osaka 599-8531, Japan

Correspondence should be addressed to Katsuhiro Honda; honda@cs.osakafu-u.ac.jp

Received 25 August 2017; Accepted 27 November 2017; Published 18 December 2017

Academic Editor: Ferdinando Di Martino

Caption: Figure 1: Artificial three-mode cooccurrence information data.

Caption: Figure 2: Derived memberships by proposed 3FCCM.

Caption: Figure 3: Derived memberships by proposed 3Fuzzy CoDoK.

Caption: Figure 4: Estimated intrinsic connection matrix [??] = R x S.

Caption: Figure 5: Derived memberships by conventional FCCM.

Caption: Figure 6: Derived memberships by conventional Fuzzy CoDoK.

Caption: Figure 7: 2D Plots given by multicorresponding analysis.
Table 1: Comparison of frequencies of plausible solutions in 100
trials.

Algorithm     Two-mode              Three-mode

            FCCM   Fuzzy CoDoK   3FCCM   3Fuzzy CoDoK

Frequency    78        47         90         100
COPYRIGHT 2018 Hindawi Limited
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2018 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Title Annotation:Research Article
Author:Honda, Katsuhiro; Suzuki, Yurina; Ubukata, Seiki; Notsu, Akira
Publication:Advances in Fuzzy Systems
Article Type:Report
Date:Jan 1, 2018
Words:4926
Previous Article:A Study on Some Fundamental Properties of Continuity and Differentiability of Functions of Soft Real Numbers.
Next Article:Kaizen Selection for Continuous Improvement through VSM-Fuzzy-TOPSIS in Small-Scale Enterprises: An Indian Case Study.

Terms of use | Privacy policy | Copyright © 2020 Farlex, Inc. | Feedback | For webmasters