Research Trend Analysis of Geospatial Information in South Korea Using Text-Mining Technology.
The term geospatial information (GI) was initially coined in Europe and is now used internationally . GI is information that expresses geographical location and attributes in a form that computers can recognize [2-4]. GI can apply to all information on the Earth's surface, underground, and in the ocean and atmosphere [5,6]. Representative examples of GI include satellite images and numerical maps. GI has maintained qualitative and quantitative growth with the development of Geographic Information Systems (GIS) . Recently, GI has emerged as the core of the 4th industrial revolution and is expected to create more value by fusing with other technologies . An understanding of GI and application fields is necessary for utilizing integrated GIs; therefore, it is important to analyze research trends that can confirm this evolutionary transition and changes in academic interest in GI.
As text-mining techniques have developed, it has become possible to analyze large amounts of text data, accumulated over long periods, and a technical environment suitable for analyzing the academic trend has been created . Text mining is a way of processing unstructured data and analyzing patterns that are latent in a text-based database [9, 10]. It provides a means to automatically extract natural language (character information) using a mechanical algorithm . Recently, studies using text-mining techniques are increasingly used in various fields such as computer science, statistics, nutrition, and construction [12-14]. Hung and Zhang  analyzed the abstracts of Science Citation Index (SCI)/Social Science Citation Index (SSCI) theses published between 2003 and 2008, for research trends in the field of mobile learning. He et al.  analyzed Facebook and Twitter posts and analyzed buyer preferences for pizza chains. Lim et al.  analyzed trends in GI by analyzing the frequency and time series of keywords extracted from papers and reports published in South Korea. However, prior studies have been limited in terms of analytical methods. Since analysis is based on basic statistical analysis of extracted keywords, only sections of research trends can be detected. These statistical methods do not show which topics are centrally located within the research flow or how the connection structure of each theme changes over time. To address this problem, network analysis methodologies such as cocitation analysis or coword analysis have been introduced [18,19]. Cocitation analysis determines the characteristics of the document through an analysis of citation relationships between documents . However, cocitation analysis has limitations in analyzing research trends using literature units . Coword analysis is a method of detecting the relationships of keywords extracted from the literature and has advantages in research trend analysis. In particular, coword analysis is useful in that it enables the determination of the nature and strength of relationships between keywords. Recently, network analysis has been applied in text-mining research in various fields . Kajikawa et al.  performed network analysis in the field of energy research to create a road map to sustainable energy. Additionally, Kajikawa and Takeda  derived promising results through an analysis of the network structure of studies of organic light-emitting diodes. Choi et al.  conducted a temporal analysis of the patent network to detect changing trends in technology. Through this type of keyword network analysis, it is possible to grasp large-scale trends in Research Fields.
In this research, GI research trends were examined using basic statistical methods and network analysis. The main target keywords of this trend analysis were GI and Research Field (GI-based). Papers relating to GI, over the past 20 years (1996-2015), were screened in the Korea Citation Index (KCI). Additionally, a set of domains (GI, Research Field) were extracted from the keywords presented in each paper. Basic statistical analyses and network analysis were conducted based on these extracted sets of domains. The results of these analyses allowed us to detect large-scale trends for GI, Research Field, and their interrelationships and to present new research themes that combine GI and related research. Managers and policy makers in the field of GI need to know researchers' interests and research priorities to allocate limited resources to GI fields appropriately. Thus, our research mainly aimed to demonstrate the status of GI research trends in South Korea and determine new directions for development.
In this study, basic statistical analysis and network analysis were performed using keywords from GI-related papers. As shown in Figure 1, the research procedure was divided into three stages: data collection, preprocessing, and analysis. First, GI-related papers were collected during the data collection stage and the keywords presented in the paper were extracted. Next, during the preprocessing stage, the keywords were categorized based on the classification scheme. Finally, in the analysis stage, basic statistical analysis and network analysis were performed.
2.1. Data Collection and Preprocessing. This study was limited to papers published in South Korea. GI-related papers were collected via the KCI database (DB), which can search and collect all papers published in South Korea. We entered Geospatial Information into the KCI DB, with a collection period limited to the past 20 years (1996-2015). We excluded papers whose research objectives were system construction and services. A total of 869 papers were selected as research data. Preprocessing of the keywords was performed in two steps. In the first step, keywords relating to GI and Research Field were selected from the keywords listed in the collected papers. Keywords with similar meanings were changed to preselected keywords. For example, GIS and Geographic Information System were processed with the same keyword. In the second step, we classified the collected keywords according to a modified keyword classification system based on criteria presented by Lee et al.  and the Korea National Spatial Data Infrastructure (NDSI) Portal . In particular, NDSI  provides a certified classification system for GI produced in South Korea. These classification systems resulted in 13 GI domains and 13 Research Field domains (Tables 1 and 2).
2.2. Analysis. We performed an analysis of research trends over the entire period (1996-2015) and also as divided into four periods: 1st term (1996-2000), 2nd term (2001-2005), 3rd term (2006-2010), and 4th term (2011-2015). The reason for dividing into the 4 periods is that the number of papers published in one year is small. We performed basic statistical analysis, including occurrence frequency and time series development of the GI and Research Field domains, and a network analysis focusing on the frequency of simultaneous domain appearances. Basic statistical analysis yielded the schematic flow of the GI and Research Field domains. Network analysis compared the relative importance of domains and visualized the connective structure of domains. Our network analysis included the calculation of four indices: frequency, degree, closeness centrality, and betweenness centrality. Frequency indicates the number of domains extracted from the papers. Degree is an index indicating how connected a particular node is to the surrounding nodes. Closeness centrality is the average of the number of specific trunk lines for one node to be connected to each node on the network. Finally, betweenness centrality is an index that measures the extent to which a specific node plays an intermediary role when constructing a network with other nodes. The scope of analysis was divided into major classification and subdivision classification analyses. Major classification analysis targeted the domains presented in Tables 1 and 2, and subdivision classification analysis targeted the keywords contained in each domain. We conducted all analyses using R and NodeXL , which are public software.
3.1. Major Classification Analysis
3.1.1. The GI Domain. GI-related keywords extracted from the papers were divided into 13 GI domains, which are listed in Table 1. Figure 2 shows the frequency percentage and time series frequency of the GI domains over the entire study period. GI domains with a frequency percentage equal to or greater than 10% were Satellite Image (24.1%), Natural Disaster Thematic Map (13.6%), and General-Purpose Map (11.7%) (Figure 2(a)). Among GI domains, the frequency percentage of Satellite Image was the highest, because Natural Disaster Thematic Map, General-purpose Map, and so forth are created by processing satellite images. The rate of occurrence of Natural Disaster Thematic Map was high because interest in natural disasters such as volcanic eruptions and earthquakes has increased in South Korea since the 1990s. In particular, there has been great concern about the volcanic explosion on Mt. Baekdu, located on the Korean Peninsula . Additionally, the frequency percentage of the General-Purpose Map domain was high because it contains base maps for GIS-based spatial analysis tools such as digital maps and digital elevation models (DEMs), each of which is a separate domain (Table 1). The frequency percentage of the Water Resource Thematic Map, Biodiversity Thematic Map, and Forest Thematic Map domains gradually increased in periods 1-3; however, they sharply decreased in periods 3-4.
3.1.2. The Research Field Domain. Keywords related to Research Field were divided into 13 Research Field domains, as listed in Table 2. Figure 3 shows the frequency percentage and time series frequency of GI domains for the entire period. Research Field domains with a frequency percentage equal to or greater than 10% were Climate (27%), Natural Disaster (18.6%), Urban (12%), and Water Resource (10.8%) (Figure 3(a)). The high frequency percentage of the Climate domain was due to increases in damage caused by worldwide Climate change . In the time series analysis, the rate of change of the frequency of domains, excluding Climate and Natural Disaster, was nearly constant (Figure 3(b)). The frequency of the Climate domain showed the sharpest rise in the 2nd and 3rd periods and showed a tendency to decrease in the 3rd period. In the Natural Disaster domain, the rate of change of frequency increased almost constantly over periods 1-4.
3.1.3. The GI-Research Field Network. Basic statistical analysis has limited application to the quantitative aspect of the frequency of domain. Through a network analysis of simultaneous domains (GI, Research Field), it is possible to compare temporal structural changes in the network and the relative importance of each domain in the network. Table 3 shows the number of links and nodes in the network over time. An increase in the number of links and nodes means that the structure of the network is becoming more complicated. The nodes indicate the domains classified based on the lists in Tables 1 and 2. Links indicate the relationships between the GI and Research Field domains; one link (GI-Research Field) was extracted per paper. In Table 3, the number of links is obtained by dividing redundant links in the network. The value displayed in parentheses is the total number of links that are not considered redundant, indicating the number of papers collected during each period. The number of links increased during periods 1-3 and decreased during the 4th period. However, since the number of links that were not duplicated has continuously increased over periods 1-4, we cannot infer that the network scale was reduced during the 4th period.
Table 4 shows the results of network analysis by time period; for this analysis, we did not distinguish between the GI and Research Field domains, to observe their integrated importance in the GI-Research Field network. The maximum number of nodes (domains) was 26 (13 for each domain) (Tables 1 and 2). We display domains with the top 10 network index (frequency, degree, closeness centrality, and betweenness centrality) values in Table 4. Italic cells in Table 4 represent the domains in which all network indices are within the top 10 in each period. These domains are relatively important within the GI-Research Field network. Therefore, it is reasonable to analyze the time series of the network around these domains. Two domains were in the top 10 for the entire period (periods 1-4): Satellite Image and Climate. Other domains repeatedly rose and fell in rank according to each period. We conclude that, irrespective of the periods, the research themes that receive steady attention are Satellite Image in the GI domain and Climate in the Research Field domain. These features are consistent with the results of our frequency analysis by period (Figures 2(b) and 3(b)). A high frequency indicates that the domain was frequently used as a research theme. The Satellite Image domain had the smallest variation in rank over time and was top ranking (1st to 3rd place) during periods 1-4. The Climate and Natural Disaster domains ranked higher than the Satellite Image domain during periods 3-4. Thus, the center of the GI-Research Field network moved from GI to Research Field. Such features were also observed in the analysis of degree, closeness centrality, and betweenness centrality indices in periods 1-4. Degree indicates how connected each node is to the surrounding nodes. At least three degree values ranked within the top 10 over the whole period. The domain with the highest degree was Climate, during the 4th period. In particular, the degree value of Climate continued to rise during all four periods (1st period: 5; 2nd and 3rd period: 10; 4th period: 11). This means that the degree value gradually increased with the use of various GI domains for Climate. The Environmental Impact Assessment Map domain ranked in 4th place (degree: 6) in the 1st period; however, it was far from 10th place during periods 2-3 and ranked 7th (degree: 8) in the 4th period. That is, the Environmental Impact Assessment Map domain may be GI that has recently regained attention. Closeness centrality is a measure of centrality in a network, calculated as the sum of the length of the shortest paths between the node and all other nodes in the network. Therefore, in order to assess the overall flow of a network, it is necessary to investigate those nodes with high closeness centrality values. The GI domains whose closeness centrality ranked within 10th place during the entire period were the Satellite Image and General-Purpose Map domains. For this reason, we conclude that these GI domains can be used universally, across all Research Fields. In the Research Field domain, Urban and Soil recently showed a tendency to be far from the center of the network. Urban and Soil ranked in 4th to 8th place during periods 1-3 but did not rank within 10th place in the 4th period. Water Resource was the domain with the largest fluctuation in ranking by period and showed a change of 2-10 places by period. Therefore, studies on water resources based on GI manifest repeated increases and decreases with no trend. The betweenness centrality is a measure of centrality in a network based on the shortest paths. The betweenness centrality for each node is the number of these shortest paths that pass through the node. Therefore, it is reasonable that the interdisciplinary research between nodes (Research Field domains) with different characteristics functions to mediate nodes with high betweenness centrality. In particular, the Satellite Image, Climate, and General-Purpose Map domains had betweenness centrality values within the top 10 during the entire period. Therefore, it is effective to try to combine Satellite Image and GI within the Climate Research Field. The Urban Thematic Map domain ranked 1st place during the 3rd period but fell to 10th place in the 4th period. Figures 4-7 are a network structure diagram for periods 1-4. In Figures 4-7, red nodes are GI domains, and blacknodes are Research Field domains. The size of the circles is an expression of the relative frequency of each node (domain). Dotted lines indicate nodes ranked within the top 10 in all network indices. The network structure diagram, by period, has a structurally simple form in the 1st period and the size of the nodes was relatively small. Over periods 2-4, the structure of the network changed to a more complicated form. In particular, the network structure during the 3rd period was the most complicated because it had the largest number of links throughout the entire period (Table 3). In Figures 6 and 7, the GI domains that were outside the dotted line during the 3rd period moved inside the dotted line during the 4th period. This means that the influence of some GI domains on the network increased and indicates that interdisciplinary research based on GIs will be feasible.
3.2. Subdivision Classification Analysis. This section presents detailed analysis results obtained by subdividing the GI domain with the highest frequency (Satellite Image). The Satellite Image domain was subdivided according to the satellite species. When the number of simultaneous occurrences of the subdivided Satellite Image and Research Field domains was less than 2, the set of domains was excluded from analysis. A basic statistical analysis was carried out separately for the entire period and for periods 1-4, and network analysis was conducted for the entire period.
3.2.1. The Satellite Image Domain. The Satellite Image domain was divided into 12 subspecialized domains based on the satellite species. Figure 8 shows the frequency over the entire study period and during each period. The frequency percentage was 10% or greater over the entire period for the satellites LANDSAT (50%), KOMPSAT (20.1%), and MODIS (10.9%) (Figure 8(a)); the frequency percentage of LANDSAT was 50%, which has a high relevance to the year when LANDSAT was launched. A total of eight LANDSAT satellites were launched by 2017 . LANDSAT-1 was launched in 1972. The KOMPSAT and MODIS satellites were ranked in 2nd and 3rd place and were first launched in 1999 [32, 33]. In the time series analysis, the frequency of LANDSAT was found to be high overall (Figure 8(b)). The frequency of LANDSAT sharply increased during periods 1-2, and the rate of increase decreased during periods 2-3. During the 3rd and 4th periods, its frequency decreased sharply, possibly due to an increase in the use of KOMPSAT, which was developed in South Korea, because the papers that we collected for this study were published in South Korea . A total of five KOMPSAT satellites were launched by 2017. KOMPSAT comprises four optical satellites (K-1, K-2, K-3, and K-3 A) and 1 Synthetic Aperture Radar (SAR) satellite (K-5). KOMPSAT satellites were launched between 1999 and 2015. The frequency of KOMPSAT increased steadily over periods 1-4 and was highest within the 4th period. When considering the collection period (1996-2015) of the papers and the launch year (2015) of K-3 and K-3A, we expect that the frequency of KOMPSAT will continue to increase in the future. The frequency of MODIS was lower than that of LANDSAT and KOMPSAT but continued to increase over periods 1-4. Next, the frequency of GEOKOMPSAT, which was launched in 2010 as the first Korean multifunction geostationary satellite, was high. GEOKOMPSAT-2A and GEOKOMPSAT-2 B will be launched from 2018 to 2019 . Therefore, we expect the frequency of GEOKOMPSAT to continue to increase. Eight satellites, including SPOT, were used as Satellite Image domains, but not at high rates, due to the difficulty and cost of obtaining their data. The privatization of LANDSAT has had the effect of gradually decreasing its use as a keyword for the following reasons. First, the frequency of utilization of high-resolution optical satellites is increasing. The spatial resolution of LANDSAT-8 is 30 m (panchromatic band), whereas the spatial resolution of the KOMPSAT-2 is 1 m (panchromatic band). Medium-, low-, and high-resolution satellite images have advantages and disadvantages depending on the purpose of use. Therefore, it is difficult to judge whether high-resolution satellite images (KOMPSAT) are replacing medium- or low-resolution satellite images (LANDSAT). However, it is reasonable to infer that Research Fields requiring high-resolution satellite imagery are increasing in number. Second, the demands for satellite images captured by various sensor types are increasing. Earth observation satellites include optical satellites (LANDSAT, K-2), SAR satellites (K-5), and high-spectral resolution satellites (MODIS, GEOKOMPSAT). The frequency of use of optical satellites (LANDSAT) was very high during periods 1-3. However, the frequency of the SAR satellite (K-5) and high-spectral resolution satellites (MODIS, GEOKOMPSAT) gradually increased during the 4th period.
3.2.2. Research Field Based on Satellite Image. Research Field based on Satellite Image was divided into nine domains. Figure 9 shows the frequency percentage for the entire study period and for each period. The domains whose frequency percentage was over 10% for the entire period were Climate (29.3%), Natural Disaster (16.7%), Forest (10.9%), and Water Resource (10.3%) (Figure 9(a)). Overall, Research Field domains with a wide spatial extent such as Climate and Natural Disaster ranked highly, whereas Research Field domains with a narrow spatial range, such as Urban and Agriculture, were of low ranking. Figure 9(b) shows the time series frequency for periods 1-4. The frequency of the Climate domain was the highest over the entire period and rose continuously during periods 1-4. The frequency of Natural Disaster was also high and increased continuously during periods 1-4. Overall, the frequencies of the Climate, Natural Disaster, and Forest domains were high and continually increased. Conversely, the Urban, Water Resource, and Soil domains showed a tendency to decrease. These results may be due to differences in spatial extent and indicate the importance of time series change analysis, because Satellite Image is advantageous for analysis over a wide area and can detect time series changes in the region of interest.
3.2.3. The Satellite Image-Research Field Network. Table 5 and Figure 10 show the results of the Satellite Image-Research Field network analysis. We performed this analysis over the entire period because the structure of the entire network was relatively simple, such that time series analysis would hold no significant meaning. In the Satellite Image-Research Field network, 12 Satellite Image domains and nine Research Field domains were connected. The number of links connected by the Satellite Image-Research Field network was 34 in total. The top 10 domains for each index are shown in Table 5.
The results of a detailed analysis of each network index follow. The frequency of the Urban and Ocean domains was high, whereas the other network indices ranked below 10th place. These domains have relatively high frequencies but very little interaction with various other domains. The frequencies of the Natural Disaster and Climate domains were 29 and 51, respectively, and the degrees were 8 and 6. In other words, in the field of Natural Disaster, we infer that many pilot studies are attempted, using various satellite images. To confirm the latest research trend using Satellite Image, an analysis of the Natural Disaster domain would be appropriate. Generally, closeness centrality will be high if the degree is high; these were approximately 80% coincident in the domains within the top 10 in the present study. The domains with high closeness centrality were in the center of the network, such that they easily connect with the entire domain. Scholars studying satellite images for the first time need to search for related papers with high closeness centrality values and investigate the overall flow of the Satellite Image application field. A domain with a high betweenness centrality value has connections with other research themes. Therefore, when trying to fuse different Research Fields, it is reasonable to choose a domain with high betweenness centrality. Figure 10 shows the Satellite Image-Research Field network. When examining the node inside the dotted circle, all Research Field domains are included; however, the Satellite Image domain comprised 41.6% of the total, indicating that the privatization of several satellites such as LANDSAT and KOMPSAT had a substantial effect. Satellite Image domains outside the dotted circle indicate pilot studies in some Research Fields.
4. Discussion and Conclusion
Recently, the GI field has grown in both quantity and quality. To increase the value of GI and apply it in various Research Fields, it is important to establish research trends. In this study, we analyzed GI research trends using various network indices. We extracted domain pairs (GI, Research Field) from GI-related papers and performed frequency analysis, time series analysis, and centrality analysis on these pairs. We also conducted major classification and subdivision classification analyses. Subdivision classification analysis was performed by subdividing the representative GI domain, calculated from the major classification.
A total of 869 papers were collected from KCI DB, and one set of domains (GI, Research Field) was extracted from each paper. As a result of a frequency analysis of periods 1-4, only a few domains, such as Climate and Satellite Image, showed a sustained increase. As a result of the GI-Research Field network analysis, the Climate, Satellite Image, Natural Disaster, General-Purpose Map, and Natural Disaster Map domains had high-ranking values among all network indices during the entire study period. An analysis of detailed research trends in the Satellite Image domain showed that the LANDSAT, KOMPSAT, MODIS, Natural Disaster, Climate, and Forest domains had high-ranking values among all network indices.
In the major classification analysis, we found that the Climate domain moved to the middle of the GI-Research Field network over time. The network indices of the Climate domain continued to increase throughout periods 1-4 and, in the 4th period, it had the highest ranking among all network indices. The high values of all the network indices indicate that its accessibility to other domains is at a peak. In other words, it is effective to focus on Climate in the interdisciplinary research mediated by GI. GI domains consistently occupying the top ranks over the entire period were Satellite Image and General-Purpose Map. The Satellite Image and General-Purpose Map domains are most common types of data used in spatial analysis; therefore, this result was expected. However, to maintain continued growth in the value of the GI field, it is necessary to further strengthen the versatility of GI. Thus, we were encouraged to see that the Environmental Impact Assessment Map and Atmosphere Thematic Map domains moved from low to high rankings during periods 3-4.
Our subdivision classification analysis of Satellite Image showed that the privatization of several satellites on the network had a significant effect. In particular, LANDSAT displayed much higher values in all network indices. However, this phenomenon has been decreasing over time. In the time series frequency analysis, LANDSAT showed a sharp decline during periods 3-4, and KOMPSAT, MODIS, and GEOKOMPSAT showed a sustained increase over periods 1-4 (Figure 8(b)). To expand interdisciplinary research, it seems reasonable to try to center on the Satellite Image domain, which has a high betweenness centrality value. The convergence of Climate and Natural Disaster by mediating LANDSAT and KOMPSAT with high betweenness centrality would be an efficient method. It is necessary to derive new research topics through the combination of Satellite Image domains. It is reasonable that the consilience of Satellite Image domains shows an increased focus on Research Field domains with high betweenness centrality and closeness centrality values. For example, if researchers study natural disasters by combining KOMPSAT and ALOS, new research results may be derived.
In this research, we analyzed GI research trends using the text-mining method, with the following limitations. First, since this study examined only papers indexed by KCI, there is a limit to generalization of results to larger GI research trends. Second, we analyzed GI research trends using network indices (frequency, degree, closeness centrality, and betweenness centrality). However, for a more diverse analysis, more indicators should be analyzed. Third, we classified the keywords extracted from the papers into 26 domains to perform major classification analysis, resulting in a rather simplified analysis. Subsequent studies should overcome these limits to produce results that can be more widely generalized among Research Field and GI domains.
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.
This research was conducted at Korea Environment Institute (KEI) with support from the Technology Advancement Research Program (TARP) funded by the Ministry of Land, Infrastructure and Transport of the Korean government via Grant 16CTAP-C114629-01.
 P. Folger, Geospatial Information and Geographic Information Systems (GIS): Current Issues and Future Challenges, DIANE, 2010.
 S. Fotheringham and P. Rogerson, Eds., Spatial Analysis and GIS, CRC Press, 2013.
 K.-Y. Oh, H.-S. Jung, and K.-J. Lee, "Comparison of image fusion methods to merge KOMPSAT-2 panchromatic and multispectral images," Korean Journal of Remote Sensing, vol. 28, no. 1, pp. 39-54, 2012.
 S.-W. Lee, A.-R. Song, and N.-W. Park, "Environmental impact assessment of nuclear power plant accident using spatial information modeling: a case study of chernobyl," Korean Journal of Remote Sensing, vol. 28, no. 1, pp. 129-143, 2012.
 S.-Y. Cha, U.-H. Pi, and C.-H. Park, "Mapping and estimating forest carbon absorption using time-series MODIS imagery in South Korea," Korean Journal of Remote Sensing, vol. 29, no. 5, pp. 517-525, 2013.
 N.-W. Park, "Geostatistical downscaling of coarse scale remote sensing data and integration with precise observation data for generation of fine scale thematic information," Korean Journal of Remote Sensing, vol. 29, no. 1, pp. 69-79, 2013.
 F. G. Bonham-Carter, Geographic Information Systems for Geoscientists: Modelling with GIS, vol. 13, Elsevier, 2014.
 B. Jaap, V. D. Menno, D. Sander, E. David, M. Rene, and V. O. Erik, "The fourth industrial revolution," VINT Research Report 3, SOGETI, 2014.
 R. Feldman and J. Sanger, The Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data, Cambridge University Press, Cambridge, UK, 2006.
 M. W. Berry, "Survey of text mining," Computing Reviews, vol. 45, no. 9, pp. 548-554, 2004.
 M. A. van Driel, J. Bruggeman, G. Vriend, H. G. Brunner, and J. A. M. Leunissen, "A text-mining analysis of the human phenome," European Journal of Human Genetics, vol. 14, no. 5, pp. 535-542, 2006.
 J. Y. Lee, H. Kim, and P. J. Kim, "Domain analysis with text mining: analysis of digital library research trends using profiling methods," Journal of Information Science, vol. 36, no. 2, pp. 144-161, 2010.
 M. Krallinger, F. Leitner, and A. Valencia, "Analysis of biological processes and diseases using text mining approaches," Bioinformatics Methods in Clinical Research, vol. 593, pp. 341-382, 2010.
 D. Delen and M. D. Crossland, "Seeding the survey and analysis of research literature with text mining," Expert Systems with Applications, vol. 34, no. 3, pp. 1707-1720, 2008.
 J.-L. Hung and K. Zhang, "Examining mobile learning trends 2003-2008: a categorical meta-trend analysis using text mining techniques," Journal of Computing in Higher Education, vol. 24, no. 1, pp. 1-17, 2012.
 W. He, S. Zha, and L. Li, "Social media competitive analysis and text mining: a case study in the pizza industry," International Journal of Information Management, vol. 33, no. 3, pp. 464-472, 2013.
 S. Y. Lim, M. S. Yi, G. H. Jin, and D. B. Shin, "A study on the research trends in the area of geospatial-information using text-mining technique focused on national R&D reports and theses," Journal of Korea Spatial Information Society, vol. 22, no. 4, pp. 11-20, 2014.
 M. Callon, J. P. Courtial, and F. Laville, "Co-word analysis as a tool for describing the network of interactions between basic and technological research: the case of polymer chemsitry," Scientometrics, vol. 22, no. 1, pp. 155-205, 1991.
 Q. He, "Knowledge discovery through co-word analysis," Library Trends, vol. 48, no. 1, p. 133, 1999.
 B. Yoon and Y. Park, "A text-mining-based patent network: analytical tool for high-technology trend," Journal of High Technology Management Research, vol. 15, no. 1, pp. 37-50, 2004.
 M. Rokaya, E. Atlam, M. Fuketa, T. C. Dorji, and J.-I. Aoe, "Ranking of field association terms using co-word analysis," Information Processing and Management, vol. 44, no. 2, pp. 738-755, 2008.
 G. A. Ronda-Pupo and L. A. Guerras-Martin, "Dynamics of the evolution of the strategy concept 1962-2008: A co-word analysis," Strategic Management Journal, vol. 33, no. 2, pp. 162-188, 2012.
 Y. Kajikawa, J. Yoshikawa, Y. Takeda, and K. Matsushima, "Tracking emerging technologies in energy research: toward a roadmap for sustainable energy," Technological Forecasting and Social Change, vol. 75, no. 6, pp. 771-782, 2008.
 Y. Kajikawa and Y. Takeda, "Citation network analysis of organic LEDs," Technological Forecasting and Social Change, vol. 76, no. 8, pp. 1115-1123, 2009.
 S. Choi, J. Yoon, K. Kim, J. Y. Lee, and C.-H. Kim, "SAO network analysis of patents for technology trends identification: a case study of polymer electrolyte membrane technology in proton exchange membrane fuel cells," Scientometrics, vol. 88, no. 3, pp. 863-883, 2011.
 M. S. Lee, C. H. Lee, and J. Y. Kim, Big Data Analysis on Demands for Environmental Policies, Korea Environmental Institute, 2014.
 Korea National Spatial Data Infrastructure Portal, http://www .nsdi.go.kr.
 NodeXL, https://www.microsoft.com/en-us/research/project/ nodexl-network-overview-discovery-and-exploration-in-excel/.
 A. Paone and S. H. Yun, "Pyroclastic density current hazards at the baekdusan volcano, Korea: analyses of several scenarios from a small-case to the worst-case colossal eruption," in Updates in Volcanology-From Volcano Modelling to Volcano Geology, InTech, 2016.
 Intergovernmental Panel on Climate Change (IPCC), Climate Change 2014: Impacts, Adaptation, and Vulnerability, Cambridge University Press, Cambridge, UK, 2014.
 D. P. Roy, M. A. Wulder, T. R. Loveland et al., "Landsat-8: science and product vision for terrestrial global change research," Remote Sensing of Environment, vol. 145, pp. 154-172, 2014.
 L. Hoonyol and H. Hyangsun, "Evaluation of SSM/I and AMSR-E sea ice concentrations in the antarctic spring using KOMPSAT-1 EOC images," IEEE Transactions on Geoscience and Remote Sensing, vol. 46, no. 7, pp. 1905-1912, 2008.
 Terra (MODIS), https://terra.nasa.gov/about/terra-instruments/modis.
 EESA Earth Online (KOMPSAT), https://earth.esa.int/web/ guest/missions/3rd-party-missions/current-missions/kompsat-2.
 J. Kim, M. Kim, M. Choi et al., "Monitoring atmospheric composition by GEO-KOMPSAT-1 and 2: GOCI, MI and GEMS," in Proceedings of the 36th IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2016, pp. 4084-4086, chn, July 2016.
Kwan-Young Oh (1) and Moung-Jin Lee (2)
(1) Korea Aerospace Research Institute (KARI), Daejeon, Republic of Korea
(2) Korea Environment Institute (KEI), Sejong, Republic of Korea
Correspondence should be addressed to Moung-Jin Lee; firstname.lastname@example.org
Received 26 April 2017; Accepted 2 July 2017; Published 17 August 2017
Academic Editor: Lei Zhang
Caption: FIGURE 1: Research procedure.
Caption: FIGURE 2: Frequency percentage (a) and time-dependent frequency change (b) of the GI domain.
Caption: FIGURE 3: Frequency percentage (a) and time-dependent frequency change (b) of the Research Field domain.
Caption: FIGURE 4: GI-Research Field network, 1st period (1996-2000). The red nodes are GI domains, and black nodes are Research Field domains.
Caption: FIGURE 5: GI-Research Field network, 2nd period (2001-2005). The red nodes are GI domains, and black nodes are Research Field domains.
Caption: FIGURE 6: GI-Research Field network, 3rd period (2006-2010). The red nodes are GI domains, and black nodes are Research Field domains.
Caption: FIGURE 7: GI-Research Field network, 4th period (2011-2015). The red nodes are GI domains, and black nodes are Research Field domains.
Caption: FIGURE 8: Frequency percentage (a) and time-dependent frequency change (b) of the Satellite Image domain.
Caption: FIGURE 9: Frequency percentage (a) and time-dependent frequency change (b) of the Research Field domain, based on Satellite Image.
Caption: FIGURE 10: Satellite Image-Research Field network (1996-2015). The red nodes are Satellite Image domains, and black nodes are Research Field domains.
TABLE 1: Geospatial Information (GI) domains. Domain Keyword examples Satellite Image KOMPSAT, GEOKOMPSAT, LANDSAT, MODIS, SPOT, Cosmo-SkyMed, ALOS, IRS Aerial Image Aircraft, Unmanned Aerial Vehicle (UAV), Drone, Lidar General-Purpose Map World map, Digital Elevation Model (DEM), Digital map, Electronic map Environmental Impact Land cover map, Land use map, Biotope map, Assessment Soil map Biodiversity Biodiversity census data, Ecological zoning Thematic Map map Agriculture Agricultural census data, Farm Program Atlas, Thematic Map Agricultural land use Atmosphere Climatic map, Air temperature, Amount of Thematic Map rainfall Natural Disaster Disaster census data, Landslide, Earthquake, Thematic Map Flood, Volcano Forest Thematic Map Forest classification, Wood age, Forest location Ocean Thematic Map Cadastral map, Water level (Tidal), Ocean-uses atlas, Sea surface temperature Urban Thematic Map Population density, Built-up area, Cultural facilities Transportation Rail, Road, Airport, Street Thematic Map Water Resource Water census data, Water supply facilities, Thematic Map Groundwater, Water resource TABLE 2: Research Field domains. Domain Keyword examples Natural Disaster Landslide, Earthquake, Flood, Volcano, Forest fire Agriculture Agriculture ecosystem, Food science, Agricultural policy, Pest control Forest Forest cover loss, Forest ecology, Forest biomass waste, Forest canopy Urban Heat island, Urban growth, Surface temperature, Development density Air Quality Air pollution, Yellow dust (Asian Dust), Exhaust gas, Smog Water Resource Water pollution, Water quality, Base flow, Stream flow Biodiversity Species Diversity, Terrestrial species, Oceanic species, Habitat destruction Climate Greenhouse gas, Global warming, Carbon dioxide, Drought, Typhoon Health Carcinogen, Pathogen, Bacteria, Heat stroke, Respiratory disease Noise Ocean Noise pollution, Soundproofing, Hearing loss, Sleep disturbance Mudflat, Coast, Sea level, Red tide, Marine litter Soil Soil pollution, Land degradation, Pesticide, Herbicide Waste Hazardous waste, Wastewater, Bodily wastes, Recycling TABLE 3: Number of links and nodes (1996-2015). Division 1st (1996-2000) 2nd (2001-2005) Number of GI nodes (domain) 10 13 Number of Research 9 11 Field nodes (domain) Number of links (total) 33 (58) 57 (182) Division 3rd (2006-2010) 4th (2011-2015) Number of GI nodes (domain) 12 13 Number of Research 13 11 Field nodes (domain) Number of links (total) 74 (306) 71 (323) TABLE 4: Results of GI-Research Field network analysis (periods 1-4). Italic cells represent the domains in which all network indices are within the top 10 in each period. Stage Rank Frequency 1 1 Satellite Image 14 (1996-2000) 2 Natural Disaster 11 3 Climate 10 4 Urban 10 5 Water Resource 9 6 Urban Thematic Map 9 7 Natural Disaster 8 Thematic Map 8 Environmental Impact 6 Assessment Map 9 General-Purpose Map 6 10 Forest 5 2 1 Satellite Image 47 (2001-2005) 2 Climate 42 3 Water Resource 27 4 Urban 26 5 Natural Disaster 23 Thematic Map 6 Natural Disaster 22 7 Soil 21 8 Water Resource 20 Thematic Map 9 General-Purpose 19 Map Environmental 10 Impact Assessment Map 18 3 1 Climate 93.0 (2006-2010) 2 Satellite Image 73.0 3 Natural Disaster 52.0 4 Water Resource 39.0 Thematic Map 5 Natural Disaster 37.0 Thematic Map 6 Water Resource 36.0 7 Urban 35.0 8 General-Purpose Map 32.0 9 Soil 25.0 10 Atmosphere 25.0 Thematic Map 4 1 Climate 90 (2011-2015) 2 Natural Disaster 77 3 Satellite Image 75 4 Natural Disaster 50 Thematic Map 5 General-Purpose Map 45 6 Atmosphere 42 Thematic Map 7 Urban 33 8 Urban Thematic 28 Map Environmental 9 Impact Assessment Map 25 10 Biodiversity 25 Stage Rank Degree 1 1 Satellite Image 7 (1996-2000) 2 Natural Disaster 6 3 Water Resource 6 4 Environmental Impact 6 Assessment Map 5 Climate 5 6 Urban 5 7 General-Purpose Map 4 8 Natural Disaster 4 Thematic Map 9 Biodiversity 3 10 Air Quality 3 2 1 Satellite Image 10 (2001-2005) 2 Climate 10 3 General-Purpose Map 8 4 Soil 8 5 Natural Disaster 7 6 Urban 7 7 Natural Disaster 6 Thematic Map 8 Atmosphere 6 Thematic Map 9 Water Resource 5 10 Forest 5 3 1 Climate 10 (2006-2010) 2 Satellite Image 9 3 Natural Disaster 9 4 Water Resource 9 5 General-Purpose Map 9 6 Urban 8 7 Soil 8 8 Urban Thematic Map 8 9 Water Resource Thematic Map 7 10 Natural Disaster 7 Thematic Map 4 1 Climate 11 (2011-2015) 2 Satellite Image 10 3 Natural Disaster 8 4 Natural Disaster 8 Thematic Map 5 General-Purpose Map 8 6 Atmosphere Thematic 8 Map 7 Environmental Impact 8 Assessment Map 8 Water Resource 8 9 Forest 8 10 Biodiversity 7 Stage Rank Closeness centrality 1 1 Satellite Image 0.032 (1996-2000) 2 Natural Disaster 0.029 3 Water Resource 0.029 4 Environmental Impact 0.029 V 5 Climate 0.028 6 Urban 0.028 7 General-Purpose Map 0.026 8 Natural Disaster 0.024 Thematic Map 9 Biodiversity 0.023 10 Urban Thematic Map 0.023 2 1 Satellite Image 0.027 (2001-2005) 2 Climate 0.026 3 General-Purpose Map 0.023 4 Soil 0.022 5 Natural Disaster 0.022 6 Natural Disaster 0.022 Thematic Map 7 Atmosphere 0.021 Thematic Map 8 Urban 0.021 9 Agriculture 0.020 10 Water Resource 0.020 3 1 Climate 0.025 (2006-2010) 2 Natural Disaster 0.024 3 Water Resource 0.024 4 Satellite Image 0.023 5 General-Purpose Map 0.023 6 Urban 0.023 7 Urban Thematic Map 0.021 8 Aerial Image 0.021 9 Soil 0.021 10 Water Resource 0.020 Thematic Map 4 1 Climate 0.027 (2011-2015) 2 Satellite Image 0.027 3 Natural Disaster Thematic Map 0.024 4 Atmosphere Thematic Map 0.024 5 Natural Disaster 0.023 6 General-Purpose 0.023 Map 7 Environmental Impact 0.023 Assessment Map 8 Water Resource 0.023 9 Forest 0.023 10 Biodiversity 0.022 Stage Rank Betweenness centrality 1 1 Satellite Image 100.68 (1996-2000) 2 Environmental Impact 72.92 Assessment Map 3 Natural Disaster 51.9 4 Climate 48.29 5 Water Resource 45.28 6 Urban 44.4 7 Biodiversity 35.63 8 General-Purpose Map 23.85 9 Natural Disaster 15.24 Thematic Map 10 Air Quality 12.75 2 1 Satellite Image 120.77 (2001-2005) 2 Climate 105.91 3 Atmosphere Thematic Map 69.36 4 Soil 65.66 5 General-Purpose Map 50.74 6 Agriculture 50.23 7 Natural Disaster 45.67 8 Natural Disaster 35.93 Thematic Map 9 Urban 33.84 10 Ocean 17 3 1 Urban Thematic Map 109.99 (2006-2010) 2 General-Purpose Map 63.73 3 Climate 63.19 4 Satellite Image 54.21 5 Natural Disaster 51.15 6 Water Resource 47.83 7 Ocean 47.38 8 Urban 46.38 9 Aerial Image 30.87 10 Water Resource 26.92 Thematic Map 4 1 Climate 91.07 (2011-2015) 2 Satellite Image 64.6 3 Forest 59.87 4 Agriculture 52.88 5 Atmosphere Thematic Map 44.9 6 Environmental Impact 38.63 Assessment Map 7 General-Purpose Map 33.8 8 Natural Disaster Thematic Map 32.03 9 Biodiversity 26.95 10 Urban Thematic Map 26.51 TABLE 5: Results of network analysis (1996-2015). Italic cells represent the domains in which all network indices are within the top 10 in each period. Rank Frequency Degree 1 LANDSAT 87 LANDSAT 9 2 Climate 51 Natural Disaster 8 3 KOMPSAT 35 KOMPSAT 7 4 Natural Disaster 29 Climate 6 5 Forest 19 Forest 5 6 MODIS 19 MODIS 5 7 Water Resource 18 Soil 3 8 Urban 17 Biodiversity 3 9 Ocean 12 Agriculture 3 10 GEOKOMPSAT 12 GEOKOMPSAT 3 Rank Closeness centrality Betweenness centrality 1 LANDSAT 0.032 LANDSAT 140.804 2 KOMPSAT 0.029 Natural Disaster 136.670 3 Natural Disaster 0.028 Climate 86.810 4 Climate 0.025 KOMPSAT 72.887 5 MODIS 0.024 Forest 43.886 6 Forest 0.024 Agriculture 38.758 7 Soil 0.022 MODIS 30.487 8 Biodiversity 0.022 GEOKOMPSAT 15.540 9 Agriculture 0.022 Water Resource 8.645 10 GEOKOMPSAT 0.021 SPOT 4.032
|Printer friendly Cite/link Email Feedback|
|Title Annotation:||Research Article|
|Author:||Oh, Kwan-Young; Lee, Moung-Jin|
|Publication:||Journal of Sensors|
|Date:||Jan 1, 2017|
|Previous Article:||Biomimetic Sonar for Electrical Activation of the Auditory Pathway.|
|Next Article:||Method for Detecting the Inside of Coke Drum Using Acoustic Signals.|