Disrupting the data landscape again with linked open data.
According to David Price, managing director of TopQuadrant (topquadrant.com): "Big data shattered the paradigm where organizations would only use methods designed for a single relational database. They suddenly needed unstructured data analysis for big data, along with relational methods to find the proverbial needle in the haystack and a linked data approach to connect things and see how they relate."
Subsequently, linked data--manifest as both linked open data and its private sector counterpart, linked enterprise data--has enjoyed a resurgence so utilitarian that it could very well disrupt the data landscape again. Machine-readable public data sources have abounded across verticals, become prioritized by the federal government during the previous presidential administration and are regularly aggregated with proprietary sources for a holistic view of organizational interests.
Such activity, however, only hints at the core functionality for which linked data is prized after the onset of the big data epoch and does not begin to enumerate the true value of its interoperability across the data ecosystem. Specifically, the linked data approach is rapidly gaining credence throughout the public and private sectors for its ability to:
* share data--Linked data is readily exchanged across any assortment of hardware systems, applications and use cases in a singular fashion, greatly simplifying and expediting integration while largely automating analytics prerequisites such as transformation. Semantic web founder Tim Berners-Lee stated, "With linked data, when you have some of it, you can find other related data."
* improve governance--Predicated on uniform modeling conventions known as ontologies, linked data technologies clarify data's meaning according to uniform standards that naturally evolve at the pace of business. The resulting heightened governance capabilities are ideal for ensuring compliance in heavily regulated industries such as finance or healthcare.
* implement machine intelligence--Linked data's inherent machine-readable nature substantially impacts the enterprise's ability to scale at speeds commensurate with real-time big data ingestion and is timely for the rejuvenated interest in artificial intelligence.
* unify silos--The penultimate boon of the linked data approach is its ability to end the silo-based culture rampant throughout the data sphere, which fundamentally delays time to insight and action, increases costs and renders data's meaning abstruse.
Those reasons, and others, are why many organizations in both commercial and public spaces are realizing that "it's worth an investment to take the data out of the individual applications and make it available in a more mutual, standard format," Price says. "Applications are going to come and go but if they've got this standard format, it helps them manage these long lifecycles in a better way."
Linked open data's sustainability
The most visible incarnation of linked data's increasingly conspicuous presence, and the one that is most profoundly reshaping the data sphere, is the public sector's linked open data effort. Largely fueled by the need to swiftly integrate and share data across different entities in a sustainable manner, linked open data is the most viable means for effecting such exchanges among and within different countries, languages, sectors and organizations. The crux of the linked data approach is the uniform standard of the W3C Consortium (w3.org), which seamlessly harmonizes all data regardless of structure, source or type. Those all-encompassing, evolving data models and standardized vocabularies create pervasive adherence among disparate data, spurring a range of pragmatic possibilities fostering the trust and consistency needed of data-driven applications in the public sector. The growing list of linked open data sources includes data from the U.S. Congress (congress.gov), World Bank (worldbank.org), British Geological Survey (bgs.ac.uk), U.S. Securities and Exchange Commission (sec.gov) and linked sensor data sources.
For the past several months, Price has been assisting both the Swedish and Dutch national road authorities with the European Virtual Construction for Roads (V-Con) project, an intricate exchange of linked data pertaining to road network management, updates and construction in Sweden and Holland. The implementation of uniform, semantic standards within and, in certain cases, between those countries is a multifaceted process targeted at the organizational, national and industrywide levels. The standards-based approach of linked data is the most viable means of sharing data among the construction companies, government entities that own the roads and their national governments. Most importantly, the utilization of semantic technologies ensures the long-term continuity of the underlying data vital to that construction work, regardless of shifting personnel, infrastructure or tools.
Such enduring relevance of data and their technologies in the rapidly shifting world of IT, in which legacy systems quickly become outdated, is a cardinal virtue of linked data--and all but impossible with any other approach. On the continuous value linked data produces in this respect, TopQuadrant CEO Irene Polikoff says, "The long-term sustainability and reuse of linked data processes is one of the reasons it has become more attractive today, because otherwise you would have to do everything over: the schema, the modeling and transformation. Constantly doing that work takes too long and proves much too expensive over time." Ensuing benefits include decreased total cost of ownership, a lasting means of leveraging data assets and a drastic reduction in the instances of legacy silos.
Standards-based, 3-D modeling
The sustainable longevity of linked data stems from the implementation of consistent meaning across systems--regardless of type or source--largely because of its standardized models. Those ontologies are not only responsible for modeling all data in a uniform way, but also naturally evolve to include additional requirements or data types in a singular method without the inordinately lengthy recalibration of schema required of relational models. Furthermore, those semantic models can be implemented at varying levels to ensure the consistency vital to expedient data exchange and interminable reuse at scale. As such, a large part of Price's work with V-Con has involved "helping organizations use our tools to create models with 3-D capabilities to share data between the different organizations."
The totality of the expressiveness of those models is not only ascribed to their 3-D attributes, but also to the various standards to which they require data to conform. Price described the latter as a "layering of multiple ontologies according to the different countries, their subsets and in some cases, for each organization." The merit of the linked data approach in this regard is that the data still preserve their defined meaning at each of the respective levels, yet are quickly exchanged between and understood by the different IT systems found therein. Moreover, the expressiveness of the ontologies is considerably enriched by "the 3-D characteristics, which are extracted from the models," Price explains. "We've added a widget to the browser that users can click on and actually see a bridge or a road."
Other than the immediacy at which it enables data to be exchanged and comprehended, the capital value proposition of linked data is the singular continuity it provides for long-term investment in data-driven processes. The uniformity of its standards all but makes obsolete--although it can certainly work in conjunction with--legacy systems and the habitual need to overhaul them with more modern infrastructure that alienates time-honored data. In addition to rapidly becoming the de facto means of publishing data for public consumption, linked data is garnering even more traction within organizations in the private sector for those long-term benefits. A primary driver for that movement is the increasing prevalence of vertical industry standards--a definite means of not only accessing and analyzing data apropos to a specific vertical, but also of exchanging data between organizations (and perhaps even customers) within it as well.
Industry Foundation Classes (IFC) is the standardized ontology within the construction field that is responsible for the exchange of data between countries in the V-Con project. Additional industry-specific ontologies include the Clinical Data Interchange Standards Consortium (CDISC, cdisc.org) standards for clinical trials and the Financial Industry Business Ontology (FIBO) in financial services. "We're definitely seeing a growing interest in industry standards and in FIBO in particular," Polikoff says. "In financial services, many companies have one silo for a business glossary, another for security purposes and others for everything else. There is an emerging need to connect them and one of the drivers is regulatory compliance."
Linked enterprise data
The realities of regulatory compliance, coupled with the increasingly strident penalties for non-compliance, serve as the primary impetus for the implementation of linked enterprise data within the private sector. Quickly realized gains in that regard include improved provenance and data lineage, as well as holistic management of information assets and their use according to industrywide regulations. Furthermore, adherence to governance practices is readily discernible for more effective data stewardship that provides additional oversight for compliance issues. "A lot of large banks are now using a linked data approach for data governance and data management, mostly because they are accountable for so many regulatory requirements," Polikoff says.
The eminent expressivity of linked enterprise data delivers even greater yield to organizations with its innate understanding of the relationships between data, especially when that data represents the totality of an organization's information assets. The standardized modeling of all enterprise data on a semantic RDF graph delivers peerless insight into the most minute relationships between data elements, enabling organizations to glean a pivotal contextualization of data assets that might otherwise seem unrelated. The result is a greater understanding of one's data realized through a revamped data discovery process delivering analytic profundity through the use of all enterprise data, which positively impacts the ROI.
Another fundamental trait of linked data that provides considerable utility within both private and public sector applications is its intrinsically machine-readable nature. On the one hand, that quality directly translates to an ability to scale--at rapid velocities--on sets of big data that might otherwise prove too exorbitant to manage. "The means to scale is one of the foremost advantages of this approach because it accelerates processes that would otherwise take too many financial and temporal resources to do," Polikoff says. The underlying semantics technology that makes linked data machine-readable is also a core component of machine intelligence and artificial intelligence, enabling linked data to seamlessly merge and incorporate such data into applications of choice to the enterprise.
Of all the capabilities for linked open data and linked enterprise data to impact the future of the data sphere, the most irrefutable may well be the penchant for permanently abolishing silos. Linking data is far from synonymous with granting all systems access to each and every coveted node; governance and even security protocols can be implemented according to semantic triples to limit who can view what in accordance with organizational dictates. Still, the greater utility comes from the means of data interoperability among diverse systems at a pace equitable to that of the modern business climate, whether in the public or private sector. The allocation of resources for maintaining, modeling, integrating, transforming and preparing data for individual legacy systems or specific data marts is much too costly when compared to linked data's efficient alternative of seamlessly harmonizing data between them. The additional benefits of scale, speed, governance and regulatory compliance, longstanding sustainability and machine identifiers certainly make this methodology primed to conquer the copious amounts of unstructured big data with which the world at large is contending.
Perhaps the future of data management may not result in ubiquitous computing or pervasive computing in which all IT systems are linked, but ascending credence is attributed to the notion that it will certainly entail linked data.
By Jelani Harper
Jelani Harper is an editorial consultant servicing the information technology market, specializing in data-driven applications focused on semantic technologies, data governance and analytics, e-mail firstname.lastname@example.org.
|Printer friendly Cite/link Email Feedback|
|Date:||Mar 1, 2017|
|Previous Article:||IoT platforms create value in connectivity.|
|Next Article:||100 companies that matter in knowledge management.|