Printer Friendly

We got you (un)covered.

Sometimes, what you don't know can hurt, or at least hinder, your organization. A column from EContent's June issue looks at dark data--information that lurks underneath the surface and can become a boon or a bust. An article from the May issue of The Information Advisor's Guide to Internet Research provides researchers with a helpful resource for discovering the "how" of a range of topics.

The Dark Side

The dark side doesn't just refer to the Dark Side of the Force. As Erik J. Martin explains in his Trending Topics article for EContent ("Dark Data: Analyzing Unused and Ignored Information"), "Dark data is information, collected as a function of an organization's normal operations, that is rarely or never analyzed or used to make intelligent business decisions." Dark data, aka "data exhaust," is often overlooked, which is unfortunate, as what it contains may be valuable. And what's not of value may be draining company resources, such as money, digital storage space, and security.

Martin found that more than 50% of the world's stored data is dark. The report he quotes predicts that by 2020, if left unchecked, this vast entity of wasted data will collectively cost organizations $5.2 trillion to manage. Wow! No wonder Martin thinks the time has come to shed some light on this data black hole.

So why does so much data go dark to begin with? As Martin discovered, it's because organizations are failing to keep up with the amount of data being generated due to a lack of tools to integrate, manage, and analyze it. However, Martin notes that companies willing to make the effort to "overcome their dark data malaise often outpace their competition in top-line revenue, growth, and efficiency." Here's how they do it.

Organizations need to determine how their dark data can be used to their advantage. By capturing clickstream and scrollstream data, a digital publishing company can get user feedback about how the audience views contributors' work. Do most users stop at a certain paragraph and then leave the page? Are certain titles getting more clicks? Analyzing this data can help an organization make its content platform "stickier."

The flip side of dark data is that without ongoing oversight, it can weaken a company's security. Dark data provides hackers with more ways in, and leaked, breached, or stolen data can damage a company's reputation and competitive strength. Gaining visibility into dark content--verifying what you have, where it resides, who can access it, and when it was last viewed--is a must. This enables an organization to classify legacy data to give it meaning going forward. You can divvy up data so its value can be assessed; anything not worthwhile can be deleted. You can also set up retention policies to automate how data will be handled henceforth.

Martin recommends tools for battling dark data: Hadoop can break data down into chunks that can be more easily managed and studied. Using metadata enables you to cross-reference, link, and curate information so its usefulness can be revealed. Another aid in sifting through dark data is a file analysis tool, as putting data in context leads to enhanced decision making and knowing what data to archive and what to get rid of.

Covering the Basics

Sometimes, writers just need a little rudimentary information to move a story along. Ran Hock's article, "HowStuffWorks: Just What the Title Says" (The Information Advisor's Guide to Internet Research), highlights HowStuffWorks, which may become the new favorite go-to site for "anyone who needs clearly written, very informative, but moderately brief articles for a quick, basic understanding of, indeed, "how things work.'..."

Hock is quick to point out that the site's rather simple name does not mean it is elementary. Instead, it is a font of useful information that was once found in encyclopedias. Since it began in 1998, HowStuffWorks has won more than 80 awards, which is a reflection of the high value it places on editorial quality. Its articles provide "verifiable data" from expert sources. As a result, nearly every article is bylined, and those written within the last 10 years include a list of all the sources used.

The HowStuffWorks homepage contains a lead article and a list of headlines from the Now section, as well as headlines from several of the site's 11 "channels" (topic areas). Other options from the homepage are Great Quizzes, Great Lists, HowStuffWorks Classics, Audio Podcasts, and Video. My guess is that it would be hard not to find a needed topic within the 11 broad channels: Adventure, Animals, Auto, Culture, Entertainment, Health, Home & Garden, Lifestyle, Money, Science, and Tech. Clicking on any of these channels takes you to its main page. There you find several articles, including one feature article, and a directory of categories. You may also find additional links, such as More to Explore and Most Popular. Hock notes that categories range from three to a dozen depending on the channel selected. They generally have several subcategories, and some even have sub-subcategories.

Hock says that while the search feature is an easy way to find information, channel surfing may be the best approach, as this will give searchers the full scope of the "type and variety of content" available. While labeling the search syntax "quite basic," Hock says it is more than adequate. All terms entered are automatically ANDed, although there are no other Boolean or phrase search options. This is not a problem according to Hock, and he suggests doing separate searches for alternate terms. Returned pages display matching results; each record contains the article's title and URL as well as a one-to-two-sentence abstract. If the entry is a video, quiz, list, or recipe, an icon will appear in front of the title. Although the bulk of the "stuff" on HowStuffWorks is text and illustrations, it also includes "processes, events, and even issues."

The article takes a quick look at each channel, but I'm just going to give you a basic idea of what a typical search will turn up. Hock's search on wind turbines returned 72 results, consisting of articles, lists, photo montages, quizzes (suitable for classroom use), and videos. Some results will also include podcasts.

HowStuffWorks has Facebook, Twitter, and Pinterest pages; a YouTube channel; and mobile apps. You can also subscribe to a variety of newsletters. The only negative Hock found with the site is its ad placement, which often comes in the middle of an article, making it seem as if it has ended when it is just being interrupted.

Data Exhausted

As I write this, I have just hit the 30-year mark at Information Today, Inc. I shudder to think about how much dark data lies within the confines of my office. Most of my out-of-date info is not digital (or a security risk), but can be found in my Rolodex (which is probably older than some of my Medford, N.J., co-workers), desk drawers, and file cabinets. Is there a Hadoop solution for this?

Lauree Padgett is Information Today, Inc.'s editorial services manager. Her email address is Send your comments about this column to
COPYRIGHT 2016 Information Today, Inc.
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2016 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Title Annotation:IN OTHER WORDS; Dark Data: Analyzing Unused and Ignored Information
Author:Padgett, Lauree
Publication:Information Today
Article Type:Editorial
Geographic Code:1USA
Date:Jul 1, 2016
Previous Article:Whatever happened to MOOCs?
Next Article:UN's millennium development goals chart worldwide development progress.

Terms of use | Privacy policy | Copyright © 2018 Farlex, Inc. | Feedback | For webmasters