Printer Friendly

Your new role as a data scientist.

It would be an understatement to say that we have increased our focus on data science. Many info pros know that data applications will change not only what we already do now, but what we need to do in the future. We write about data a lot more too, and I'm no exception. Indeed, since this issue of CIL is devoted to data, I was tempted to write about a different topic. But similar to everyone else, I have more to say about data this month.

I'm being intentionally provocative by using the title to suggest that you may find yourself immersed in data science. Whether or not that comes to pass, it never hurts to keep an eye on data trends, both at the local and industry levels.

The world of data is changing quickly, but two trends underway in my own user community are driving my thinking about data. The first trend is the merger of data analysis into a wide range of scholarly disciplines. Times have changed, and data applications are not being resisted as much as they used to be. The second is more personal, but no less fascinating. Whether we know it or not, data analysis, management, and research have become mainstream competencies in our field. This changes what we need to be thinking about--and what we need to be doing.

The evidence of change is everywhere. The organization I am affiliated with has become so data-driven that all roads to innovation involve data expertise. My team was (at least to some degree) ready for this development, and that has steered some new projects in our direction. Most recently, I took on oversight of secure data licensing as well as coordinating its future growth as a library service, and this may be only the beginning of new library-based data services.

These two local trends are surely playing out elsewhere, and so I'll assess data's new reach into communities of practice and what this means for us. Next, I will identify the three data-intensive roles we can fill that go beyond our traditional strategies for knowledge management.

Data's Two-Way Dialogue

Data's new reach into our lives is headline news these days. But what is perhaps somewhat less evident is that whenever data science is applied in new disciplines (including the humanities), a two-way process takes hold. The data influence scholarship. But just as important, scholars are reframing the meaning of data. This strikes me as a new paradigm, because there is a fair amount of consensus that data applications must evolve.

In a perfect world, data are objective and contribute dispassionately to shared understandings. In actual practice, data are constantly changing and are deeply influenced by new inputs. This is important to keep in mind, because popular media tends to focus more on the monolithic "truth" we find in quantifiable data--and less on the ways students and scholars are revolutionizing data use. I agree that the mainstream focus on data as a commercial tool is interesting, but I don't see many authors paying attention to how scholars and citizens alike are reframing data analysis and creating new applications.

This is noteworthy because the initial, reflexive pushback against data has given way to greater curiosity, driven in no small part by new web-based forms of data visualization. The temptation to create compelling data visualization is quickly changing how scholars work. Nowadays, English professors may use textual analysis or collaborate with someone who can write programs to parse language in many forms. Social scientists employ vast datasets to study topics such as migration and educational outcomes.

The "new" story is one of discovery and new horizons, oftentimes resulting from mundane data sources. Researchers are bolder and much less hesitant to work with primary sources or to build their own value-added datasets. This transformation can happen so fast that we can miss the moment when it took place among our users. My view is that the recent acceptance of data--as well as the scrutiny it receives from diverse scholars--is an "untold story" of innovation that is waiting to explode across popular news outlets.

The reason is clear: Scholarship now depends on data as a foundational tool. This is true both for traditional data users and new entrants. For example, business administration researchers are building databases from privileged or confidential sources and testing hypotheses against the data. Long-term strategies such as case studies, interviews, and fieldwork are ongoing, but the focus has moved to data analysis by a wide margin. It is not just another tool: It's the tool of choice.

Moreover, I have found that, time and again, whenever researchers start working intensively with data, they are pulled further into analysis. I recently conducted a telephone survey of faculty. It revealed every respondent was working with primary data, across four distinct disciplines. I also found that they are looking for help in obtaining those data.

Do I sense an opportunity here? Absolutely. Riding the datastream is currently the best pathway into academic user communities. And we show up armed with many skills--not all of which have to do with data.

Three Strategies

If I am correct that data science is now a core competency for us, then we need to focus on how to use it to advance our own profession. I can think of three strategies that require some thought and energy, but they also align with our existing skills. None of them are a complete surprise, but it is worth remembering that incremental change is change nonetheless.

Become data overseers. It is easy to forget that data users such as economists are real people and not just talking heads. They care a great deal about the final product, but perhaps less so about how tidy their offices are. Research team members may use different approaches to study the same data sources, creating extra work for everyone involved. The library profession emphasizes shared oversight and common agreements on the structure of knowledge, whether in online catalogs as MARC records, in data management, or in systematic literature searching. This basis for collaboration is powerful, and we can find a place within research teams as custodians of data, if we view this skill as a strategic tool.

Become an expert in data visualization. It may not be necessary to become a full-fledged data scientist in order to be indispensable in the not-so-distant, data-driven future. What do our users need the most? I have a locally sourced answer: They need effective forms of data visualization. Web-based, annotated, and replicable data--yes, our users need all those things as well. Information display that is accessible to the layperson is essential.

Examples are everywhere. One of my favorites is the University of California-Berkeley (UC-Berkeley) professor Karen Chapple's interactive map of gentrifying neighborhoods in the San Francisco Bay Area ( If we can act as experts in this type of data visualization, we can help researchers tell their story in ways that many people can understand.

You are the discovery expert. One of the hazards of being an expert in a narrow field is paying less attention to how others view similar intellectual challenges. Info pros are spared this dilemma, because we are charged with knowing how to find anything, synthesize our findings, and explain them to others who don't have time to do this vital work. Doesn't everyone possess that know-how? The short answer is, "Maybe."

But in an amped-up, data-driven world, there is less time for general research skills. Therefore, the ability to search the literature of many disciplines and discover patterns across diverse fields of study has grown more powerful than ever. Automating pattern recognition is a key goal of Big Data thinkers, and more power to them. But guess what? We have been doing it all along, and we're good at it. There will always be a role for human pattern recognition in the digital future. I suggest we seize the role, instead of waiting for it to be offered to us.

Data Science Everywhere

When I entered this profession as my second career, the term "information scientist" was so new that it raised eyebrows. Special libraries in Fortune 100 corporations frequently used this term (or variations) well before the turn of the century. Times have changed, and we might hesitate to style ourselves as information scientists--unless we have learned data science skills as a first step. My current perspective is that we are being offered a new opportunity to become the information scientists that the world needs. After all, if an English literature professor can be a data scientist, then so can I. And as I practice data science, I'll be just as likely to use the skills I learned as an English major as those of an engineer. I'll be in good company, as I join the ranks of humanists and social scientists who are jumping into the world of data science.

by terence k. huwe

Director of Library and Information Resources Institute for Research on Labor and Employment University of California-Berkeley

Terence K. Huwe is director of library and information resources at the Institute for Research on Labor and Employment at the University of California-Berkeley. His responsibilities include library administration, reference, and overseeing web services. His email address is
COPYRIGHT 2016 Information Today, Inc.
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2016 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Title Annotation:Building Digital Libraries
Author:Huwe, Terence K.
Publication:Computers in Libraries
Geographic Code:1USA
Date:Apr 1, 2016
Previous Article:Research data repositories: the what when, why, and how.
Next Article:Librarian of Congress.

Terms of use | Privacy policy | Copyright © 2020 Farlex, Inc. | Feedback | For webmasters