Printer Friendly

Personalization, privacy, and the problem of oversharing'.

By now, we've all seen headlines warning us about search engines that know what we want before we do, social networks that are more familiar with our habits than even our closest friends, and operating systems that "phone home " with our personal data. It seems that no matter what we do, online or off, everyone wants a piece of our data.

While the idea of data overreach and the growth of personalization is enough to raise the eyebrows of even the most casual user, information professionals must be even more conscientious about how personalization affects our search results. We also need to be cognizant about the impact of "oversharing" and the inadvertent leaking of proprietary and sensitive data we're entrusted with by our clients (and their NDAs!).


The early days of the internet were much simpler times. Personalization did not yet play a role. Yahoo was king, "Big Data" wasn't a buzz phrase, and advertisements came in a one-size-fits-all package. The results of a search were affected by only two things--the query itself and the placement of a website in the search engine's ranking system. While advertisers and search engines often used cookies to create profiles, they were primitive, and the results were usually obvious. Like queries generated like results. Thus, if Alice, a native of Cleveland, and Bob, a denizen of Dallas, searched for the best restaurant in Cleveland, both would receive the same results.

This changed in 2004, when Google brought us search personalization. Cookies took on a new life and expanded from simple identifiers to a system that directly and profoundly affected the results of a user's search experience. Today, search engines, social networks, operating systems, and even voice-enabled searching from Apple's Siri, Microsoft's Cortana, and Google Now track a vast array of data. This includes our queries, location, and behavioral data, as well as the very devices we use, enabling search engines to harvest hundreds of individual data points to build out precise profiles on each of us.

Thus, while Alice and Bob received the same results a decade ago, today they are likely to see radically different options in their search results, depending on their customer profile, the location they're searching from, what they've searched for in the past, and, in some cases, even the preferences of their friends and family. Alice might receive tons of local favorites, while Bob might be seeing the restaurants ranked highest by visitors. And both could be missing out.


Customization leads to filter bubbles. While search personalization can be beneficial for the casual searcher, bringing up user-relevant search results to the top, personalized search can be problematic for search professionals, especially when our private and professional "personas" intersect. For example, Google's search algorithm factors in your past searches to determine which results should be on the all-important first page. If you're a frequent Amazon shopper, Amazon results are presumed to be more relevant and will likely show up first when you're looking for a new product. Likewise, advertisement relevance also follows this model--so that research conducted as part of a competitive intelligence review for your client might trigger an increase in advertisements from rival firms.

As search engines develop better algorithms to divine what we like, this may translate to what author Eli Pariser called a "filter bubble." In The Filter Bubble: How the New Personalized Web is Changing What We Read and How We Think (Penguin Books, 2012), Pariser warns that a huge downside to personalized search is that it closes individuals off to new ideas, subjects, and important information by favoring results that match what we like, or at least what computer models believe we like. He argues that this "invisible algorithmic editing of the web" limits our exposure to new and/or differing points of view. While we should always be diligent and "go beyond the first page," even the most diligent among us rarely traverse to the 50th or 60th page of results.


In addition to affecting the relevancy of our results, the ubiquitous and near-permanent storage of our search histories, results, and sensitive client materials raises additional concerns about data breach and accidental leakage.

Whether it's from the stored results of our web history, hackers gaining a window into our search traffic, or a theft of sensitive client files, researchers need to start taking matters of data security seriously. This is particularly true for those of us who enjoy doing our work in the company of others, at cafes, libraries, or even co-working locations, where data connections are shared and unencrypted information can be easily monitored. Doing a U.S. Patent Office (USPTO) search on behalf of a client? It's probably good to know that the USPTO doesn't use a secure (HTTPS) connection, meaning that your data is being transmitted in the clear.

Similarly, many of us use cloud services to store sensitive client files, findings, and work products. However, if we're not using strong passwords and practicing good security hygiene (such as two-factor authentication), we might be leaving our livelihoods exposed. More troubling are the cases when we may be unaware of exactly where our data resides. For example, Microsoft recently came under fire when it released Windows 10 with an expansive and broad privacy policy that initially allowed the company to "access, disclose and preserve personal data," including documents, emails, and files stored locally, anytime it was "necessary" (micro In addition to causing substantial outrage, it also alerted users to the fact that often our data is being stored in the cloud, even when we may not necessarily intend it to be.


While there's no single solution to mitigate against these threats, there are a number of best practices that we as professional searchers should adopt. Although this article won't touch in depth on any single approach, it will outline a few relatively simple steps that anyone can follow to ensure both that our personalized searches aren't too personal, and we're not inadvertently leaking the data we're entrusted to protect.

Most critically, we need to limit what we're sharing with the world. We can accomplish this by taking a variety of proactive approaches such as these:

* Browsing privately. Almost all browsers, including mobile browsers, offer a "private session" option. While each browser calls it something different (Incognito, Private Browsing, In-Private), most offer a few shared features. Search history isn't stored; downloaded metadata (the name and size of a file) is erased; autofill information does not populate the search bar; and cookies are either restricted to the session, or are blocked entirely, which can impact how some services (especially those requiring a login), are displayed or behave.

Note that private browsing doesn't work if, for example, you login to check your email, or search Facebook from an Incognito window. While the session won't be saved, any information that's stored on third-party systems will still be tied to the unique userid and can impact the results received. Private browsing may also be limited if you have shared extensions or apps that are allowed to run across your private and regular sessions.

An excellent guide on how to set up private browsing for each of the major browsers (Chrome, Microsoft Edge, Firefox, Internet Explorer, and Opera) can be found at the Digital Citizen website (

* Being unique. Creating separate and unique "user profiles" can also limit the effect of personalized search by keeping personal and professional profiles separate. Most browsers offer the ability to create multiple profiles, with varying degrees of difficulty. When private browsing is inconvenient, setting up a unique profile can be a next-best solution. How-To Geek has a helpful guide on how to create user profiles (

* Periodically deleting your data. Google, Bing, and Yahoo provide ways to delete your search history or limit the search history collected. You can even delete specific results so the search on "communicable diseases of the bedroom" you conducted as part of a medical literature review won't show ads for STD tests when you're doing a personal search in front of your spouse! Molly Wood wrote about "Sweeping Away a Search History" in the April 3, 2014, issue of The New York Times ( tech/sweeping-away-a-search-history.html?_r=1), which has several helpful hints.

However, while we can influence what cookies are kept locally and as part of our search history, we can't stop everything. In 2011, computer security researchers discovered the existence of so-called "supercookies" and "zombie" cookies--cookies embedded in Adobe Flash or in client-side scripts that can be "respawned" after they're deleted. In 2014, it was discovered that some internet service providers, such as AT&T and Verizon, were using these supercookies, injecting these nearly undeletable cookies into every web request. While the ISPs initially claimed to be using the cookies to ensure "quality of service," in truth, they're mostly being used to support advertisers.

The second step is to focus on keeping sensitive and private client data secure. This can be accomplished by doing the following:

* Using strong and unique passwords. Remember, your data is only as secure as your weakest link, and for many of us, that means our passwords. Using simple, easily guessable passwords, or passwords that we use for multiple accounts can leave us open to compromise, exposing both client data and our own personal information. At a minimum, passwords should be strong, unique, and changed periodically. Microsoft's suggestions, although aimed at Vista users, are generally applicable ( Fortunately, there are many wonderful tools that make good password management relatively easy, including password managers such as Lastpass, KeePass, or 1Password. Lifehacker has a good rundown and review of many of the most popular password managers on its website (

* Embracing Two-Factor Authentication: To ensure even greater security of your data, particularly data stored in the cloud, information professionals should also incorporate a second factor, whether it's something you have (such as a one-time passcode, a dongle, or even an SMS text message), or something you are (such as a fingerprint or iris scan). Once the province of intelligence professionals in secure locations, two-factor authentication (sometimes referred to as 2FA) has become far more commonplace, as sites such as Dropbox, Google Drive, Facebook, LinkedIn, and even many banks have pushed to improve security on their sites. Check out TeleSign's "Ultimate Guide to Two-Factor Authentication" ( for more information.

* Using a virtual private network. While there are huge benefits to being mobile and able to work from anywhere, there are also risks, especially when it comes to protecting our data. Fortunately, virtual private networks (VPNs) can act as a barrier between our sensitive work and the prying eyes of the public by encrypting and masking our location and the sites we access.

There are dozens of VPN providers, and most offer reasonably priced, secure connections from multiple locations around the world. Torrent Freak lists a number of resources for assessing VPN quality, coverage, and price ( Points to consider when signing up for a VPN include whether logs are kept, the type of encryption used by the provider, and whether the VPN is self-contained. (Does it use its own DNS servers and have physical control over its own hardware?)

Information professionals should also look for a VPN provider with many access points, especially those scattered around the world. For example, Europe's Data Protection laws, and particularly its ruling on the Right to be Forgotten (RTBF), sometimes limit the results European citizens are able to obtain, especially when it comes to investigations about an individual. By using a VPN connected to a server in the United States, however, researchers can avoid these limitations and see all results, even those that were "hidden" in the EU. Finally, VPNs or services such as TOR can provide one of the few means to bypass ISP supercookies.

* Try peeling the onion. For the truly privacy-conscious, it may be advisable to consider using software such as The Onion Router, AKA Tor ( Tor is a free, open-source browser that can be used to mask location and block unwanted cookies and browser tracking. While Tor has received a bit of negative press, the Tor browser itself can be a boon to searchers, creating a secure connection to the internet. And pairing TOR with a search engine that does not save search results remotely, such as DuckDuckGo (, can provide an even greater degree of security.


Ultimately, no matter what tool, service, or technique we use, it's my hope as searchers that we start to understand how search personalization and data privacy affect us, both personally and professionally. Taking a mindful approach to issues of personalization, confidentiality, and the risks surrounding what we do can go a long way in protecting both our own personal information, while also ensuring we are providing the best results and due care for our clients.

Carey Lening ( is competitive intelligence and legal research analyst for Knowligence, LLC

Comments? Contact the editor-in-chief (
COPYRIGHT 2016 Information Today, Inc.
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2016 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Comment:Personalization, privacy, and the problem of oversharing'.
Author:Lening, Carey
Publication:Online Searcher
Geographic Code:1USA
Date:Jan 1, 2016
Previous Article:It's good to follow your government: keeping up with U.S. federal agencies via social media.
Next Article:Time for open source.

Terms of use | Privacy policy | Copyright © 2018 Farlex, Inc. | Feedback | For webmasters