Printer Friendly

Building and running a collaborative internet filter is akin to a Kansas barn raising: the filtering committee would apply CIPA standards and Kansas law in making decisions, and I would modify the server to block or unblock as they instructed.

Our filtering project started out as a response to the passage of CIPA, the Children's Internet Protection Act, in January of 2001. Originally called "onGuard," it was a service that the Northeast Kansas Library System (NEKLS), where I work, created for its members. Although there had been mixed responses to CIPA in the library community, we knew that many NEKLS libraries would be unable to afford high-speed Internet access without their E-rate funding. So even while the challenge from the ALA and the ACLU cast the constitutionality of CIPA in doubt, we knew we should be exploring our options in case CIPA was upheld.

Then when the Supreme Court ruling did uphold the constitutionality of CIPA, the Kansas State Library became interested in expanding our filtering system to all public libraries in Kansas. Today, "Kanguard" is a statewide Internet content filter that's provided as a service of the Kansas State Library. I'm here to tell you how such a collaborative, far-reaching filter works.

From the very beginning, this filter has been a community-based effort, just as barn raisings have always been. Its building materials have been donated and assembled by many people and organizations, so it really does support local standards.

Before I get into the technical details, I want to briefly explain a little bit about my organization and whom we serve. The Northeast Kansas Library System is one of the several regional library systems established in 1965 by the Kansas State Legislature in order to improve and extend library service within its designated borders. NEKLS serves libraries of all types in the 14 counties of Northeast Kansas. This region is diverse, ranging from large urban centers to small towns and rural areas. Among the many services NEKLS provides to its member libraries is a membership in the Kansas Research and Education Network. KanREN is a member-driven organization providing Internet access, high-end networking support, and related services to schools, libraries, and higher educational institutions. KanREN and NEKLS cooperate on a systemwide intranet, which services 22 NEKLS member libraries, many of which have no other available option for Internet access.

As the automation coordinator for NEKLS, I consult with member libraries, researching technological issues such as filtering, and striving to come up with solutions.

Planning the Project

When the management of NEKLS made the decision to explore filtering options for its members. I first checked into commercial options. Finding no affordable solutions, I was at a loss.

As I so often do when I am faced with problems of a highly technical nature, I turned toward the extremely capable staff at the KanREN. In October of 2001, I called Cort Buffington, senior network engineer. "Cort, we are looking for a filter that can reside on a centrally managed server, potentially filter many libraries, be easily disabled and enabled, allow us to fully modify blocklists, and won't unnecessarily invade privacy." After some discussion, Cort finally said, "squidGuard." Thinking that maybe I hadn't heard correctly, I asked him to elaborate. "Google it, and then call me with any questions."

So I searched for squidGuard and found the Web site. Since I was what can most kindly be described as a Linux newbie at the time, I perused the site and was not really enlightened. I called Cort back to say that I was a bit lost. He offered to meet me in our offices and do a test installation; I readily agreed. To prepare, I brought in an old PC from home that I had used in library school for writing papers, surfing the Web, and watching television. It was not a powerful machine, with its AMD-K6.-2/350, 384 megabytes of RAM, a 3-gigabyte hard drive, CD-ROM and floppy drives, network card, and Red Hat Linux 7.0.

A couple of weeks later, I got the visit from Cort and Jake Chambers (Kan-REN's network engineer). By the time they left that October day, we had a functioning Squid server to test and play with. Within a month, we had hooked up test libraries and were successfully filtering on a trial basis. That test configuration from Cort was the basis for all future work on the project.

Building the Frame

In January of 2002, NEKLS purchased a Dell server with a Pentium III 900-MHz processor and 1 gigabyte of RAM to use as the onGuard filter server. During January and February, I hounded Cort and Jake with dozens of questions as I learned to implement our own filter installation. I surfed Web sites, read books, and gave myself a crash course in proxies, Linux, and other related issues. I learned that to build a Squid-based proxy filter, I had to have several different software packages working together. Two have similar names: Squid is the open source Web proxy cache, and squidGuard is filter, redirector, and access controller for Squid.

First, I selected an operating system. Squid will run on GNU/Linux or BSD (Berkeley Software Design). Squid will also run on a Windows system, provided that you either purchase a private version developed for Windows, or install the Cygwin and Mingw packages. I chose to use Red Hat Linux 7.0 because Squid is better supported on the Linux platform than on a Windows platform.

Linux has become so easy to start up that you could just about do a default installation and be ready to install the other software involved in this project. In fact, many distributions of Linux now include Squid as an option during installation. Although the first two installations I set up involved compiling Squid manually, in later upgrades I just installed the default Red Hat Package Manager (RPM) during the Linux installation process.

Next I downloaded Squid from its Web site, and installed and configured it. The site had all the installation instructions and lots of other good information. (If you have had much experience compiling and installing programs on a *nix-based system, you will not find Squid to be particularly daunting.)

The next piece of software I installed was Berkeley Database. Although in theory squidGuard could be used with any database, the developers recommend Berkeley DB version 2.x.x. I went over to Sleepycat Software and pulled version 2.7.7 from its Web site. This database houses blocklists that contain the bulk of the sites that the filter blocks. Since the time when I first obtained this file, Sleepycat seems to have obfuscated the location of older versions of the database. (If you are using RPM-compliant Linux on Sparc or Intel architectures, you can download the software from the site listed in the sidebar.)

Next, I wanted to download squid-Guard and install it. The installation was pretty straightforward and fairly simple. Although there are various package managers (precompiled binaries) available on the Web, I did fine by following the instructions provided on the Web site and compiling from source. Although precompiled binaries often make installation easier, I have not found a squidGuard rpm that works as well as compiling the code myself.

After installing squidGuard, I obtained a blocklist from squidGuard's Web site and installed it on the server. (I learned that when you unpack the blocklist, you should note where you put it so that you can specify its location in the squidGuard configuration file later.)

There were, however, a few important points that were not well-documented. Let me share what I learned the hard way:

* As I mentioned, be sure to have a squidGuard.log file located in a directory specified in the squidGuard.conf file. I used the "touch" command in order to create the file, and then used "chgrp" and "chown" in order to change permissions on the file appropriately. The program will not automatically write this log file for you, and if it is not there, squidGuard will go into "panic mode" and allow all Web pages past the filter.

* I used the "chown" and "chgrp" commands to give Squid permissions on all of its so-called "blacklist" files and directories. I did all of my downloading, unpacking, and installations as root, so Squid needed permissions set for the blacklists and for squidGuard.log.

* I am using logrotate to manage Squid's three big log files, so they don't get really large and eventually fill up the partition that they are on, causing Squid to fail and effectively block access to anything.

* I learned that two files require editing--squidGuard.conf and squid.conf.squid.conf is a long file that enables Squid to do a lot. I recommend setting a line like this:

redirect_program/usr/local/bin/squidGuard-c/etc/squid/squidGuard.conf

That will tell Squid to use squidGuard as a redirector and to let Squid know where squidGuard.conf is located.

Will the Barn Be Secure?

Once the server was set up and running properly, all I needed to do was configure my browser's HTTP proxy settings to use the IP address of the machine I set everything up on, and the appropriate port number (3128 by default). This causes the browser to use the server as a proxy, which will do all of the filtering work. I learned I had to do this for every browser installed on the machine, or else the machine would be filtered only when certain browsers were in use.

At this point, I took some time to evaluate the security of the server. I have been able to reliably shut down unnecessary services to the point that only SSH (Secure Socket Shell) and Squid are showing up on a port scan. I like to use Nmap, a free port scanner, to check the security of servers. Although Squid can be secured very well, proxy servers seem to attract crack attempts, and it is better to be safe than to be sorry.

Putting Up the Walls

Once the filter was set up and working, it was time to attend policy issues. Many decisions had to be made upfront to avoid potential quagmires of conflict: What content would be blocked and unblocked? How will those decisions be made? What criteria should prevail in making block and unblock decisions?

When NEKLS went live with onGuard in February 2002, I had already recruited some volunteers for a filter maintenance committee--three library directors who were going to use onGuard in their libraries. I felt that such a committee would be the best way to handle requests for additions to and subtractions from the blocklist.

Here's how it worked for us: onGuard had a built-in filter maintenance form that any user could fill out. All requests were brought to the attention of the committee, which treated them like any other challenged content. I asked the committee to apply CIPA standards and those of Kansas law in making their decisions. The committee vote would be final, and I would modify the server to block or unblock as they instructed.

When the Kansas State Library took over and expanded the service to Kanguard, it adopted a similar method for handling requests for blocklist modification. The State Library selected a committee of five professional librarians who were well-versed in CIPA and filtering issues. Although we are using the blocklist from the squidGuard Web site as our foundation, we have also configured the server with permit and deny lists that contain entries we specifically want to block or unblock. These entries are located separately from the main database, and are not altered no matter what changes are made to the database of blocked sites.

The filter maintenance was slightly modified for Kanguard. This form explains about the filter, requests only a URL and whether the site should be blocked and unblocked, and thanks the library customer for his or her submission. The completed forms are faxed or e-mailed by local library staff to the state committee, which takes a vote and then instructs me whether to make a change in the filter.

So what happens when a patron requests a page that we've blocked? I have created a Web page, fondly called "the oops! page," that squidGuard goes to. It explains the filter and informs the customer that there is a filter maintenance form that he or she can fill out to question a blocking decision.

Another concern that we needed to address was the filter-disabling mechanism referred to in the Supreme Court's CIPA ruling. We didn't want librarians to have to dig around and reconfigure the browser proxy settings whenever there was a request for unfiltered access. The Johnson County (Kan.) Library System generously gave us a program that its staff had written for use with its own proxy server. This program allows a librarian to click an icon to either disable or enable the filter in Internet Explorer. We have placed this icon on floppies and other removable discs and distributed them so that staff members in any participating library are the only ones who can temporarily disable the filter.

Help on the Finishing Work

As the number of libraries using Kanguard has grown, the number of challenges I face has grown as well. One of the first issues to arise was about databases that authenticate using IP address recognition. When a browser is set up to use the filter correctly, database vendors no longer recognize the computer as having the library's IP address, but instead see the computer as having the IP address of the proxy server. This required that I learn to configure the server to use subinterfaces, or multiple IP addresses assigned to the network card of the server. Then, each library that uses IP address recognition for database authentication gets its own unique IP address that it registers with its database vendors.

The Kansas State Library purchased a Xeon 3.12-GHz server with 4 gigabytes of RAM running Red Hat Linux 9.0 for us to host the service on. The State Library has also paid to have us add connectivity to our office in order to ensure smooth operation of the server. In order to address concerns about the long-term future of squidGuard, I contacted the developers, who promised that squidGuard is around to stay.

Back when this was a system service of NEKLS, I was able to provide full tech support to libraries using the filter. Since it has gone statewide, however, my colleagues in the other regional library systems have been doing installations, registrations, and even some troubleshooting for the libraries in their respective systems. To facilitate those activities, we techies have put a registration Web form on the Kanguard project page, along with a test page that should be filtered if everything is working properly.

True Barn-Raising Fashion

I've been told that, in Kansas, the barn-raising tradition is an important part of the state's history. In a sense, Kanguard is very much like a barn that was raised by a big group of different people all pulling together. The software was all written by the free and open source software community. The content review committees were composed of volunteer librarians. The initial setup and a great deal of ongoing consulting has been provided by KanREN's Cort Buffington, Jake Chambers, and Josh Peck (he's been quite helpful in KanREN's new role of systems manager). Those elements, as well as the work of my colleagues in other systems to deploy Kanguard, the disabling software donated by the Johnson County Library System, and the hard work and new server funding contributed by the Kansas State Library, have all made this a truly community-based filter.

Web Resources

Kanguard Project Page

http://skyways.lib.ks.us/KSL/libtech/kanguard

Northeast Kansas Library System (NEKLS)

http://www.nekls.org

Kansas Research and Education Network (KanREN)

http://www.kanren.net

squidGuard

http://www.squidguard.org

Red Hat, Inc.

http://www.redhat.com

Squid

http://www.squid-cache.org

Sleepycat Software (Maker of the Berkeley Database)

http://www.sleepycat.com

Alternate Download Site for Berkeley Database 2.7.7

http://rpm.pbone.net/index.php3/stat/4/idpl/171385/com/Berkeley DB-2.7.7-5mdk.i586.rpm.html

squidGuard Installation Instructions

http://www.squidguard.org/install

squidGuard Blacklist

http://www.squidguard.org/blacklist

Insecure.Org (Maker of Nmap)

http://www.insecure.org

Kanguard "Oops! Page"

http://www.nekls.org/blocked.html

Thomas M. Reddick is the automation coordinator at the Northeast Kansas Library System in Lawrence. He holds an M.L.I.S. from the University of South Carolina-Columbia, where he took technology classes under Dr. Robert E. Molyneux. His e-mail address is treddick@nekls.org.
COPYRIGHT 2004 Information Today, Inc.
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2004 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Title Annotation:Children's Internet Protection Act
Author:Reddick, Thomas M.
Publication:Computers in Libraries
Date:Apr 1, 2004
Words:2743
Previous Article:Glossary.
Next Article:Conversion conundrums and training traumas.
Topics:

Terms of use | Privacy policy | Copyright © 2019 Farlex, Inc. | Feedback | For webmasters