SquidGuard, My Favorite Content Filter
Introduction
squidGuard is a combined filter, redirector and access controller plugin for Squid. It is
- free (GPLv2)
- very flexible
- extremely fast *)
- easily installed
- portable
squidGuard can be used to
Neither squidGuard nor Squid can be used to
- filter/censor/edit text inside documents
- filter/censor/edit embeded scripting languages like JavaScript or VBscript inside HTML
*) 100,500 requests in 2 seconds on a AMD Athlon 3700+ with lists of 5643 domains 7442 urls 13780 total
**) squidGuard is not a porn or banner filter/blocker, but it is very well suited for these purposes too.
Capabilities
squidGuard has many powerful configuration options that lets you:
1) define different time spaces based on any reasonable combination of
- time of day (00:00-08:00 17:00-24:00)
- day of the week (sa)
- date (1999-05-13)
- date range (1999-04-01-1999-04-05)
- date wildcards (*-01-01 *-05-17 *-12-25)
2) group sources (users/clients) into distinct categories like “managers”, “employees”, “teachers”, “students”, “customers”, “guests” etc. based on any reasonable combination of
- IP address ranges with
- prefix notation (172.16.0.0/12)
- netmask notation (172.16.0.0/255.240.0.0)
- first-last notation (172.16.0.11-172.16.0.35)
- address lists (172.16.134.54 172.16.156.23 …)
- domain lists (foo.bar.com …) *)
- user id lists (weho sdgh dfhj asef …) **)
- and optionally link the group to a given time space
- positively (within business-hours)
- negatively (outside leisure-time)
3) group destinations (URLs/servers) into distinct categories like “local”, “customers”, “vendors”, “banners”, “banned” etc. based on
- an unlimited number of unlimited lists of
- domains, including subdomains (foo.bar.com)
- hosts (host.foo.bar.com)
- directory URLs, including subdirectories (foo.bar.com/some/dir)
- file URLs (foo.bar.com/somewhere/file.html)
- regular expressions ((expr1|expr2|…))
and optionally link the group to a given time space:
- positively (within business-hours)
- negatively (outside leisure-time)
4) rewrite/redirect URLs based on any reasonable combination of
string/regular expression editing à la sed with
- silent squid redirecting rewrite (s@from@to@[i])
- visible client redirecting rewrite (s@from@to@[i]r) ***)
URL replacement with
- silent squid redirect to a common URL (redirect “new_url”)
- visible client redirect to a common URL (redirect “302:new_url”) ***)
activated by
- 1-1 URL redirection
- destination group match
- a fallback/default for blocked URLs
- a fallback/default for blocked/unknown clients
and optionally with
- runtime string substitution à la strftime or printf
5) define access control lists (acl) based on any reasonable combination of the definitions above by
giving each source (user/client) group
a pass list with any reasonable combination of
- acceptable destination groups (good-dests …)
- unacceptable destination groups (!bad-dests …)
- block IP address URLs (enforce the use of domain names) (!in-addr)
- wildcards/nothing (any|all|none)
optionally a common rewrite rule set for the source group
optionally a default replacement URL for blocked destinations for the source group
and optionally:
link the acl to a given time space
- positively (within business-hours)
- negatively (outside leisure-time)
defining a fallback/default ruleset
6) have selective logging by optional log statements in the: ****)
- source/client group declarations to log all translations for the group (log “file”)
- destination group declarations. Typically used to log blacklist matches. (log “file”)
- rewrite rule group declarations to log all translations for the rule set (log “file”)
and optionally anonymized to protect the individuals (log anonymous “file”)
*) Client access control based on domain name requires enabling reverse lookups (log_fqdn on) in squid.conf.
**) Client access control based on user id requires enabling RFC931/ident in squid.conf.
***) Note: Visible redirects (302:new-url) are not supported by some interim versions of Squid (presumably 1.2-2.0).
****) Note: squidGuard is smart enough to open only one filedescriptor per logfile (i.e. not necessarily one per log statement); per spawned process of course. Though logging to too many different files may exeed your system’s concurrent filedescriptor limit.
Portability
squidGuard should compile right out of the box on any modern brand of UNIX with a development environment and a recent version (4.X, older versions are supported, too) of the Berkeley DB library. squidGuard was initially developed on Sun Solaris-2.8 with gcc-2.95.3, bison-1.25, flex-2.5.4. It is now maintained and developed under Gentoo Linux with latest versions of gcc, flex and bison.
In the past users have reported success on at least, but not limited to:
- AIX: 4.1.3, 4.3.2.0/egcs-2.91.66
- Dec-Unix: OSF1-4.0/gcc-2.7.2.3, 3.2C/gcc-2.7.2.3
- FreeBSD 4.x-STABLE gcc 2.95.3
- Linux: RedHat-5.2/gcc-2.8.1 RedHat-7.x/gcc-2.8.1, Gentoo 1.12.6/gcc-3.3.6
- Solaris: 2.6/gcc-2.7.2.3 2.6/gcc-2.95.3, 2.8/gcc-2.95.3
- CentOS: 4.4