Thursday, January 8, 2009

Content filter

Content filter - Xtechnology®Network.Server -

Content filter

Many work places, schools and colleges restrict the web sites and online services that are made available in their buildings. This is done either with a specialized proxy, called a content filter (both commercial and free products are available), or by using a cache-extension protocol such as ICAP, that allows plug-in extensions to an open caching architecture.

Requests made to the open internet must first pass through an outbound proxy filter. The web-filtering company provides a database of URL patterns (regular expressions) with associated content attributes. This database is updated weekly by site-wide subscription, much like a virus filter subscription. The administrator instructs the web filter to ban broad classes of content (such as sports, pornography, online shopping, gambling, or social networking). Requests that match a banned URL pattern are rejected immediately.

Assuming the requested URL is acceptable, the content is then fetched by the proxy. At this point a dynamic filter may be applied on the return path. For example, JPEG files could be blocked based on fleshtone matches, or language filters could dynamically detect unacceptable language. If the content is rejected then an HTTP fetch error is returned and nothing is cached.

Most web filtering companies use an internet-wide crawling robot that assesses the likelihood that a content is a certain type (i.e. "This content is 70% chance of porn, 40% chance of sports, and 30% chance of news" could be the outcome for one web page). The resultant database is then corrected by manual labor based on complaints or known flaws in the content-matching algorithms.

Unfortunately, web filtering proxies are not able to peer inside secure sockets HTTP transactions. As a result, users wanting to bypass web filtering will typically search the internet for an open and anonymous HTTPS transparent proxy. They will then program their browser to proxy all requests through the web filter to this anonymous proxy. Those requests will be encrypted with https. The web filter cannot distinguish these transactions from, say, a legitimate access to a financial website. Thus, content filters are only effective against unsophisticated users.

A special case of web proxies is "CGI proxies". These are web sites that allow a user to access a site through them. They generally use PHP or CGI to implement the proxy functionality. These types of proxies are frequently used to gain access to web sites blocked by corporate or school proxies. Since they also hide the user's own IP address from the web sites they access through the proxy, they are sometimes also used to gain a degree of anonymity, called "Proxy Avoidance".


Add by FiQ@Xtechnology®Network.Server
YOUR SERVER STATION

0 comments:

Post a Comment