[wp-testers] .htaccess question

dayaparan ponnambalam daya at meyshan.com
Sat Nov 10 16:10:46 GMT 2007


You always get quick answer in the support form, I have got it many times
myself.

here is what got by googling, may be it is helpful to you

A referrer request-header filed allows the client to specify the address
(URI) of the resource from which the request–URI was obtained. It is a way
for an HTTP client to send in the headers, the URI of the page that sent
them there. This is especially handy for a site administrator to provide
insight as to where the traffic on his web server is coming from. It is also
depended upon by the most popular web server log analyzers in providing
statistics on the most common referrers.

The HTTP Referrer: header is very useful but it is also completely
arbitrary. Any web browser or HTTP client is free to send a forged Referrer:
header with any request to a web server. Spammers have taken advantage of
the fact that there is no provision for authentication in *SMPTP* and have
used the existing openness to specially craft request with their website in
the Referrer: header.

Most people will find it difficult to understand why someone would bother
spamming something which only the site administrator will see in the logs.
One probable motivation pinpointed is the boosting of search engine ranking.
Another is simply to show-up in any stats published by the site. If a site
being spammed runs a web server log analyzing software, access to the URL in
the top referrer's section is handily obtained by the spammer.

A serious consequence of referrer spam is that the process is often
performed via an *HTTP "GET" or "POST" *request which retrieves the entire
body of the document being spammed. A 30k document, for example, will have
all the 30k transferred across one's Internet pipe. This results to not a
small amount of traffic in the web server which could be very costly since
bandwidth is not cheap.

Referrer spam wastes *CPU* and disk space and can be a source of endless
annoyance to server operators. It is being actually fought by search engine
developers thus its initial effectiveness in boosting a site's ranking has
been considerably lessened. However, the problem persists and much has to be
done to conquer it.

Some recommended practices in countering the threat of referral spam include
the non-publication of referrers by bloggers, inclusion of the page in*
robots.txt *when referrers have to be published, use of the rel="no follow"
attribute and gathering a cleaner list of referrers using JavaScript and
beacon images. Some bloggers have begun fighting referrer spammers at the
.htaccess level. Others have even taken steps to automate this.

*Blocking Users by Referrer Notes*

A very useful feature of .htaccess is the ability to block users or sites
that originate from a particular domain. When there are tons of referrals
from a particular site with no single visible link to one's own site from
the said site, the referral probably isn't a legitimate one. The other site
is most likely hot linking to certain files such as images, CSS file or
other file. The blocking access by referrer in .htaccess requires the help
of the Apache module mod rewrite to be able to make out the referrer first.
There is a fear that spam would still come in even as .htaccess continue to
grow. Blacklisting certain referrers in .htaccess is another option, the
effectiveness of which has been greatly diminished due to the ease by which
spammers are able to register thousands of domains and rotate them as
quickly as they are blacklisted.

The .htaccess generator to prevent people from certain IP addresses, domains
or even countries from gaining access to a site or to specific folders can
be used. The full IP address has to be typed to block a specific IP. The use
of a partial IP address is required to block a range of IPs. Blocking a
particular domain can be done by typing the domain without the www. The tail
extension is to be typed when blocking a country.

There is no limit to the entries that can be added one at a time. The "add"
should be checked after each entry while the generated code is to be copied
and posted into a plain text file. This file is then named .htaccess. The
"." Before the file name should be noted as well as the absence of any tail
extension.

If there is already an .htaccess file in the root of the docs directory or
the folder where it is to be applied, the generated code shall be added to
the end of the current .htaccess file, taking extra care not to disturb the
existing code. It will then be uploaded in ASCII mode.

*The rel = "no follow" solution*

A coalition of blogging and search engine companies have joined together to
support an HTML attribute designed primarily to combat comment spam but have
high potentials as well for effective use against referral spam. This
attribute is known as the rel ="no follow" is being praised by many bloggers
as the ultimate solution for the prevailing problem. The idea is simple
enough with the hardest part being the matter of convincing the major
players such as Google, Yahoo! and MSN to agree on it.

Tagging a link with rel ='no follow" attribute would prevent any
contribution to the site's PageRank. This means that comment and referral
spammers will not be rewarded for their illegitimate activities on websites
that implement the attribute. The problem gets solved partially but this
solution is unable to end it.

This truth is sought to be explained by the fact that it is impossible to
reach a 100% adoption thus there will always be an incentive to spam.
Spammers essentially do not care whether their techniques are specifically
effective as long as they are generally effective. They need no particular
reason to hit any site and will do so as their main target is the
blogosphere as a whole. It is also quite unfortunate that the resources
required to fight spam, particularly referral spam, is far bigger than the
resources needed to create it.

Referral spam is an HTTP request. The client doesn't even need to
acknowledge the response. All it may need is a simple packet with formatted
text.

Spammers take pains to make a request look legitimate. The user – agent
string would look very much like MSIE. It used to be that spam came from a
single IP but things have definitely gotten more complex since then.

Filtering referrer IPs against spam blacklisting can also be done. Listing
the referring URL in any section of a site's web stats should be avoided if
the IP is blacklisted. Do not pursue query once a given site is identified
as a referral spam host name.


On 11/10/07, cpa31335 <tpblogeditor at gmail.com> wrote:
>
> what would I add to my .htaccess file to prevent a referral from a
> website?
>
> I have some people who like to come to my blog, from a web forum that I'm
> not too fond off... and I'd like to prevent referrals from there.
>
> can you all help. I didn't use the support forms, because no one ever
> answers in there.
>
> thanks,
>
> -Chuck
>
>
>
> --
> Chuck Adkins
> http://thepopulistblog.com
> _______________________________________________
> wp-testers mailing list
> wp-testers at lists.automattic.com
> http://lists.automattic.com/mailman/listinfo/wp-testers
>



-- 
Dayaparan
Executive Director
NHIT
Trichy
India
*Mobile:* 919965370000
daya at 19hourit.com
http://www.19hourit.com


More information about the wp-testers mailing list