How to put pirates (and bad aggregators) in their place

Aarrr, matey. Here be a tale of a blogging practice that makes ye look like a bilge rat pirate

As a rule, you should periodically check referrer logs. Usually it's good practice because you find out who's linking to your work. But once in a while you'll also find a site that's either copying your content outright without permission or that's embedding links to your media (images, MP3's, etc.) in their site and essentially pirating your bandwidth. This morning I found a site that was embedding links to my images in their page. Avast! The image on the right shows a bit of their page and how I'm replacing images (See the "Revenge" section below to see how this works).

My site publishes complete entries in its RSS feed. Because of that, other people's web-based aggregators are able to republish my content in its entirety. In the best case, a blogger uses a web-based aggregator to watch feeds and post the ones they like, excerpting the entry. In the worst case, they republish your entire entry without attribution. I don't know what this site owner was doing, but I noticed that their blog was basically aggregating other people's posts. There doesn't seem to be any original content. But in my case, they didn't excerpt, they re-copied my entire blog entry verbatim. What pisses me off is it looks like they wrote the article.

I suppose it's partly my fault for putting full entries in my RSS feed rather than excerpts, but I do this so that people can read my blog in their aggregators without having to actually go my site. This is the downside, I suppose. Web-based aggregators will republish whatever they get.

I take me revenge

To play with them a little, I now replace images referenced from another site with a STOP image. I hate to have to do this, because it messes up the images for legitimate aggregators. I suppose you could be really malicious and post a hardcore porn picture in there instead to make thinkgs look even worse. I'm not that malicious.

You can do the same thing if you find that someone is pirating your media. Using altlab's examples for dealing with bandwidth theft, I modified the .htaccess file on myserver to include these lines.

RewriteEngine On
RewriteCond %{HTTP_REFERER}
    !^http://(www\.)?urlgreyhot\.com/ [NC]
RewriteCond %{HTTP_REFERER} !^$
RewriteRule \.(jpe?g|gif|bmp|png)$ img/pirate.png [L]

To use this code on your site, replace the second line with your domain and modify the fourth line to use the path to your stop image.

I thought about this a few minutes. Because I don't want to do this to everyone, I can use the code below as a method to block from that domain only:

RewriteCond %{HTTP_REFERER}
    ^http://(www\.)?badsite\.net/ [NC]
RewriteCond %{HTTP_REFERER} !^$
RewriteRule \.(jpe?g|gif|bmp|png)$ img/pirate.png [L]

To use this code on your site, replace the domain first line with the domain of the bad site and the change the third line to use the path to your stop image. Take that, ye scurvy lubber!

Yo ho! Here be Mr. Krabs bit of advice to ye
This is why you should always look at your referrers! Be smart about RSS aggregation and blogging. If you are going to use an RSS aggregator to feed your blog, be sure to excerpt and ALWAYS link to the original article and attribute the author.

Update
Moments after doing this, they must have seen the replaced image, so they removed the copied entry from their site.

Comments

01 Nebojsa
02/04/06 @ 11:44

Hi.
I am sorry that I published your content on my site in illegal way. Please except my apologize, I'll not do that any more.

However, my site is not so pretty designed as your as, but I am trying to build a site with content where I am interested in. I am very carefully collected all feeds that I found in my interesting area and I am always looking to give author credits which cc licence asking from us... It's my mistake. It could be that I mistake licence.

However, your post about grid design was great and I will try to redesign my site so I hope that you will post about me as a good and not bad example.

One more time, sorry.
N.

02 jibbajabba
02/04/06 @ 12:50

No worries, Nebojsa. I figured your aggregator just snarfed the entire entry. Anyway, it gave me a chance to try out a new redirect rule and to point out a good practice for webmasters.

Thanks for acknowledging my entry and correcting the oversight.

-m

Advertisement
03 greggles
02/21/06 @ 11:42

Before I saw your image idea, I thought you were going to put in a paragraph with bold explanations of feed stealing that was included in a {div class="hideme"} that would correspond to a "hideme" class with a hidden directive in the CSS on your site (but not on any pirate's sites). Your solution works just as well.

The only drawback to that hidden text element is that you wouldn't want to include any links in the block because at least some search engines seem to dislike hidden links as a form of cloaking/malicious SEO that can negatively impact your site rank.

04 jibbajabba
02/21/06 @ 11:52

Clever idea, but I don't want to add any text kruft to my database that would have to be removed later. Takes up space and is untidy.

05 Anonymous
12/08/06 @ 20:57

I have found one occurence where somebody has put my name on a bogus testimonial. How does this effect my site's PR?
What are the drawbacks for this?

Jeffrey a. Solochek
http://www.nosugarcoating.info

Advertisement

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Allowed HTML tags: <a> <b> <strong> <dd> <dl> <dt> <i> <li> <ol> <u> <ul> <code> <blockquote>
  • Lines and paragraphs break automatically.
  • Web page addresses and e-mail addresses turn into links automatically.
  • You may post code using <code>...</code> (generic) or <?php ... ?> (highlighted PHP) tags.

More information about formatting options