Monday, 2 April 2012
Attack Of The Clones: Bonkers Sees Doubly Red In A Battle With Content Scrapers
Now my blog is already listed on a number of “aggregator” or directory sites like Tree Hugger or Day Life, and I have no problem with any of those. On such sites just the title and a bitesize taster of your post will typically be shown. Readers have to click on the link below - which will say something like: “Full article on (Site Name)” – if they wish to carry on reading, whereupon they are routed back to the original blog post. Thanks to this “titbit”-quoting approach, RSS feed sites are no threat at all to blog owners, and may well serve as additional sources of referred traffic.
“A scraper site is a spam website that copies all of its content from other websites using web scraping.”
One more sentence if the Wikipedia authors don’t mind, and then I am done.
“Some scraper sites are created to make money by using advertising programs.”
Too right they are! Yes, this one plonked my posts up on its site, hedged them all around with Google ads, and then sat back and watched as the revenue poured in...or maybe just trickled in, one small monetised drop at a time - who knows? Hey, it is the principle I object to, the piggybacking on someone else's time and effort in the hope of financial gain.
So I fired off a pretty stern email to the site owner, in which I came over all UPPER CASE, WHICH IS NOT LIKE ME AT ALL. I will, however, reproduce my complaint in normal text, to preempt mass eyestrain on the part of readers:
“Please take down all content from my blog, Bonkers about Perfume - you have
basically stolen my blog in its entirety for your own purposes and I will
pursue the matter further if you do not remove it from your site
“And tobe clear we are only rss index directory (no scraping).”
And to be clear, if the lifting wholesale of material, including text and images, does not constitute scraping, I sincerely hope never to encounter a bona fide content scraper in a dark alley.
So anyway, by the end of the afternoon my blog content had been excised from the rogue site – scraped off, no less - though not without the site owner pointing out how I was in fact cutting off my SEO nose to spite my face:
“FYI to be in out site it is good for you SEO & traffice wise but it is no longer relevant for you…”
Note the suspenseful dots pointing to my ill-judged decision….
Well, as it happens, having checked them out, the content scraper’s site doesn’t appear to register on Google’s page rank scale of 0-10 whereas Bonkers is a 3, ie "off the starting blocks", you could say. (For anyone not familiar with this techie blog rating malarkey, Google page ranking is one key measure of a blog’s importance in cyberspace - for more on this see point 4 of my recent post on blogging here. So I fail to see why my ranking would be improved by an association with an unranked site, or for that matter why I would get any referred traffic to speak of, given that entire blog posts of mine were available for people to read over on the scraper site!
Scents of Self, our go-to “in-perfume-community” Hebrew speaker, so I sent her the text and asked if she could confirm it was in fact Hebrew, and if so, whether she could kindly find a moment to tell me the gist. Ari was keen to help, but finding herself unfamiliar with some of the “Internet-y” terminology, she had the bright idea of running the text through Google Translate, and came up with the following:
"There is not even one character of copied content on this site. Display pages are a type of display window that broadcasts RSS within the online conversion of standard RSS to a display state that a web surfer is able to browse. And that every change that occurred in the original broadcast (feed) varies respectively in our website online. The donor site publishers with the broadcast channels are promoting Google and directing visitors. If you still want us to remove the display page of your transmission source, please contact us through the button "contact" and send us the download link."
So they would be the donor site publisher, I take it? Funny that I should feel as though I am the (unwitting) donor in all this...like those bodies harvested for organs without the prior permission of the deceased or their next of kin. And as for this business of "displaying" as opposed to "scraping", well, that is a nice point of semantics. That would also mean that pubs which screen Sky TV football matches using foreign satellite decoders are also merely "displaying" the games, rather than filching them in any more reprehensible manner.
Well, what have I learnt from this unsettling incident? Firstly, that it really does pay to make a direct approach in the first instance to the site which has copied your material, as this avoids having to seek out alternative avenues of complaint that may lie deep within the bowels of Google or Blogger. I also found out that fellow blogger My Perfume Life is being scammed in exactly the same way, and have dropped her a line to this effect. And I have proved once more that I can call upon friends in Perfume Land – even on a "sudden death" basis - for help on all manner of random topics. : - )
Lastly, I learnt (for the umpteenth time!) never to write a post on Blogspot software, even though it supposedly saves drafts as you go along. Beastly Blogger contrived to eat this post completely - or, you could say, to scratch it off the "compose" window with a single gouge of its heartless fingernails - just as I was doing a final proof. Which rather begs the question:
“Where’s the blinkin' duplicate post when you really need one?!”
And I would be interested to know if anyone else out there is aware of having been cloned, scammed or scraped in a similar fashion?
If so, did you take any action against the scrapers, or did you decide to go with the outflow? : - )
Since writing this account of my own experience, Tarleisio of The Alembicated Genie has written a powerful and moving post on the subject of content theft - see link below:
Phantoms in the Fumosphere
Photo of print scraper from drsmith7383 via Flickr CC, photo of boot scraper from sywlch via Flickr CC, other photos my own.