Scraping Off the Blog Scrapers

Money from advertising income motivates blog scrapers to steal content. It’s only a matter of time before you discover your copyright has been violated and your content is now duplicated on a site you don’t want to be associated with. So let’s consider the impact of that duplication and how to counter it.

All Rights Reserved – Any content reblogged from one cool site must adhere to the terms of © Copyright

balanceWhen full copies of your content appear anywhere other than on the site it was originally published on

  • your brand has been diluted;
  • your traffic has been siphoned away from your site;
  • backlinks from the duplicate lack value as they are usually from sites that either have no pagerank or a low pagerank and sadly your article is now associated with those sites or even with banned sites;
  • too many backlinks from low quality sites can compromise your blog’s positioning in the SERPs;
  • the duplicate content can outrank the original content in the SERPs (search engine page results).

Splog off!

These days convenience rules. Instead of composing our own new posts and including backlinks to related articles in them we click reblog buttons.  We are forgetting that backlinks to our articles transfer pagerank thereby contributing to the authority our sites have within our niche.

Reblogs do pass pagerank, when there’s any to be passed. However, we’ve all witnessed the kind of blog that wouldn’t exist if it weren’t for throwaway comments and like button clicks from similar sites where content is primarily prompted posts, awards, product reviews, competitions, contests, other memes and reblogs.

What is value of a reblog backlink from a site with no pagerank or very low pagerank that’s light on original content, but rich in prompted posts, awards, product reviews, competitions, contests, other memes and reblogs?

Not much is my polite answer. Blogs with content of that nature are so close to being splogs that I cringe when my posts are reblogged on those sites, as I know that too many backlinks from low quality sites can compromise a blog’s positioning in the SERPs.

Help! Facebook thinks my blog is spam” can be the plaintive cry of a blogger posting to the support forums.  I always provide the correct  answer, despite the fact I visited the blog in question and usually concur with robotic analysis and error message it gave rise to.

Have you analyzed your site to make sure you don’t link to spam sites? Frequently spam sites that are not obvious will offer widgets, “awards” and images for people to use in their blogs and there is adware/malware attached. If you’ve hotlinked from any of those sites, then you’ll have to remove it and flush your Facebook cache before Facebook will reclassify you.

1. Try flushing your FB cache but disconnect from FB first.

2. Disconnect the Publicize app and reconnect it to the correct account.Try the full reconnect procedure again detailed in the support documentation including step 3 (removing the FB app), which is probably the most important step.

3. See also publicize troubleshooting.

4. Then you can report it as a false-positive from within the error message Facebook provides.

No Google juice and no page view stat

When we click reblog, share and like buttons off-site without reading the original article where it was published, the result is that the button clicking doesn’t create a page view stat. It seems we are moving towards a model where we place a higher value on likes and  shares resulting from autoposting to social media sites thereby replacing backlinks and actual blog visits.  Well, that doesn’t make me happy because off site button clicking doesn’t result in visitors who locate the rest of my blog content.

Countering blog scrapers

Below are 11 steps I take to ensure that full copies of my content appear only on the site it was originally published on.

  1. I use a theme that is designed to display an author byline on each article that’s linked to all my other posts published on the same site.
  2. I have  set my RSS Feeds to “Summary” rather than “Full” on this page > Settings > Reading.
  3. I either insert “the more tag” following an introductory paragraph into each article prior to publication, or I use a theme that automatically provides excerpts  on the Front page,  Archives, Categories and Tags pages.
  4. I insert a copyright notice with an embedded link to my  copyright page  into each article.
  5. In each new article I backlink to my earlier related articles using appropriate anchor text.
  6. A few hours after publication I copy and paste a sentence from the body of my article into Google search and do a duplicate content search.
  7. In addition I have set up daily Google Alerts for my domain names and my pseudonym.
  8. From time to time I also use Copyspace to search for duplicates of my content.
  9. I also use Plagium  occasionally to track plagiarism.
  10. When my content has been stolen I file a DMCA take down notice.
  11. I report scraped content ranking above or instead of the original to Google.
  12. I recommend watermarking your images.

Search engines like Google aim to provide the most relevant and fresh results to users. Blog scrapers aim to make advertising income from the content they steal. The duplicate content blog scrapers create pollutes search results causing frustration and stealing time from search engine users.  So it’s not surprising that Google is scraping blog scraped content from their search results.

 How can I make sure that Google knows my content is original?

Can you benefit from content scraped from your site?

Discussion

Google Authorship gives you the ability to tell Google that you are the author of the content. And connecting  your Google+ Profile to your WordPress.com blog creates an official connection between your posts and your Google+ account.

Is your author name embedded in every article*?
Is your copyright notice is embedded in every article*?
Do you backlink to your internal related content in every article*?;

*The same applies to all digital products such as PDFs files, audio files and videos,etc.

Hat tip to aivlys who inspired this article.

Hat tip to ♡eM who inspired the Splog Off! section by prompting me to reflect on “the good old days”.

Related posts:
Reblogging Questions Remain Unanswered
Guest blogging for SEO gets the boot
Plagiarism, Attribution, Citation, Quotation

65 thoughts on “Scraping Off the Blog Scrapers

  1. This past few weeks I’ve gotten in a hurry or a bit lazy and stopped adding my own links to related posts at the bottom of each post. Must start that again – even if linked back within post. I like the idea of inserting that cr notice in each post with a link back

    It’s so annoying to have to be so protective, but it is my writing and stealing it is wrong and illegal. Do people even understand what “illegal” means anymore?

    Good post – will probably reread again in future.

      1. The enable feature is clicked on. But have found it selects the oddest post to link to – so I usually add my own at the end.

        Been meaning to ask someone – does WP always put any video you put in a post as the teaser in Reader. Recently it seems they jump over a couple of strong images/pictures at the first and use the video instead. The way I craft my post, that defeats any surprise feature /O’Henry type ending with the video as the closing explanation/discovery. So I’ve just stopped putting in videos and just post a link to it. A bit annoying

        Thanks for all your help in your posts with information and ideas. Appreciate it all.

        1. I know what you mean by odd related post choices. This Expound theme has the related posts function design ed into the theme itself. The posts selected do not seem to be either category or tag based choices.

          Setting that aside, I have added my own “Related posts found in this blog” links for years to the bottom of all my posts. Now we have the ability to select them in the link box selections but that didn’t used to exist at all.

          About image display in the Reader go to this post http://onecoolsitebloggingtips.com/2014/03/09/worrisome-or-not-wordpress-com-reader-developments/ > and scroll down to the Reblogging and Media section

          Thanks for the thanks. I sincerely appreciate the laughs I get from reading your unique and thought provoking posts. :)

    1. Cute isn’t it? It’s a free image we can use without any attribution so help yourself. I love all animals but I’m primarily a horse and dog person.

  2. Thank you for taking time to answer my questions so thoroughly. I’ll be back to tomorrow to read it again, and go through it like a checklist to see what changes I need to make. Lots of food for thought here!

  3. Great advice, timethief. A few questions: If I have my name on every post, is that the same as embedding it? I also include the copyright info on every post and it is posted on my main page. If this is not embedding, then how do I do that?

    Thanks for helping us keep control of what we create.

    1. No it isn’t but it’s a good inclusion. You type your name into each post but embedding is different. Look at the top of my blog posts and note the software provides my name and it’s a live link. timethief. It’s an embedded an author link and when you click it you find all the posts I am the author of on this blog. If anyone duplicates my content that author link is embedded and helps identify me as the author as well as identifying original site it was published on.

      To get the software do this on your blog simply use this workaround for creating as single author byline display. http://onecoolsitebloggingtips.com/2012/01/03/wordpress-com-changes-bylines-for-authors/

      Your copyright notice on the front page at the end of each post is well done. What “embedded” is getting at is what Matt Cutts referred to in the second video. If your content is stolen and it contains links in the text back to your site then those links identify your site as the original source.

  4. As a creative writer I worry about my content, but thanks to articles like this one I can fight a good fight for my work. Thank you for the tips.

  5. I have nothing else to add except you have tenacity. I like the featured cat photo a lot. Stealthy, smooth….scraping along the fence. :)

  6. [ Smiles ] Apparently, they disregarded the term, “Thou shall not steal”.

    Besides, if someone steals your work, you can take legal action because your work is copyrighted.

  7. This is excellent advice TimeThief, I’m going to incorporate your suggestions w/n how I run my blog. Thank you as always.

    1. Dear Mary,
      I’m so glad to read that response. Your skill in oil pastels is so impressive and your artwork is so lovely that I quail at the thought of it being duplicated anywhere.

      1. Thank you so much Timethief for your generous comments. I have some of your suggestions in place, but I think this weekend I’ll work on getting the other. The thing that we all can’t escape is that whether your images is on a blog, website, gallery website, dailypainters, etc. it’s out there and I’m not too sure what you can do to stop theft of property except not post. But that then takes away our options – attorney’s can send cease and desist letters, but they are only as good as the sheet of paper written on. Your advice is well-taken and is one step to perhaps stop the majority of folks. Thanks again – your articles and posts are always valuable.

  8. I wish I understood all of this better, but quite frankly this is where I need to improve my skills as a blogger. For example, I don’t understand how I would insert a tag at a specific place in a post–“the more tag,” for example. I’m a good writer. I’m horrible at this sort of thing and actually find it confusing–which is why I find your blog so helpful. Thank you!!!

    Hugs from Ecuador,
    Kathy

    1. Hi Kathy,
      You use the Untitled theme now and it creates excerpts automatically on the front page of your blog. Know that “the more tag” can only be used once in any post and what it achieves is what your theme does automatically. if you change themes in the future you can find “the more tag” button in the the Visual Editor Row 1 icon number 12 (looks like a broken page).
      http://en.support.wordpress.com/visual-editor/
      http://en.support.wordpress.com/splitting-content/more-tag/
      In the text editor the icon looks like this [more].

      As you are both an author and an artist my advice is: get on top of this Kathy. The last thing you need is having your unique content duplicated.

      Hugs from me in the Gulf Islands in the Salish Sea.

    1. Hi there,
      You are most welcome.

      May I please make you aware of a section of the support docs that you need to read and act on?
      All reasons for non-appearance of posts on the WordPress.com Topics (categories and Tags) pages and in the Reader can be found here > http://en.support.wordpress.com/topics/#missing-posts

      If you have assigned a combined total that exceeds 15 Categories/Tags to your posts, you need to edit and delete. http://en.support.wordpress.com/posts/edit-posts-screen/ After removal of the excess categories and/or tags note it may take several days for your posts to begin displaying there.

      You are inadvertently spamdexing AKA tag spamming. And, you ought to be more concerned about your posts not appearing and being well ranked on Google and Bing SERPS (search results pages) than not appearing on the WordPress.com Topics pages because it’s search engines that send a significant flow of traffic to blogs and they can choose to bury your content where the sun don’t shine for tag spamming.

      The rule of thumb is to assign the least, not the most, combined number of only relevant categories and tags that accurately describe the post content. For tips on tagging see > http://onecoolsitebloggingtips.com/2013/03/15/quick-blog-post-tagging-tips/

    1. Hi there,
      It’s good to know you thought think my list is valuable. I do have a 12th point but I’m thinking it may be better made in an upcoming post.

  9. This is great. I had no idea about any if this. I operate a very small blog and thought I needn’t worry about this kind of thing. You’ve made me think again. The idea of people stealing my stuff is infuriating! Thanks for sharing and explaining it simply for novices like me. :0)

    1. Hi there,

      Your collection of published posts may be may be small now but it will grow. Your writing is engaging and your content is worthy of protection, Olivia.

      The idea of people stealing my stuff is infuriating!

      I feel the same way and roll my eyes skyward while biting my lips, lest I respond to those who claim to be flattered by being ripped-off in a way that flattens them. Though I’m introverted I am not shy. I’m highly opinionated and I’m quite capable of flattening fools, who swoon over becoming the tools of content thieves, but I try not to do it very often, if at all.

      1. Thank you very much for your reply and for taking the time to read my blog :0) I googled a few of my articles and nothing has been ripped off so far. I will take some of the measures you have outlined and try to become more familiar with all of this. Thanks again!

  10. This is wonderfully informative, thank you! I understand everything except #6 on the list? Only a few hours after something goes Live? Isn’t it unlikely that anyone would scrape when it’s so freshly published? I think I’m not understanding correctly. This posting was MUCH appreciated.
    Stephanie

    1. Hi there,
      As we are all in different timezones despite the fact our content may be scraped off the RSS feed immediately after publication in some cases splogs won’t re-publish ie. duplicate it for a few hours after it’s published. Also note that it takes time for the Google bots to ascertain which is the original site of publication. If you rely on using Google search only immediately after publishing you may be lulled into a the false sense that everything is fine because the duplication has yet to happen. A quick Google search for a unique string of words from your own content done more than once daily may turn up some unpleasant revelations such as full article duplication and instances of plagiarizing too.

        1. I’m glad your focus is clear now. You write so well and put so much into your posts that taking basic precautions to ensure no duplicates exist online makes sense.

  11. My current theme (I just switched to a new one) does not have an author name listed. I wonder if WP should make that an automatic feature of all their themes.

    1. Staff’s position is that the point of displaying an author name is to differentiate the posts on an multi-author site so they made a chnage awhile back. Regardless of the theme, and provided the theme is designed to display a byline, bylines will only display now if there are at least two authors who both have at least one published post in the blog. There’s a single author workaround here. http://onecoolsitebloggingtips.com/2012/01/03/wordpress-com-changes-bylines-for-authors/

  12. Very informative! :)
    For an ordinary blogger like myself, all the safety measures seem too tricky to accomplish. Scrapers, as you mentioned, will always find a way to circumvent the systems. I just hope the money they earned will end up paying for their eventual illness that has no cure. That’s justice, I believe.

    1. Hi there,
      Karma is always in operation and like you I know that content thieves making an income from what they steal from others are dis-eased people. I’m so happy that you commented here as I discovered your blog and found you are a skilled storyteller so I’m now following it. Pleased to meet you. :)

      1. Well said. I agree. :)
        I have visited the Forums a lot and you have been most helpful. I do admire your patience to assist us, who often get lost in the technological maze. :)
        I am glad you found my writing enjoyable. Thank you for the kind words. It means a lot to me, :)
        Blessings,
        belsbror

  13. This has already happened to me. I’ve found my posts in surprising places. Most of the time, people contact me to ask if they can republish a post. And most of the time, I give them permission. I like to share.

    Recently, though, another blogger republished an original guest post I’d written for another blog. The blogger linked to my blog, but posted it without permission from me. I wonder if the blogger received permission from the blog at which my guest post had been published. I suppose I could find out.

    But since the other blog “owns” the rights to the guest post, there isn’t much I can say or do about it now. I imagine bloggers feel that if they link to the writer’s blog or credit the writer, it’s okay to republish posts.

    What do you think about that? Is it an etiquette matter or more?

    Thanks, Time Thief, for always sharing your blogging know-how and helping me to make sense of the blogosphere.

    I appreciate and admire you.

    Smiles!
    ♡eM

    1. When I read this:
      “another blogger republished an original guest post I’d written for another blog” I snarled. I advise contacting the blogger whose blog your guest post was published on and requesting that she/he takes the required action to get it taken down.

      “What do you think about that? Is it an etiquette matter or more? ”

      Etiquette never trumps the law! What you pint to is a copyright violation and what comes with the territory of being a blogger AKA a publisher is a knowledge of and compliance with the relevant law.

      1. I’m certainly not pleased that this caused you to “snarl”, but it’s just one more bit of putting our writing out there. I am always happy to share my writing, but do so appreciate when people ask instead of take. I will contact the editor of the blog and let her know what I discovered. Then, I suppose, it is up to her on how to approach the blogger.

        You’re right about etiquette not trumping copyright law.

    2. same here about finding my posts or some of my original texts elsewhere… can’t stop them! :D I refuse and I avoid, I’m simply unable to fight with ghost thieves/virtual plagiarists!
      * * *
      stay “cool” and serene! cheers, Mélanie

      1. Dear Melanie,
        I try to stay cool in a dry ice kind of way … lol :D Plagiarists and blog scrapers really get my goat. We may not be able to stop all of them but we can take steps to reduce the duplicate content that pollutes search results and impacts negatively on our blogs by employing sensible counter measures and I hope you take those steps too.

      2. Yes, ghost thieves are invisible, aren’t they? If we are willing to post our writing, we must be willing to let go of our ownership of it.

        Thanks for sharing a positive way of seeing right through this situation. :)

        1. I disagree beacuse everyone online leaves digital breadcrumb trails and can be tracked down. I do not give up ownership of my words. I do not surrender copyright. But I do accept your thanks. :)

          1. Honestly, it really comes down to whether I’m willing to take the time to go ghost hunting. ;) I’m glad you do. You are a force, a protector of our words. Thank you.

            I actually had a work taken after sending in a hard copy manuscript many years ago. It was a tough lesson to learn, trusting in copyright law instead of other people. I wrote to the publisher, but never received any response. Since then, I’ve protected all of my submitted writing. The only piece ever published, though, was the one that was taken. It was good.

            I suppose if and when my writing becomes more than a squeeze it in in my spare time (actually, I take the time) passion, I will become a word warrior as well. :)

  14. Do the 3 related posts at the bottom of each post (created by WordPress) count as back linked internal content, or should I be more proactive?

    Thanks for the tips, additional to what I already do.

    1. To put this in perspective I have been doing this manually for years. I don’t find that the 3 automatically provided related posts are necessarily the related posts I would choose to link to, so I include my own in the body text and if required at the end of my posts.

Comments are closed.