A Call to Arms Against Unscrupulous Autobloggers

For those that don’t already know, autoblogging is much like it sounds. It is a term that was coined around 2006 to mean that a computer can do the job of a human blogger automatically, by using robot to go to the web an acquire content to be published on the owners site.

Sounds OK in principle and could be argued that in a way such products as Google News is an autoblog, because all that it does is listen to all of the RSS feeds from around the planet and then re-link them on their own page.

However this is usually where the comparison stops. Most autobloggers (be that the application or the blog owner) do not repeat the information verbatim but slightly change it to make it look like their own content, and then pack the rest of the page with advertising.
For example I have made a plugin called Share and Follow that is available through the WordPress Plugin Repository and from my own website. Yet if I start looking a little harder for reviews and things like that, I will find a massive number of sites that have taken parts of the readme.txt and placed it on their own site, where they offer nothing more than a link back to wordpress and a bunch of advertising. Useless to anybody but the auto-blog owner as they can charge per page impression via advertising.

There is even an autoblog that has chosen the niche of ‘errors’, so if someone is looking for help on an error, they can end up at a site that has no help but just a bunch of links to keywords about errors. This one I personally find annoying as they are suggesting to the world that there are issues or problems where there are none, and are also making money from advertising on it.

Others throughout the web are equally as angry as their full articles or blog posts are being harvested, sometimes verbatim repeated on a secondary site with yet again another set of advertising around it. I can safely say that popular sites like this one, SpeckyBoy, will have all of it’s articles ‘stolen’ and used else where.

Why do people Autoblog?

There seems to be only a handful or reasons to autoblog:

• To be able to have many blogs running at the same time
• To get higher SEO, as Google/yahoo et al believe that a site with regularly updated content is more valuable than others.
• To get more advertising space.
• They are too lazy or uncreative to make their own content
• They don’t want to employ real people and cut in to their advertising profits
• To get backlinks to their site from an un-suspecting blog owner, in the hope of increasing their PageRank
• To make their company or group look more important to their customers as they have all these great bloggers “working” for them

Is Autoblogging legal?

One could easily argue that whatever content is placed on a site with a copyright on it, that copying that content would be a breach of copyrights. Yet we often are all guilty of opening a legal backdoor to anybody who wants to copy our content, that backdoor is of course our RSS (really simple syndication) feed.

By offering an RSS feed we are allowing anybody the legal right to copy our content.

There are some additional rules that are supposed to cover RSS, like:

• The content must not be changed from it’s original format
• All links must stay the same
• If you choose to put a copyright message on your RSS feed it should be adhered to
• Full attribution of the content must remain with the original Author

As we know, rules are meant to be broken and these rules are often completely ignored. We must also remember that rules are different throughout the web world. The web it’s self does not worry about boundaries, but the rule of law does.

So, for example currently in Holland it is 100% legal to download any content available on the web (even pirated movies on torrents, if you don’t watch them after download), but totally illegal to upload or re-transmit that content without prior agreement from the content owner. However in countries like America, it’s a Federal Offense to do any of that part of it without explicit agreement from the content owner.

The major point there being that different countries have different laws, and a clever autoblogger could host in a country that has loose copyright laws, thus allowing them more flexibility in what they publish, or in the ability for the law courts to actually take the autoblogger to court.

Would you try to take a Chinese or North Korean autoblogger to court if your based in the US?

As an example of RSS without copyrights, you can do what you like with Gov. Arnold Schwarzenegger’s XML RSS feed from Posterous, including taking the images as they are all there in the feed. Ironically here there is not an author listed, so you could not reference them, but you would have to link back to the site to remain good under the eyes of the law.
I am sure he would come and terminate your ass if you took a thing from the Office of the Governor site where there is no RSS feed.

When is Autoblogging not legal?

OK, forgetting for a moment that everyone can be in loose lawed countries, I’m going to focus on the General Laws that exist in most countries such as USA or any of the European countries for a moment.

Autoblogging is not legal at all when any of the following conditions are true:

• Ignoring explicit copyright condition on the RSS feed
• Not recognizing the original author
• Copying the images from the original post and putting them on the new site, when the images are not part of RSS
• Not linking back the originator of the content
• Taking any content from a site that does not have an RSS feed, and contains a copyright or implied copyrights
• Transferring the RSS feed content into another language
• Taking the full content from a site where the RSS feed is a summary content
• Taking content or adapting content from a site without a Creative Commons, or Copy Left policy

So for example it is legal to take (and re-write) from Wikipedia, but not if you don’t link back to them or at least provide the URL of the original source.

Is there such a thing as a good Autoblogger?

As you will see yourself from looking around the web that there are many blogs that are just vehicles for gaining advertising revenue, there are also many sites that are run with a White Hat attitude.

As in the example at the beginning of this article, Google News, it gets news from around the world in different languages and re-publish them on it’s news pages. This helps us all.

Here is another scenario of where autoblogging could be productive for the general public:
Lets imagine that we’re running a site that deals with Car Insurance. Quite a dull subject but needed by everyone. I feel it would be great if the owner of an Independent Brokerage, re-published news via RSS from all of the Insurance Companies, or Newspapers about Car Insurance, along with the standard content on their site and their own comparison engine for comparing or selling insurance. Using WordPress terminology, to autoblog other peoples news to the sidebar while creating their own content to the main blog.

Using this example of the sidebar for autoblogging, it would satisfy Google’s desire for regularly updated content and higher SEO, but as a concept would definitely cut into that advertising space that so many autobloggers want. After all, who wants traffic with no income? Certainly not the autobloggers.

The Autoblogging’s biggest drawback

Quite simply, autobloggers never create their own content. If everyone was going to autoblog then there would be an ever decreasing circle of content that we all see over and over again. The web would become a very boring place.

Protecting yourself from autobloggers

Copyright Term and the Public Domain in the United States
There are a few key ways to protect your content from autobloggers, or at least making it clear to the reader that the autoblog is not the origin of the content, you could:

• Setup your RSS feeds to be summary only, not full content of posts (this can have an adverse effect on your readership, as people may not click through).
• Add a copyright notice to your RSS feed. Something that allows your readers unhindered access to your feed, but does not allow for re-publishing without prior consent. Do make sure that you have targeted both your own country and international copyrights.
• Add more images to your content with watermarks on the images, so that you suggest to the reader that this content all came from elsewhere. It helps if the images are an essential part of the reading process, i.e. references to graphs or other visual tools.
• Add more video with buffers, headers and footers with adverting for your site.
• Use a “pro” account for your video footage that gives you the possibility to say where things can be embedded or not; i.e. on your site only.

Here are some useful copyrights resources:
10 Big Myths About Copyright Explained →
Copyright Term and the Public Domain in the United States →
Intelectual Property Laws by Country →

Finding out who is copying your posts

Copyscape
There are a number of ways to find out who is copying your content. My personal favorite at the moment is CopyScape.

As an example I wrote a nice article about why in my opinion Adobe Fireworks is better than the Photoshop/Illustrator combo as a web design product. As you can see from this link (my article via CopyScape), there are at the time of writing 5 places on the web where people have taken this content and made it their own, in some cases they have taken it verbatim and snapped up the images too.
Don’t forget that these people care so little for you that they will often also link back to your images in their posts so that they don’t have to pay anything to host it. This can cost you a whole lot more than just a day or two angry at them.

Dealing with duplicate content (the slow way)

DMCA
Contact the site owner and demand in a polite way that they remove your content. You are in your rights to do that as the site owner or as the author of the article. So you don’t have to contact the blog you posted on to get them to take action if you are fuming mad already. Please note: most autobloggers don’t care.
You can also submit a spam request or duplicate page request to Google via the webmaster tools (I’m yet to find the equivalent for yahoo or bing, maybe somebody will comment with that info).
Read this page on DMCA (Digital Millennium Copyright Act) and report it to Google via a FAX or LETTER, not an email.

Do not expect quick action from these people or companies, and as most autobloggers work on their own servers and domains, there is not usually a governing body to help you.

At the other end of the scale Blogger.com spends many hours removing what it considers to be copyright infringement items as it does not want to be caught with it pants down in a position where they hold the legal responsibility. So if you complain to them when you see duplicate content on a Blogger site, they will work damn quickly to remove it, often without the knowledge of the owner of the blog.

But there are other ways to get back at these people much quicker.

Hitting back at the autobloggers where it hurts… in their pocket

The best way to get back at the autobloggers is to hit them in their pocket. Most autoblogger websites use Adsense to sell the advertising space to Google to populate, as no autoblogger wants the hassle of managing their own advertising space, after all they are lazy people in the first place.
Google AdSense has a Terms of Service that they must uphold, where they own very little responsibility with the DMCA, so it is much quicker to access help through this route.
All you need to do to get Google’s ear is to click on the “ads by Google” link and it will take you to a page owned by Google with a title at the bottom of the page saying “Report a policy violation regarding the site or ads you just saw”, from the drop down list choose websites and select from the possible complaints. You can tell them exactly that this is duplicate content and that it is only there for the purpose of selling adverting.

One of the many Google AdSense terms and condition is “Google ads, search boxes or search results may not be: Placed on pages published specifically for the purpose of showing ads.”

See here for the rest of the policies: Google AdSense Program Policies.

Search engine responsibilities, do they do enough to combat this?

Any good SEO person worth their salt will tell you that Google recognizes duplicate content and will mark down those sites in the search engine rankings. Remember the old adage “Content is King”.

Well they will also tell you that regularly updated sites will place higher in the search engine rankings as they are considered to be more active than others.
It seems that the balance of power there is in the hands of the sites that regularly update their content, be it copied or not.
In my mind this is a shameful state of affairs as any Search Engine should be able to work out who came first and link to that one, or put that one at the top of their listing, not the autobloggers site.

In my opinion, regular offenders (say more that 2 articles or blog posts with legitimate complaints), should be removed from the search engine rankings completely except for the contact page of that site, where sites that do not have a contact page are removed completely from the registers.

If the autobloggers cannot benefit from this activity, they will move on to another get rich quick scheme.

Author: (13 Posts)

Andy Killen has been working exclusively with the web since 1994 when he was Intel's internet engineer for Europe. Today he runs his own company phat-reaction.com as well as is the CTO of Speckyboy and was also part of the Adobe Fireworks CS6 beta test team. When Andy is not busy making websites, themes and plugins or giving training on website performance, wordpress or node.js, he can be found enjoying life in Amsterdam.

  • http://www.bojandjordjevic.com Bojan

    When you publish content on the internet, if it has any quality it will be copy pasted and republished elsewhere.

    Current DMCA laws are flawed and should be restructured, since they don’t bring good to humanity in general, rather to individuals.

    Ideas can’t be stolen, because you can’t put property tag on them.

    • http://phat-reaction.com andy killen

      Bojan,

      yes it is possble for people to copy and paste things, but using the tools outlined in this post you will see that you can

      a) ask for it to be removed
      b) get their advertising turned off
      c) make it obvious to whoever is reading it that it belongs to you and not the place where it is posted.

      Ideas are easy, I get thousands every day, doing them is the hard part.

      If you want to keep an idea as yours you also have to patent it in every possible region and therefore have it on paper/the web.

      Yes the DMCA is flawed as it was defined and created essentially by the film and music industry to protect their pocket, using government as the tool of implementation.

    • http://www.speckygeek.com Pritam @ Specky Geek

      I guess the author is not talking about copying of ideas, but blatant stealing of entire content. I guess at least 50% of the Web is spam or copy-pasted stuff.

  • http://www.web2panorama.wordpress.com Krisztian

    Hi Andy,

    I fully agree with you. But if this post is a call to arms, what’s next? Bloggers unite? Setting up a Facebook page to publicly name and shame autobloggers? Or what?

    Best,

    Krisz

    • http://phat-reaction.com andy killen

      Hey Krisz

      a facebook group would be less that useless in these current social networking times. nobody cares there, but you have prompted me to think about making 1 or 2 things. A website to name and shame and some sort of plugin for CMS’s that finds out who is doing what.

      But on a personal level you can still complain at google. Maybe if the AdSense people get enough complaints the search people will take note and put these people much lower down. After all its about search engine rankings and the possibility of being clicked on that the autobloggers want.

      regards
      Andy

  • Pingback: Published again, this time about autoblogging | Phat Reaction.

  • Patty

    This is extremely interesting and useful. I can see how it can be used in all kind of copyrights violation cases, not just for autobloggers (autobloggers case is more relevant when it comes to Google adsense violation terms.)

    Thank you so much for writting about this, it should greatly the blogging community.

    Definitely putting this article in my best read of 2010 ;)

    - Patty

    • http://phat-reaction.com andy killen

      Thanks Patty, you’ve made me a very happy man.

  • John

    What about protecting your content from being stolen?

    You can disable copy-pasting functionality that it’s quite lame but it is possible to find better solutions with php or plugins.

    I know that some people will claim that content protection is pure evil but is it better to have your web sistematically scrapped?

    • http://phat-reaction.com andy killen

      John,

      Maybe you can do that at a javascript level in a browser, but it does not stop a server or rss client from pulling apart your XML RSS feed.

      unless you know of some good RSS tools to deal with this?

  • Anarca

    Copying is not stealing. Even if it’s made so easy, thanks to PCs, it isn’t. That’s what the lobbyist propaganda tries to make the most naive believe.

    • http://phat-reaction.com andy killen

      I guess you did not read any of the copyright resources. or the part about being a legal copy. If you copy and make a reference to the original work, author and site, then your cool. But if you don’t then you are stealing. It’s why newspapers have to put “source” for anything that is not theirs (they normally get it from ANP or Reuiters, who would be mighty pissed if they did not reference them).

      So in essence paying homage is ok, but taking without reference is not cool at all.

      if you read the external links about copyrights you will seen that taking only a sub section is still breaking copyrights.

      So yes, copying is stealing. And no it’s not the DMCA that has set the trend here but the Berne Conventions, which have been around for over 100 years.

      • Anarca

        Copyright is just laws passed with lobby pressure to protect the corporate interests. It’s called Corporatism.

        I’m not saying there aren’t laws, I’m saying they’re wrong and immoral if not unethical.

        You still hold the original content. It’s like saying you have a bike, and I build a bike that looks the same as yours, maybe even works as yours, or better, and you say I’m stealing. How is that stealing? “Oh it’s the law”… Right?

      • http://www.gonzotimes.com PunkJohnnyCash

        Stealing? No stealing means you take something of someones they no longer have it you have it. Copying is not theft. It may be ethical to link back or credit, but copying is not stealing, two different things.

        • Andy killen

          Please explain the term identity theft?

          Where I read in dictionaries about theft it is about “stealing a product or service” not just taking something away from the original owner so that they no longer have the use of it.

    • Menno

      copy maybe not but publishing it is

      • http://phat-reaction.com andy killen

        Menno

        So what you are saying is….

        “you can copy but if you share it (publish it) then you are out of order.”

        Well no you are incorrect. If you copy it at all you are in breaking copyrights, be that published in your thesis, blog, laptop or where ever else.

        It’s like saying “microsoft won’t mind if I copy and use Word on my machine as long as I don’t send a document anywhere, as I have not brought it”

        With a name like menno I’m guessing your dutch and are arguing semantics from the wrong side of the fence. As clearly stated in this blog post, Holland has different rules to almost the rest of the world. But as you should know that is changing thanks to BREIN. http://www.anti-piracy.nl/home/home.asp (brien’s homepage)

        But lets look at an example for you guessing that you are based in holland: you find on the web the latest Jamie Oliver cook book as a ebook. You download it and use it in a way what so ever (share or read yourself)… that is definitely breaking copyrights.

        it’s no different for any other item. You can’t claim “fair usage” or any other suggest get out clause here to say that it’s ok.

      • http://www.gonzotimes.com PunkJohnnyCash

        Nope

        • http://share-and-follow.com/ Andy Killen

          PunkJohnnyCash,

          Just because you have an anti-authoritarian attitude does not mean that there are not laws. And yes on your site you have chosen a copy left policy.

          But lets imagine that one of your Essays is used by somebody else and passed off as theirs by digitally pre-dating your publication, with no credit to yourself what so ever. how would you feel if they then got success in their carrier based on that?

          If you like it or not copyrights were setup to protect the common man, Artists and Authors (see Berne Convention) and has been around for over 100 years and was predominately to stop people selling authors works in other countries than the one where it was created. i.e. created in the UK and then sold in France by another party without any money or credit returned to the original author.

          Why shouldn’t people feel the benefit of their hard work?

  • jahmaicherry

    Unscrupulous Autoblogging is much like littering for profit. As a web content consumer I detest the tactics have become of RSS usage.

    I wonder what the television world would be like if it used the same tactics.

    I support any decisions and tools that can help bloggers to keep their content free of abuse.

    Where can I, as a consumer, report any abuses I find?

    • http://phat-reaction.com andy killen

      Same thing, you can use the AdSense route, but as you are not the originating content maker it will be harder for you to prove you case.

  • http://www.webdesign2day.com raybak

    Copy pasting the entire blog is a serious offence. But most of the autobloggers don’t do that they read the rss and post it in blogs so one thing can be to reduce the number of words in your RSS so that it will only publish the title and first few words. And also a link to read full post.

    • http://phat-reaction.com andy killen

      Making derivative works without referencing the original author and site, is also a breach of copyrights.

      If you don’t copyright your RSS and have full posts in it, then they are in their rights to copy and paste it directly into their blog, as long as they don’t change a thing, and reference you as the source/author.

      If you put in partial posts in your RSS and have full posts on your site and people copy you full post, no matter what it is a breach of copyright.

      If they do partial posts with advertising all around it they break the T&C of Google AdSense.

  • http://iamautocomplete.com angelee

    I think not only blog articles are copy pasted, even logos and some premium / privately used graphics. I’ve read few complaints before about the ‘copyright issue”, and as for autobloggers, it’s quite tedious to stop them, esp that the internet has becoming wider than the universe.

    • http://phat-reaction.com andy killen

      yes every graphic artist I know has used copyrighted images as a basis for their designs at some point or another.

      Again if they advertise using that logo, and you can prove it, and it’s eating into your corporate idententy you can get their revenue attacked through adsense.

      BUT you are very right, this is annoying to say the least, and time consuming. So I think it is wise to choose your battles carefully as pyric victories are not the best

      Also you must note that there are other TM’s and Copyrighted images that people/originating companies want you to copy and re-use to hype them (facebook, twitter, youtube etc…).

  • Pingback: MPAA Shuts Down 29 BitTorrent and NZB Sites – informationliberation | Gonzo Times Aggregator

  • Pingback: Di lo que quieras 2.5

  • http://bizdharma.com Himanshu Chanda

    I agree to your points and to some extent I am also concerned about my content getting copied.

    But at the same time the I have 2 thoughts. one is that is a complete full stop possible? and the other is Does it really matter ??

    I mean let them copy and go to hell, they really dont bother our rankings!

    I say so because there are cool article spinning softwares who will ensure that your content is copied and yet you cant make out.

    Lastly I believe if google just starts reducing the importance of exactly similar articles just on the basis of time they are posted, it will still solve much of our problem.

    Wont it ???

  • http://share-and-follow.com/ Andy Killen

    Hey Himanshu

    Yes I totally agree, if Google/yahoo/bing could work out what came first and put that at the top it would all be fine. But it is not as simple as that, lets say that your a Newspaper and you have also brought the same news from ANP (Associated News Press) as has been brought by 50 other newspapers across the globe, who comes first then?

    Or in a different way, you create a new website with a new blog post (first one), then it should not come as high as others as it has no incoming links (see pagerank). Also google will place a site that has more content (regular postings) than a site that just has one, if they show the same content.

    It’s a bit of a mine field based on the current set of rules of SEO that depict where things are placed on google.

    I think it needs some sort of human interaction here to weed out the “bad” sites as a straight algorithm in my mind won’t be able to deal with this, I still see that the AdSense people need to be sharing more with the Search people who the complaints are lodged against to be able to effective.

  • 4rjnd

    Hi,
    I am not supporting your views,
    lets say this blog, did you create all the resources that you post on this blog ? look how many similar websites are there..the same icon resources gets posted here and somewhere..with autoblogging its automated..