New Webmasters > Plagiarism on the Web: How To Detect It and Prevent It

Plagiarism on the Web: How To Detect It and Prevent It

Imagine putting your heart and soul into a piece of work, only to see it passed off by somebody else as their own. Many people know what this feels like because it happens all the time on the Internet. The whole cut-and-paste simplicity of copying on the Web makes it irresistible to people who want to build content without having to actually write anything.

Plagirism, as this is known, is sometimes referred to as copying or borrowing somebody else’s work or ideas. This description plays down the seriousness. It is stealing, plain and simple. Just because the content is online in digital format, it doesn’t lessen the offence. If you take a look at The Learning Center, a website that specialises in teaching all about plagiarism you will see the various definitions:

  • To steal and pass off (the ideas or words of another) as one’s own
  • To use (another’s production) without crediting the source
  • to commit literary theft
  • to present as new and original an idea or product derived from an existing source.

Plagiarism doesn’t just mean word-for-word copying of somebody’s work. Passing off somebody else’s ideas as your own is plagiarism too.

Apart from the fact that somebody has stolen something that you have written, plagiarism can also be damaging for the website that hosted the original content. People are increasingly reliant on search engines to direct people to their content. If Google becomes aware of two copies of an article, how does it know which one to direct people to? Even more devastating to the original author is what happens if the search engine finds the copied version before the original version? Asserting yourself as the original author in this situation can be very tricky indeed.

We will discuss how to deal with plagiarism later on in the article. The first problem we must address is finding which content has been copied in the first place.

Finding Plagiarised Content on the Web

Using Search Engines

The first place people usually go to find out what has been copied from their website is Google. Using a search engine can be quite time intensive as you must search for each page that you want to check. It can be quite useful however as you can spot minor changes to titles and content.

Fortunately, most content scrapers (automatic bots that harvest the internet for content to copy) don’t bother to change the title of the article they are copying. This means that a simple search for the article title will usually return any copies. This is not always the case and some more devious cheats will change certain words. Bear in mind, however, that some words in the title cannot be changed as the title will no longer make sense. Use this to your advantage. Let’s look at an example.

Let’s take this popular article from Articles Base entitled Hairstyles for Round Face Shapes – What Hair Styles Will Suit your Round Face Best. Let’s search Google to try and find if it has been used anywhere. Please note that in this example the article is designed to be reprinted on many different websites. Anybody using the article is free to do so and is doing nothing wrong. We will just use it as an example.

Put the article title in quotation marks and search for it on Google. The quotation marks will ensure all the words are kept in the right order. You can see the results for the search below.

Google Search Results

Google search results for our article name

We can then look down the list of results to find who has copied our article.

Using Copyscape

Copyscape (copyscape.com) is a search engine designed specifically to detect copied content on the web. You give it a URL and it will try to match any copies of you page. The advantage of Copyscape is that it doesn’t rely on the title being the same. Bear in mind, however, that any results aren’t always plagiarised copies. Extended quotes will also be flagged up by Copyscape. As long as these quotes are properly attributed to your site, these are acceptable. The image below shows the results when searching for our example URL that we discussed above.

Copyscape Screenshot

Copyscape screenshot for our article URL

What to do When You Find Copied Content

Contact the Website Owner

The first thing that is worth trying is to simply ask the owner of the website to remove the content (known as sending a “cease and desist” letter). Knowing that he has been caught may be enough to persuade a site owner to remove the stolen content. Make sure you email them or send them a letter rather than simply leaving a comment on their blog or website. Stick to the facts and make the letter very formal. Don’t get personal. Just let them know that they have reproduced your original work without permission. Request they remove it by a deadline set by you. This will work for a small number of sites. If it doesn’t then move on to the next step. Note that some people disagree with contacting somebody who has stolen your content. They say you shouldn’t negotiate with thieves. Personally, I disagree and think that any action that may get the content removed is worth pursuing.

Contact the Website Host

The next step is to contact the web host. You can locate a website’s host by performing a search on Domain Tools. If the site uses subdomain hosting such as WordPress.com or Blogger you will need to contact those sites directly. Hosting companies are usually very keen to avoid any possibility of legal action so most will take your complaint seriously.

International issues mean you will have varied success with contacting a host. In the USA sites are bound by the Digital Milennium Copyright Act but with international hosts the rules vary. Threatening a host with being prosecuted or sued is pointless as you will never be able to follow through with the threat. As we learned when looking at libel and slander on the Internet, international laws for this type of situation simply don’t exist.

As with contacting the website owner directly, simply prepare an email, letter or fax to the host setting out the facts. Again, be formal, polite and stick to the facts. There is a link to sample letters below.

Ask Search Engines to Remove the Content

The goal of most online plagiarism is to attract visitors. It is possible to have stolen content removed from the major search engines by contacting them directly. While this doesn’t remove the offending content from the Web, it will probably reduce the number of visitors to the site and make it much more likely the thief will stop. A lot of the major search engines make it quick and easy to notify them of stolen content. You can look at an extensive list at Plagiarism Today.

As you can see, plagiarism is quite an extensive topic that has caused headaches and heartbreaks for many. Dealing with it can be time consuming and frustrating but persisting will eventually result in having the stolen content removed.

To read more about plagiarism you should definitely take a look at Plagiarism Today. They also have some good cease and desist letters you can edit and send out for yourself. It is definitely worth a look if you are interested in learning more.

Share this page with others
  • Digg
  • del.icio.us
  • Facebook
  • Mixx
  • Google Bookmarks
  • Furl
  • Reddit
  • blogmarks
  • Propeller

Related Articles

Discussion

5 comments for “Plagiarism on the Web: How To Detect It and Prevent It”

  1. Thanks for the mentions and the links! I just want to make myself available to anyone who needs help with these issues.

    Also, one thing to consider is that, in addition to Copyscape, you might want to look at Bitscan.com and their new service copyalerts.com

    Bitscan works much like Copyscape, but producing different (though not always better) results. Copyalerts works like Google Alerts, emailing you about suspicious pages.

    Just wanted to give you a heads up! Thank you for drawing attention to this issue.

    Posted by Jonathan Bailey | August 21, 2008, 5:17 pm
  2. really helpful tips thanks heaps!! :)

    Posted by poowhare | September 24, 2008, 5:53 am
  3. I wrote songs when I was young. All were stolen and plagiarized. Some when I was a child. I wrote Puff the Magic Dragon, Horse with no Name, Heart of Gold, Old Man (take a look at my song(life)) etc. My life and family was threatened and the government did nothing. Plagiarism is more than a civil crime especially when threats are made.

    Posted by Michael Holland Shepard | October 30, 2008, 2:10 pm
  4. Trackbacks / Pingbacks

  5. Where to Obtain the Best Web Images | New Webmasters | August 24, 2008, 7:26 am
  6. How Your Website Could Be Breaking The Law | New Webmasters | August 27, 2008, 4:37 am

Post a comment