The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Moz Tools
    4. How to know bad bots or spiders scraping my website?

    How to know bad bots or spiders scraping my website?

    Moz Tools
    4 3 3.7k
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Trigun
      Trigun last edited by

      This post is deleted!
      1 Reply Last reply Reply Quote 0
      • stefanok
        stefanok last edited by

        Sorry to hear about this, this can really cause a big headache. I came across these three articles which im sure you would find helful.

        1. http://perishablepress.com/press/2010/09/24/content-scrapers-suck-ass/

        Basically using HTACESS to block these domains. here is an excerpt:

        "These are the tools I use when dealing with content scrapers. For bigger sites like DigWP.com, I agree with Chris that no action is really required. As long as you are actively including plenty of internal links in your posts, scraped content equals links back to your pages. For example, getting a link in a Smashing Magazine article instantly provides hundreds of linkbacks thanks to all of thieves and leeches stealing Smashing Mag’s content. Sprinkling a few internal links throughout your posts benefits you in some fantastic ways:

        • Provides links back to your site from stolen/scraped content
        • Helps your readers find new and related pages/content on your site
        • Makes it easy for search engines to crawl deeply into your site"

        2.http://www.famousbloggers.net/content-scrapers-thieves.html

        basically identify them. Contact site owners, hosting services etc.

        "What Can You Do When You Catch Someone Stealing Your Content? In addition to several basic steps that you can immediately take, there are also a few extra tricks you can use to protect your content: Contact the blog or website’s owner and politely ask them to remove the stolen content. 95% of the time, this has been the only step I’ve needed to take. You can use the Whois Lookup from Domain Tools to help you find the blog or website’s owner contact information. On the rare occasions when this isn’t successful, move on to the next steps.Contact Google and file a Digital Millennium Copyright Act (DMCA) complaint. In addition to Google giving your site credit for the original content, filing a DMCA complaint may result in Google completely removing a blog or website that is full of stolen content from their index. You can also file a Spam Report with Google to help fight back against content thieves.Contact the blog or website’s hosting company and file a Digital Millennium Copyright Act (DMCA) complaint. Hosting companies are required by law to shut down the blog or website until the stolen content is removed. Most reputable hosting companies already have procedures in place for lodging your DMCA complaints with their security or abuse departments. The key to successfully using this technique is that you will need to prove to the hosting company that you were the first one to publish the content. A simple and effective way to do this is by using the free Wayback Machine from Archive.org. This technique has worked for me on several occasions when a blog or website owner refused to remove the stolen content on their own."

        3. Great 5 minute solution: http://blog.effortlessebookwriting.com/my-blog-content-was-stolen-here-is-an-effective-5-minute-solution-to-that/

        Hope that this helps you. let me know, regards Stef

        Trigun 1 Reply Last reply Reply Quote 1
        • Distil
          Distil last edited by

          I agree with Stefano, this can be a huge headache.  Since Google implemented the latest iteration of their search algorithms (Panda) there have been serios implications for duplicate content on the web.   You now are penalized for the duplicated content and will lose pangerank.

          We always recommend all our clients take immediate action against offending sites.  The first step is contacting the site.  If that bears no results then you should absolutely file DMCA complaints with Google, their hosting provider, and any advertisers on their site.

          You should also take steps to prevent the continued scraping of your site.  Another good article for reference:
          http://www.blueglass.com/blog/content-scraping-prevention-benefits/

          HTACESS is a great start but scraping is a business, and the scrapers have gotten very good at masquerading their theft. That is why there are services, such as ours, that can help protect you against site scraping.

          Rami Founder, CEO
          www.distil.it

          1 Reply Last reply Reply Quote 0
          • Trigun
            Trigun @stefanok last edited by

            Hi Stefano,

            First of all, I would like to thank you for the help.  I really appreciate it.

            I have tried to implement the (http://perishablepress.com/press/2009/03/29/4g-ultimate-user-agent-blacklist/) and it seems that it minimized the scrapers visiting my site.  I paired it with blocking specific domains and IPs.

            Once this are applied,  an error occurs every time I add a picture into my posts.   What do I need to remove in the list of blocked bots to allow the adding of pictures?

            by the way, I am using wordpress as my CMS.

            Thanks in advance...

            1 Reply Last reply Reply Quote 0
            • 1 / 1
            • First post
              Last post
            • DA, PA, I knows it.
              sjoie73
              sjoie73
              0
              5
              43

            • My website was at the top of Google search for some years... suddenly I almost can't reach first page! Moz ranks my website better than the competitors... what might be going one? Could anybody help me out? Thanks!
              wesleyms
              wesleyms
              0
              3
              100

            • Identifying Bad Domains
              MarieHaynes
              MarieHaynes
              0
              7
              99

            • How to know exactly which page links to a 404 page on my website?
              maestrosonrisas
              maestrosonrisas
              0
              5
              141

            • How to get website audit report (broken links etc.) of any website in few minutes
              GPainter
              GPainter
              0
              2
              328

            • How to monetize a video website ?
              danatanseo
              danatanseo
              0
              2
              506

            • Seomoz Spider/Bot Details
              HalogenDigital
              HalogenDigital
              1
              5
              3.2k

            • Website migration
              perfectweb
              perfectweb
              0
              7
              1.4k

            Get started with Moz Pro!

            Unlock the power of advanced SEO tools and data-driven insights.

            Start my free trial
            Products
            • Moz Pro
            • Moz Local
            • Moz API
            • Moz Data
            • STAT
            • Product Updates
            Moz Solutions
            • SMB Solutions
            • Agency Solutions
            • Enterprise Solutions
            • Digital Marketers
            Free SEO Tools
            • Domain Authority Checker
            • Link Explorer
            • Keyword Explorer
            • Competitive Research
            • Brand Authority Checker
            • Local Citation Checker
            • MozBar Extension
            • MozCast
            Resources
            • Blog
            • SEO Learning Center
            • Help Hub
            • Beginner's Guide to SEO
            • How-to Guides
            • Moz Academy
            • API Docs
            About Moz
            • About
            • Team
            • Careers
            • Contact
            Why Moz
            • Case Studies
            • Testimonials
            Get Involved
            • Become an Affiliate
            • MozCon
            • Webinars
            • Practical Marketer Series
            • MozPod
            Connect with us

            Contact the Help team

            Join our newsletter
            Moz logo
            © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
            • Accessibility
            • Terms of Use
            • Privacy