The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Intermediate & Advanced SEO
    4. Competitior 'scraped' entire site - pretty much - what to do?

    Competitior 'scraped' entire site - pretty much - what to do?

    Intermediate & Advanced SEO
    29 7 4.4k
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • DanielFreedman
      DanielFreedman @Distil last edited by

      Thanks. Rami:

      Your solution and offer are fascinating. And no worries about the shameless plug pitfall.

      The issue for me is clients who may not quite fit into the category of being victims of the scraping/complete sleaze bag racket.

      Rather. they are industry leaders who are often victimized by leading content farms (and you know who I mean!) Some poor schmuck gets 15 bucks spending 15 minutes lifting our content without attribution or links by paraphrases it..

      Ironically, said content farms claim to have turned over a new leaf, hired reputable journalists as so-called "editors-in-chief" and now want to "partner" with our leading SMEs.

      As they used to say in 19th century  Russian novels "What is to be done?"

      1 Reply Last reply Reply Quote 0
      • RyanKent
        RyanKent @Distil last edited by

        Hi Rami.

        Sharing information about a relevant and useful service isn't advertising, it's educational and informative. You could have used a random name and mentioned the service, but you shared the information in a transparent, quality manner and I for one appreciate it.

        I believe your signature is missing a character and you meant to use www.distil.it.

        After reading about your product, I have some follow up questions. I can send the questions to your privately, but I think others would benefit from the responses so I will ask here if it is ok. I would humbly suggest adding this information to your site where appropriate or possibly in a FAQ section. If the information is already on your site and I missed it, I apologize.

        • It sounds like your solution offers cloud hosting. Is that correct? If so, is your hosting complete? In other words, do I maintain my regular web host or is your service in addition to my regular host?

        • It sounds like your Cloud Acceleration service is a CDN. Is that correct? Is this service an extra cost on top of the costs listed on your pricing page?

        • The Enterprise solution offers "Custom Security Algorithms". Can you share more details about what is involved?

        • Would it be fair to say your service handles 100% of security settings?

        • You mentioned caching, compression and minification. Would it be fair to say your service handles 100% of optimization settings? Along these lines, is your solution offered in such a manner to where your results are recognized by PageSpeed and YSlow? I always value results over any tool, but some clients latch onto certain tools and it would offer additional value if the tools recognized the results.

        • While your site ccTLD is .it, your contact number listed on your home page appears in the San Francisco area. Are you a US-based company?

        • You mention "the best support in the industry". For your regular (i.e. non-premium
          ) users, if a non-technical client requested basic changes such as to direct URLs which did not end in a slash to the equivalent URL which did end in a slash throughout their site, do you make these changes for them? How far are you able to assist customers? (I know it's a dangerous question to answer on some levels for you, but inquiring minds would like to know).

        • I did not notice any pricing related to space on disk. I have a client who provides many self-hosted videos and the site is 30 GB. Are there any pricing or other issues related to the physical size of a site?

        Your solution intrigues me because it addresses a wide array of hosting issues ranging from site speed to security to content scraping. I am anxious to learn more.

        Distil RyanKent 4 Replies Last reply Reply Quote 0
        • Distil
          Distil @RyanKent last edited by

          Hi Ryan,

          Thanks for catching my typo and your interest.   I am happy to answer your questions publicly and will definitely add your questions to the FAQ section we are currently working on.

          The company is at distil.it and yes we are an american company located in San Fran despite the Italian TLD.

          We do not host your files permanently on our servers, instead our service is layered on top of a standard host.  We do however cache your content on our edge nodes exactly like a CDN to accelerate your site.  This feature is already included in the pricing model.

          With the enterprise plan we will work with clients to responde to specific threats that an organization may face.  This could mean blocking certain countries from accessing your site, blocking certain IP ranges, or dealing with DoS attacks.

          Although we can respond to most security concerns, there are still some security threats outside our scope.

          Our page optimization and acceleration techniques are recognized by Pagespeed and YSlow and the results are measurable.  With one case study we improved our customer's page load time by 55%.  There are still other optimization tricks that we do not handle such as combining images into CSS sprites, or setting browser caching.

          We try to accomodate our customers the best we can. Basic redirects like the one you mention would not be hard and we would happily do this for regular customer within reason.

          Pricing for the service is based on bandwidth used and there is no extra cost for storage.For your specific scenario though we may not be a complete solution since our service is not currently optimized for video delivery.

          Please feel free to ask any additional questions, we are happy to answer and help!

          Rami

          1 Reply Last reply Reply Quote 1
          • RyanKent
            RyanKent @RyanKent last edited by

            Thank you for the additional details Rami. If you are willing to share further information, I do have a few follow up questions.

            • Do you serve 100% of the content to users? Or do users still visit the site? I am interested to understand how dynamic content would be affected. Will location based content where information changes based on a user's IP still function properly or is there likely to be issues? Will "fresh" content still function properly such as a new blog article which is receiving many comments, or a forum discussion.

            • Since you are caching the target site, how much does the target site's speed optimization still play? If a client's site is on a shared server vs a dedicated server, would it still be a concern for speed?

            • You mentioned dealing with security concerns. Are your actions taken proactively? Or does a client need to recognize there is an issue and contact your company?

            • Specific to the original question asked in this Q&A, can some bots get past your system? Or do you believe it to be bot-proof? I am specifically referring to bad bots, not those of major search engines.

            • How would Google Analytics and other tools which monitor site traffic be impacted by your service? I am trying to determine if your service is "normal" cloud service or if there are differences.

            • What differences are there between the services you offer and the regular Amazon cloud service?

            Thanks again for your time.

            1 Reply Last reply Reply Quote 0
            • kameeleon
              kameeleon last edited by

              Greg,

              There is only one thing that helps you to move forward with your client.  Rewrite your texts and upgrade or tweak you site to  better UX.  That way the scraped site will look like cheap copy.  Have done that in past. I know it's not fair but thats how you can put this behind you.

              PS. Rapid linkbuilding to forums and blogs will get one banned 🙂

              1 Reply Last reply Reply Quote 0
              • Distil
                Distil @RyanKent last edited by

                Hi Ryan,

                As long as others are benefiting and not bothered I am happy to answer your questions.

                When setting up Distil you are able to allocate a specific record (subdomain) or the entire zone (domain) to be delivered through our cloud.  This allows you to segregate what traffic you would like us to serve and what content you would like to handle through other delivery mechanisms.  Distil honors all no cache and cache control directives allowing you to easily customize what content we cache even if we are serving your entire site.  Additioanlly we do not cache any dynamic file types ensuring that fresh content always functions properly. Location based content will continue to function correctly because our service continues to pass the end user's IP through the host headers.

                Clients are able to reduce their infrastructure after migrating onto our platform however it is important to note that you cannot downgrade to a $5 shared hosting and expect the same results.  Distil is able to reduce your server load by 50%-70% but the remaining 30%-50% will still be handled by your backend so you need ensure any hosting you use can still handle that.

                Our specialty is dealing with bots and all of our security measures surrounding that protection are automated.  Any security concerns outside of that scope will be handled reactively with each individual client.

                Our service is constantly adapting to ensure that we provide a holistic solution and we go far beyond the suggestions mentioned above. Distil is set up to adapt intelligently on its own as it uncovers new bots and we also are always adding new algorithms to catch bots.  I do not want to say we are bot proof but we will catch well over 95% of bots and will quickly adapt to catch and stop any new derivates.

                Similar to most other cloud or CDN type services Google Analytics will not be impacted at all.

                Amazon offers cloud computing where as Distil offers a managed security solution in the cloud.  We utilize several cloud providers, including Amazon, for our infrastructure but what makes Distil unique is the software running on that infrastructure.  Amazon simply provides the computing power, we provide the intelligence to catch and stop malicious bots from scraping your website and ensure your content is protected.

                Rami Essaid
                www.distil.it

                1 Reply Last reply Reply Quote 1
                • RyanKent
                  RyanKent @RyanKent last edited by

                  Thanks for all the details Rami.

                  1 Reply Last reply Reply Quote 0
                  • ShaMenz
                    ShaMenz last edited by

                    Hi again Greg,

                    Just one more option that is available to you if you happen to have a Wordpress blog on the site (or have the option of rebuilding the entire site using Wordpress).

                    You could install the Bad Behavior plugin for Wordpress. The plugin is part of Project Honeypot, which tracks millions of bad ip addresses and gathers information from the plugin and feeds it back to the honeypot. Bad Behavior also works against link spam, email and content harvesters and other malicious sites.

                    Sha

                    1 Reply Last reply Reply Quote 0
                    • HMCOE
                      HMCOE last edited by

                      5 Steps:

                      1. Take screenshots of ALL webpages
                      2. Get a report on exactly how many pages were scraped and have evidence (usually Googling the site titles is very effective)
                      3. Take screenshots of the meta data: Right click, click on view source, and take screenshots
                      4. Once all is recorded send the website owner a Cease and Desist letter informing them to take everything offline and manually take off the pages from search indexes
                      5. If they don't comply at that point any IP lawyer will help if you have all the documentation. Some will take the work pro-Bono because there's huge money to be won, especially if you did all the work for them already.

                      Do NOT issue Cease and Desist letters without the screenshots. Usually what these guys will do is to change the appearance and add content to the meta tags and at  that point they will claim it was not plagiarized while still hurting you. It will not stand up in court.

                      However, if you documented the scraping the only option the website owner will have is to take the plagiarized content offline completely. Any edits they do at that point is considered a scraping/plagiarism because you documented the offense.

                      We've been able to prosecute 13 companies already. One company we publicly called out on Twitter during a popular chat leading to the company's downfall in 4 weeks.

                      FIGHT FOR YOUR CONTENT!

                      1 Reply Last reply Reply Quote 0
                      • 1
                      • 2
                      • 2 / 2
                      • First post
                        Last post
                      • Google Indexed Site A's Content On Site B, Site C etc
                        Paddy_Moogan
                        Paddy_Moogan
                        1
                        7
                        70

                      • Is there a difference between 'Mø' and 'Mo'?
                        Alick300
                        Alick300
                        0
                        4
                        246

                      • When the site's entire URL structure changed, should we update the inbound links built pointing to the old URLs?
                        BlueprintMarketing
                        BlueprintMarketing
                        1
                        6
                        102

                      • Old site penalised, we moved: Shall we cut loose from the old site. It's curently 301 to new site.
                        Carson-Ward
                        Carson-Ward
                        0
                        3
                        143

                      • 'Nofollow' footer links from another site, are they 'bad' links?
                        Stellar_SEO
                        Stellar_SEO
                        0
                        3
                        1.0k

                      • Starting Over with a new site - Do's and Don'ts?
                        DarrenX
                        DarrenX
                        0
                        3
                        373

                      • Can literally any site get 'burned'?
                        EGOL
                        EGOL
                        0
                        4
                        394

                      • Can you see the 'indexing rules' that are in place for your own site?
                        Dr-Pete
                        Dr-Pete
                        0
                        5
                        451

                      Get started with Moz Pro!

                      Unlock the power of advanced SEO tools and data-driven insights.

                      Start my free trial
                      Products
                      • Moz Pro
                      • Moz Local
                      • Moz API
                      • Moz Data
                      • STAT
                      • Product Updates
                      Moz Solutions
                      • SMB Solutions
                      • Agency Solutions
                      • Enterprise Solutions
                      • Digital Marketers
                      Free SEO Tools
                      • Domain Authority Checker
                      • Link Explorer
                      • Keyword Explorer
                      • Competitive Research
                      • Brand Authority Checker
                      • Local Citation Checker
                      • MozBar Extension
                      • MozCast
                      Resources
                      • Blog
                      • SEO Learning Center
                      • Help Hub
                      • Beginner's Guide to SEO
                      • How-to Guides
                      • Moz Academy
                      • API Docs
                      About Moz
                      • About
                      • Team
                      • Careers
                      • Contact
                      Why Moz
                      • Case Studies
                      • Testimonials
                      Get Involved
                      • Become an Affiliate
                      • MozCon
                      • Webinars
                      • Practical Marketer Series
                      • MozPod
                      Connect with us

                      Contact the Help team

                      Join our newsletter
                      Moz logo
                      © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                      • Accessibility
                      • Terms of Use
                      • Privacy