The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Intermediate & Advanced SEO
    4. Massive Amount of Pages Deindexed

    Massive Amount of Pages Deindexed

    Intermediate & Advanced SEO
    12 4 534
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • ThompsonPaul
      ThompsonPaul last edited by

      First thing to confirm - did you recently migrate to HTTPS?

      1 Reply Last reply Reply Quote 1
      • D.J.Hanchett
        D.J.Hanchett last edited by

        Not recently. It migrated well over a year ago to HTTPS.

        1 Reply Last reply Reply Quote 0
        • TimHolmes
          TimHolmes last edited by

          Not sure if this is of help to you, I suppose it depends how many pages you are expecting to be indexed, but according to John Mu at Google - Google does not necessarily index all pages.

          https://www.seroundtable.com/google-index-all-pages-20780.html

          1 Reply Last reply Reply Quote 0
          • BlueprintMarketing
            BlueprintMarketing last edited by

            Right now I cannot get that site to load on my browser, and when I used https://tools.pingdom.com it was unable to load as well you could be having some serious server problems, and that could be causing the issue although I was getting it to run through screaming frog which is surprising.

            This is a zip file of your screen frog results this will show if there are any no index pages which I found none of it looks to me like you have a server issue. Zip file: http://bseo.io/BXYpZh

            I checked your site for malware using https://sitecheck.sucuri.net/results/www.hkqlaw.com/ ( please understand this only check the homepage and a handful of others) and found none though when I checked your IP address I noticed a lot of ransomware information tied directly to your IP

            https://ransomwaretracker.abuse.ch/ip/205.178.189.131/

            Here is a large screenshot of when I tried to browse your website: https://i.imgur.com/OzcLhbx.png

            Here is Pingdom ( remember to test on something outside of your local computer because you have caching and other things that could give you incorrect results.)

            https://tools.pingdom.com/#!/bd6d52/https://www.hkqlaw.com/

            in my experience network solutions, hosting is terrible I would strongly suggest doing two things.

            Get a better hosting company for your site.

            A good host that is not too expensive is and also managed is liquid Web, cloudways, rack space, pairnic, you can also build out your own system on non-managed hosting like Linode, digital ocean, AWS, Google cloud, Microsoft Azure if you want a high-quality, inexpensive manage host that offers more than one back and like the ones I've listed above https://www.cloudways.com/en/  will host anything and manage it, and you can use the backends provided before this.  If you want what I think is the best and price is not a big deal considering you're not running WordPress https://armor.com is my preferred hosting company. Otherwise, cloudways or liquid Web would be where I would host your site.

            Considering you already have an IP address attached to ransomware and you're using hosting company that will not be beneficial to you in security terms. I would add a web application firewall/reverse proxy you can do that with https://sucuri.net/website-firewall/  https://incapsula.com  https://fastly.com and if you want most basic and least secure but better than what you have https://cloudflare.com

            At the very least put Cloudflare on their but what I'm seeing is a severe problem coming from your web host and knowing that hosting company I would strongly advise you to move to a better host.

            I hope this was of help,

            Thomas

            OzcLhbx.png

            1 Reply Last reply Reply Quote 0
            • D.J.Hanchett
              D.J.Hanchett last edited by

              Thanks for the great feedback!  The hkqlaw.com url simply forwards (301) to hkqpc.com.  The IP address you have is for hkqlaw.com which is registered through Network Solutions, but hosting of hkqpc.com is on 1and1.com hosting.  Also, the timeout error you're getting is because there is no SSL cert for hkqlaw.com, again, it's just forwarded to hkqpc.com (which does have an SSL attached to it).  As far as SC, everything is setup to index hkqpc.com.

              1 Reply Last reply Reply Quote 0
              • BlueprintMarketing
                BlueprintMarketing last edited by

                https://cryptoreport.websecurity.symantec.com/checker/

                This server cannot be scanned for these vulnerabilities:HeartbleedServer scan unsuccessful. <a>See possible causes.</a>Poodle (TLS)Server scan unsuccessful. See possible causes.BEASTThis server is vulnerable to a BEAST attack. <a>More information.</a>

                I am sorry I said your IP was  Network solutions when it was 1&1 I still strongly recommend changing hosting companies even though I am German and so is 1&1

                DNS resolves www.hkqpc.com to 74.208.236.66

                The SSL certificate used to load resources from https://www.hkqpc.com will be distrusted in M70. Once distrusted, users will be prevented from loading these resources. See https://g.co/chrome/symantecpkicerts for more information.

                Look: https://cl.ly/pCY5

                Look: https://cl.ly/pAKa

                symantec  SSL certificates are now owned by DigiCert

                <big>https://www.digicert.com/help/</big>

                https://www.dareboost.com/en/report/5a70b33e0cf28f017576367f

                The Set-Cookie HTTP header can be configured with your Apache server. Make sure that the mod_headers module is enabled. Then, you can specify the header (in your .htaccess file, for example). Here is an example:  <ifmodule mod_headers.c=""># only for Apache > 2.2.4: Header edit Set-Cookie ^(.*)$ $1;HttpOnly;Secure  # lower versions: Header set Set-Cookie HttpOnly;Secure</ifmodule>

                1. robots.txt file inside of the SERPS big photo https://i.imgur.com/cJeDR9t.png
                2. XML sitemap inside of SERPS should be no indexed big photo https://i.imgur.com/tlx5jc7.png

                Double forward slashes after verdicts the same page without double forward slashes you need to add rel canonical tags zero canonical's on any page whatsoever.

                • https://www.hkqpc.com/news/verdicts//hkq-attorneys-win-carbon-county-real-estate-case/
                • https://www.hkqpc.com/news/verdicts/hkq-attorneys-win-carbon-county-real-estate-case/

                The URLs above need a rel=canonical tag I have created an example below for you. For the page without the double forward slashes, and this tells Google the one you'd prefer to have indexed besides it keeps the query string pages and junk pages out of Google's index. Please see the resources below and add them to your website  because I do not know what type of CMS you're using I cannot recommend a plug-in to do it but if you were using something like WordPress it would be automatically done by something like Yoast WordPress SEO for the site that you are using it may be a wise move to move to something like WordPress it is a solid platform for a site that size and makes things a lot easier for you to implement change across the entire site quickly.

                • https://moz.com/blog/complete-guide-to-rel-canonical-how-to-and-why-not
                • https://yoast.com/rel-canonical/
                • https://moz.com/blog/canonical-url-tag-the-most-important-advancement-in-seo-practices-since-sitemaps

                You need to add a canonical

                • Bigger photo of problem https://i.imgur.com/1qMMPSM.png
                • this page https://www.hkqpc.com/attorney/David-Saba.html/
                • Warning: Creating default object from empty value in /homepages/43/d238880598/htdocs/classes/class.attorneys.php on line 38
                • Warning: Invalid argument supplied for foreach() in /homepages/43/d238880598/htdocs/headers/attorney.php on line 15
                • ** FIx for this**
                • https://stackoverflow.com/questions/14806959/how-to-fix-creating-default-object-from-empty-value-warning-in-php
                • http://thisinterestsme.com/invalid-argument-supplied-for-foreach/

                You have

                Heartbleed Vulnerability

                An unknown error occurred while scanning for the Heartbleed Bug.

                1qMMPSM.png tlx5jc7.png cJeDR9t.png

                1 Reply Last reply Reply Quote 1
                • D.J.Hanchett
                  D.J.Hanchett last edited by

                  Ok, canonical is set for each page (and I fixed the // issue).  I used x-robots header to noindex the robots.txt and sitemap.xml files, along with a few other extensions while I was at it.

                  I'll get the secured cookie header set after this is resolved.  We don't store any sensitive data via cookies for this site so it's not of immediate concern but still one I'll address.

                  EDIT:  The https://www.hkqpc.com/attorney/David-Saba.html/ page no longer exists which was the cause of the errors.  I've redirected that to the appropriate page.

                  1 Reply Last reply Reply Quote 1
                  • BlueprintMarketing
                    BlueprintMarketing last edited by

                    Wow, I got it

                    your 301  redirecting a ton of URLs back to the homepage.

                    • Redirect chains https://bseo.io/cZW0w0
                    • internal URLs https://bseo.io/4sFqUk
                    • insecure content https://bseo.io/YDDKGD
                    • no canonical https://bseo.io/fWey1Q
                    • crawl overview https://bseo.io/Zg6bpM
                    • canonical errors https://bseo.io/YtTh7W
                    1 Reply Last reply Reply Quote 0
                    • D.J.Hanchett
                      D.J.Hanchett last edited by

                      Now I'm really baffled. I just ran Screaming Frog and don't see any of the redirects or other stats. Which software are you using that is showing this information? I'm trying to replicate it and figure out if there's something, somewhere else doing this.

                      1 Reply Last reply Reply Quote 0
                      • D.J.Hanchett
                        D.J.Hanchett last edited by

                        Looking at the first report, "Redirect Chains"..  As I understand the table, these are correct..

                        Column A is the page (source) with the redirecting link
                        Column B is the link that is redirecting (http://www.hkqlaw.com)
                        Column C shows 2 redirects happening
                        Column I shows the first redirect (http://www.hkqlaw.com -> http://www.hkqpc.com) (non ssl version)
                        Column N shows the second redirect (http://www.hkqpc.com -> https://www.hkqpc.com) (ssl version)

                        The original link (hkqlaw.com) is a link in the footer of our news section so is common on those pages which is why it shows so often.  So, like I said, this appears to be correct.

                        I added the canonical directives to the pages earlier so perhaps that report was run prior to me doing that?

                        Again, thanks so much for your effort in helping me!

                        1 Reply Last reply Reply Quote 0
                        • BlueprintMarketing
                          BlueprintMarketing last edited by

                          the report was run prior canonical directives

                          Anytime remember to noindex your robots.txt

                          https://yoast.com/x-robots-tag-play/

                          There are cases in which the robots.txt file itself might show up in search results. By using an alteration of the previous method, you can prevent this from happening to your website:

                           <filesmatch "robots.txt"="">Header set X-Robots-Tag "noindex"</filesmatch> 
                          
                          **And in Nginx:** 
                          
                          location = robots.txt {
                              add_header  X-Robots-Tag "noindex";
                          }
                          
                          1 Reply Last reply Reply Quote 1
                          • 1 / 1
                          • First post
                            Last post
                          • Minimum amount of content for Ecommerce pages?
                            Christy-Correll
                            Christy-Correll
                            0
                            3
                            132

                          • Best practice for deindexing large quantities of pages
                            LoganRay
                            LoganRay
                            0
                            4
                            311

                          • Removing massive number of no index follow page that are not crawled
                            MickEdwards
                            MickEdwards
                            0
                            3
                            92

                          • How best to deindex tens of thousands of pages?
                            vcj
                            vcj
                            0
                            4
                            103

                          • Any downsides of (permanent)redirecting 404 pages to more generic pages(category page)
                            MatthewBarby
                            MatthewBarby
                            0
                            2
                            133

                          • Amount of pages indexed for classified (number of pages for the same query)
                            Visiblics
                            Visiblics
                            0
                            2
                            267

                          • NOINDEX listing pages: Page 2, Page 3... etc?
                            dunklea
                            dunklea
                            0
                            3
                            686

                          • Do in page links pointing to the parent page make the page more relevant for that term?
                            IM_Learner
                            IM_Learner
                            0
                            6
                            747

                          Get started with Moz Pro!

                          Unlock the power of advanced SEO tools and data-driven insights.

                          Start my free trial
                          Products
                          • Moz Pro
                          • Moz Local
                          • Moz API
                          • Moz Data
                          • STAT
                          • Product Updates
                          Moz Solutions
                          • SMB Solutions
                          • Agency Solutions
                          • Enterprise Solutions
                          • Digital Marketers
                          Free SEO Tools
                          • Domain Authority Checker
                          • Link Explorer
                          • Keyword Explorer
                          • Competitive Research
                          • Brand Authority Checker
                          • Local Citation Checker
                          • MozBar Extension
                          • MozCast
                          Resources
                          • Blog
                          • SEO Learning Center
                          • Help Hub
                          • Beginner's Guide to SEO
                          • How-to Guides
                          • Moz Academy
                          • API Docs
                          About Moz
                          • About
                          • Team
                          • Careers
                          • Contact
                          Why Moz
                          • Case Studies
                          • Testimonials
                          Get Involved
                          • Become an Affiliate
                          • MozCon
                          • Webinars
                          • Practical Marketer Series
                          • MozPod
                          Connect with us

                          Contact the Help team

                          Join our newsletter
                          Moz logo
                          © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                          • Accessibility
                          • Terms of Use
                          • Privacy