The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Technical SEO Issues
    4. Site Audit Tools Not Picking Up Content Nor Does Google Cache

    Site Audit Tools Not Picking Up Content Nor Does Google Cache

    Technical SEO Issues
    7 2 86
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • nezona
      nezona last edited by

      Hi Guys,

      Got a site I am working with on the Wix platform. However site audit tools such as Screaming Frog, Ryte and even Moz's onpage crawler show the pages having no content, despite them having 200 words+. Fetching the site as Google clearly shows the rendered page with content, however when I look at the Google cached pages, they also show just blank pages.

      I have had issues with nofollow, noindex on here, but it shows the meta tags correct, just 0 content.

      What would you look to diagnose? I am guessing some rogue JS but why wasn't this picked up on the "fetch as Google".

      1 Reply Last reply Reply Quote 0
      • nezona
        nezona last edited by

        Just an update on this one

        Looks like it may be a problem with Wix

        https://moz.com/community/q/wix-problem-with-on-page-optimization-picking-up-seo

        I have another client who also uses Wix and they also show now content in screaming frog but worryingly their pages show in a cached version of the site. I know the "cache" isn't the best way to see what content is indexed and the fetch as Google is fine.

        I just get the feeling something isn't right.

        1 Reply Last reply Reply Quote 0
        • effectdigital
          effectdigital last edited by

          If you can share a link to the site I can probably diagnose it. It's probably that the content is within the modified (client-side rendered) source code, rather than the 'base' (non-modified) source code. Google fetches pages in multiple different ways, so using fetch as Google artificially makes it seem as if they always use exactly the same crawling technology. They don't.

          Google 'can' crawl modified content. But they don't always do it, and they don't do it for everyone. Rendered crawling takes like... 10x longer than basic source scraping. Their mission is to index the web!

          The fetch tool shows you their best-case scenario crawling methodology. Don't assume their indexation bots, which have a mountain to climb - will always be so favourable

          nezona 1 Reply Last reply Reply Quote 0
          • nezona
            nezona @effectdigital last edited by

            Hi There,

            This is the URL:-

            https://www.nubalustrades.co.uk/

            Be great if you could give me your opinion. I am thinking that this content isn't being indexed.

            Regards

            Neil

            effectdigital 1 Reply Last reply Reply Quote 0
            • effectdigital
              effectdigital @nezona last edited by

              Ok let's have a look here.

              So this is the URL of the page you want me to look at:

              • https://www.nubalustrades.co.uk/

              I can immediately tell you that, from my end it doesn't look like Google has even cached this page at all:

              • http://webcache.googleusercontent.com/search?q=cache:https%3A%2F%2Fwww.nubalustrades.co.uk%2F (live)
              • https://d.pr/i/DhmPEr.png (screenshot)

              As you know I can't fetch someone else's web page as Google, but I do know Screaming Frog pretty well so let's give that a blast

              First let's try a quick crawl with no client-side rendering enabled, see what that comes back with:

              • https://d.pr/f/u3bifA.seospider (SF crawl file)
              • https://d.pr/f/9TfNR5.xlsx (Excel spreadsheet output)

              Seems as if, even without rendered crawling the words are being picked up:

              • https://d.pr/i/426Ck9.png

              Only the rows highlighted in green (the 'core' site URLs) should have a word count anyway. The other URLs are fragments and resources. They're scripts, stylesheets, images etc (none of which need copy).

              Let's try a rendered crawl, see what we get:

              • https://d.pr/f/ijprbx.seospider (SF crawl file)
              • https://d.pr/f/c8ljoF.xlsx (Excel spreadsheet output)

              Again - it seems as if the words are picked up, though oddly fewer are picked up with rendered crawling than with a simple AJAX source scrape:

              • https://d.pr/i/y3Jv51.png

              That could easily be something to do with my time-out or render-wait settings though (that being said I did give a pretty generous 23 seconds so...)

              In any case, it seems to me that the content is search readable in either event.

              Let's look at the homepage specifically in more detail. Basically if content appears in "inspect element" but not in "view source", **that's **when you know you have a real problem

              • view-source:https://www.nubalustrades.co.uk/ - (you can only open this link with Chrome browser, it's free to download from Google)

              As you can see, lots of the content does indeed appear in the 'base' source code:

              • https://d.pr/i/8BzTjd.png

              That's a good thing.

              That being said, each piece of content seems to be replicated twice in the source code which is really weird and may be creating some content duplication issues, if Google's more simple crawl-bots aren't taking the time to analyse the source code correctly.

              Go back here:

              • view-source:https://www.nubalustrades.co.uk/ - (this link only works in Chrome!)

              Ctrl+F to find the string of text: "issued by the British Standards Institution". Hit enter a few times. You'll see the page jump about.

              On the one hand you have this, further up the page which looks alright:

              https://d.pr/i/8AlJJ1.png

              On the other hand you have this further down which looks like a complete mess, embedded within some kind of script or something?

              https://d.pr/i/mJJaqa.png

              Line 6,212 of the source code is some gigantic JavaScript thing which has been in-lined (and don't get me started on how this site is over-using inline code in general, for CSS, JS - everything). No idea what it's for or does, might be deferred stuff to boost page speed without breaking the visuals or whatever (there are many clever tricks like that, but they make the source code a virtually unreadable mess for a human - let alone a programmed bot!)

              What really concerns me is why such a simple page needs to have 6,250 lines of source code. That's mental!

              What we all forget is that, whilst the crawl and fetch bots pull information quickly - Google's algorithms have to be run over the top of that source code and data (which is a much more complex affair)

              Usually people think that  normalizing the code-to-text ratio is a pointless SEO maneuver and in most cases, yes the return is vastly outweighed by the time taken to do it. But in your case it's actually very extreme:

              • https://www.prepostseo.com/code-to-text-ratio

              Put your URL in and you'll get this:

              • https://d.pr/i/nXBu0S.png

              I tried like 5-8 different tools and this was the most favorable result :')

              It is clear that, even were the page successfully downloaded by Google, their algorithms may have trouble hunting out the nuggets of content within the vast, sprawling and unnecessary coding structure. My older colleagues had always warned me away from Wix... now I can see why, with my own two eyes

              Ok. So we know that Google isn't bothering to cache the page, and that - despite the fact your content can 'technically' be crawled, it may be a marathon to do that and dig it out (especially for non-intelligent robots)

              But is the content being indexed? Let's check:

              • https://www.google.co.uk/search?q=site%3Anubalustrades.co.uk+%22issued+by+the+British+Standards+Institution%22
              • https://www.google.co.uk/search?num=100&ei=q_MYXMj3EM_srgSNh6LYCQ&q=site%3Anubalustrades.co.uk+%22product+and+your+happy+with%22
              • https://www.google.co.uk/search?num=100&ei=6vMYXPuLC4yYsAXAoKfAAg&q=site%3Anubalustrades.co.uk+%22Some+customers+like+to+have+more+than+one+balustrade%22
              • https://www.google.co.uk/search?num=100&ei=CPQYXOmJFYu6tQXi8arwBA&q=site%3Anubalustrades.co.uk+%22installations+which+will+help+you+visualise+your+future+project%22
              • https://www.google.co.uk/search?num=100&ei=KvQYXMyhC4LStAWopbqACg&q=site%3Anubalustrades.co.uk+%22Cleanly-designed%2C+high-quality+handrail+systems+combined+with+attention%22

              Those are all special Google search queries, designed to specifically search for strings of content on your website from all the different, primary content boxes

              Good news fella, it's all being found:

              • https://d.pr/i/Zlb926.png

              Let's make up an invalid text string and see what Google returns when text can't be found, to validate our findings thus-far:

              • https://www.google.co.uk/search?num=100&ei=SfQYXJitEomwtQWRk43QDg&q=site%3Anubalustrades.co.uk+%22what+the+heck+I+don%27t+even+I+never+wrote+this+on+my+site+how+dare+you%22

              If nothing is found you get this:

              • https://d.pr/i/FGvT49.png

              So I guess Google can find your content and is indexing your content

              Phew, crisis over! Onto the next one...

              nezona 1 Reply Last reply Reply Quote 2
              • nezona
                nezona @effectdigital last edited by

                My Friend,

                That is some analysis you have done there!! and I am eternally greatful. It's people like you, who are clearly so passionate about SEO, that make our industry amazing!!

                I am going to private message you a longer reply, later but i just wanted to publicly say thank you!!

                Regards

                Neil

                1 Reply Last reply Reply Quote 1
                • effectdigital
                  effectdigital last edited by

                  Hey Neil 🙂

                  Wow, we are really chuffed here at Effect Digital! I guess... we have a lot of combined experience - and we also try to give something back to the community (as well as making profit, obviously)

                  We didn't actually know how many people used the Moz Q&A forum until recently. It seemed like a good hub to demonstrate that, not all agency accounts have to exist to give shallow 1-liner replies from a position of complete ignorance (usually just so they can link spam the comments). Groups of people, **can **be insightful and 'to the point'

                  Again we're just really thrilled that you found our analysis to be useful. It also shows what goes into what we do. Most of the responses on here which are under-detailed have the potential to lead people down rabbit holes. Sometimes you just have to get into the thick of it right?

                  I think our email address is publicly listed on our profile page. Feel free to hit us up

                  1 Reply Last reply Reply Quote 0
                  • 1 / 1
                  • First post
                    Last post
                  • Google how deal with licensed content when this placed on vendor & client's website too. Will Google penalize the client's site for this ?
                    katemorris
                    katemorris
                    1
                    4
                    94

                  • Google Webmaster Tools - content keywords containing spam?
                    Snoogle
                    Snoogle
                    0
                    8
                    783

                  • 301'd site, but new site is not getting picked up in google.
                    evolvingSEO
                    evolvingSEO
                    1
                    28
                    406

                  • Can I have an http AND a https site on Google Webmaster tools
                    evolvingSEO
                    evolvingSEO
                    0
                    5
                    179

                  • Want to Target Mobile site for Google Mobile Version and Desktop Site for Google Desktop Version
                    Cyrus-Shepard
                    Cyrus-Shepard
                    0
                    4
                    1.6k

                  • Have I Set my joomla site map correctly as google not picking up articles
                    JaspalX
                    JaspalX
                    0
                    4
                    935

                  • Google Panda and ticketing sites: quality of content
                    0
                    2
                    620

                  • Google caching meta tags from another site?
                    ozgeekmum
                    ozgeekmum
                    0
                    4
                    903

                  Get started with Moz Pro!

                  Unlock the power of advanced SEO tools and data-driven insights.

                  Start my free trial
                  Products
                  • Moz Pro
                  • Moz Local
                  • Moz API
                  • Moz Data
                  • STAT
                  • Product Updates
                  Moz Solutions
                  • SMB Solutions
                  • Agency Solutions
                  • Enterprise Solutions
                  • Digital Marketers
                  Free SEO Tools
                  • Domain Authority Checker
                  • Link Explorer
                  • Keyword Explorer
                  • Competitive Research
                  • Brand Authority Checker
                  • Local Citation Checker
                  • MozBar Extension
                  • MozCast
                  Resources
                  • Blog
                  • SEO Learning Center
                  • Help Hub
                  • Beginner's Guide to SEO
                  • How-to Guides
                  • Moz Academy
                  • API Docs
                  About Moz
                  • About
                  • Team
                  • Careers
                  • Contact
                  Why Moz
                  • Case Studies
                  • Testimonials
                  Get Involved
                  • Become an Affiliate
                  • MozCon
                  • Webinars
                  • Practical Marketer Series
                  • MozPod
                  Connect with us

                  Contact the Help team

                  Join our newsletter
                  Moz logo
                  © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                  • Accessibility
                  • Terms of Use
                  • Privacy