The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Technical SEO Issues
    4. How Does Google's "index" find the location of pages in the "page directory" to return?

    How Does Google's "index" find the location of pages in the "page directory" to return?

    Technical SEO Issues
    9 3 215
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • reidsteven75
      reidsteven75 last edited by

      This is my understanding of how Google's search works, and I am unsure about one thing in specific:

      1. Google continuously crawls websites and stores each page it finds (let's call it "page directory")
      2. Google's "page directory" is a cache so it isn't the "live" version of the page
      3. Google has separate storage called "the index" which contains all the keywords searched.  These keywords in "the index" point to the pages in the "page directory" that contain the same keywords.
      4. When someone searches a keyword, that keyword is accessed in the "index" and returns all relevant pages in the "page directory"
      5. These returned pages are given ranks based on the algorithm

      The one part I'm unsure of is how Google's "index" knows the location of relevant pages in the "page directory".  The keyword entries in the "index" point to the "page directory" somehow. I'm thinking each page has a url in the "page directory", and the entries in the "index" contain these urls.   Since Google's "page directory" is a cache, would the urls be the same as the live website (and would the keywords in the "index" point to these urls)?

      For example if webpage is found at wwww.website.com/page1, would the "page directory" store this page under that url in Google's cache?

      The reason I want to discuss this is to know the effects of changing a pages url by understanding how the search process works better.

      1 Reply Last reply Reply Quote 0
      • TakeshiYoung
        TakeshiYoung last edited by

        This a pretty confusing question, and the terminology you use is different from industry standard. Check out these links for a quick overview of how Google works:

        • http://www.google.com/insidesearch/howsearchworks/thestory/
        • http://www.googleguide.com/google_works.html

        If you are just worried about changing a page's url, just be sure to put in a 301 redirect from the old page to the new page. That way, even if Google has an older version of the page indexed, it will automatically redirect the user to the new page as well as help Google discover the new location of the page.

        reidsteven75 1 Reply Last reply Reply Quote 1
        • cbielich
          cbielich last edited by

          Wow you just asked questions that would require about 10,000,000,000 answers 😉

          Lets start here

          1. Video from the man himself Mr. Matt Cutts - Matt Cutts (Works for Google)
          2. Great Web 2.0 Page create from Google themself - (Google Them self)
          3. Older but still relevant description about how "backlinks" affect PR - (Google Them self)
          1 Reply Last reply Reply Quote 2
          • reidsteven75
            reidsteven75 @TakeshiYoung last edited by

            Thanks for the response and links Takeshi.  Maybe I can rephrase the question to be more clear. Let's say a piece of content (or page) is at the url "www.oldurl.com/page".  During a migration this same piece of content now at the url "www.newurl.com/page".   The "www.oldurl.com" doesn't exist anymore so there isn't duplicate content in the live web.

            Would Google create a new entry in it's "page directory" (what is the industry standard name for this directory?) and give it the url "www.newurl.com/page"?

            If it does create a new entry, would Google keep the old entry "www.oldurl.com/page" although the old url doesn't exist in the "live" web anymore?

            TakeshiYoung 1 Reply Last reply Reply Quote 0
            • TakeshiYoung
              TakeshiYoung @reidsteven75 last edited by

              Just because you create a new page and delete the old one, Google won't know immediately about it. So if Google crawls the new page before it's had a chance to crawl the old one, then it will indeed consider the new page to be duplicate content. Then when it tries to crawl the old page, it will discover that it no longer exists. However, as long as links to the old page exist, it will continue to try to crawl that page. Eventually it may de-index the old page if it keeps returning an error.

              Bottom line, if you are moving content to a new URL, be sure to include a 301 redirect on the old page so that Google (and other search engines) know that the piece of content has moved. You can also do this with canonical tags, but 301s are more effective.

              reidsteven75 1 Reply Last reply Reply Quote 1
              • reidsteven75
                reidsteven75 last edited by

                Hey Cesar,

                Thanks for the links!  Really useful info there.

                Unfortunately they I couldn't find the answer I was looking for so I'll be more specific in what I'm asking.

                From what I understand Google uses two database systems.   One contains keywords and the other contains cached pages.  How does a keyword entry point to a page entry?  Does it use a unique id number, or does it use the url that page is using in the "live" vesion on the web?

                cbielich 1 Reply Last reply Reply Quote 0
                • reidsteven75
                  reidsteven75 @TakeshiYoung last edited by

                  That makes sense, thanks for getting back to me so fast!

                  Perhaps you can help answer my next question.  I have a client who used to host his domain at "www.oldurl.com", and has migrated his website to "www.newurl.com".  He wants to use his old domain "www.oldurl.com", so he setup forwarding/masking so that when someone tries to access "www.oldurl.com" they are forwarded to "www.newurl.com" but the url shown to the user is "www.oldurl.com".

                  My client want his old url "www.oldurl.com" to be ranked in Google, but from what I understand his new url will be ranked.  I know masking is really bad for SEO, and I want to educate my client as to why on the technical side.  I have read Google see's all the content as duplicate with masking.  Do you know the details as to why?

                  1 Reply Last reply Reply Quote 0
                  • cbielich
                    cbielich @reidsteven75 last edited by

                    That is a question that no one here can answer. We cant speak for how Google does things internally.

                    but.... as a web / database programmer for 14+ years let me tell you how its "generally" done

                    Usually when you have to link to separate sets of data together (ie. database or tables) there is usually a unique_id created to link them which usually is never changed. So when a new record is created that record will live with that ID for its life, also known as a (unique identifier which tends to be an auto-incremented number that is dynamically generated and can not be repeated).

                    Since records tend to be linked this way, any other fields that exist in the record (firstName, lastName, Url, blah blah) then can be changed without the original ID being disturbed.

                    So to answer your question from my experience I would assume Google links from a unique identifier of some sort and not the URL directly.

                    Hope I didn't lose you, its my favorite subject...but no one here speaks that language to much 🙂

                    reidsteven75 1 Reply Last reply Reply Quote 1
                    • reidsteven75
                      reidsteven75 @cbielich last edited by

                      Yeah that makes sense.  I also have a lot of experience with databases and the back ends of websites so I know your language.

                      I'm wondering how Google correlates the url with the page entries then. Maybe each page entry would have a url field so Google knows the location of the live version to constantly update that entry in the "page directory" database?

                      1 Reply Last reply Reply Quote 0
                      • 1 / 1
                      • First post
                        Last post
                      • Why does Google's search results display my home page instead of my target page?
                        JohnSammon
                        JohnSammon
                        0
                        2
                        51

                      • Will a Robots.txt 'disallow' of a directory, keep Google from seeing 301 redirects for pages/files within the directory?
                        DmitriiK
                        DmitriiK
                        0
                        4
                        408

                      • Google how deal with licensed content when this placed on vendor & client's website too. Will Google penalize the client's site for this ?
                        katemorris
                        katemorris
                        1
                        4
                        94

                      • My sites "pages indexed by Google" have gone up more than qten-fold.
                        MibuKotaro
                        MibuKotaro
                        0
                        4
                        89

                      • My beta site (beta.website.com) has been inadvertently indexed. Its cached pages are taking traffic away from our real website (website.com). Should I just "NO INDEX" the entire beta site and if so, what's the best way to do this? Please advise.
                        Vuly
                        Vuly
                        0
                        5
                        1.6k

                      • Why does the Google "link:" query come up with pages where I can't find the link?
                        PatriotOutfitters81
                        PatriotOutfitters81
                        0
                        3
                        363

                      • Will a "blog=example "parameter at the end of my URLs affect google's crawling them?
                        Intridea
                        Intridea
                        0
                        3
                        428

                      • Why is this url showing as "not crawled" on opensiteexplorer, but still showing up in Google's index?
                        KeriMorgret
                        KeriMorgret
                        0
                        2
                        617

                      Get started with Moz Pro!

                      Unlock the power of advanced SEO tools and data-driven insights.

                      Start my free trial
                      Products
                      • Moz Pro
                      • Moz Local
                      • Moz API
                      • Moz Data
                      • STAT
                      • Product Updates
                      Moz Solutions
                      • SMB Solutions
                      • Agency Solutions
                      • Enterprise Solutions
                      • Digital Marketers
                      Free SEO Tools
                      • Domain Authority Checker
                      • Link Explorer
                      • Keyword Explorer
                      • Competitive Research
                      • Brand Authority Checker
                      • Local Citation Checker
                      • MozBar Extension
                      • MozCast
                      Resources
                      • Blog
                      • SEO Learning Center
                      • Help Hub
                      • Beginner's Guide to SEO
                      • How-to Guides
                      • Moz Academy
                      • API Docs
                      About Moz
                      • About
                      • Team
                      • Careers
                      • Contact
                      Why Moz
                      • Case Studies
                      • Testimonials
                      Get Involved
                      • Become an Affiliate
                      • MozCon
                      • Webinars
                      • Practical Marketer Series
                      • MozPod
                      Connect with us

                      Contact the Help team

                      Join our newsletter
                      Moz logo
                      © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                      • Accessibility
                      • Terms of Use
                      • Privacy