The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Intermediate & Advanced SEO
    4. XML Sitemap Questions For Big Site

    XML Sitemap Questions For Big Site

    Intermediate & Advanced SEO
    5 3 450
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • keywordwizzard
      keywordwizzard last edited by

      Hey Guys,

      I have a few question about XML Sitemaps.

      • For a social site that is going to have presonal accounts created, what is the best way to get them indexed?  When it comes to profiles I found out that twitter (https://twitter.com/i/directory/profiles) and facebook (https://www.facebook.com/find-friends?ref=pf) have directory pages, but Google plus has xml index pages (http://www.gstatic.com/s2/sitemaps/profiles-sitemap.xml).

      • If we go the XML route, how would we automatically add new profiles to the sitemap? Or is the only option to keep updating your xml profiles using a third party software (sitemapwriter)?

      • If a user chooses to not have their profile indexed (by default it will be index-able), how do we go about deindexing that profile?  Is their an automatic way of doing this?

      • Lastly, has anyone dappled with google sitemap generator (https://code.google.com/p/googlesitemapgenerator/) if so do you recommend it?

      Thank you!

      1 Reply Last reply Reply Quote 0
      • LesleyPaone
        LesleyPaone last edited by

        If it were me and someone were asking me to design a system like that, I would design it in a few parts.

        First I would create an application that handled the sitemap minus profiles, just for your tos, sign up pages, terms, and what ever pages like that.

        Then I would design a system that handled the actual profiles. It would be pretty complex and resource intensive as the site grew. But the main idea flows like this

        Start generation, grab the user record with id 1 in the database, check to see if indexable (move to next if not), see what pages are connected, write to xml file, loop back and start with record #2.

        There are a few concessions you have to make, you need to keep up with the number of records in a file before you start another file. You can only have 50k records in one file.

        The way I would handle the process in total for a large site would be this, sync the required tables via a weekly or daily cron to another instance (server). Call the php script (because that is what I use) that creates the first sitemap for the normal site wide pages. At the end of that site map, put a location for the user profile sitemap, then at the end of the scrip, execute the user profile site map generating script. At the end of each site map, put the location of the next site map file, because as you grow it might take 2-10000 site map files.

        One thing that I would ensure to do is get a list of crawler ip addresses and in your .htaccess have an allow / deny rule. That way you can make the site maps only visible to the search engines.

        spencerhjustice 1 Reply Last reply Reply Quote 0
        • spencerhjustice
          spencerhjustice @LesleyPaone last edited by

          I'm not a web developer, so this might may be wrong, but I feel like it might be easier to just add every user to the xml sitemap and then add a noindex robots meta tag ons users pages that don't want to their profiles to be indexed.

          LesleyPaone 1 Reply Last reply Reply Quote 0
          • LesleyPaone
            LesleyPaone @spencerhjustice last edited by

            I guess the way I was explaining it was for scalabilty on a large site. You have to think a site like fb or twitter with hundreds of millions of users still has the limitation of only having 50k records in a site map. So if they are running site maps, they have hundreds.

            keywordwizzard 1 Reply Last reply Reply Quote 0
            • keywordwizzard
              keywordwizzard @LesleyPaone last edited by

              Thanks for the input guys!

              I believe Twitter and Facebook don't run sitemaps for their profiles, what they have is a directory for all their profiles (twitter: https://twitter.com/i/directory/profiles Facebook: https://www.facebook.com/find-friends?ref=pf) and use that to get their profiles crawled, however I feel the best approach is through xml sitemaps and Google plus actually does this with their profiles (http://www.gstatic.com/s2/sitemaps/profiles-sitemap.xml) and quite frankly I would rather follow Google then FB or Twitter...  I'm just now wondering how the hell they upkeep that monster!  Does it create a new sitemap everything one hits 50k?  When do they update their sitemap? daily, weekly, or monthly and how?

              One other question I have is if their is any penalties to getting a lot of pages crawled at once? Meaning one day we have 10 pages and the next we have 10,000 pages or 50,000 pages...

              Thanks again guys!

              1 Reply Last reply Reply Quote 0
              • 1 / 1
              • First post
                Last post
              • International XML Sitemaps - Standalone, or Integrate into Existing XML Sitemap?
                Martijn_Scheijbeler
                Martijn_Scheijbeler
                0
                2
                138

              • Is this a good sitemap hierarchy for a big eCommerce site (50k+ pages).
                Andy.Drinkwater
                Andy.Drinkwater
                0
                7
                482

              • Xml sitemap Issue... Xml sitemap generator facilitating only few pages for indexing
                Paddy_Moogan
                Paddy_Moogan
                0
                6
                153

              • XML sitemaps questions
                Martijn_Scheijbeler
                Martijn_Scheijbeler
                0
                3
                167

              • Sitemap.xml
                DarinPirkey
                DarinPirkey
                0
                2
                196

              • XML Sitemaps - how to create the perfect XML Sitemap
                AlanMosley
                AlanMosley
                0
                2
                3.5k

              • XML Sitemap index within a XML sitemaps index
                Martijn_Scheijbeler
                Martijn_Scheijbeler
                0
                2
                1.1k

              • Questions about turning my wordpress site into an ecommerce site. Experience needed.
                activitysuper
                activitysuper
                0
                2
                630

              Get started with Moz Pro!

              Unlock the power of advanced SEO tools and data-driven insights.

              Start my free trial
              Products
              • Moz Pro
              • Moz Local
              • Moz API
              • Moz Data
              • STAT
              • Product Updates
              Moz Solutions
              • SMB Solutions
              • Agency Solutions
              • Enterprise Solutions
              • Digital Marketers
              Free SEO Tools
              • Domain Authority Checker
              • Link Explorer
              • Keyword Explorer
              • Competitive Research
              • Brand Authority Checker
              • Local Citation Checker
              • MozBar Extension
              • MozCast
              Resources
              • Blog
              • SEO Learning Center
              • Help Hub
              • Beginner's Guide to SEO
              • How-to Guides
              • Moz Academy
              • API Docs
              About Moz
              • About
              • Team
              • Careers
              • Contact
              Why Moz
              • Case Studies
              • Testimonials
              Get Involved
              • Become an Affiliate
              • MozCon
              • Webinars
              • Practical Marketer Series
              • MozPod
              Connect with us

              Contact the Help team

              Join our newsletter
              Moz logo
              © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
              • Accessibility
              • Terms of Use
              • Privacy