The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Technical SEO Issues
    4. Question about construction of our sitemap URL in robots.txt file

    Question about construction of our sitemap URL in robots.txt file

    Technical SEO Issues
    16 4 1.6k
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • danatanseo
      danatanseo last edited by

      Hi all,

      This is a Webmaster/SEO question. This is the sitemap URL currently in our robots.txt file:

      http://www.ccisolutions.com/sitemap.xml

      As you can see it leads to a page with two URLs on it. Is this a problem? Wouldn't it be better to list both of those XML files as separate line items in the robots.txt file?

      Thanks!

      Dana

      1 Reply Last reply Reply Quote 0
      • JarnoNijzing
        JarnoNijzing last edited by

        Dana,

        the buildup of your sitemap.xml is very strange to me. I use an external program to build my sitemap.xml for me entire website.

        You now have a link in your robots.txt file pointing to a sitemap which contains 2 files (both .xml) with een map of the site?

        Why not use a program (free or paid like Microsys A1 (the one I use)) to build 1 sitemap.xml en point to this file from your robots.txt?

        hope this helps

        if you do have any questions, please let me know.

        kind regards

        Jarno

        danatanseo 2 Replies Last reply Reply Quote 2
        • Prospector-Plastics
          Prospector-Plastics last edited by

          Another tool to help generate a sitemap and even check broken links is called Xenu (weird logo, but good free product).

          danatanseo 1 Reply Last reply Reply Quote 1
          • ChristopherGlaeser
            ChristopherGlaeser last edited by

            There is a limit on the size of a sitemap and to allow for large sitemaps to be split into smaller sitemaps, the sitemap protocol includes a sitemapindex.  See "Using Sitemap index files (to group multiple sitemap files)" here http://www.sitemaps.org/protocol.html.  Of course, it's also possible to include the multiple sitemaps in the robot.txt file, but automated sitemap generators will likely use the sitemapindex feature so that the robots.txt file does not have to be modified as the size of the site changes.

            Best,
            Christopher

            danatanseo 1 Reply Last reply Reply Quote 1
            • danatanseo
              danatanseo @JarnoNijzing last edited by

              Thanks Jarno,

              I have downloaded and am trying the 30-day free trial of the A1 Sitemap Generator right now. Thanks for the tip. Can you comment on Christopher's remark below concerning sitemap indexes for larger sitemaps?

              Can either you or Christopher give me more clarification on that. Is this what our IT director has attempted to do with the sitemap in our robots.txt file? If so, has it been done correctly?

              Thanks!

              1 Reply Last reply Reply Quote 0
              • danatanseo
                danatanseo @ChristopherGlaeser last edited by

                Thanks Christopher,

                Your answer took a noment to sink in, but I think I get it (I think I am coffee deprived this morning).

                So, if I am using the A1 Sitemap generator that Jarno suggested, this sitemap index should automatically be generated based on the size of my generated sitemap. Is that correct?

                ChristopherGlaeser JarnoNijzing danatanseo 3 Replies Last reply Reply Quote 0
                • ChristopherGlaeser
                  ChristopherGlaeser @danatanseo last edited by

                  I'm not familiar with the A1 Sitemap generator, but regarding the sitemap protocol, there is a limit on the size of a single sitemap.xml file, so for large sites, the sitemap must be split into multiple sitemap.xml files.  And, the protocol has a method for indexing these multiple sitemap.xml files.  It's sort of like an index to an index.  None of my sites exceed the sitemap file limit, so I don't know which sitemap generators use this approach, but I would guess many of them do.

                  Sitemap generators I have used include DMXZone which is a Dreamweaver plugin, and xml-sitemaps.com which includes a video sitemap generator.

                  Best,
                  Christopher

                  EDIT: PS: Your current sitemap looks fine to me.

                  1 Reply Last reply Reply Quote 1
                  • danatanseo
                    danatanseo @JarnoNijzing last edited by

                    Hi again Jarno,

                    Is it normal for A1's sitemap generator's "Scan website"  function for images to take over two hours? Our site is about 3,500 URLs. So far it has under "Internal 'sitemap' URLs"  Listed found: 82076 (and climbing every few seconds).

                    I am wondering if there isn't something wrong? (I don't have any frame of reference since I've never used it before). Thanks!

                    Dana

                    JarnoNijzing danatanseo 5 Replies Last reply Reply Quote 0
                    • JarnoNijzing
                      JarnoNijzing @danatanseo last edited by

                      A1 Sitemap does 2 things:

                      1 ) It builds a file names sitemap.xml which contains all files on the website (not conform the google requirements

                      1. It builds a number of files listed in sitemap-index.xml for every 100 pages in one sitemap. So if you're website contains 2800 pages You'll get loads of files: 28 sitemap-1.xml etc and 1 sitemap-index.xml file. Which does meet the Google standards. Afterwards you can do 2 things in Google webmasters:

                      2. enter the sitemap-index.xml file as a sitemap -> Google will follow everything and come to the grand total of 2800 pages.

                      3. Enter each sitemap separately.-> same result but you can pinpoint better where you have a 100 pages and google only indexes fewer (can happen).

                      Hope this helps

                      1 Reply Last reply Reply Quote 0
                      • JarnoNijzing
                        JarnoNijzing @danatanseo last edited by

                        Dana,

                        sometimes that happens. Are you scanning for images or are you scanning the site?

                        i will check your site tomorrow with my full version and see what it does.

                        Sometimes with some websites you'll get things like this but it can be loads of things. 3500 pages should not take 2 hours but only a couple of minutes. I'll check it first thing tomorrow. A1 is not installed on my laptop..

                        Let you know tomorrow.

                        Kind regards

                        Jarno

                        1 Reply Last reply Reply Quote 1
                        • danatanseo
                          danatanseo @danatanseo last edited by

                          Thanks Jarno. I really appreciate that. Yes, I had it selected to just scan for images (as prompted when I attempted to create an image sitemap). Let me know what you see? I am wondering if it is going around in circles? 🙂

                          Dana

                          1 Reply Last reply Reply Quote 0
                          • JarnoNijzing
                            JarnoNijzing @danatanseo last edited by

                            i started the scan and it's still busy:

                            2500 analyzed references so far.

                            Let you know how it turns out.

                            Jarno

                            1 Reply Last reply Reply Quote 0
                            • JarnoNijzing
                              JarnoNijzing @danatanseo last edited by

                              Dana,

                              It just finished scanning here are the results:

                              Internal Sitemap URL's:

                              • Listed found: 5248
                              • Listed deduced: 5301
                              • Analyzed content: 3110
                              • Analyzed references: 3176

                              External URL's:

                              • Listed found: 700

                              When i look at the overview of the result i see a number of 301 redirects, canonical redirects (when tested again the get code 200 OK). But I see a lot op pages.

                              When i build the sitemap it generates one file (no idea why not more then one) with all the links in the document. Google's sitemap protocol states it should be like the schema at sitemaps.org which it does. The entire protocol of sitemap.org states that a sitemap can not hold over 50,000 links and should be smaller then 10 MB in filesize.

                              The one I just build for you is only 1 MB and contains less url's then 50,000 and thus is it allowed by Google.

                              http://www.sitemaps.org/protocol.html

                              I can send you the entire version of the sitemap if you'd like in a personal message or through e-mail?

                              Hope this helps you further.

                              kind regards

                              Jarno

                              1 Reply Last reply Reply Quote 1
                              • danatanseo
                                danatanseo @danatanseo last edited by

                                Hi Christopher,

                                Thanks for the update. Yes, I looked at it too and other than it not being "pretty" XML, the data seemed to be okay. The one thing the A! generator did that we couldn't do was assign the values for importance and frequency specific pages are modified. If that data is accurate, that's pretty cool. I'm just not sure, although it seems it did identify pages that are modified more frequently correctly. I have 30 days to play with the free trial, but so far I think I like it a lot.

                                Dana

                                1 Reply Last reply Reply Quote 0
                                • danatanseo
                                  danatanseo @Prospector-Plastics last edited by

                                  Yes, we definitely use XENU, but I think I like Screaming Frog a bit better (although our IT Director swears it's broken).

                                  1 Reply Last reply Reply Quote 0
                                  • danatanseo
                                    danatanseo @danatanseo last edited by

                                    Hi Jarno,

                                    Thanks so very much! I have to say I am really liking the A1 generator. How awesome of you to follow up. I really appreciate that. Yes, if you want to send me the complete sitemap via PM that would be awesome. I certainly hope I can return the favor 🙂 Happy Holidays!

                                    Dana

                                    1 Reply Last reply Reply Quote 0
                                    • 1 / 1
                                    • First post
                                      Last post
                                    • Adding your sitemap to robots.txt
                                      Martijn_Scheijbeler
                                      Martijn_Scheijbeler
                                      0
                                      3
                                      2.9k

                                    • Will it be possible to point diff sitemap to same robots.txt file.
                                      DirkC
                                      DirkC
                                      0
                                      3
                                      268

                                    • Is there a limit to how many URLs you can put in a robots.txt file?
                                      CraigBradford
                                      CraigBradford
                                      0
                                      10
                                      1.1k

                                    • Robots.txt file
                                      Asher
                                      Asher
                                      0
                                      3
                                      261

                                    • Robots.txt Question
                                      ThompsonPaul
                                      ThompsonPaul
                                      0
                                      5
                                      700

                                    • Robots.txt questions...
                                      johnshearer
                                      johnshearer
                                      0
                                      5
                                      631

                                    • Robots.txt file question? NEver seen this command before
                                      omnea
                                      omnea
                                      0
                                      7
                                      772

                                    • Robots.txt question
                                      seoug_2005
                                      seoug_2005
                                      0
                                      9
                                      908

                                    Get started with Moz Pro!

                                    Unlock the power of advanced SEO tools and data-driven insights.

                                    Start my free trial
                                    Products
                                    • Moz Pro
                                    • Moz Local
                                    • Moz API
                                    • Moz Data
                                    • STAT
                                    • Product Updates
                                    Moz Solutions
                                    • SMB Solutions
                                    • Agency Solutions
                                    • Enterprise Solutions
                                    • Digital Marketers
                                    Free SEO Tools
                                    • Domain Authority Checker
                                    • Link Explorer
                                    • Keyword Explorer
                                    • Competitive Research
                                    • Brand Authority Checker
                                    • Local Citation Checker
                                    • MozBar Extension
                                    • MozCast
                                    Resources
                                    • Blog
                                    • SEO Learning Center
                                    • Help Hub
                                    • Beginner's Guide to SEO
                                    • How-to Guides
                                    • Moz Academy
                                    • API Docs
                                    About Moz
                                    • About
                                    • Team
                                    • Careers
                                    • Contact
                                    Why Moz
                                    • Case Studies
                                    • Testimonials
                                    Get Involved
                                    • Become an Affiliate
                                    • MozCon
                                    • Webinars
                                    • Practical Marketer Series
                                    • MozPod
                                    Connect with us

                                    Contact the Help team

                                    Join our newsletter
                                    Moz logo
                                    © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                                    • Accessibility
                                    • Terms of Use
                                    • Privacy