The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Intermediate & Advanced SEO
    4. Canonical OR redirect

    Canonical OR redirect

    Intermediate & Advanced SEO
    14 4 189
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • stassaf
      stassaf @Inductive_Automation last edited by

      Thanks Jesse!

      1. the content is different - according to a comparison tool they are 64% similar and considering the menus, header of the site and other element that appears on each page - you can say they're unique - don't i? even so google haven't indexed the 2nd page and it's up for 5 days - sitemap indexing rate is 90% according to google webmaster tools. so what wrong here?

      2. including the date seems like a good idea! but 2 questions about it:

      -won't the URL look messy with these numeric inputs?

      • the same same match can be repeated in the future isn't it a good idea that the page is already indexed? i mean the URL will stay the same, just the content will be different.

      Thanks,

      Assaf.

      1 Reply Last reply Reply Quote 0
      • Dr-Pete
        Dr-Pete last edited by

        This is a pretty common problem with event-oriented sites, and there's no easy solution. It's a trade-off - if you keep creating new URLs every time a new event is listed, you risking producing a lot of near duplicates and eventually diluting your index. At best, you could have dozens or hundreds of pages competing for the same keywords.

        You could canonical or 301-redirect to the most recent event, but that has trade-offs, too. For one, a huge number of either can look odd to Google. Also, the latest event may not always be an appropriate target page, especially if more than just the data is changing. Unfortunately, without seeing the content, it's really tough to tell.

        The other option is to create a static URL for every pairing and update the content on that page (maybe creating archival URLs for the old content, that are lower priority in the site architecture). That way, the most current URL never changes. Again, this depends a lot on the site and the scope.

        If you'r just talking about a couple of URLs for a handful of events, I wouldn't worry too much about it. I probably wouldn't reverse the URL ("A vs. B" --> "B vs. A"), as it doesn't gain you much, but I also wouldn't lose sleep over it. If each pairing can generate dozens of URLs, though, I think you may want to consider a change in your site architecture.

        stassaf 1 Reply Last reply Reply Quote 1
        • stassaf
          stassaf @Dr-Pete last edited by

          Hi Dr. Meyers,

          thanks for your detailed response.

          just wanted to refine my scenario:

          1. the case of pairs (repeat match after a short term) is rare, but i encountered it.

          2. there're no links or sitemap entry for the match that already finished. but google keeps it in the index. the page is reachable ONLY by direct URL address or from the SERP.

          3. i don't think i can enforce google to automatically remove the old match from the index and doing it manually for 1000's of matches is not an option.

          4. i thought google recognize the content of each page to determine if it's duplicate and not only by the URL/title - by tool the content is only 66% similar.

          5. currently i've this problem twice - so for one case i've made rel=canonical and the other one i'm letting google to decide. when google encounters a rel=canonical does it goes to the URL of the canonical?

          Thanks,

          Assaf.

          Dr-Pete 1 Reply Last reply Reply Quote 0
          • stassaf
            stassaf last edited by

            i've got some good responses, but i'm not sure what to do.

            any other opinions will be highly appreciated.

            Thanks!

            1 Reply Last reply Reply Quote 0
            • Dr-Pete
              Dr-Pete @stassaf last edited by

              The problem with (2) is that, if you cut the crawl path, Google can't process any on-page directives, like 301s, canonicals, etc. Now, eventually, they might try to re-crawl from the index (knowing the URL used to exist), but that can take a long time. So, while canonical is probably appropriate here, you may have to leave the old event/URL active long enough for Google to process the tag.

              If these are really isolated cases, I wouldn't worry too much. Maybe rel=canonical them, and eventually Google will flush out the old URL. If this starts happening a lot, I'd really consider some kind of permanent URL for certain match-ups and events.

              There's no easy answer. This stuff is very site-specific and can be tricky.

              stassaf 1 Reply Last reply Reply Quote 1
              • stassaf
                stassaf @Dr-Pete last edited by

                Dear Dr. Meyers,

                i'm starting to understand i've a much bigger problem.

                all finished matches are not relevant anymore and though you can reach them (their Page) from SERP or direct URL, they don't appear on site links OR sitemap. so the best idea is to remove all these old pages from google index - they don't contribute + they made my index status contain 120k pages while only 2000 are currently relevant.

                this causes waste of google crawling on irrelevant pages and a potential that google may see some of them as dupes cause in some cases most of the page is relatively similar.

                one suggestion i got is - after a match finishes pragmatically add  to the page and google will remove it from it's index. - will it remove it if there're no links/sitemap to this page???

                but i also have to handle the problem of the huge index - the above approach may/or not handle pages from now on, but what about all the other far past pages with finished matches??? how can i remove them all from the index.

                • adding  <meta name="robots" content="noindex,follow">to all of them could take months or more to clean the index cause they're probably rarely crawled.</meta name="robots" content="noindex,follow">

                • more aggressive approach would be to change this site architecture and restrict by robot.txt the folder that holds all the past irrelevant pages.

                so if today a match URL is like this: www.domain.com/sport/match/T1vT2

                restrict www.domain.com/sport/match/ on robots.txt

                and from now on create all new matches on different folder like: www.domain.com/sport/new-match/T1vT2

                • is this a good solution?

                • wouldn't google penalize me for removing a directory with 100k pages?

                • if it's a good approach, how much time it will take for google to clear all those pages from it's index?

                I know it's a long one and i'll really appreciate your response.

                Thanks a lot,

                Assaf.

                Dr-Pete 1 Reply Last reply Reply Quote 0
                • Dr-Pete
                  Dr-Pete @stassaf last edited by

                  Oh, wow - yeah if only 2K are current and 120K are indexed, you definitely should be proactive about this. Unfortunately de-indexing content that's already been indexed is tough. Robots.txt isn't terribly effective after-the-fact, and the folder-based approach you've described won't work. You can move the pages and remove the folder (either with Robots.txt or in Webmaster Tools), but you haven't tied the old URLs to the new URLs. To remove them, first you have to tell Google they've moved.

                  First, pick your method. If these old events have any links/traffic/etc., then you may want to rel=canonical or 301-redirect. Otherwise, you could META NOINDEX or even 404. It depends a bit on their value. Then, a couple of options:

                  (1) You can wait and see. Let Google clear out the old events over time. If you're not at any risk, this may be fine. Monitor and see what happens.

                  (2) Encourage Google to re-crawl the old pages by creating a new, stand-alone sitemap. Then, monitor that sitemap in GWT for indexation. You don't have to do all 120K at once, but you could start with a few hundred (hopefully, you can build the XML with code, not by hand) and see how it progresses).

                  stassaf 1 Reply Last reply Reply Quote 0
                  • stassaf
                    stassaf @Dr-Pete last edited by

                    Dear Dr. Meyers,

                    very insightful!!!

                    i must clear all the irrelevant page and the sooner the better.

                    (1) could take months or years

                    (2) sounds as a very good approach - i'm building my Sitemap with code so it's not a problem. the only problem is with a few hundreds at a time it could also take a long time. and wouldn't google spend a lot of time on crawling those pages and index less of the fresh new ones?

                    (3) what about google removal tool - and it's connected to my point on last post about setting a new site architecture:

                    • for all new matches=Pages create a new directory (without the irrelevant pages)
                    • ask WMT removal tool to remove the old directory and with it all the irrelevant pages (of course according to the guidelines for this tool)

                    what do you think about this approach?

                    Thanks again for all your help, i really appreciate it!

                    Assaf.

                    Dr-Pete 1 Reply Last reply Reply Quote 0
                    • Dr-Pete
                      Dr-Pete @stassaf last edited by

                      (2) It could take a while, yes. There is no speedy way to de-index a lot of content that is no longer crawlable, I'm afraid, unless it's currently in a directory that can be removed in Google Webmaster Tools.

                      (3) So, basically, let's say all the pages live under "/events" - you'd create "/events2", put all the new events in that going forward, and them remove "/events" in GWT?

                      It could work for removal, but changing your site architecture that way carries a significant amount of risk. You'll also have to make sure that you have a plan going forward for de-indexing new content that becomes outdated, because this is not something you want to do every couple of months. Honestly, unless you know the old content is harming your rankings, I probably wouldn't do this. I'd stick to the slower method.

                      stassaf 1 Reply Last reply Reply Quote 0
                      • stassaf
                        stassaf @Dr-Pete last edited by

                        Thanks for everything.

                        i'll stick to the slower method and see what's going on in the index.

                        1 Reply Last reply Reply Quote 0
                        • 1 / 1
                        • First post
                          Last post
                        • Pages with similar content: Redirect or Canonical? Or something else?
                          ChrisAshton
                          ChrisAshton
                          0
                          3
                          204

                        • Canonical Question: Root Domain Geo Redirects to SubFolder.
                          blake.runyon
                          blake.runyon
                          0
                          9
                          524

                        • Tags: 301 Redirect, Rel Canonical, or Leave Them Alone?
                          PatrickDelehanty
                          PatrickDelehanty
                          0
                          4
                          129

                        • Wildcard Redirects & Canonical Tags
                          donford
                          donford
                          0
                          2
                          640

                        • 301 redirect or rel=canonical
                          ORob
                          ORob
                          0
                          7
                          383

                        • Redirecting, then redirecting back
                          Highland
                          Highland
                          0
                          2
                          256

                        • Redirecting Canonical 301s and Magento Website
                          Jeremy_FP
                          Jeremy_FP
                          0
                          22
                          3.1k

                        • Reducing pages with canonical & redirects
                          KeriMorgret
                          KeriMorgret
                          0
                          5
                          977

                        Get started with Moz Pro!

                        Unlock the power of advanced SEO tools and data-driven insights.

                        Start my free trial
                        Products
                        • Moz Pro
                        • Moz Local
                        • Moz API
                        • Moz Data
                        • STAT
                        • Product Updates
                        Moz Solutions
                        • SMB Solutions
                        • Agency Solutions
                        • Enterprise Solutions
                        • Digital Marketers
                        Free SEO Tools
                        • Domain Authority Checker
                        • Link Explorer
                        • Keyword Explorer
                        • Competitive Research
                        • Brand Authority Checker
                        • Local Citation Checker
                        • MozBar Extension
                        • MozCast
                        Resources
                        • Blog
                        • SEO Learning Center
                        • Help Hub
                        • Beginner's Guide to SEO
                        • How-to Guides
                        • Moz Academy
                        • API Docs
                        About Moz
                        • About
                        • Team
                        • Careers
                        • Contact
                        Why Moz
                        • Case Studies
                        • Testimonials
                        Get Involved
                        • Become an Affiliate
                        • MozCon
                        • Webinars
                        • Practical Marketer Series
                        • MozPod
                        Connect with us

                        Contact the Help team

                        Join our newsletter
                        Moz logo
                        © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                        • Accessibility
                        • Terms of Use
                        • Privacy