What to do with extremely high number of URLs on your site?

TLO

Here is the situation:

The site has tons of business and personal profiles, the information needed to be categorized as such directories were created in an attempt to keep the URL structure clean - so for example:

www.abc.com/product/um/name-here/city-name/state/lastname:3458765

Each profile has a unique ID#, and for some reason there needed to be a category for a user in this case /um/ stands for user name.

Webmaster tool steps to resolve state to use an rel=canonical which can be done for that directory /um/ but I am concerned about the bot not being able to find the other pages beyond that directory, like the profile name, city, state associated. So I guess my ultimate question is if I use rel=canonical will the rest of the content not get crawled or indexed as well?

anthonydnelson

Does everything need to be indexed? If not, perhaps the personal profiles could be noindexed. Let the search engines crawl all of your content, but only have them index pages that provide value to the SERPs.\

Only use rel=canonical if the content on different URLs is the exact same. Using it incorrectly will cause content to not be indexed.

irvingw

This is not what the canonical tag is intended for.

The personal profiles will most likely be very low content dupes of each other like these which are indexed and should not be:

https://www.google.com/#sclient=psy-ab&hl=en&site=&source=hp&q=site:www.imeet.com&oq=site:www.imeet.com&aq=f&aqi=&aql=&gs_l=hp.3...446.446.0.458.1.1.0.0.0.0.0.0..0.0...0.0.uZHApsCZ_-g&pbx=1&bav=on.2,or.r_gc.r_pw.,cf.osb&fp=eeee66798b61e3b7&biw=1920&bih=902

if pages deeper in that folder are good content worthy of being indexed then:

a) add noindex,follow to these profile pages

b) add index, follow to the deeper pages

that will keep the bots crawling the profile pages to the deeper folders with content you want indexed.

You can also disallow the /un/ (user name) folder and allow the deeper folders with robots.txt commands. We were just discussing this:

http://www.seomoz.org/q/allow-or-disallow-first-in-robots-txt

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

What to do with extremely high number of URLs on your site?

Products

Moz Solutions

Free SEO Tools

Resources

About Moz

Why Moz

Get Involved