Hi there,
I've been thinking a lot about this lately. I work on a lot of webshops that are made by the same company. I don't like to say this, but not all of their shops perform great SEO-wise.
They use a filtering system which occasionally creates hundreds to thousands of category pages. Basically what happens is this: A client that sells fashion has a site (www.client.com). They have 'main categories' like 'Men' 'Women', 'Kids', 'Sale'.
So when you click on 'men' in the main navigation, you get www.client.com/men/. Then you can filter on brand, subcategory or color. So you get: www.client.com/men/brand. Basically, the url follows the order in which you filter. So you can also get to 'brand' via 'category': www.client.com/shoes/brand
Obviously, this page has the same content as www.client.com/brand/shoes or even /shoes/brand/black and /men/shoes/brand/black if all the brands' shoes happen to be black and mens' shoes.
Currently this is fixed by a dynamic canonical system that canonicalizes the brand/category combinations. So there can be 8000 url's on the site, which canonicalize to about 4000 url's.
I have a gut feeling that this is still not a good situation for SEO, and I also believe that it would be a lot better to have the filtering system default to a defined order, like /gender/category/brand/color so you don't even need to use these excessive amounts of canonicalization. Because, you can canonicalize the whole bunch, but you'd still offer thousands of useless pages for Google to waste its crawl budget on.
Not to mention the time saved when crawling and analysing using Screaming Frog or other audit tools.
Any opinions on this matter?