Should I noindex?
-
I'm currently importing about 15000 posts to my site that are essentially businesses that have posted their programs for people to sign up. To sign up the user clicks the button that links to the program provider's website. I also have the option to link directly back to the original directory site that the program was posted on.
I've scraped this information from a different website and I want to make sure I don't get penalized. I don't want to spin the content because it produces poor quality content. I don't necessarily need to rank for the pages because I have relevant, direct traffic coming to the site via purchased domains.
I plan on at least contacting all of the larger program providers to let them know that I've listed their programs on my site and to give them the option to manage them.
My Questions
- In order to avoid any penalty, should I just apply a rel="noindex on all of the posts/programs? My domain is still new so I'd imagine that's probably what I should do but definitely need some reassurance.
Thanks for your time!!
-
There is no penalty for duplicate content (if you take that content from another site you posted on yours). Google went on record several time on this subject. But you won't rank with this content anyway.
Now if your site is 10-100 pages and you get 15k in - then you will change the topicality of the site by a lot with thin content that doesn't rank and dilute everything you have and that is not good. If your site is 200k pages indexed already and you add 15k - I would't really worry about it.
To play it "safe" you could indeed noindex them. Those will still be crawled if they are linked though. If you want to cut them out and also save some crawl budget (if the site is new and with not that many pages) I would push all of those 15k pages in a separate folder in the structure and add that folder in robots.txt as disallow.
Take into account that linking to those pages and having those pages indexed is also a signal for the pages that are linking to them (good or bad). If those are on the same topic it will help the pages that are linking to them.
If you can find the time and energy to improve this content even if it's the same like on any other sites, find a way to add some value (in the way you show it, with resources, stats, addon con tent etc) those can turn into a good set of landing pages.
my 2c. Hope it helps.
-
Thanks so much for your reply. I actually disallowed searching via the robots.txt file and will keep it that way until my site gets larger/older more naturally.
Because I'm reaching out to each of the major businesses posting these programs in efforts to get them to update each of their programs to unique content I hope be able to index them within a couple years, while removing any programs that don't get attention from their providers.
I hope I don't get in trouble for photos posted on my site from the other site. They're user-driven photos so maybe those are treated differently?