Questions
-
How to stop URLs that include query strings from being indexed by Google
WIthout a specific example, there are a couple of options here. I am going to assume that you have an ecommerce site where parameters are being used for sort functions on search results or different options on a given product. I know you may not be able to do this, but using parameters in this case is just a bad idea to start with. If you can (and I know this can be difficult) find a way to rework this so that your site functions without the use of parameters. You could use canonicals, but then Google would still be crawling all those pages and then go through the process of using the canonical link to find out what page is canonical. That is a big waste of Google's time. Why waste Googlebots time on crawling a bunch of pages that you do not want to have crawled anyway? I would rather Googlebot focus on crawling your most important pages. You can use the robots.txt file to stop Google from crawling sections of your site. The only issue with this is that if some of your pages with a bunch of parameters in them are ranking, once you tell Google to stop crawling it, you would then lose traffic. It is not that Google does not "like" robot.txt to block them, or that they do not "like" the use of the canonical tag, it is just that there are directives that Google will follow in a certain way and so if not implemented correctly or in the wrong sequence can cause negative results because you have basically told Google to do something without fully understanding what will happen. Here is what I would do. Long version for long term success Look at Google Analytics (or other Analytics) and Moz tools and see what pages are ranking and sending you traffic. Make note of your results. Think of the most simple way that you could organize your site that would be logical to your users and would allow Google to crawl every page you deem important. Creating a hierarchical sitemap is a good way to do this. How does this relate to what you found in #1. Rework your URL structure to reflect what you found in #2 without using parameters. If you have to use parameters, then make sure Google can crawl your basic sitemap without using any of the parameters. Use robots.txt to then block the crawling of any parameters on your site. You have now ensured that Google can crawl and will rank pages without parameters and you are not hiding any important pages or page information on a page that uses parameters. There are other reasons not to use parameters (e.g. easier for users remember, tend to be shorter, etc), so think about if you want to get rid of them. 301 redirect all your main traffic pages from the old URL structure to the new URL structure. Show 404s for all the old pages including the ones with parameters. That way all the good pages will move to the new URL structure and the bad ones will go away. Now, if you are stuck using parameters. I would do a variant of the above. Still see if there are any important or well ranked pages that use parameters. Consider if there is a way to use the canonical on those pages to get Google to the right page to know what should rank. All the other pages I would use the noindex directive to get them out of the Google index, then later use robots to block Google crawling them. You want to do this in sequence as if you block Google first, it will never see the noindex directive. Now, everything I said above is generally "correct" but depending on your situation, things may need to be tweaked. I hope the information I gave might help with you being able to work out the best options for what works for your site and your customers. Good luck!
Intermediate & Advanced SEO | | CleverPhD0 -
Should a business requestion nofollow links from businesses it has commercial relationships with?
It is indeed. One the one hand, err on the side of caution - on the other, links are required, but with that many affiliate links, I would have though Google was ignoring them. -Andy
White Hat / Black Hat SEO | | Andy.Drinkwater0 -
Location of body text on page - at top or bottom - does it matter for SEO?
Thanks v much Marie - that's really useful - this sounds like it'd be good to test so will try to do that and see what happens.
Web Design | | McTaggart1 -
Do I want backlinks from companies my site has a business relationship with
Hi Luke, Yes you do want those links. I'd rather them being dofollow and only in a single page, such as blog post. Remember to be cautious enough not no be obvious to google. For example, do not make links reciprocal (they give you a link as you give them a link). And of course these links should not be your only links, i believe that you already know that. Best luck. GR.
Intermediate & Advanced SEO | | GastonRiera0 -
Page speed - what do you aim for?
IMHO, if somebody is paying us for SEO, then our GOAL is to get the homepage to load in a second or less.... especially if most of the users are mobile. If it's mid 1 second, then we can grudgingly live with that. I'm glad you asked about server response times.... for most sites, after the content is optimized ( smaller images, clunky code, etc...) the initial server response time is usually the culprit for getting over a second.... as long as the rest of the home page is "light". Light to us is under 1MB. Depending on your CMS, there are a variety of ways to get the response time to be 200ms or less. Google Pagespeed, as David said, is a good measurement, but it's not the holy grail of measurements. We use it only to identify areas that need improvement. Waterfalls tell us what's taking so long and what's heavy. You didn't ask about plugins - which is a major culprit to caching, minify errors, conflicts, speed and weight. We limit all active plugins to TEN (including caching, SEO, security). For some sites, plugin clean up is the easiest way to speed up a site. At the end of the day, nothing beats clean code, light images and a lightening fast server.
Intermediate & Advanced SEO | | lcallander0 -
Search engine keyword rank - easiest way to check the keywords that rank across website
Thanks for your helpful feedback Gaston - much appreciated.
Intermediate & Advanced SEO | | McTaggart1 -
Duplicate content hidden behind tabs
Thanks for your feedback Beau and Jordan - very helpful Luke
Intermediate & Advanced SEO | | McTaggart0 -
Why do people put xml sitemaps in subfolders? Why not just the root? What's the best solution?
Thanks Angular Marketing, and Everett... very helpful feedback and much appreciated. Luke
Intermediate & Advanced SEO | | McTaggart0 -
Photo filenames
Thanks everyone for your helpful feedback - much appreciated.
Intermediate & Advanced SEO | | McTaggart0 -
Http resolving to https - why isn't it doing that?
Though the ideal plan of action would be to move all pages on the site over to https, the HTTPS certificate happens on a webpage by webpage basis—so there are a few things that could be going on here. First, the website could have chosen to only move forward with procuring the HTTPS certificate for certain webpages and neglected to get it for others, which is why it's only showing up on some. Second, as mentioned below, they could be in the process of transitioning all pages to https but not be all the way there yet, transitioning the pages in batches but not all batches have been complete yet. Third, the redirects could simply have been done incorrectly for certain pages! One of these three options should provide the answer you're looking for.
Intermediate & Advanced SEO | | BlueCorona0 -
Sitemap with homepage URL repeated several times - it is a problem?
Many thanks Eric - much appreciated - that clarifies everything perfectly
Intermediate & Advanced SEO | | McTaggart0 -
Local SEO - two businesses at same address - best course of action?
Thanks Miriam - that makes good sense - many thanks for your feedback Luke
Intermediate & Advanced SEO | | McTaggart0 -
Website copying in Tweets from Twitter
I know - nearly fell of my chair in shock ha! Thanks for the links Luke
Intermediate & Advanced SEO | | McTaggart0 -
Onsite calendar throwing out thousands of pages
Hi Luke Matt has the right idea. If the pages are going to "exist", you should block search engines from crawling them with the robots.txt file. I would get your dev to help, but basically you'd find the folder or path in which you want to crawler to stop at. Maybe it's /month/ or something and you'd block that in robots.txt. Ian covers this in his recent article about "Spider Traps". And you can also read about robots.txt on Moz or on Google.
Intermediate & Advanced SEO | | evolvingSEO0 -
Multilingual SEO - site using Google translate within existing URL structure
Luke, This technically wouldn't cause any ill-effects on you SEO efforts since your URLs aren't changing. However, according to our good friend, Mr. Cutts, auto translation via Google Translate isn't recommended and can be seen as spamming. https://www.youtube.com/watch?feature=player_embedded&v=UDg2AGRGjLQ
Intermediate & Advanced SEO | | LoganRay0 -
Very strange HTML docs - what should I do with them through site migration?
Or should I fix the issue first via htaccess rule before attempting the migration I quite honestly think that the problem is WITH htaccess, not that you have to fix something else with htaccess. And as an answer to your question - you always can migrate with issues and hope that nothing breaks during the process, or try to patch it up so it seems to be working fine and, again, hope that it doesn't break on you, OR you can get it fixed at the root of the problem and don't worry about it in the future.
Intermediate & Advanced SEO | | DmitriiK0