Welcome to the Q&A Forum

brettmandoes

Update: after about two weeks the crawl rate returned to normal. We haven't been able to identify a cause yet.

brettmandoes

We're dealing with a similar issue here. Someone from Russia scraped our content and knocked us out of the SERP. Google sees a 200 status code but all others see 404.

We sent in a DMCA takedown request via https://www.google.com/webmasters/tools/dmca-dashboard and it was responded to in less than a week (4 days). One caveat - make sure you have copyright ownership on that material. If you are an agency representing a client you may not have copyright and you may need to have them submit the request.

When you submit the request to Google and they verify it's legitimate they will remove the offending site from their search results. This is the only way to begin restoring your position.

You can contact their hosting provider and submit a takedown request with them as well. Provide evidence, and threaten legal action if they do not respond. Of course, Google has still indexed the other website, so you still need to submit a DMCA request with them to get your rankings back.

May the odds be ever in your favor.

brettmandoes

I think if you try to do what you're suggesting you're going to end up with a headache for both yourself and your users. A simple and elegant solution to this would be to rewrite (or copy/paste) some of the blogs as pages, and set a canonical URL so Google knows which content it should attribute originality to.

If you have a sidebar, you could also utilize a widget that recommends related blog posts to users if you'd rather maintain the structure you have, but callout related content and provide navigation.

brettmandoes

There is value. You are indicating to Google throughout your website with every link you place that you consider Page A or Page B to be more or less valuable than the other to your users.

We know that Google places different values on backlinks based on the the placement of the link. For example a link that is in the main body content will pass along pagerank and so will a link in a footer, but the value is modified as it passes through on the assumption that a link in the footer is of different value than a link placed in content.

I'm not sure where you heard that Google only counts the first link it finds. That's incorrect, and you can read any of the older articles from Matt Cutt's blog on how pagerank works to see why.

I believe the best case for you is to create links that go to content you value highly for your business and your users, and to place these links in logical places.

brettmandoes

I received a response from Barry Hunter who said pretty much what I suspected: that the devil is in the details.

"Critic reviews must allow for customers to express both positive and negative sentiments. They may not be vetted by the business or restricted by the content provider based on the positive/negative sentiment of the review before submission to Google."

I've bolded the distinction he had made which is that it's acceptable to vet reviews as a profanity filter.

What he did not address, though he did acknowledge, was that there may still exist some confusion as the reviews most laden with profanity are likely to be angry, negative reviews. While I'm not 100% satisfied with this answer, I think it's likely to be the only one I'll get.

For those interested in the discussion: https://productforums.google.com/forum/#!msg/webmasters/k24p4fPf404/3e7D7hjxEwAJ

I'm tempted to <nofollow>that link until I get a satisfactory response </nofollow>

brettmandoes

Thanks Miriam, I've posted the question in Google's product support forums as well to try and find a resolution. If anyone nibbles I'll update the Q&A here as well.

There is a caveat in the wording that I've noticed where it states "Critic reviews must allow for customers to express both positive and negative sentiments. They may not be vetted by the business or restricted by the content provider based on the positive/negative sentiment of the review before submission to Google."

This may give us wiggle room to vet the review based on profanity, though I don't know how Google would be able to make the distinction since any review using profanity is more likely by its nature to have a lower rating, and therefore is likely to trip Google's alarms.

brettmandoes

Google released new guidelines last year governing how schema markup is to be deployed on a website. One of those guidelines states that reviews on your site must not be filtered or altered to receive the benefit of schema markup. After my client was slapped on the wrist by Google for ignoring their Webmaster guidelines (and our advice ahem) they removed all filtering from the websites.

However, being a family friendly company it is a requirement that no profanity be displayed on the website. Google's guidelines are not entirely clear about what to do. They state:

"Profanity and vulgar language are discouraged. Reviews should be appropriate for a broad and diverse audience. Consequently, reviews containing vulgar or profane language may be ineligible for use."

and...

"Critic reviews must allow for customers to express both positive and negative sentiments. They may not be vetted by the business or restricted by the content provider based on the positive/negative sentiment of the review before submission to Google."

The issue is that we need to vet the reviews to remove profanity, yet that may be triggering for Google. Any thoughts?
Source: https://developers.google.com/search/docs/data-types/reviews

brettmandoes

Were they referencing Rank Brain in their article? The statement sounds similar to an explanation given on what Rank Brain is and how it impacts search. It does seem like a bit of hyperbole but I see their point and I agree with it to a certain extent. I believe the purpose of a machine learner is to continuously innovate without human intervention so that improvements are made while you sleep. It's my understanding that Rank Brain does this based on feedback from users. It's the perfect solution to handling the complexity of search, and would result in a continuously changing algorithm.

I do see a lot of websites ranking without backlinks. Try any local home services query - they're mostly propped up by citations which is a little different than your standard backlink.

brettmandoes

Is there some reason you need to have individual landing pages for each image? That's a lot of bloat. Hundreds (or thousands) of unique landing pages with nothing but an image isn't going to convey either relevance or authority to search engines, nor does it sound particularly useful to users.

Since Googlebot can't "see" the images, I would follow Steve's advice and add the images to your sitemap, noindex and nofollowing the pages that lack content. Taking it a step further, I would also make sure the filenames are readable by search engines and alt text is entered for each image. Doing so will help Google understand what the images are and may help display you for relevant queries.

You may want to consider just doing gallery style pages for your portfolios with the option to enlarge a photo to max resolution by clicking on it if you're just trying to let users see a larger image. Probably will help with UX on top of SEO.

brettmandoes

Yes I have, and yes there are pages that aren't listed in the sitemap and aren't supposed to be there. That's being corrected (we're considering experimenting with priority tags in the sitemap to see if it has an impact over just immediately blocking with robots.txt or meta robots). But if you factor in those pages, it still only amounts to 303 pages.

Weird, right?

brettmandoes

I'm going to recommend Screaming Frog here. Run a scan of your site and then filter it by duplicate title tags, duplicate meta descriptions, and (my favorite) word count. Usually I don't need to go any further than duplicate title tags.

There's also www.siteliner.com. I've used that regularly and it has been tremendously helpful for pages that have duplicate content in the body but not in the META.

Finally, Google Search Console. Go to Search Appearance and click on HTML Improvements. You can also find all your duplicate title tags there, which should help you identify duplicate content easily.

brettmandoes

I just checked your xml sitemap and (as Steve mentioned) you don't have any images in your sitemap (or videos). I strongly recommend using a dynamic xml sitemap as well, if you're not already. Glancing through your site you should have a much larger sitemap than what's displayed.

Also your robots.txt file looks a little funny. Normally I see a line under "User-Agent: *" that says "Disallow:". I'm not sure it's completely necessary, but I don't play around with that file. Too many clients have done weird things by varying from the standard. Here's THE resource on robots.txt:

http://www.robotstxt.org/robotstxt.html

brettmandoes

This can create some real headaches. If you're going to secure a part of the site, you may as well secure the whole thing. Leaving part of the site unsecured and just securing a few pages that are transactional or used to collect customer data like physical addresses is something other sites have done, but should be considered a temporary solution while securing the rest of the site.

While I'm not sure that this implementation would create dark traffic in your Google Analytics reports, you're still leaving yourself open to MIM attacks and other SEO issues with a partial implementation, such as creating duplicate content. I'm dealing with this issue right now with a couple clients and I can share one of the headaches we're experiencing.

Mixed sitemap URLs! Some URLs are in https and others are in http, because they've managed to confuse the CMS (don't ask, I'm not sure what they did yet). On top of that, duplicate content is created with every new page, because the CMS now creates a page in http and a page in https. The dynamic XML sitemap then picks one and adds it. It gets worse, but I'll end it there.

You can avoid all this by securing everything, and you'll have the optional benefit of upgrading the site to HTTP/2 if you secure the whole thing first.

brettmandoes

While Google has been reluctant to just come right out and say "Responsive is the way to go", they've dropped enough hints to make it fairly obvious that's what they want you to do. It makes sense - provide the same content to mobile users as you do desktop so it's accessible from anywhere. And the simplicity of it allows Google to efficiently crawl the site.

You're likely to realize greater efficiency by moving to a responsive design, which in turn will allow you to improve rankings more expediently. If SEO is a concern, go responsive and ditch the worry.

brettmandoes

I see Google focusing on mobile more and more. The launch of Pixel was specifically done to maintain search share (people rarely change default browsers on mobile devices, which Microsoft used to their advantage in Europe by releasing phones preloaded with Bing).

That said, mobile friendliness will be ever more important. This means the mobile first index will be fine tuned, and I expect to see more ad revenue focused on mobile this year. Site speed will continue to play a factor, and further integration of video into the SERP and paid ads is where I'll put my money.

HTTPS is already a ranking factor, but I expect that to become more important in the future once a certain threshold is crossed. Once Google can force the issue without impacting their own search quality I'm sure it will happen. Unknown if that will be a 2017 advancement or further down the line.

And of course Google will continue to try and make G+ relevant

brettmandoes

Hello all, I'm looking at something a bit wonky for one of the websites I manage. It's similar enough to other websites I manage (built on a template) that I'm surprised to see this issue occurring. The xml sitemap submitted shows Google there are 229 pages on the site. Starting in the beginning of December Google really ramped up their intensity in crawling the site. At its high point Google crawled 13,359 pages in a single day.

I mentioned I manage other similar sites - this is a very unusual spike. There are no resources like infinite scroll that auto generates content and would cause Google some grief.

So follow up questions to my "why?" is "how is this affecting my SEO efforts?" and "what do I do about it?". I've never encountered this before, but I think limiting my crawl budget would be treating the symptom instead of finding the cure. Any advice is appreciated. Thanks!

*edited for grammar.

brettmandoes

I'm running into this same issue where I have about a quarter of a client's site not indexing. Using the site:domain.com trick shows me 336 results - which I somehow need to add to a csv file, compare against the URLs crawled by screaming frog, and then use VLOOKUP to find the unique values.

So how can I get those 300+ results exported to a csv file for analysis?

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

brettmandoes

@brettmandoes

Posts made by brettmandoes

Products

Moz Solutions

Free SEO Tools

Resources

About Moz

Why Moz

Get Involved