Why my site getting soft 404 errors from search term on 404 page showing in GWT?
-
Hi,
I just recently switched to the wordpress and I’m suddenly getting a couple of soft 404 errors on Google Webmaster tools.
I think they are all coming from the somehow automatic search or getting crawled by Google within coding of our wordpress website on the NOT FOUND page since that’s the only search page I have found, they all have the term {search_term} as the search query and the page gives me a “Nothing Found” result.
I would love to hear some advice on how to resolve this issue.
This is what I see on the webmaster tool, also check screenshot link here.
Soft 404
Url: abc.com/search/{search_term}/Linked from
abc.com/search/{search_term}/
abc.com/?s={search_term}Thank you,
Ram Babu -
Ram,
the problem could be in your HTACCESS file. Have you checked that? I've recently discovered my sites also generated soft 404's because of a false declaration in my HTACCESS regarding 404 pages. Maybe you should look into that.
Or post your code and we can have a look for you.
regards
Jarno
-
Hi Jarno, thanks for reply.
Below is my website exact .htaccess code , please do check & let me know , is there really any problem in our .htaccess coding that generate soft 404 errors:-
BEGIN W3TC Browser Cache
<ifmodule mod_deflate.c=""><ifmodule mod_headers.c="">Header append Vary User-Agent env=!dont-vary</ifmodule>
AddOutputFilterByType DEFLATE text/css text/x-component application/x-javascript application/javascript text/javascript text/x-js text/html text/richtext image/svg+xml text/plain text/xsd text/xsl text/xml image/x-icon application/json
<ifmodule mod_mime.c=""># DEFLATE by extension
AddOutputFilter DEFLATE js css htm html xml</ifmodule></ifmodule>END W3TC Browser Cache
BEGIN W3TC Page Cache core
<ifmodule mod_rewrite.c="">RewriteEngine On
RewriteBase /
RewriteCond %{HTTP:Accept-Encoding} gzip
RewriteRule .* - [E=W3TC_ENC:_gzip]
RewriteCond %{HTTP_COOKIE} w3tc_preview [NC]
RewriteRule .* - [E=W3TC_PREVIEW:_preview]
RewriteCond %{REQUEST_METHOD} !=POST
RewriteCond %{QUERY_STRING} =""
RewriteCond %{REQUEST_URI} /$
RewriteCond %{HTTP_COOKIE} !(comment_author|wp-postpass|w3tc_logged_out|wordpress_logged_in|wptouch_switch_toggle) [NC]
RewriteCond "%{DOCUMENT_ROOT}/wp-content/cache/page_enhanced/%{HTTP_HOST}/%{REQUEST_URI}/_index%{ENV:W3TC_PREVIEW}.html%{ENV:W3TC_ENC}" -f
RewriteRule .* "/wp-content/cache/page_enhanced/%{HTTP_HOST}/%{REQUEST_URI}/_index%{ENV:W3TC_PREVIEW}.html%{ENV:W3TC_ENC}" [L]</ifmodule>END W3TC Page Cache core
BEGIN WordPress
<ifmodule mod_rewrite.c="">RewriteEngine On
RewriteBase /
RewriteRule ^index.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]</ifmodule>END WordPress
EXPIRES CACHING
<ifmodule mod_expires.c="">ExpiresActive On
ExpiresByType image/jpg "access 1 year"
ExpiresByType image/jpeg "access 1 year"
ExpiresByType image/gif "access 1 year"
ExpiresByType image/png "access 1 year"
ExpiresByType text/css "access 1 month"
ExpiresByType text/html "access 1 month"
ExpiresByType application/pdf "access 1 month"
ExpiresByType text/x-javascript "access 1 month"
ExpiresByType application/x-shockwave-flash "access 1 month"
ExpiresByType image/x-icon "access 1 year"
ExpiresDefault "access 1 month"</ifmodule>EXPIRES CACHING
Aggressive Russian Search Engine
SetEnvIfNoCase User-Agent "Yandex" bad_bot
<limit get="" post="" head="">Order Allow,Deny
Allow from allCyveillance
deny from 38.100.19.8/29
deny from 38.100.21.0/24
deny from 38.100.41.64/26
deny from 38.105.71.0/25
deny from 38.105.83.0/27
deny from 38.112.21.140/30
deny from 38.118.42.32/29
deny from 65.213.208.128/27
deny from 65.222.176.96/27
deny from 65.222.185.72/29
Deny from env=bad_bot</limit>
Waiting your reply @__Jarno__
Aman -
Hi Ram
Is this you? I found it searching Google: https://wordpress.org/support/topic/why-my-site-getting-soft-404-errors-from-search-term-on-404-page-showing-in-gwt - and thought it would be helpful to know the actual site (and there's some more details in your thread there).
Anyhow, this is the URL getting a soft 404: https://akclinics.org/search/{search_term}/ (because you are showing a 200 OK, but there really is no content there, and looks like a 404 to the user)
You mention it's linked from here: https://akclinics.org/search/{search_term}/
But the question is - where is that linked from? I've crawled the entire site and can't find a reference to that URL anywhere.
You're normal search URL looks like this (even if the query is empty): https://akclinics.org/?s=
It shows the proper ?s= parameter.
So somehow, it looks like Google is finding and crawling the wrong search URL. It's using /search/%7Bsearch_term%7D/ instead of /?s= - even for an empty query. Maybe this was the old search URL for your old site?
The fix I think is to:
- find out where / why Google can access /search/%7Bsearch_term%7D/
- remove any links or references to /search/%7Bsearch_term%7D/ and/or redirect it to /?s=
- OR just return a real 404 code for /search/%7Bsearch_term%7D/
Let me know if that makes sense.