Google continues to make tremendous strides that are geared towards improving user experience. Google Console, previously known as Google Webmaster Tools, now offer more data to Singapore search engine optimization experts. The two top features in Google Console that will impact your optimization efforts are Links to Your Site and Search Analytics.
One of the maintenance things that you need to do on your site to ensure that it continues to rank high on search engines and generate revenue is fix crawl errors. Today, we will look at the specific steps that you can take to fix these errors using Google Console.
All the errors on your site are listed in Google Console Site Errors section. It is imperative that you take a look at these errors as skip them could have adverse effects on your site functionalism. Note that these errors are for the last three months (90 days).
How Frequent Should I Check These Errors?
It is recommended to log in to your website Google Console on a daily basis to ensure that no errors have being picked up by this analytical tool. Accessing this section on a daily basis will also ensure that all errors are fixed quickly.
DNS (Domain Name System) Errors
The effects of having DNS errors are severe and should not be ignored. This is based on the fact that if Googlebot is experiencing these errors, it will not be able to establish a connection with your brand’s domain via DNS lookup or timeout issue.
According to a statement from Google, most DNS issues do not hinder the search engine bots from connection with your website, but severe issues could make this impossible and so they should be rectified promptly. For instance, Google can still crawl the site despite the presence of high latency issues, but they could be result in poor user experience on the other end.
How to Fix DNS errors
- Google advices webmasters to use their Fetch tool to understand how the search engine bots are interpreting/crawl the website pages.
It is possible to do this check without rendering if you are just evaluating the DNS connection status.
- Secondly, you can check your DNS provider to see if the issue is emanating from their side.
- Thirdly, make sure that the server that your website is hosted on displays 500 or 404 error codes. The server will display either of these two codes if it cannot establish a connection. Most SEO experts are of the idea that these codes are more accurate than the DNS errors.
A server error results from the server taking too long to respond or if the request times out. The error is displayed since there is a limit on the amount of time that Googlebot can wait for the site to load.
Server errors can also result of unprecedented traffic load. You can avoid this by choosing a hosting company that can accommodate sudden increases in website traffic. Every website owners wants the site to go viral but very few are actually ready to handle the traffic bursts.
Just like the DNS errors, it is imperative that you fix these errors before they wreak havoc on your website. Googlebot will not be able to find anything on your site to crawl if these errors are not fixed even if it can connect with the DNS.
How to Fix Server Errors
If your website is functioning as expected when you come across the error, it means that the server errors occurred in the past and were resolved. However, you still need to make some changes to prevent them from recurring in the future.
Google recommends using Fetch to check if Googlebot has the ability to crawl the website. If Fetch is able to access the content on your homepage, know that Google can access the site without any problems.
Before you embark on fixing server errors, it is important to know the specific error that you are dealing with. Here are some of the common types of server errors:
- Truncated Headers
- Truncated Response
- Connect Timeout
- Connect Failed
- Connect Refused
- Connection Reset
Google Search Console help page gives clear details of how to resolve each of these errors.
A robot failure means that Googlebot was not able to retrieve your website robots.txt file that is strategically located at [yourdomain.com]/robots.txt.
The robot.txt file is only important if you do not want Google to crawl specific pages of your website. According to statement by Google, you only need this file is there is certain contents that you do not want to be indexed by the search engines. Therefore, if you want to entire site to be indexed, you don’t need to have it – not even an empty one. If this file is unavailable, a 404 error will be displayed with Googlebot requests for it and will proceed to crawl and index the entire site.
For sites whose content is updated on a daily basis, it is important to ensure that this file is in place to ensure that all new pages or contents are indexed.
How to Fix Robots Failure
The first thing that you need to do is make sure that the robot.txt file is configured properly. Check the specific pages that this file is directing Googlebot not to crawl because all the others will be crawled by default. Make sure the superior line “Disallow:/” does not exist unless you do not want your website to appear on SERPs.
If the file is OK and you are still getting errors, use a server header check tool to see if the file is returning a 404 or 200 errors. Interestingly, it is much safer to have no robot.txt file in place than to have one that is not properly configured.
The main difference between site errors and URL errors is that the latter only affects specific website pages and not the entire site.
Google Search Console is set to list all the top URL errors in different categories- smartphone, desktop, and featured phone. Well, this information may not be enough for large sites, but for most sites, this is enough.
The large number of URL errors displayed can freak you out. What you need to note is that Google ranks the errors based on their priority and some of the errors are already resolved even though they appear on the list.
If you recently made some major changes to your site, you can mark all the errors as fixed then check the list after a few days.
This move will clear all the errors from the dashboard but Google will list them again the next time it crawls your site. However, if you already fixed them, they will not appear on the list.
Soft 404 Error
Soft 404 error is when a webpage shows 200(found) when it should display 404(not found). Note that the user visible aspect of a 404 page is actually the contents published on the page. The message displayed should inform the web visitors that the page no longer exists.
On the other hand, a 404 page is the crawler visible response and the header HTTP response code should be 404 “not found” or 410 “gone”.
Below is a quick overview on how HTTPS responses and requests work:
If a 404 page is listed/labeled as Soft 404, it means that the header HTTP response code does not transmit the (not found) 404 response code. According to Google, it should always return a 410 or 404 codes is a user visits a non-existing web page.
Soft 404 errors can also be as a result of pages that are 301 redirecting to pages that are not related such as the home page. Based on my experience, redirecting many pages to the home page will trigger Google to interpret the specific redirected ULS as 404s instead of 301 redirects. Also, if you redirect an old webpage to another page that is related, the chances of a 404 warning being displayed are minimal.
Pages that are of paramount importance to your business such as product description pages and lead generation pages should not trigger soft 404s if they are live pages. Ideally, you should pay close attention to the pages are essential to your site ability to generate revenue.
How to Fix Soft 404s
For website pages that are no longer available:
- Allow to 410 or 404 if the web page is no longer in existent and receives insignificant links or traffic. But, make sure that the header server response is either 410 or 404, not 200
- Consider 301 redirecting every old page on your site to another page that is relevant and related
- Desist from redirecting a large number of non-existent pages to the home page. Instead, they should be redirected to similar or related pages
As mentioned earlier, live pages should be soft 404:
- Make sure that that there is enough and valuable content in the page as thin content can trigger this error
- Make sure that the content on the page does not appear to represent a 404 page while still serving a 200 response code
As you can soft 404S are confusing and look like hybrids of normal pages and 404. Just make sure all the critical pages do not have 404 errors.
A 404 error means that Googlebot tried to crawl and index a page that no longer exists on your site. This error could also be as a result of other pages or sites linking to a non-existent page. According to a statement by Google, this kind of an error has no serious impact on site ranking. In fact, it recommends ignoring the errors but you need to be vigilant to know which errors you can let slide and those that require immediate attention.
My personal opinion is that a 404 error on a page that:
- Gets important links from external sources
- Receives a substantial amount of visitor traffic
- Has an straightforward URL that the web visitor intend to view
It is OK to ignore it.
However, deciding what qualifies as a substantial amount of traffic for a particular URL and important links is a daunting task. You should check the backlinks to ensure that you do not lose a valuable link.
On the flipside, 404s should be urgently resolved if they are displayed on important pages.
How to fix 404 Errors
- Check if the page is published from your CMS (content management system) and is not in draft mode or deleted
- Make sure that the 404 error URL is indeed the correct page and not a variation
- Check if the error is displayed on www vs. non-www version of your website as well as in http vs. https version of the sit
- If you have no plans to bring the page back to life, but you plan to redirect it to another page, make sure that you 301 redirect it to the related page
How to prevent old 404s from being listed in the crawl error report
First, if a 404 error URL is not mean to be shown, ignore it as Google recommends. However, if you want it not to appear in the crawl error report page, you need to do a few more steps. Another sign of the power of links is that Google only shows these errors if an external website or a page on your site is link to the 404 page.
You can know the specific links to the 404 page by clicking Crawl Errors then URL Errors Section:
Click on any of the URLs that are displayed to fix them:
Use the source code of the page to find the related links quickly. To prevent the links from showing on your dashboard, you will need to remove every link to the specific page include external website.
Access Denied basically means that Googlebot is unable to crawl the page. Unlike 404 errors, Googlebot in this case is prevented from crawling the page by other underlying factors.
Here are some of the common reasons why Googlebot is blocked:
- There is a restriction that requires all users to log in to be able to view a URL on your site
- The site’s robots.txt file is blocking Googlebot from accessing the URLs, the entire site, or the folders
- The servers require all users to authenticate their identity by proxy or the hosting provider is blocking Googlebot from accessing the website
You can ignore the error if you do not want Google to crawl it, but if you need the page to be crawled, you should urgently resolve them.
How to fix Access Denied Errors
These errors can be fixed by removing the specific elements or underlying settings that are blocking Googlebot from accessing the site:
- Get rid of login from the specific pages that you want crawled such as a popup login prompt
- Cross-check the robots.txt file to be sure that the pages listed there are not meant to be crawled or indexed by search engines
- Make use of robots.txt tester to test individual URLs as well as warning on your robots.txt file
- Make use of a user-agent switcher plugin for your browser
- Use Screaming Frog to scan your site. The scan will prompt to log in to the pages that requires it
Note that access denied can harm your site ranking.
If this error is a result of old URLs that are no longer in existence, then you have nothing to worry. However, if the error is stemming from high priority URL, you need to act fast and resolve them.
How to fix not followed errors
The following is a list of Google that indicates the specific features that Googlebot may have issues crawling:
- Session IDs
Chrome add-on, User-Agent Switcher, Fetch, or Lynx text browser can help you view your site as Google does. If you are unable to see the pages loading or important page content when you use these tools, you have found your issue. Without visible links and content on the page to craw, some of the URLs will not be followed. Do further research to diagnose and fix these identified issues.
Here are steps that you need to take to resolve not followed errors that are as a result of redirects:
- Be on the lookout for redirect chains. Google will automatically stop following a redirect chain if it has too many hops
- Update the site architecture to make it possible to every page to be accessed from static links instead of previously implemented redirects
- Make sure that redirected URLs are not included in the sitemap, only the destination URL should be included
Other tools that you can use to resolve this error are:
- Raven Tools Site Auditor
- Moz Pro Site Crawl
- The Screaming Frog SEO Spider
DNS & Server Errors
In the URL Errors section, Google Search Console lists DNS errors and server errors again. Google recommends resolve these errors the same way you handled the server and DNS errors that we previously discussed.
The errors in this section may differ with the other errors previously discussed if they only affect specific URLS and not the entire site. Also, if some of the URLs are configured different or have adjustments such as minisites, they will be listed in this category.
Here is a simple table to help you know the meaning of the various site errors that we have discussed in this guide quickly.
The bottom line is; fixing these errors, even though the process is tedious, will improve your website’s ranking on Google as well as provide better user experience to your web visitors.