It is possible to view details about crawl ability and indexability for your pages and videos via the Index report in Search Console. The reasons that cause the URLs on your site not to be crawled or indexed are listed in detail in this report.
In this content I have prepared, I have compiled together the crawl ability and indexability warnings, errors, and solutions that can be viewed in the ” Page Count ” tab in the Index report in the Index section of Search Console.
How Does Indexing Work?
It generally takes 4 main stages for Googlebot to discover a URL, perform a crawl by requesting the URL, and then index it to serve users from Google search results. Googlebot crawls a URL through a link from an external website or the sitemap of that site and begins the Discovering process, the first stage of the indexing process.
If there is no error after the discovery phase, the URLs are placed in the crawling queue and the second step, Crawling, begins. Some problems prevent a URL from being crawled ( server problems, Robots.txt blocks, rendering issues, etc.).) if there is no problem, Scanning takes place and as the last step, the Indexing process of the page starts.
Indexing will also occur if no problems are preventing the page from being indexed. In the last step, the Serving phase is completed for the indexed URLs, and the URLs are served to the users through Google search results.
What Does A Search Console Index Report Do?
There are various crawl ability and indexability data about your pages in the “ Page Count ” tab in the section titled “ Index” on the left menu of Search Console. You can see the “ Page Count” exactly in the Index section with the “ Pages, Video Pages, Sitemaps, and URL Removal ” tabsThe ” tab lists warning messages and errors about the discovery, crawling, and indexing status of your pages.
In this section, you can view the URLs that Google has successfully crawled and indexed, while at the same time, you can view your pages that were not crawled after Googlebot’s request or that were not indexed for various reasons, even if the crawling took place, and you can examine in detail for what reason your pages could not be crawled or indexed.
Warnings, Errors, And Solutions In The Index Report
You can see many different categories of warnings and errors in the index report. The warnings and errors, which are divided into different categories here, vary according to the situations encountered after your pages are discovered or crawled by Googlebot.
In general, this report lists warning messages and errors on discovery, crawling, and indexing. In the rest of the article, I tried to explain the warning messages and errors presented in this report and the solutions for these errors simply and understandably.
“ Why aren’t the pages indexed? The following warnings and errors are generally encountered in the ” Page Indexing Report “, which is divided into two main sections titled ” Improving the Page View ” and “Improving the Page View”. In both sections, crawl ability and indexability issues with URLs are listed with drop-down tabs one after the other.
Blocked By robots.txt
As you know, with the Disallow commands you have used in your Robots.txt file, you can prevent a direct URL, a directory, or multiple URLs from being crawled by Googlebot. In this case, Google also shows you a warning message about this situation, “ Blocked by robots.txt ” on the Index report in Search Console. When you click on the warning message, you can see which of your URLs could not be crawled due to this blocking.
Solution: This message is not always an error message that needs to be resolved. Google here just informs you that you have URLs that cannot be crawled because of your Robots.txt commands. All you have to do here is to check if you have a URL that is not in the account and is important to you among your URLs affected by Robots.txt blocks.
Because it is a very common problem that the Disallow commands added to the Robots.txt file are written incorrectly, and it is possible that very important URLs cannot be crawled due to this error. You can send a review request to Google with the “ Verify Correction ” button for your URLs to be re-evaluated after the change you have made
Blocked Due To Another 4xx Issue
The URLs where you see this error may have a 4xx issue in a different category than the 4xx errors Google has listed in Search Console. For example, if a URL has a 404 response code, you can view it in Search Console, but if a URL has various 4xx errors such as 405, 406, or 408, Google will not display it separately in Search Console, and the affected URLs will be referred to as “Another URL”. It is displayed inside the “Blocked due to 4xx problem” tab.
Solution: First, you need to run a new test for your affected URLs through the URL Inspection Tool to determine which response Googlebot sees when it crawls the page and what the response is from. Afterward, necessary actions must be taken so that the relevant URLs give 200 HTTP responses again. After the actions you have taken for your URLs affected by the problem here, you can notify Google that you have resolved the problems in your URLs with the “ Verify Fix ” button.
Discovered – Currently Not Indexed
Googlebot can discover URLs on your site from different sources. The discovery phase can take place with the help of links that your URLs get from external sites, through links to your site from social media platforms, or your sitemap that you submit to Google from Search Console.
However, not every discovered URL is immediately crawled and indexed. The crawling of discovered URLs by Googlebot takes place next, and Googlebot must wait for a URL to be crawled.
Solution: Seeing this warning message does not always mean there is an error. With this message, Google informs you that your URLs have been discovered and that the first step on the path to indexing is complete. Although there is no clear action you can take to crawl and index your discovered URLs, it can be a good step to generally ensure that your pages comply with SEO criteria, keep your content quality at a high level, and serve pages that comply with Google’s quality guidelines.
Crawled – Currently Not Indexed
Just as it is not always possible to instantly crawl discovered URLs, it is not always possible to have your crawled URLs indexed instantly. This message you see in the index report means your URLs have been crawled but not yet indexed in Google.
There are many reasons why your URLs listed in this warning message have not yet been indexed. Generally speaking, site-wide crawl budget issues are considered to be poor quality of the content at the affected URLs, use of duplicate content, and above all, Googlebot not finding a page of good quality to index it. If your high number of URLs are affected by this issue and are not indexed, it is a serious obstacle to your site’s organic traffic gain.
Solution: By determining site-wide crawl budget management actions, you should direct Googlebot to your valuable pages that are important to your SEO performance, rather than unproductive pages. Even if you have pages that are not crawled and indexed after your crawl budget management actions, these pages will already be pages that do not matter to you on the SEO side.
Secondly, it is also recommended to perform a content audit for your affected URLs. If the content on your pages is weak and insufficient, the content must be strengthened and enriched. In addition, if your pages contain duplicate content or if you have content designed to target only Googlebot, you still need to revise your content for these pages.
Excluded By The “no index” Tag
This message you see in the index report does not appear due to an error and is displayed to notify the site owner. As you know, the no index tag informs Googlebot that the page is not listed in the search results, and it is used to prevent the page from being indexed if it is already indexed, or to prevent it from being indexed if it is to be indexed for the first time.
The important point here is which pages your URLs are included in this warning message when you click on the tab “ Excluded by the “no index” tag. This is a serious problem if you have a page that is important for your SEO performance among your URLs with no index tag in this tab. It’s a common mistake to forget the no index tag on important pages for obvious reasons.
Solution: As this is a warning message, it does not always count as an error and a resolution action may not always be required. The only point that matters is whether you have important URLs among the URLs included in the “ excluded by the “no index” tag. If an important URL is affected by this warning, you should remove the no-index tag from your URL.
Access Denied Due To Denial Of Permission (403)
This error with HTTP code 403 occurs when your server forbids access to that URL after a request is made to your URL. A real user or Googlebot can encounter the 403 error, and it provides information that access to the requested URL is blocked by the server.
Since Googlebot, encountering a 403 error, cannot get access permission for the page it wants to crawl, the crawl does not occur and the relevant URL is displayed in this warning in Search Console.
Solution: You need to make an access arrangement for the relevant URLs through the configurations on your server. After your edit, you can use the URL Inspection Tool in Search Console to check what HTTP response Googlebot receives when it crawls your URL. After you fix the access problem in your URLs, you can use the “ Verify Fix ” button to notify Google that the problem has been resolved.
Not Found (404)
When Googlebot sends a crawl request to your URL and sees a 404 in the HTTP response, it realizes that your page is no longer found and a failed crawl occurs. The URLs you see in this tab in the index report refer to the URLs that were discovered by Googlebot after an external link, or that were previously created on your site but are not currently available.
Unless you take any additional action, once Googlebot discovers a 404 URL, it periodically recrawls and checks the status of the URL. While the crawl frequency for a URL with a consecutive 404 response may decrease over time, there is no way to indicate to Googlebot that it should no longer crawl a URL permanently. Finally,
Solution: If there are pages that you have moved to a different URL among the 404 URLs you see in this tab, it is recommended to apply a 301 redirect so that your 404 URLs can be redirected to their new addresses. With a 301 redirect, you can redirect both real users and Googlebot to a new and working URL.
In addition, the 301 redirect you make ensures that the Google directory owned by your old URL is replaced with the new URL, and the authority and backlink contribution is transferred to the new URL. After completing the necessary redirect actions for your URLs with 404 errors, you can send a review request to Google with the ” Verify Fix ” button to check your URLs
A soft 404 is a warning that is shown if the content of a page with a 200 response is blank or if the page view has the same as a 404 page. In cases where a page with an HTTP 200 response code cannot be loaded due to a technical error, the page cannot be rendered by Googlebot, its content is loaded empty, or it shows 404 error messages to users, such as a 404 page by design, it is marked as Soft 404 by Google. Pages marked as soft 404 will not be indexed and will not be shown to users in search results.
Solution: You can start by checking if your URLs marked as Soft 404 in the index report are working. If these pages are not opening, you need to identify the reason why the content of your URLs is not loading and ensure that the technical/software problems that cause this are resolved. If your pages are opening but the content of the pages is empty, you need to make these pages useful.
Finally, even if your pages behave like a 404 page and display an error message like ” Not Found “, you need to identify and resolve the situation that caused it. After solving the problems in your URLs, you can send an evaluation request to Google with the help of the ” Verify Fix ” button for the Soft 404 error via Search Console
Server error (5xx)
5xx errors are shown as a warning message when a page cannot be loaded or crawled due to server-related problems. A problem with your server may prevent your URLs from working and your pages from being usable.
In this case, the Googlebot crawling your page will receive a 500 or similar HTTP response. Your URLs that cannot be crawled due to server problems are not indexed and are listed in “ Server error (5xx) ” in the Index report.
Solution: The problem that prevents your pages from working on your server should be resolved. Your URLs may not be working due to a faulty configuration on your server, database connection problems, or similar server-related problems.
If this problem is detected and resolved – unless there is a different situation – your pages will be up and running with the HTTP 200 response code. After you have resolved the 5xx problems on your pages, you can use the ” Verify Fix ” button on Search Console to let Google know that you have solved the problem and have it re-examined.
The redirect page warning is a warning message that can be caused by different redirect problems. After a request is made to a URL, that URL can redirect you to another URL. If this redirect is too long and many redirects are applied until the final URL is reached, it will be difficult to reach the URL at the end of the chain and it will be difficult for Googlebot to complete the crawling process properly.
In this case, the crawl will not occur and the URL with this problem will be written to the ” Redirected page ” tab in the Index report. Apart from this situation, you can also face this problem if a URL constantly redirects to itself and enters an endless redirect loop.
In this case, too, Googlebot cannot perform the crawl, and the relevant URL is still in the Index report “Redirected page. It is listed in the” tab.
Solution: A general check for your URLs affected by the redirect page issue and the redirect phases should be checked. If your URL is redirecting and there are many redirects in between, this will cause a “ redirect chain ” and it will be difficult to crawl. That’s why your referrer URLs should always point to the destination URL with a single redirect.
In addition, if you have self-directing URLs, the redirect should also be removed from these URLs and the page should be checked whether it works and whether it is suitable for crawling. After you’ve resolved your issues, the ” Verify Fix” page will appear on the alert page. By clicking the ”button, you can request Google to re-evaluate and fulfill a re-crawl for your URLs.
Read This Article: Buy High PR Do-follow Backlinks for Quick SEO
Except for the above problem, if you see a ” Redirect error ” warning message for a URL, this indicates that the target page is not working in a URL redirect you made. For example, if you redirected page A to page B and the target URL is page B with a status of 404, in this scenario page A will be listed in “ Redirect error ” in the Index report in Search Console and a warning will be displayed.
Solution: If the destination URL does not work in the URL redirects you make, you should remove the redirects here and redirect your starting URLs to your URLs that are working and can be crawled without any problems. After this action, you can let Google know that you have solved the problem by clicking the “ Verify Fix ” button on the Index report.
Alternative Page With Correct Canonical Tag
The warning message ” Alternative page with the correct standard tag ” that you see in the index report is not an error and there is no problem to be resolved. In this message, you are informed that you are using the correct Canonical URL for the URLs you have seen.
Solution: If your URLs in this report do not use an incorrect Canonical URL on your live site, there is no problem. But if your Canonical tags for these URLs on the live site are different and this is not planned within an SEO strategy, you will need to manipulate the Canonical URLs of your pages and edit them to use the correct Canonical URL.
Copy Without User-Selected Standard Page
The URLs listed in this alert are pages that Googlebot crawled but did not index because the content was duplicated or very similar to another page. If your URLs don’t have duplicate/similar content status and still show in this warning message, it means that the Canonical tag is not used in your URLs.
Solution: If the content on your pages is duplicate or very similar to another page, you need to make content edits for those pages. If the problem is not caused by content problems, you should check if your pages have a Canonical URL.
If these URLs do not use a Canonical URL, you can work around the problem by using the correct Canonical URL for your URLs. After your edits, you can start the verification process by clicking the ” Verify Fix ” button for Google to check your URLs
Duplicate Google Chose A Different Canonical Page From The User
Even if you use a valid Canonical URL for the URLs on your site, in some cases, Googlebot will skip the Canonical URL you specified after scanning and act as if another similar page on your site is more appropriate. In this case, it indexes the URL it prefers instead of the URL it crawls.
It is a warning message that is encountered especially on sites where there is content similarity. For example, “ May 25, 2023, Exchange RateSuppose there are two pages named ” and “26 May 2023 Exchange Rate”. Since these two pages will be very similar in content, Googlebot may not index these two pages after crawling.
As in this warning message, it can make a choice and index for the URL it deems appropriate. In addition, this warning message is also encountered when Canonical URLs are not selected correctly and may prevent important pages from being indexed.
Solution: First you can check if the Canonical URLs you use in your URLs are the best Canonical URL preference for your pages. If there is no problem at this point and your Canonical URLs are as they should be, you can check for content similarity by doing a site-wide review of your pages with the problem. You can use the “ Verify Fix ” button to notify Google that the issue has been resolved after your changes.
Read This Article: Everything About iTools Free Download For Windows 7 Latest Version
Indexed Though Blocked By robots.txt
This warning message, listed under the heading “ Improving the page appearance ” in the index report, states that the URL was crawled and indexed despite a URL blocking with the Disallow command over the Robots.txt file.
So how can a URL that is not allowed to be crawled by Googlebot via Robots.txt be crawled by Googlebot and also indexed? Google explains this situation as follows: Although there is a crawl block for a URL via the robots.txt file when a link is given from an external site to that URL, Googlebot will follow that link and perform indexing.
Solution: For the URLs you have blocked via robots.txt not to be indexed, you need to add the “no index” tag to the relevant URLs. Thus, the no-index tag you use on your page will prevent your URL from being indexed.
Page indexed Without Content
This warning message, especially for pages whose content cannot be loaded due to technical errors, is displayed when the content of a page cannot be rendered/read by Googlebot. Although the page content cannot be read by Googlebot, indexing takes place and is served to users through search results.
Solution: The loading problems in your URLs shown in this warning message should be detected and resolved. After your edits, you can check whether your page content is readable by Googlebot through the URL Inspection Tool in Search Console. After solving the loading problems on your pages, you can use the ” Verify Fix ” button to review your URLs again.