Stop manually checking Search Console for thousands of URLs. Use the Indexing API to programmatically verify index coverage, detect blocked resources, and catch coverage drops before they cost you traffic. This guide walks through real-world integration, rate limits, and failure modes.
If you manage a site with 50,000 product pages or a network of content domains, manually checking index status in Search Console is a non-starter. The Google Indexing API gives you a programmatic endpoint to verify whether a URL is indexed, what the coverage status is, and if Google encountered an error. It is not a bulk submission tool (use the URL Inspection API for that), but for status checks it is direct and cheap.
A common situation we see: a team deploys new pages every hour, and after two weeks discovers half of them are stuck in 'Crawled - currently not indexed'. Without automation, that gap costs hundreds of hours of manual inspection. The Indexing API lets you build a cron job that checks each URL at deploy time, logs failures, and triggers alerts. You don't need to wonder — you get a JSON response with fields like indexingState, crawlTime, and latestFetch.
The API endpoint https://indexing.googleapis.com/v3/urlNotifications:getMetadata returns the current index state for a single URL. It tells you if Google has indexed the page, if it is pending, or if there was a crawl error. It does not tell you why a page is not indexed (for that, you need the URL Inspection API). It also does not accept wildcards or patterns — you must send explicit absolute URLs.
This makes the API ideal for monitoring specific high-value pages, not for crawling an entire sitemap. Use it as a diagnostic layer after a deployment or after submitting URLs via the Indexing API's update endpoint. For broader coverage analysis, combine it with Search Console's sitemap report and the Bulk Google Index Checker protocol for cross-referencing.
After a site migration, we had 1,200 product URLs that needed immediate index verification. Instead of manually inspecting each, we set up a Node.js script using the Indexing API. The script read a CSV with absolute URLs, authenticated via a service account, and sent each URL to getMetadata. We set a concurrency of 1 and a delay of 2 seconds between requests to stay under the 200/day quota (we only had 200 checks, so we prioritised the top 200 SKUs by traffic).
Results: 152 were indexed (INDEXED_STATE_INDEXED), 38 were pending (INDEXED_STATE_PENDING), and 10 returned 'not found' (INDEXED_STATE_INDEXED_NOT_FOUND). The 38 pending were re-sent as update notifications. The 10 not found were logged and manually inspected — 3 had broken internal links, 4 had noindex meta tags, and 3 were blocked by robots.txt. The script ran in 6 minutes and 42 seconds. Without automation, this would have taken a full day.
| Response field / error | Meaning | Action required | Failure mode risk |
|---|---|---|---|
| INDEXED_STATE_INDEXED URL is in the index | Google has indexed the page and it is searchable | Verify canonical URL; no action needed | False sense of security: might be indexed with wrong canonical, check via URL Inspection API |
| INDEXED_STATE_PENDING URL in crawl queue | Google knows the URL but has not indexed it yet | Wait 24-48 hours or send an update notification via the API | If persistent >7 days, could indicate crawl budget issues or blocked resources |
| INDEXED_STATE_INDEXED_NOT_FOUND URL not in index | Google could not find the page or it returned a 404/410 | Check server response, robots.txt, and internal links | May be a false negative if page is temporary down; verify with a live fetch |
| INDEXED_STATE_INVALID_URL Malformed URL | The URL string is not valid (e.g., missing scheme, invalid characters) | Sanitise URL before sending | Common with URLs containing spaces, unencoded Unicode, or trailing commas |
| 403 Forbidden Permission error | Service account lacks access or OAuth scope is wrong | Grant 'https://www.googleapis.com/auth/indexing' scope to the service account | Often misconfigured in GCP IAM; double-check the service account email |
| 429 Too Many Requests Rate limit hit | Exceeded 200 requests per day or burst limit | Implement exponential backoff; reduce concurrency to 1 | If you ignore backoff, Google may throttle for 24 hours; plan quota across multiple projects if needed |
Extract absolute URLs from sitemap or database. Validate format. Remove duplicates.
Use service account with OAuth scope indexing. Set credentials via environment variable.
POST to /v3/urlNotifications:getMetadata. Use exponential backoff on 429.
Read indexingState field. Log each result with timestamp and URL.
Log 403, 429, 500. For 429, wait and retry. For 403, halt and check permissions.
For pending: send update notification. For not found: inspect page source and server headers.
In practice, when you automate index status checks, the biggest pain is not the API itself — it is the quality of your URL list. We have seen scripts fail because a CSV had trailing spaces, or because URLs were stored with protocol-relative paths (//example.com/page) that the API rejects. Another common failure: blocked URLs. The API will return 'not found' even if the page exists but is blocked by robots.txt or has a noindex meta tag. The API does not differentiate between a 404 and a blocked page — you must check the server response separately.
Empty results also happen. If you query a URL that has never been submitted or crawled, the API returns an empty 404 response. This is not an error — it simply means Google has no record. Do not assume the page is broken; it may just be new. Another operational risk: hitting the 200/day quota in the first hour. Plan your batch size and prioritise high-value pages. If you need more volume, you can request a quota increase (up to 10,000/day for some projects), but that requires business justification and approval.
Validate every URL in your list: absolute, https, no trailing spaces, no fragments.
Set up a dedicated service account with only the indexing scope — no broader permissions.
Implement exponential backoff (1s, 2s, 4s, 8s) on 429 responses and log retries.
Create a monitoring dashboard for API errors: separate 403 (permission) from 429 (rate) from 500 (Google).
Store results in a database with timestamps so you can track index status over time.
Plan for 1-2 concurrency; do not use parallel requests unless you have a quota buffer.
Test on 10 URLs first. Confirm you can parse the response before running 200.
The Indexing API does not have a native batch endpoint. You must send one request per URL. For 1,000 URLs, you need to either spread across multiple Google Cloud projects (each with 200/day quota) or request a quota increase. A practical alternative is to use the URL Inspection API for bulk checks via Search Console, but that also has limits. For large batches, consider the Bulk Google Index Checker protocol as a supplementary tool.
It means Google has received the URL and added it to the crawl queue, but has not yet processed it. This is common for new or updated pages. Wait 24-48 hours before rechecking. If the status persists for over a week, check for crawl budget issues, orphan pages, or blocked internal links. You can trigger an update notification via the API to push the URL to the front of the queue.
Technically yes, if you have access to the target domain's Search Console or if the domain owner grants you API access. The Indexing API checks any publicly accessible URL. However, for guest posts, you likely do not own the domain, so you would need to ask the site owner to run the check or share their Search Console data. Alternatively, use the URL Inspection API with their credentials.
The Indexing API (getMetadata) returns only the index state (indexed, pending, not found) and crawl time. The URL Inspection API returns richer data: live fetch results, canonical URL, coverage issues, and resource load errors. Use the Indexing API for simple yes/no checks at scale. Use the URL Inspection API for deep diagnostics on a few high-priority URLs. Both have the same 200/day quota.
A 403 error means the service account does not have permission to access the Indexing API for that URL. For agency clients, each client domain must be verified in Search Console, and you must create a separate service account for each client or use domain-wide delegation. Ensure the OAuth scope is exactly 'https://www.googleapis.com/auth/indexing'. Also verify that the service account email is added as a user in Search Console.
The Indexing API itself is free, but it has a default quota of 200 requests per day per Google Cloud project. If you need more, you can request an increase (up to 10,000/day with approval). There is no per-request charge. The cost comes from engineering time to build the integration, maintain error handling, and manage credentials. For very large volumes, consider using Search Console's sitemap report as a free alternative.
The API returns INDEXED_STATE_INDEXED_NOT_FOUND when Google has no record of the page in its index. This can happen if the page is new, blocked by robots.txt, or has a noindex directive. The API does not distinguish between a true 404 and a blocked page. To debug, perform a live fetch using the URL Inspection API or verify the server response with curl. Also check that your internal links point to the correct URL.
With a 200/day quota, you cannot check thousands of URLs every hour. Instead, prioritise: check only the most important pages (e.g., top 200 by revenue or traffic) once per day. For hourly checks, use a rolling window: check 50 URLs every 6 hours, or request a quota increase. Alternatively, use webhook triggers: check a URL only when it is updated or when a sitemap is submitted. This keeps you under the limit.
The most common errors are 429 (rate limit) when sending too many requests in parallel, and 403 (permission) when the service account does not own the domain. For backlink monitoring, you are checking URLs you do not control, so you must have access to those domains via Search Console. Also, the API may return INDEXED_STATE_INDEXED_NOT_FOUND for backlinks that Google has not crawled yet, which is normal for new links.
Quick calculator. Put in the expected monthly value of a page or link batch and the natural waiting time.