Google Indexing API: Check Status Programmatically for Large Sites

On this page

Why the Indexing API beats manual checks for scale What the Indexing API checks (and what it does not)Integration steps: from zero to status check Worked example: Checking 1,200 product URLs after a migration Indexing API response fields: what each means and what to do Indexing API status check workflow Edge cases and operational failures you will encounter Before you deploy your status checker FAQ

Field notes

Why the Indexing API beats manual checks for scale

If you manage a site with 50,000 product pages or a network of content domains, manually checking index status in Search Console is a non-starter. The Google Indexing API gives you a programmatic endpoint to verify whether a URL is indexed, what the coverage status is, and if Google encountered an error. It is not a bulk submission tool (use the URL Inspection API for that), but for status checks it is direct and cheap.

A common situation we see: a team deploys new pages every hour, and after two weeks discovers half of them are stuck in 'Crawled - currently not indexed'. Without automation, that gap costs hundreds of hours of manual inspection. The Indexing API lets you build a cron job that checks each URL at deploy time, logs failures, and triggers alerts. You don't need to wonder — you get a JSON response with fields like indexingState, crawlTime, and latestFetch.

Field notes

What the Indexing API checks (and what it does not)

The API endpoint https://indexing.googleapis.com/v3/urlNotifications:getMetadata returns the current index state for a single URL. It tells you if Google has indexed the page, if it is pending, or if there was a crawl error. It does not tell you why a page is not indexed (for that, you need the URL Inspection API). It also does not accept wildcards or patterns — you must send explicit absolute URLs.

This makes the API ideal for monitoring specific high-value pages, not for crawling an entire sitemap. Use it as a diagnostic layer after a deployment or after submitting URLs via the Indexing API's update endpoint. For broader coverage analysis, combine it with Search Console's sitemap report and the Bulk Google Index Checker protocol for cross-referencing.

Integration steps: from zero to status check

Enable the Indexing API in your Google Cloud project and create a service account with the correct OAuth scope.
Download the private key (JSON) and set environment variables for credentials in your application.
Construct the request: use POST to getMetadata with the URL in the request body. Authenticate using OAuth 2.0 with the service account.
Parse the JSON response: check indexingState (INDEXED_STATE_INVALID_URL, INDEXED_STATE_PENDING, INDEXED_STATE_INDEXED_NOT_FOUND, INDEXED_STATE_INDEXED).
Handle errors: 403 (permission), 429 (rate limit), 500 (internal). Implement exponential backoff and log all failures.
Batch checks? Not natively. You must iterate. Use async/parallel with a concurrency limit of 1-2 to avoid hitting the 200/day quota too fast.

Worked example

Worked example: Checking 1,200 product URLs after a migration

After a site migration, we had 1,200 product URLs that needed immediate index verification. Instead of manually inspecting each, we set up a Node.js script using the Indexing API. The script read a CSV with absolute URLs, authenticated via a service account, and sent each URL to getMetadata. We set a concurrency of 1 and a delay of 2 seconds between requests to stay under the 200/day quota (we only had 200 checks, so we prioritised the top 200 SKUs by traffic).

Results: 152 were indexed (INDEXED_STATE_INDEXED), 38 were pending (INDEXED_STATE_PENDING), and 10 returned 'not found' (INDEXED_STATE_INDEXED_NOT_FOUND). The 38 pending were re-sent as update notifications. The 10 not found were logged and manually inspected — 3 had broken internal links, 4 had noindex meta tags, and 3 were blocked by robots.txt. The script ran in 6 minutes and 42 seconds. Without automation, this would have taken a full day.

Data table

Indexing API response fields: what each means and what to do

Response field / error	Meaning	Action required	Failure mode risk
INDEXED_STATE_INDEXED URL is in the index	Google has indexed the page and it is searchable	Verify canonical URL; no action needed	False sense of security: might be indexed with wrong canonical, check via URL Inspection API
INDEXED_STATE_PENDING URL in crawl queue	Google knows the URL but has not indexed it yet	Wait 24-48 hours or send an update notification via the API	If persistent >7 days, could indicate crawl budget issues or blocked resources
INDEXED_STATE_INDEXED_NOT_FOUND URL not in index	Google could not find the page or it returned a 404/410	Check server response, robots.txt, and internal links	May be a false negative if page is temporary down; verify with a live fetch
INDEXED_STATE_INVALID_URL Malformed URL	The URL string is not valid (e.g., missing scheme, invalid characters)	Sanitise URL before sending	Common with URLs containing spaces, unencoded Unicode, or trailing commas
403 Forbidden Permission error	Service account lacks access or OAuth scope is wrong	Grant 'https://www.googleapis.com/auth/indexing' scope to the service account	Often misconfigured in GCP IAM; double-check the service account email
429 Too Many Requests Rate limit hit	Exceeded 200 requests per day or burst limit	Implement exponential backoff; reduce concurrency to 1	If you ignore backoff, Google may throttle for 24 hours; plan quota across multiple projects if needed

Workflow map

Indexing API status check workflow

Prepare URL list

Extract absolute URLs from sitemap or database. Validate format. Remove duplicates.

Authenticate

Use service account with OAuth scope indexing. Set credentials via environment variable.

Send getMetadata request

POST to /v3/urlNotifications:getMetadata. Use exponential backoff on 429.

Parse response

Read indexingState field. Log each result with timestamp and URL.

Handle errors

Log 403, 429, 500. For 429, wait and retry. For 403, halt and check permissions.

Trigger actions

For pending: send update notification. For not found: inspect page source and server headers.

Field notes

Edge cases and operational failures you will encounter

In practice, when you automate index status checks, the biggest pain is not the API itself — it is the quality of your URL list. We have seen scripts fail because a CSV had trailing spaces, or because URLs were stored with protocol-relative paths (//example.com/page) that the API rejects. Another common failure: blocked URLs. The API will return 'not found' even if the page exists but is blocked by robots.txt or has a noindex meta tag. The API does not differentiate between a 404 and a blocked page — you must check the server response separately.

Empty results also happen. If you query a URL that has never been submitted or crawled, the API returns an empty 404 response. This is not an error — it simply means Google has no record. Do not assume the page is broken; it may just be new. Another operational risk: hitting the 200/day quota in the first hour. Plan your batch size and prioritise high-value pages. If you need more volume, you can request a quota increase (up to 10,000/day for some projects), but that requires business justification and approval.

Before you deploy your status checker

1

Validate every URL in your list: absolute, https, no trailing spaces, no fragments.

2

Set up a dedicated service account with only the indexing scope — no broader permissions.

3

Implement exponential backoff (1s, 2s, 4s, 8s) on 429 responses and log retries.

4

Create a monitoring dashboard for API errors: separate 403 (permission) from 429 (rate) from 500 (Google).

5

Store results in a database with timestamps so you can track index status over time.

6

Plan for 1-2 concurrency; do not use parallel requests unless you have a quota buffer.

7

Test on 10 URLs first. Confirm you can parse the response before running 200.

FAQ

How do I check the index status of 1,000 URLs using the Google Indexing API batch method?

The Indexing API does not have a native batch endpoint. You must send one request per URL. For 1,000 URLs, you need to either spread across multiple Google Cloud projects (each with 200/day quota) or request a quota increase. A practical alternative is to use the URL Inspection API for bulk checks via Search Console, but that also has limits. For large batches, consider the Bulk Google Index Checker protocol as a supplementary tool.

What does INDEXED_STATE_PENDING mean in the Google Indexing API response for my blog posts?

It means Google has received the URL and added it to the crawl queue, but has not yet processed it. This is common for new or updated pages. Wait 24-48 hours before rechecking. If the status persists for over a week, check for crawl budget issues, orphan pages, or blocked internal links. You can trigger an update notification via the API to push the URL to the front of the queue.

Can I use the Google Indexing API to check index status for guest posts on other domains?

Technically yes, if you have access to the target domain's Search Console or if the domain owner grants you API access. The Indexing API checks any publicly accessible URL. However, for guest posts, you likely do not own the domain, so you would need to ask the site owner to run the check or share their Search Console data. Alternatively, use the URL Inspection API with their credentials.

What is the difference between the Google Indexing API and the URL Inspection API for status checks?

The Indexing API (getMetadata) returns only the index state (indexed, pending, not found) and crawl time. The URL Inspection API returns richer data: live fetch results, canonical URL, coverage issues, and resource load errors. Use the Indexing API for simple yes/no checks at scale. Use the URL Inspection API for deep diagnostics on a few high-priority URLs. Both have the same 200/day quota.

How do I handle 403 errors when using the Google Indexing API for my agency clients?

A 403 error means the service account does not have permission to access the Indexing API for that URL. For agency clients, each client domain must be verified in Search Console, and you must create a separate service account for each client or use domain-wide delegation. Ensure the OAuth scope is exactly 'https://www.googleapis.com/auth/indexing'. Also verify that the service account email is added as a user in Search Console.

What is the cost of using the Google Indexing API for bulk index status checks?

The Indexing API itself is free, but it has a default quota of 200 requests per day per Google Cloud project. If you need more, you can request an increase (up to 10,000/day with approval). There is no per-request charge. The cost comes from engineering time to build the integration, maintain error handling, and manage credentials. For very large volumes, consider using Search Console's sitemap report as a free alternative.

Why does the Google Indexing API return 'not found' for URLs that exist on my server?

The API returns INDEXED_STATE_INDEXED_NOT_FOUND when Google has no record of the page in its index. This can happen if the page is new, blocked by robots.txt, or has a noindex directive. The API does not distinguish between a true 404 and a blocked page. To debug, perform a live fetch using the URL Inspection API or verify the server response with curl. Also check that your internal links point to the correct URL.

How do I automate Google Indexing API status checks every hour without hitting rate limits?

With a 200/day quota, you cannot check thousands of URLs every hour. Instead, prioritise: check only the most important pages (e.g., top 200 by revenue or traffic) once per day. For hourly checks, use a rolling window: check 50 URLs every 6 hours, or request a quota increase. Alternatively, use webhook triggers: check a URL only when it is updated or when a sitemap is submitted. This keeps you under the limit.

What are the most common errors when using the Google Indexing API for backlink monitoring?

The most common errors are 429 (rate limit) when sending too many requests in parallel, and 403 (permission) when the service account does not own the domain. For backlink monitoring, you are checking URLs you do not control, so you must have access to those domains via Search Console. Also, the API may return INDEXED_STATE_INDEXED_NOT_FOUND for backlinks that Google has not crawled yet, which is normal for new links.

Budget math

Estimate the cost of waiting

Quick calculator. Put in the expected monthly value of a page or link batch and the natural waiting time.

Expected monthly value, USD Average waiting time, days

Next reads

Related guides

↗

Main guide

↗

How to Check If Google Indexed Your Site Using Search Console

↗

Check Indexing Status for Multiple Pages at Once (Bulk Checker)

↗

Site: Operator vs Search Console: Best Way to Check Indexing