Published on 2025-08-07T06:18:08Z

Googlebot-News

Googlebot-News is the directive used to control which content on a publisher's site is submitted to Google News. While the crawling itself is now done by the main Googlebot, the Googlebot-News user-agent in a robots.txt file gives news publishers specific control over their content's inclusion in the news service. For publishers, being included in Google News can be a major driver of visibility and traffic for timely articles.

What is Googlebot-News?

Googlebot-News is a specialized user-agent identifier used by Google to discover and index content for its Google News service. Although it was once a distinct crawler, Google has since consolidated its news crawling into the primary Googlebot. However, Google continues to honor the Googlebot-News directive in robots.txt files, giving publishers granular control over which of their content is considered for Google News, separate from regular search results. This allows publishers to manage their presence in the news ecosystem specifically.

Why is my site being crawled for Google News?

If Google is crawling your site for news content (which is now done by the main Googlebot), it is because Google is evaluating your content for inclusion in the Google News service. The crawler is specifically looking for recent news articles, press releases, and other journalistic content that meets Google's News content policies. The frequency of visits is typically high for sites that publish breaking news, as the service aims to be as up-to-date as possible. This is a beneficial crawling activity that can significantly boost a publisher's visibility.

What is the purpose of Googlebot-News?

The purpose of the Googlebot-News directive and the associated crawling is to build and maintain the content index for Google News, a specialized news aggregation service. It focuses exclusively on journalistic content, organizing stories by topic and presenting diverse perspectives on current events. For publishers, inclusion in Google News can drive a substantial amount of traffic, especially for time-sensitive articles, and can significantly broaden a publication's audience reach.

How do I block Googlebot-News?

If you are a news publisher and you wish to prevent your content from being included in Google News, you can use a specific rule in your robots.txt file. This will not affect your site's visibility in regular Google Search results.

To block your content from Google News, add the following lines to your robots.txt file:

User-agent: Googlebot-News
Disallow: /

How to verify the authenticity of the user-agent operated by Google?

Reverse IP lookup technique

To verify user-agent authenticity, you can use host linux command two times with the IP address of the requester.

```
> host IPAddressOfRequest
```
This command returns the reverse lookup hostname (e.g., 4.4.8.8.in-addr.arpa.).

> host ReverseDNSFromTheOutputOfFirstRequest

If the output matches the original IP address and the domain is associated with a trusted operator (e.g., Google), the user-agent can be considered legitimate.

IP list lookup technique

Some operators provide a public list of IP addresses used by their crawlers. This list can be cross-referenced to verify a user-agent's authenticity. However, both operators and website owners may find it challenging to maintain an up-to-date list, so use this method with caution and in conjunction with other verification techniques.