Published on 2025-08-07T06:18:08Z

NewsNow bot

The NewsNow bot is the web crawler for NewsNow, a real-time news aggregation and monitoring service. Its purpose is to scan publisher websites to collect and index news headlines and articles for its various topical newsfeeds. For publishers, being included in NewsNow's service can increase visibility and drive traffic from a global audience interested in specific news categories.

What is the NewsNow bot?

The NewsNow bot is the web crawler for the real-time news aggregation service NewsNow. The service operates several regional editions, making it a global news monitoring platform. The bot functions as a specialized search spider, scanning publisher websites for headlines and articles, which it then automatically categorizes and files into the appropriate newsfeeds. It identifies itself in server logs with the user-agent string NewsNow.

Why is the NewsNow bot crawling my site?

The NewsNow bot is visiting your site to scan for news content to include in its aggregation service. It is specifically looking for headlines and articles to categorize for its topical newsfeeds. The bot visits sites that are part of its publisher network or those it has identified as potential news sources. The frequency of visits depends on how often you publish new content. The crawling is generally considered authorized if a site has joined the NewsNow publisher network or its robots.txt file permits access.

What is the purpose of the NewsNow bot?

The purpose of the NewsNow bot is to support the NewsNow news aggregation platform. The platform collects headlines from numerous sources and organizes them into topical feeds, providing users with a centralized place to access current news. For publishers, NewsNow offers the benefit of increased visibility and traffic. By having your content featured on the platform, you can reach a broader audience interested in your specific topic areas, making the service a valuable distribution channel.

How do I block the NewsNow bot?

To prevent the NewsNow bot from accessing your website, you can add a disallow rule for it in your robots.txt file. This is the standard method for managing access for web crawlers.

Add the following lines to your robots.txt file to block the NewsNow bot:

User-agent: NewsNow
Disallow: /

How to verify the authenticity of the user-agent operated by NewsNow?

Reverse IP lookup technique

To verify user-agent authenticity, you can use host linux command two times with the IP address of the requester.
  1. > host IPAddressOfRequest
    This command returns the reverse lookup hostname (e.g., 4.4.8.8.in-addr.arpa.).
  2. > host ReverseDNSFromTheOutputOfFirstRequest
If the output matches the original IP address and the domain is associated with a trusted operator (e.g., NewsNow), the user-agent can be considered legitimate.

IP list lookup technique

Some operators provide a public list of IP addresses used by their crawlers. This list can be cross-referenced to verify a user-agent's authenticity. However, both operators and website owners may find it challenging to maintain an up-to-date list, so use this method with caution and in conjunction with other verification techniques.