Published on 2025-08-07T06:18:08Z

Meta-ExternalFetcher

Meta-ExternalFetcher is a specialized, on-demand web crawler from Meta (formerly Facebook). It is used by Meta's AI products, such as chatbots in WhatsApp and Instagram, to retrieve real-time information from the web. Unlike a traditional crawler, it activates only when a user asks a question that requires current data from a specific URL. This allows Meta's AI to provide up-to-date, verifiable answers, and can increase a website's visibility when it is cited as a source.

What is Meta-ExternalFetcher?

Meta-ExternalFetcher is a web crawler from Meta that performs user-initiated fetches to support real-time functions in its AI products, such as chatbots on WhatsApp, Instagram, and Facebook. It is not a traditional crawler that systematically indexes the web; instead, it operates on-demand. When a Meta AI user asks for information that is not in the model's training data, this bot is sent to fetch the content from a specific URL. It identifies itself in server logs with the user-agent string meta-externalfetcher/1.1. This behavior distinguishes it from Meta's other crawlers that are used for AI training or search indexing.

Why is Meta-ExternalFetcher crawling my site?

Meta-ExternalFetcher is visiting your website because a user of one of Meta's AI products has specifically requested information from your site. This occurs when a user asks the AI to retrieve information from a URL or to verify a claim by checking a source. The frequency of its visits is entirely dependent on how often users direct the AI to access your content. These targeted, user-initiated visits are particularly interested in content that can be parsed for immediate answers.

What is the purpose of Meta-ExternalFetcher?

The purpose of Meta-ExternalFetcher is to act as a bridge between Meta's AI systems and real-time web data, allowing the AI to provide users with up-to-date information. The data it collects is used to generate an immediate response to a user's query and is not used for training AI models. This allows Meta AI to provide more accurate and verifiable answers. For website owners, this can increase the visibility of your content within Meta's AI ecosystem, as your site may be referenced as a source of information.

How do I block Meta-ExternalFetcher?

If you wish to prevent Meta's AI products from fetching content from your site on-demand, you can add a disallow rule to your robots.txt file. This is the standard method for managing crawler access.

To block this bot, add the following lines to your robots.txt file:

User-agent: meta-externalfetcher
Disallow: /

How to verify the authenticity of the user-agent operated by Meta?

Reverse IP lookup technique

To verify user-agent authenticity, you can use host linux command two times with the IP address of the requester.

```
> host IPAddressOfRequest
```
This command returns the reverse lookup hostname (e.g., 4.4.8.8.in-addr.arpa.).

> host ReverseDNSFromTheOutputOfFirstRequest

If the output matches the original IP address and the domain is associated with a trusted operator (e.g., Meta), the user-agent can be considered legitimate.

IP list lookup technique

Some operators provide a public list of IP addresses used by their crawlers. This list can be cross-referenced to verify a user-agent's authenticity. However, both operators and website owners may find it challenging to maintain an up-to-date list, so use this method with caution and in conjunction with other verification techniques.