Published on 2025-08-07T06:18:08Z

YandexBot MirrorDetector

YandexBot MirrorDetector is a specialized web crawler from the Russian search engine Yandex. Its purpose is to identify duplicate or 'mirror' websites to improve the quality of Yandex's search results. By determining the most authoritative version of a piece of content, it helps ensure that original content creators are prioritized over those who scrape or copy content, which is a benefit to publishers.

What is YandexBot MirrorDetector?

YandexBot MirrorDetector is a web crawler from Yandex that is designed specifically to identify duplicate content across the web. The bot identifies itself in server logs with the user-agent string Mozilla/5.0 (compatible; YandexBot/3.0; MirrorDetector; +http://yandex.com/bots). It focuses on analyzing and comparing the structure and content of websites to identify those that are mirrors of one another, which helps Yandex maintain a cleaner search index.

Why is YandexBot MirrorDetector crawling my site?

YandexBot MirrorDetector is visiting your site to determine if its content is original or a duplicate of content found elsewhere. This is part of Yandex's effort to improve search quality by correctly handling mirror sites. The crawler typically visits less frequently than the main Yandex indexing bot. Its activity may increase if your site's content appears similar to other websites or if you have recently launched a new site with content from another domain.

What is the purpose of YandexBot MirrorDetector?

The purpose of YandexBot MirrorDetector is to maintain the integrity of the Yandex search index by identifying duplicate content. This allows Yandex to deliver more diverse and relevant search results. When the bot identifies mirror sites, Yandex can then prioritize which version to show in search results, typically the most authoritative one. For website owners, this is beneficial as it helps prevent content scraping sites from outranking the original source and helps consolidate ranking signals to the canonical version of your content.

How do I block YandexBot MirrorDetector?

To prevent YandexBot MirrorDetector from analyzing your site, you can add a specific disallow rule to your robots.txt file. Note that Yandex recommends using other methods, like 301 redirects, to indicate the main mirror of a site. The user-agent for this specific bot is YandexMirrorDetector.

To block this bot, add the following lines to your robots.txt file:

User-agent: YandexMirrorDetector
Disallow: /

How to verify the authenticity of the user-agent operated by Yandex?

Reverse IP lookup technique

To verify user-agent authenticity, you can use host linux command two times with the IP address of the requester.
  1. > host IPAddressOfRequest
    This command returns the reverse lookup hostname (e.g., 4.4.8.8.in-addr.arpa.).
  2. > host ReverseDNSFromTheOutputOfFirstRequest
If the output matches the original IP address and the domain is associated with a trusted operator (e.g., Yandex), the user-agent can be considered legitimate.

IP list lookup technique

Some operators provide a public list of IP addresses used by their crawlers. This list can be cross-referenced to verify a user-agent's authenticity. However, both operators and website owners may find it challenging to maintain an up-to-date list, so use this method with caution and in conjunction with other verification techniques.