Published on 2025-08-07T06:18:08Z

Hatena bot

The Hatena bot is a collection of web crawlers operated by the Japanese internet services company Hatena. These bots support Hatena's various platforms, including their popular social bookmarking and blogging services. Their activity is triggered by user actions, such as when a user bookmarks a page. For website owners, this can be a source of traffic from a primarily Japanese audience.

What is the Hatena bot?

The Hatena bot is not a single entity but a collection of specialized web crawlers from the Japanese company Hatena. These bots support different functions within Hatena's ecosystem of social and content services. They identify themselves with distinct user-agent strings, such as HatenaBlog-bot, HatenaBookmark, and Hatena-Favicon. Each bot serves a specific purpose, from analyzing content for bookmark previews to retrieving favicons. They operate primarily from Japanese IP addresses and are designed to respect standard web protocols.

Why is the Hatena bot crawling my site?

A Hatena bot is visiting your website because a user of one of Hatena's services has interacted with your content. The most common trigger is a user bookmarking one of your pages on Hatena Bookmark. When this happens, a crawler visits the page to extract metadata and generate a preview. Other triggers include a reference to your site on a Hatena Blog or a need to fetch your site's favicon. The frequency of visits depends on how much your content is engaged with by Hatena's users. This is a normal and authorized part of the web's social ecosystem.

What is the purpose of the Hatena bot?

The purpose of the Hatena bots is to support the network of social media and content discovery services that are popular in Japan, such as Hatena Bookmark and Hatena Blog. The data they collect allows Hatena to provide rich previews of bookmarked content, analyze relationships between websites, and enhance user experience through content recommendations. For website owners, having your content bookmarked or featured on a Hatena service can increase your visibility and drive traffic from the Japanese internet ecosystem.

How do I block the Hatena bot?

If you need to block the crawlers from Hatena, you can add rules to your robots.txt file. Since there are multiple user-agents, you may need to block them individually. A common one is for their blog service.

To block the HatenaBlog bot, add the following lines to your robots.txt file:

User-agent: HatenaBlog-bot
Disallow: /

How to verify the authenticity of the user-agent operated by Hatena Corporation?

Reverse IP lookup technique

To verify user-agent authenticity, you can use host linux command two times with the IP address of the requester.
  1. > host IPAddressOfRequest
    This command returns the reverse lookup hostname (e.g., 4.4.8.8.in-addr.arpa.).
  2. > host ReverseDNSFromTheOutputOfFirstRequest
If the output matches the original IP address and the domain is associated with a trusted operator (e.g., Hatena Corporation), the user-agent can be considered legitimate.

IP list lookup technique

Some operators provide a public list of IP addresses used by their crawlers. This list can be cross-referenced to verify a user-agent's authenticity. However, both operators and website owners may find it challenging to maintain an up-to-date list, so use this method with caution and in conjunction with other verification techniques.