Bytespider vs. TikTokSpider

Understanding the two ByteDance crawlers and how they differ

Two separate crawlers

ByteDance runs two crawlers that get confused constantly:

TikTokSpider – fetches pages to generate TikTok link previews. Respects robots.txt, crawls on-demand, scope limited to OG metadata extraction.

Bytespider – general-purpose crawler for AI training, search indexing, and content analysis. Aggressive crawling patterns. Commonly blocked.

Why the distinction matters

Many site operators block Bytespider to prevent AI training on their content. If the block is too broad (blocking all ByteDance IPs, or using a wildcard that catches both bots), TikTok link previews break as collateral damage.

How to allow TikTokSpider while blocking Bytespider

User-agent: TikTokSpider
Allow: /

User-agent: Bytespider
Disallow: /

Put the specific TikTokSpider allow rule before broader disallow rules.

Identifying each bot in server logs

  • TikTokSpider in the user agent string
  • Bytespider in the user agent string

If your WAF or CDN blocks by bot category, check whether it distinguishes these two or groups them as “ByteDance bots.”

Firewall considerations

TikTokSpider and Bytespider may share ByteDance IP ranges. IP-based blocking can inadvertently kill link previews. Prefer user-agent-based rules for granular control.