[ANN] Go package for determining crawler User Agents

81 views
Skip to first unread message

Nagaev Boris

unread,
Apr 5, 2024, 12:52:44 PM4/5/24
to golang-nuts
Hey folks,

Repo https://github.com/monperrus/crawler-user-agents contains an updated list of patterns of HTTP user-agents used by robots, crawlers, and spiders as in a single JSON file. It recently got updated, now it's a Go package!

The package uses go:embed to access crawler-user-agents.json file from the repository and exposes it as go types, variable Crawlers. The functions IsCrawler and MatchingCrawlers make it easier to determine if a user agent string likely comes from a bot.

Documentation: https://pkg.go.dev/github.com/monperrus/crawler-user-agents

Below is a concise example demonstrating how you can use this package to determine if a user agent is likely to be a bot:

package main

import (
    "fmt"

    "github.com/monperrus/crawler-user-agents"
)

func main() {
    userAgent := "Mozilla/5.0 (compatible; Discordbot/2.0; +https://discordapp.com)"

    isCrawler := agents.IsCrawler(userAgent)
    fmt.Println("isCrawler:", isCrawler)
}

You can extract the user agent string within an HTTP handler as follows:

func handle(w http.ResponseWriter, r *http.Request) {
  userAgent := r.UserAgent()
  if agents.IsCrawler(userAgent) {
    fmt.Fprintf(w, "BOT!")
  }
}

Feel free to explore this package and consider integrating it into your projects. Should you have any questions or feedback, please don't hesitate to reach out.

--
Best regards,
Boris Nagaev
Reply all
Reply to author
Forward
0 new messages