Scrapping with Colly - Pulling Elements one at a time

59 views
Skip to first unread message

Enrique

unread,
Jan 28, 2024, 9:21:32 PM1/28/24
to golang-nuts

Hello, I am new to golang, and am working on a small web scrapper project, where i crawl through a website for a guild that i'm associated with.

Ideally I would like to pull the data from each 'dl' (html below) and insert it into the MemberDetails Struct, however all attempts to parse the below html result in the following string being returned

Printout dt - 0: [Full nameRankPrimary position]

Do you have any advice on how I could get one element at a time?

  • Full Name
  • John Doe

Note: Library being used github.com/gocolly/colly

<div class="block-container"> <h3 class="block-formSectionHeader">Information</h3> <div class="block-body block-row"> <dl class="pairs pairs--columns rosters-rows"> <dt>Full name</dt> <dd>John Doe</dd> </dl> <dl class="pairs pairs--columns rosters-rows"> <dt>Rank</dt> <dd>Rank #1</dd> </dl> <dl class="pairs pairs--columns rosters-rows"> <dt>Primary position</dt> <dd>General Staff</dd> </dl> </div>> </div> type MemberDetails struct { FullName string Rank string PrimaryPosition string } // ... detailCollector.OnHTML("div.block-container div.block-body:first-of-type", func(h *colly.HTMLElement) { selection := h.DOM val := selection.Find("dl > dt").Text() fmt.Printf("Printout dt - 0: %s \n", val) })

Tamás Gulácsi

unread,
Jan 29, 2024, 2:56:55 AM1/29/24
to golang-nuts
var data MemberDetails

detailCollector.OnHTML("div.block-container div.block-body:first-of-type",
func(h *colly.HTMLElement) {
    selection := h.DOM
    key := selection.Find("dl > dt").Text() 
    val := selection.Find("dl > dd").Text()
    switch key {
        case "Full name": data.FullName = val
        case "Rank": data.Rank = val
        case "Primary position": data.PrimaryPosition = val
    }
})

fmt.Printf("data: %+v\n", data)
Reply all
Reply to author
Forward
0 new messages