(You might be better of parsing it as html and traversing the structure.
Regular expressions aren't really the tool for the job because of all the
nested structures.)
When I tried:
> //create regex string
> beginRegStr := "Filmography.*\n.*\n.*Acting"
> beginRegex, err := regexp.Compile(beginRegStr)
> if err != nil {
> fmt.Println("Error in beginning regex string: ", err)
> return
> }
>
> pos := beginRegex.FindAllIndex(buf, 1)
> if len(pos) == 0 {
> fmt.Println("Nothing matched!")
> return
> }
> fmt.Println(string(buf[pos[0][0]:pos[0][1]]))
> fmt.Println(pos)
> }
It worked: it did what you tols it to do, which probably
isn't what you wanted. FindAllIndex will find you those things
that match your pattern, and the result it gave me:
Filmography"><span class="tocnumber">4</span> <span
class="toctext">Filmography</span></a>
<ul>
<li class="toclevel-2 tocsection-11"><a href="#Acting"><span
class="tocnumber">4.1</span> <span class="toctext">Acting
[[13984 14198]]
is indeed a chunk from the page that matches the pattern you
gave. And there are newlines in it: Go's regexp's `.` will happily match
a newline.
Chris
--
Chris "allusive" Dollin
Denis
(I was not entirely correct as it turns out ...)
> Good to know. Isn't there a multiline mode to enable '\n' matching by '.'
> (as in many other regexp langs)?
(?s) enables allow-newline-for-dot; I think it's scoped to (...) brackets
but I'd expect to park it at the front of whatever RE I was using that
didn't want to care about matching over newlines.