I've done a bit of Scheme in the past (although not a lot) and am just getting started with Racket. Yesterday I tried installing a package -- 'Boris', a web spider -- and raco crashed with the output shown below.
Two questions:
1) Is there a web-spidering package that people recommend? I could use wget and then parse things from disk, but I'd like to have something that's easily composable into CLI scripts.
2) Given the crash shown below, what happened and how do I keep it from happening again?
Context: OSX 10.6.8; Racket 6.3; running in Terminal inside an Emacs 22.1.1 *shell* buffer
$ raco pkg install github://github.com/emdonahu/boris/master
Querying Git references for boris at github://github.com/emdonahu/boris/master
Downloading repository github://github.com/emdonahu/boris/master
raco setup: version: 6.3
raco setup: platform: x86_64-macosx [3m]
raco setup: installation name: 6.3
raco setup: variants: 3m
raco setup: main collects: /Applications/Racket_v6.3/collects
raco setup: collects paths:
raco setup: /Users/dstorrs/Library/Racket/6.3/collects
raco setup: /Applications/Racket_v6.3/collects
raco setup: main pkgs: /Applications/Racket_v6.3/share/pkgs
raco setup: pkgs paths:
raco setup: /Applications/Racket_v6.3/share/pkgs
raco setup: /Users/dstorrs/Library/Racket/6.3/pkgs
raco setup: links files:
raco setup: /Applications/Racket_v6.3/share/links.rktd
raco setup: /Users/dstorrs/Library/Racket/6.3/links.rktd
raco setup: main docs: /Applications/Racket_v6.3/doc
raco setup: --- updating info-domain tables ---
raco setup: updating: /Users/dstorrs/Library/Racket/6.3/share/info-cache.rktd
raco setup: --- pre-installing collections ---
raco setup: --- installing foreign libraries ---
raco setup: --- installing shared files ---
raco setup: --- compiling collections ---
raco setup: --- parallel build using 8 jobs ---
raco setup: 7 making: <pkgs>/boris/boris
raco setup: 6 making: <pkgs>/boris/echo-server
raco setup: 5 making: <pkgs>/boris/hypertext-browser
raco setup: 4 making: <pkgs>/boris/persistent
raco setup: 3 making: <pkgs>/boris/tests
raco setup: 3 making: <pkgs>/boris/tests/boris
raco setup: 2 making: <pkgs>/boris/utils
raco setup: 2 making: <pkgs>/boris/utils/emd
raco setup: 7 making: <pkgs>/boris/boris/interpreter
Assertion failed: (((((((intptr_t)((Scheme_Object *)(scopes))) & 0x1)?(Scheme_Type)scheme_int\
eger_type:((Scheme_Object *)((Scheme_Object *)(scopes)))->type) >= scheme_hash_tree_type) && \
(((((intptr_t)((Scheme_Object *)(scopes))) & 0x1)?(Scheme_Type)scheme_integer_type:((Scheme_O\
bject *)((Scheme_Object *)(scopes)))->type) <= scheme_hash_tree_indirection_type))), function\
add_conditional_as_reachable, file ../../../racket/gc2/../src/syntax.c, line 5337.
Abort trap
Thank you; I've been rolling through the docs and playing around on thse, and they seem really useful. One question though -- I stumbled across a mention of the sxml/html module while I was reading, but had no luck installing it. None of the following worked:
(require sxml/html)
$ raco pkg install sxml/html
$ raco pkg install 'sxml/html' # Maybe the shell was having trouble with '/'?
I don't know that I need it, but I'd like to know how to deal with modules like this in future.
> For HTTP, the client modules in Racket are often
> satisfactory, and other times I've used my own packages that implement
> HTTP in pure Racket or that wrap `curl` or `wget` for special
> requirements. For storing pages and links/metadata, there's the
> filesystem, the core Racket RDBMS database support, and cloud stores
> like AWS S3. The un-AJAX-ing and site-specific scraping behavior you
> might have to do yourself, if you need it. (I have a backlog of related
> tools to release someday.)
Great, thank you. Yeah, I'd really like to be able to automate posting to Patreon. (Every week I publish a chapter of my novel there.) Unfortunately, their whole site is pointlessly AJAX. I spent some time Firebugging their code to see what the relevant calls were and then decided to spend my time on something more useful.
From what you say it sounds like there's no "magically make stupid AJAX / DOM-manipulating sites easy to deal with" module for Racket? Something that processed the site and handed back the final HTML as the browser gets it post-JS would be lovely. It's a bit much to ask for, I realize.
Thanks again for all this -- it's a big help.
Dave
Wow, that was fast! Thank you.
I didn't actually do anything related to scribble, I just tried to install the module -- although presumably that automatically attempted to setup scribble docs.