boostrapping

196 views
Skip to first unread message

Attila Lendvai

unread,
Jul 21, 2021, 9:11:31 AM7/21/21
to Idris Programming Language
hi,

i'm new here, so let me write up a short intro:

i've been programming for 30+ years, learning asm on Amiga as a child on my own behind the iron curtain, and ended up being mostly a Lisper by now. i became somewhat obsessed with the issues around bootstrapping (don't ask me why), and i've spent quite some time recently to experiment with it in the context of Maru, a tiny Lisp that can self-host through x86 or LLVM in just 2-4 kLoC:


i want to learn more about Idris and TDD, and what's a better way than a project that i find interesting? so, i ventured into trying to bootstrap Idris all the way from Haskell, and i reached surprisingly far with relative ease on NixOS!

this worked after some struggles (PR's will follow):
  1. compiled idris1 using GHC
  2. compiled idris2-boot using this idris1
idris2 HEAD didn't compile with idris2-boot. my guess is because idris2 HEAD and idris2-boot have diverged too much to be linguistically compatible:

make[1]: Leaving directory '/home/alendvai/workspace/idris/idris2/support/chez'
echo "-- @""generated" > src/IdrisPaths.idr
echo 'module IdrisPaths' >> src/IdrisPaths.idr
echo 'export idrisVersion : ((Nat,Nat,Nat), String); idrisVersion = ((0,4,0), "5f3480120")' >> src/IdrisPaths.idr
echo 'export yprefix : String; yprefix="/home/alendvai/.idris2"' >> src/IdrisPaths.idr
/home/alendvai/workspace/idris/idris2-boot/dist/idris2 --build idris2.ipkg
src/Libraries/Data/String/Extra.idr:6:1--8:1:Data.String not found


i assume for a complete bootstrap (i.e. without using the cached generated files), there should be multiple stages identified and introduced between idris2-boot and idris2 HEAD.

i'm planning to prepare the same/similar setup that i have done for maru, which is written up here in a few paragraphs:


if any maintainer has strong feelings against anything that's written up there, then i would welcome warnings/guidance prior to investing my time into something that will not end up sparking joy in them! :)

i plan is to identify the shared history of the repos (at the level of git commits), and build one git repo with multiple branches that would hold the necessary developmental stages of the language that can bootstrap each other in sequence... and with as much sharing as possible, so that the development history is retained in a single git repo.

after that a rather vague, longer term calling of mine is to do something like what Kalyn does, but for Idris (https://intuitiveexplanations.com/tech/kalyn). in short: implement a subset of Idris using a trivial syntax (SEXP), and write a core that can self-host in as few lines of code as i can make it. but arrange it so that it can also be loaded into the full Idris2 system and be type checked, etc. i.e. development would happen using the full power of Idris2, but there would be a tiny self-hosting core optimized for bootstrapping purposes, and that would be hopefully only a few thousand LoC.

then introduce a few stage1 branches implemented in various different languages.

more to come, and feedback is welcome,

yours,

 - attila

Attila Lendvai

unread,
Jul 27, 2021, 9:29:03 AM7/27/21
to Idris Programming Language

this worked after some struggles (PR's will follow):
  1. compiled idris1 using GHC
FTR, i've opened a PR inspired by my experience that tries to clarify the README: https://github.com/idris-lang/Idris-dev/pull/4902

i plan is to identify the shared history of the repos (at the level of git commits), and build one git repo with multiple branches that would hold the necessary developmental stages of the language that can bootstrap each other in sequence... and with as much sharing as possible, so that the development history is retained in a single git repo.

unfortunately there seems to be no shared history between idris1, idris2-boot, and idris2 (on the git commit level).

the lack of a shared commit history makes this setup that much less pleasing, but i'll continue with my original idea of setting up one repo with multiple branches, that capture the successive revisions of the language that can build/bootstrap each other in a sequence.

- attila

Attila Lendvai

unread,
Jul 28, 2021, 10:43:13 AM7/28/21
to Idris Programming Language
yay! i've managed to bootstrap idris all the way down from GHC up until Idris2 HEAD! (on NixOS 21.05)


the repo contains branches idris.1 .. idris.6 that can bootstrap each other in a mostly automated way.
  • idris.1 is Idris 1, i.e. the HEAD of the Idris-dev repo
  • idris.2 is Idris2-boot
  • idris.3 is v0.2.1 (from here the rest are the tags of Idris2)
  • idris.4 is v0.2.2
  • idris.5 is v0.3.0
  • idris.6 is v0.4.0
  • attila is main + some patches to the build system that can fire up the recursive chain of bootstrapping
idris.n checks out idris.n-1 under its own build/ dir, and ensures that it's built. idris.1 is built using an external GHC.

it requires nix-shell (and the added default.nix files) to set up the necessary environment for each stage.

if someone is super curious, then this should do the entire chain of bootstrap:

$ git checkout attila
$ nix-shell
$ make -j

(it should work with nix-shell --pure, but it doesn't. maybe because of flakes? the prior stages invoke nix-shell with --pure)

i'm not sure that i have precisely identified the points where new features were added that require a new bootstrap stage. my heuristic was to check out git tags and use bisecting to identify what version can build what, and then maybe refine it later to the level of git commits. but then i saw that my bisecting is honing in on the release tags in sequence, so i assume that they also are the bootstrap stages.

issues, downsides:
  • some of the git branches are divergent (three unrelated lines of git commit history: idris.1, idris.2, and everything else). it's not the end of the world, though.
  • this way the git tag spaces get merged, and the Idris2 repo has tags starting from v0.x.y; i.e. they overlap with the tags from the Idris-dev repo. luckily there's no clash due to the different numbering schemes; i don't know how git would behave if there were any. this is an aesthetic issue that doesn't cause any problems.
benefits:
  • aesthetics -- all of the history of Idris is nicely kept in one repo, and it can automatically bootstrap itself all the way from GHC, and in the future possibly from multiple other languages/implementations.
  • if we use branches for the developmental stages of the language, and tags for the releases of the implementation, then they become decoupled. it becomes possible to introduce multiple new linguistic features without the need for releasing any of it to the public. IOW, the shaping of the language and the releasing of its implementation becomes decoupled.
  • it's much easier to work out some bootstrap issues when you can mold both ends: the host, and the HEAD that is being worked on. using branches nicely facilitates this.
possible TODOs:
  • make it work without the use of make install; i.e. make it so that the binaries can be run in-place after compilation, without installing anything into the ~/.idris* directories.
  • by setting appropriate defaults, possibly eliminate the need for the idris2 startup wrapper script? are there any blockers that i'm not aware of? first try to open the .so file in the directory where executable resides, and then without any prefix (i.e. using the system's default search path). the former for uninstalled execution, the latter for when it's installed.
  • make the build more uniform across the branches/stages (e.g. backport the use of flakes, etc)
  • eliminate the ~/.idris2boot/ anomaly
  • arrange it so that the compiler backend's output files are saved under build/, and some of them can optionally be checked into the repo, and serve as a "bootstrap shortcut". then add support to the makefile to optionally skip these shortcuts if someone wants to run the entire chain of bootstrap.
  • make it more uniform how the build iterations are stored under build/; IOW, get rid of bootstrap-build, and introduce build/iteration1 (stage n-1 building stage n) and build/iteration2 (stage n building itself), and possibly a build/iteration3 (stage n once again building itself; this may be needed if e.g. an optimization is disabled in iteration1, leading to differences in the binary output). suggestion for a better nomenclature is welcome!
  • reproducible builds? if it is feasible/available, then add comparison of bootstrapped binaries as a sanity check into the makefile
there's plenty of work to do to make it slick. at this point i'd really appreciate some feedback on whether this vision is aligned with that of the maintainer's.

- attila

Attila Lendvai

unread,
Sep 19, 2021, 8:48:55 AM9/19/21
to Idris Programming Language
does anyone know what is the expected outcome of running the tests of releases? is/was there an all-tests-must-pass-before-a-release policy? and in particular the tests of v0.2.1?

i'm creating Guix packages for Idris2 that will bootstrap them all the way from GHC, and i get 10 failing tests from `make test` of v0.2.1.

is this expected?

...
idris2/literate008: success
idris2/literate009: Exception: variable blodwen-run-finalisers is not bound
Exception: variable blodwen-run-finalisers is not bound
success
idris2/literate010: success
...
chez/chez001: success
chez/chez002: FAILURE
Expected:
"[(3, (4, 5)), (6, (8, 10)), (5, (12, 13)), (9, (12, 15)), (8, (15, 17)), (12, (16, 20)), (15, (20, 25)), (7, (24, 25)), (10, (24, 26)), (20, (21, 29)), (18, (24, 30)), (16, (30, 34)), (21, (28, 35)), (12, (35, 37)), (15, (36, 39)), (24, (32, 40)), (9, (40, 41)), (27, (36, 45)), (30, (40, 50)), (14, (48, 50)), (24, (45, 51)), (20, (48, 52)), (28, (45, 53)), (33, (44, 55)), (40, (42, 58)), (36, (48, 60)), (11, (60, 61)), (39, (52, 65)), (33, (56, 65)), (25, (60, 65)), (16, (63, 65)), (32, (60, 68)), (42, (56, 70)), (48, (55, 73)), (24, (70, 74)), (45, (60, 75)), (21, (72, 75)), (30, (72, 78)), (48, (64, 80)), (18, (80, 82)), (51, (68, 85)), (40, (75, 85)), (36, (77, 85)), (13, (84, 85)), (60, (63, 87)), (39, (80, 89)), (54, (72, 90)), (35, (84, 91)), (57, (76, 95)), (65, (72, 97)), (60, (80, 100)), (28, (96, 100)), (20, (99, 101)), (48, (90, 102)), (40, (96, 104)), (63, (84, 105)), (56, (90, 106)), (60, (91, 109)), (66, (88, 110)), (36, (105, 111)), (15, (112, 113)), (69, (92, 115)), (80, (84, 116)), (45, (108, 117)), (56, (105, 119)), (72, (96, 120)), (22, (120, 122)), (27, (120, 123)), (75, (100, 125)), (44, (117, 125)), (35, (120, 125)), (78, (104, 130)), (66, (112, 130)), (50, (120, 130)), (32, (126, 130)), (81, (108, 135)), (64, (120, 136)), (88, (105, 137)), (84, (112, 140)), (55, (1Exception: variable blodwen-run-finalisers is not bound
Exception: variable integer->bits16 is not bound
Exception: variable blodwen-run-finalisers is not bound
Exception: variable blodwen-run-finalisers is not bound
Exception: variable blodwen-run-finalisers is not bound
Exception: variable blodwen-run-finalisers is not bound
Exception: variable blodwen-run-finalisers is not bound
Exception: variable cast-int-char is not bound
Exception: variable blodwen-run-finalisers is not bound
Exception: variable blodwen-run-finalisers is not bound
Exception: variable blodwen-run-finalisers is not bound
Exception: variable blodwen-run-finalisers is not bound
Exception: variable blodwen-run-finalisers is not bound
Exception: variable blodwen-run-finalisers is not bound
Exception in foreign-procedure: no entry for "idrnet_socket"
Exception: variable blodwen-run-finalisers is not bound
Exception: variable blodwen-run-finalisers is not bound
Exception: variable blodwen-run-finalisers is not bound
Exception: variable blodwen-run-finalisers is not bound
Exception: variable blodwen-run-finalisers is not bound
Exception in foreign-procedure: no entry for "idris2_popen"
32, 143)), (100, (105, 145)), (87, (116, 145)), (24, (143, 145)), (17, (144, 145)), (96, (110, 146)), (48, (140, 148)), (51, (140, 149)), (90, (120, 150)), (42, (144, 150)), (72, (135, 153)), (93, (124, 155)), (60, (144, 156)), (85, (132, 157)), (84, (135, 159)), (96, (128, 160)), (36, (160, 164)), (99, (132, 165)), (119, (120, 169)), (65, (156, 169)), (102, (136, 170)), (80, (150, 170)), (72, (154, 170)), (26, (168, 170)), (52, (165, 173)), (120, (126, 174)), (105, (140, 175)), (49, (168, 175)), (78, (160, 178)), (108, (144, 180)), (19, (180, 181)), (70, (168, 182)), (33, (180, 183)), (111, (148, 185)), (104, (153, 185)), (60, (175, 185)), (57, (176, 185)), (88, (165, 187)), (114, (152, 190)), (95, (168, 193)), (130, (144, 194)), (117, (156, 195)), (99, (168, 195)), (75, (180, 195)), (48, (189, 195)), (28, (195, 197)), (120, (160, 200)), (56, (192, 200))]\n1/1: Building Pythag (Pythag.idr)\nMain> Main> Bye for now!\n"
Given:
"1/1: Building Pythag (Pythag.idr)\nMain> Main> Bye for now!\n"
chez/chez003: success
chez/chez004: FAILURE
Expected:
"100\n94\n94.42\n\"Hello\"\n\"there!\"\n65535\n[0, 94, 0, 0, 0, 65, 65, 65, 65, 0, 123, 20, 174, 71, 225, 154, 87, 64, 0, 0, 72, 101, 108, 108, 111, 32, 116, 104, 101, 114, 101, 33, 255, 255, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]\n[0, 94, 0, 0, 0, 65, 65, 65, 65, 0, 123, 20, 174, 71, 225, 154, 87, 64, 0, 0, 72, 101, 108, 108, 111, 32, 116, 104, 101, 114, 101, 33, 255, 255, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]\n[0, 94, 0, 0, 0, 65, 65, 65, 65, 0, 123, 20, 174, 71, 225, 154, 87, 64, 0, 0, 72, 101, 108, 108, 111, 32, 116, 104, 101, 114, 101, 33, 255, 255, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 94, 0, 0, 0, 65, 65, 65, 65, 0, 123, 20, 174, 71, 225, 154, 87, 64, 0, 0, 72, 101, 108, 108, 111, 32, 116, 104, 101, 114, 101, 33, 255, 255, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]\n00 5E 00 00 00 41 41 41 41 00 7B 14 AE 47 E1 9A         .^...AAAA.{..G..\n57 40 00 00                                             W@..\n\ntotal size = 20\n48 65 6C 6C 6F 20 74 68 65 72 65 21 FF FF 00 00         Hello there!....\n00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00         ................\n00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00         ................\n00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00         ................\n00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00         ................\n\ntotal size = 80\n1/1: Building Buffer (Buffer.idr)\nMain> Main> Bye for now!\n"
Given:
"100\n94\n94.42\n\"Hello\"\n\"there!\"\n1/1: Building Buffer (Buffer.idr)\nMain> Main> Bye for now!\n"
chez/chez005: success
chez/chez006: success
chez/chez007: success
chez/chez008: success
chez/chez009: success
chez/chez010: FAILURE
Expected:
"9\nCallback coming\nIn callback\n24\nCallback coming\nIn callback with (1, 2)\n3\n9\n'k'\n1/1: Building CB (CB.idr)\nMain> Main> Bye for now!\n"
Given:
"9\nCallback coming\nIn callback\n24\nCallback coming\nIn callback with (1, 2)\n3\n9\n1/1: Building CB (CB.idr)\nMain> Main> Bye for now!\n"
chez/chez011: success
chez/chez012: success
chez/chez013: success
chez/chez014: FAILURE
Expected:
"Received: hello world!\nReceived: echo: hello world!\n1/1: Building Echo (Echo.idr)\nMain> Main> Bye for now!\n"
Given:
"1/1: Building Echo (Echo.idr)\nMain> Main> Bye for now!\n"
chez/chez015: success
chez/chez016: success
chez/chez017: success
chez/chez018: success
chez/chez019: success
chez/chez020: FAILURE
Expected:
"opened\nclosed\nIdris 2\n1/1: Building PopException: variable integer->bits8 is not bound
Exception: variable blodwen-register-object is not bound
Exception in foreign-procedure: no entry for "idris2_readChars"
Exception in foreign-procedure: no entry for "idris2_setenv"
Exception: variable blodwen-run-finalisers is not bound
Exception: attempt to reference unbound identifier blodwen-run-finalisers at line 310, char 48 of /tmp/guix-build-idris2-0.2.1.drv-0/source/tests/chez/chez026/custom_output/check_dir_app/check_dir.ss
./custom_output/check_dir: line 14: /tmp/guix-build-idris2-0.2.1.drv-0/source/tests/chez/chez026/custom_output/check_dir_app/check_dir.so: No such file or directory
Exception: variable blodwen-run-finalisers is not bound
Exception: variable blodwen-run-finalisers is not bound
en (Popen.idr)\nMain> Main> Bye for now!\n"
Given:
"1/1: Building Popen (Popen.idr)\nMain> Main> Bye for now!\n"
chez/chez021: FAILURE
Expected:
"1\n200\nTrue\n200\n254\nFalse\n248\n1\n1234567890\n255\n65535\n4294967295\n18446744073709551615\n1/1: Building Bits (Bits.idr)\nMain> [\"1\", \"200\", \"248\", \"1\", \"255\", \"200\", \"254\"]\nMain> [\"True\", \"False\"]\nMain> [\"255\", \"65535\", \"4294967295\", \"18446744073709551615\"]\nMain> Main> Bye for now!\n"
Given:
"1/1: Building Bits (Bits.idr)\nMain> [\"1\", \"200\", \"248\", \"1\", \"255\", \"200\", \"254\"]\nMain> [\"True\", \"False\"]\nMain> [\"255\", \"65535\", \"4294967295\", \"18446744073709551615\"]\nMain> Main> Bye for now!\n"
chez/chez022: FAILURE
Expected:
"Hello\nHello\nDone\nFree X\nFreeing 0 Hello\nFree Y\nFreeing 1 Hello\n1/1: Building usealloc (usealloc.idr)\nMain> Main> Bye for now!\n"
Given:
"1/1: Building usealloc (usealloc.idr)\nMain> Main> Bye for now!\n"
chez/chez023: FAILURE
Expected:
"Hello \n'I'\ndris!\n\nNo exceptions occurred\n1/1: Building File (File.idr)\nMain> Main> Bye for now!\n"
Given:
"1/1: Building File (File.idr)\nMain> Main> Bye for now!\n"
chez/chez024: FAILURE
Expected:
"True\nHI\nTrue\nHI\nTrue\nEH\nTrue\nNothing there\n1/1: Building Envy (Envy.idr)\nMain> Main> Bye for now!\n"
Given:
"1/1: Building Envy (Envy.idr)\nMain> Main> Bye for now!\n"
chez/chez025: success
chez/chez026: FAILURE
Expected:
"1/1: Building Dummy (Dummy.idr)\nFound build directory\n"
Given:
"1/1: Building Dummy (Dummy.idr)\n"
chez/chez027: success
...
294/304 tests successful

- attila

Edwin Brady

unread,
Sep 20, 2021, 7:39:17 AM9/20/21
to idris...@googlegroups.com
Hi Attila,
This is a little surprising, since we require all tests to pass on all platforms before we even accept a PR, never mind a release. 0.2.1 might be a bit unusual in that this was released just to keep the bootstrapping path possible. I don't really understand the failures you're seeing here, at first glance, I certainly haven't seen them before.

To be honest, if you can get 0.3.0 building and passing the tests, I'm not sure I'd worry about it too much, although it would be nice to know what's going on.

By the way, somehow all the older messages in this thread ended up in my spam folder - sorry I missed them. It's great to see that the bootstrapping path works as expected (well, mostly at least, given these failures...).

Edwin
--
The University of St Andrews is a charity registered in Scotland, No: SC013532


________________________________________
From: idris...@googlegroups.com <idris...@googlegroups.com> on behalf of Attila Lendvai <attila....@gmail.com>
Sent: 19 September 2021 13:48
To: Idris Programming Language
Subject: {Suspected Spam} [Idris] Re: boostrapping

is this expected?

- attila

--
You received this message because you are subscribed to the Google Groups "Idris Programming Language" group.
To unsubscribe from this group and stop receiving emails from it, send an email to idris-lang+...@googlegroups.com<mailto:idris-lang+...@googlegroups.com>.
To view this discussion on the web visit https://groups.google.com/d/msgid/idris-lang/b52d15b8-67c6-46c3-b89b-c545f3701cc9n%40googlegroups.com<https://groups.google.com/d/msgid/idris-lang/b52d15b8-67c6-46c3-b89b-c545f3701cc9n%40googlegroups.com?utm_medium=email&utm_source=footer>.

Attila Lendvai

unread,
Sep 20, 2021, 9:38:50 AM9/20/21
to Idris Programming Language
i have finally understood the issue.
 
benefits:
  • it's much easier to work out some bootstrap issues when you can mold both ends: the host, and the HEAD that is being worked on. using branches nicely facilitates this.

it is a good example for why keeping the bootstrap stages in mutable branches is a good practice (see http://bootstrappable.org/):

the issue described (also points to the commit that fixes it): https://github.com/idris-lang/Idris2/issues/166

this is why i couldn't get a clean test run for v0.2.1.

 - attila

Attila Lendvai

unread,
Jan 23, 2022, 9:13:50 AM1/23/22
to Idris Programming Language
FTR, as part of this exploration, i have refactored the build system of Idris2, which is ready for review:


- attila

Attila Lendvai

unread,
Apr 14, 2022, 9:36:58 AM4/14/22
to Idris Programming Language
FTR, i've filed some patches for Guix that can bootstrap the Idris2 package all the way down from GHC:


- attila

Reply all
Reply to author
Forward
0 new messages