So it seems like something's been broken in the new version of
diagonal adaptation. My question is: what's changed, and how are we
going to fix it?
I've also been very anxious about our master branch being
a not-ready-to-release-anytime-soon state.
Maybe we could revert master back to 1.3, patch
the bugs we know about, release, and give Michael
all the time he needs to refactor and do speed
tests before refactoring everything. This'd probably
also put Metropolis on hold, not that anyone's waiting
for it.
Ironically, we did use that model this time. The problem
is that we didn't test the refactored branch enough
either for efficiency of sampling or for backward
compatibility before merging it into master.
I think most of the work since 1.3 has gone into
the refactoring, so I don't think it'd be that hard
to revert. I've been trying to keep my fingers out of
the code until it stabilizes.
Alternatively, we could start working on a branch from
1.3 and release from there and hope that we can stabilize
master enough to eventually release it.
--
You received this message because you are subscribed to the Google Groups "stan development mailing list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stan-dev+u...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
Ok, here we go:
2024218701
2024256492
2024318542
are all pretty bad.
git checkout master
git bisect start
git bisect bad
git bisect good v1.2.0
git bisect run ./test.sh
On Saturday, May 4, 2013 6:06:50 PM UTC-4, Matt Hoffman wrote:Ok, here we go:
2024218701
2024256492
2024318542
are all pretty bad.
Attached is a (possibly unportable) script that tests whether the average ESS --- excluding _stepsize --- is less (greater) than 50 (200) with 7 chains and seed 2024218701
I'm rerunning some experiments from the NUTS paper, and discovered
that diagonal adaptation is failing quite badly in some cases on a
simple logistic regression. Here are minimum effective sample sizes
(obtained from bin/print) for 16 independent runs of 1K warmup/1K
sampling:
goodrich@CYBERPOWERPC:/opt/stan$ ./bin/print /tmp/SU/MH/samples*.csv
Inference for Stan model:
4 chains: each with iter=(1000,1000,1000,1000); warmup=(0,0,0,0); thin=(1,1,1,1); 4000 iterations saved.
mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
beta[1] 1.3 0.0 0.2 0.9 1.2 1.3 1.4 1.7 711.1e+00
beta[2] 0.5 0.0 0.1 0.2 0.4 0.5 0.5 0.7 361.1e+00
...
Okay, attached is a modified version of the script to pass to git bisect run that
-- Cleans up more
-- Only averages the ESS of the betas
-- Only does 1 chain to work around the print bug
-- Changes the thresholds for failure and success to 20 and 50 respectively.
30c7e04cd2973c409a4d187ee60f393b40adccdb is the first bad commit
commit 30c7e04cd2973c409a4d187ee60f393b40adccdb
Author: Michael Betancourt <betanalpha@gmail.com>
Date: Sun Apr 28 17:36:42 2013 +0100
Snap final var/covar adaptation window to warmup period
:040000 040000 8a68d5f951ab3702fa3204876af775a0f8af85bc 0ddba1a8e57542dbbf04d350372c30606b5bc1ee M src
bisect run success
goodrich@CYBERPOWERPC:/opt/stan$ git bisect log
git bisect start
# bad: [e3a6abd21814a28b95f74e293bb2255dea4bd1a3] shuffle function_signatures*.stan in test-unit
git bisect bad e3a6abd21814a28b95f74e293bb2255dea4bd1a3
# good: [1cde932d15a348df17e9dafce17a79299b9d8a2d] upgrade rstan verion to 1.2.0
git bisect good 1cde932d15a348df17e9dafce17a79299b9d8a2d
# good: [e8cc58cf487193b21f2235ee9570bc92accc46c1] updating release notes with proper dates: 2013 instead of 2012
git bisect good e8cc58cf487193b21f2235ee9570bc92accc46c1
# good: [fa4764f9c1300e25aabff02cde2082b66a66aaa9] fixed inv_cloglog in fvar with test
git bisect good fa4764f9c1300e25aabff02cde2082b66a66aaa9
# good: [2514d39580b0f88123526636855f7e84802f96d8] added transpose_test for fvar/matrix
git bisect good 2514d39580b0f88123526636855f7e84802f96d8
# good: [1611ac0515dd73d3332755c58ccd861b9c62d357] rstan, add option of delta and gamma
git bisect good 1611ac0515dd73d3332755c58ccd861b9c62d357
# bad: [f524331401dfa8895f63b9fa021f6746d566c8af] More detailed output during BFGS.
git bisect bad f524331401dfa8895f63b9fa021f6746d566c8af
# good: [9ef525b3cbad99d383185ada191235dc1013a7a6] fixed type inference problem for int with multiply; need to do elsewhere
git bisect good 9ef525b3cbad99d383185ada191235dc1013a7a6
# bad: [81803474d0b84d976f3ffb7410bd1bcd817daed1] Revert to quadratic interpolation when selecting initial step-size.
git bisect bad 81803474d0b84d976f3ffb7410bd1bcd817daed1
# bad: [30c7e04cd2973c409a4d187ee60f393b40adccdb] Snap final var/covar adaptation window to warmup period
git bisect bad 30c7e04cd2973c409a4d187ee60f393b40adccdb
# good: [c422874ffff99394953ed479a2fbd3d4517d0fc5] more block operations, patch in block to return row vec
git bisect good c422874ffff99394953ed479a2fbd3d4517d0fc5
# good: [fcdf6b4be181a7e49f334fe903e84c3ae84dd25d] Woops, have to actually implement a destructor
git bisect good fcdf6b4be181a7e49f334fe903e84c3ae84dd25d