On Friday, August 17, 2012 11:26:49 AM UTC-7, Alex Chaffee wrote:
> 1) I still think defaulting to sorted order is THE way to go, having worked
> in a multi-person, multi-branch team enough to have been caught in the
> peculiar hell of bouncing sort orders. Your attempts to make things
> "sedentary" in their positions feels like a stopgap.
The counter argument is that order is meaningful; people expect to see
more recently added columns near the bottom of the list. And can't you
escape that hell by making everyone run rake db:reset?
I don't think I've ever thought in terms of expecting to see more recent columns near the bottom of the list. It's taken me years to ignore the aggravation of Rails not allowing me to control the ordering explicitly. If anything, alphabetical sorting means I can visually do a binary search for the column name I'm looking for, so I've kinda gotten used to that idea.
And no, I really can't force it. Multiple, distributed teams, working across a variety of time zones and branches, with different schema changes happening on different branches. And that's with just 7 or 8 committers right now. The only solution would be having people periodically drop and recreate their local DB, which is pretty onerous.
If Rails exposed clear metadata about column order as it exists in the (explicitly ordinal) migrations, then we could produce a stable result without sorting, but short of reproducing a canonical ordering by running the migrations against a scratch, empty database, having sorting off by default just leads to people suddenly experiencing wacky little results which they may initially dismiss as odd but not significant enough to worry about (columns flip-flopping order from branch to branch, with interesting consequences for merges), until finally someone gets fed up and says WTF?! and starts digging into the root cause.
I.E. it's a problem you grow into subtly, and the solution may not be totally obvious -- so I prefer the solution that very easily guarantees it never becomes an issue.
Also, schema.rb is generated using the same db-order ordering. Is that
causing the same problems?
I suspect Rails does some order-stabilizing (not *moving* columns that are already in schema.rb), rather than using one's local DB order exclusively, but on occasion, yes it has. In one case, even dropping local DBs wasn't enough because production had been migrated out-of-canonical-order, and so people would pull down a production snapshot to test against real data with, and have a wonky ordering that would show up the next time they made schema changes.
The resolution to that issue was a MySQL-specific migration that explicitly re-ordered a few columns in terms of others. It was, frankly, a royal PITA.
> 2) I encountered a LOT of issues around old versions of model code being in
> memory when annotation happens, so I decided to punt and basically make the
> rake tasks set env vars and call the shell script(s) to ensure they've got
> the latest code.
I'm all for improving isolation, and I think the current code is in a
peculiar middle state between "load the whole rails environment" and
"don't load rails, just try to parse model files one at a time". Also
I'd love to get the rake task to be a pure client of the annotate
script (now there's a lot of rake leakage into the core annotation
code).
My changes definitely help move in that direction, although at the cost of "load the whole rails environment". However I now have all sorts of corner-cases working properly, including models-from-engines being properly ignored, models defined in multiple places being annotated in all places, no magic monkey-patching causing issues (see the bug report about auto-loaded FactoryGirl interfering with the ObjectSpace approach), STI, namespacing, etc.
About the only edge-case I'm aware of that I have not explicitly tested yet is custom inflections, and I'll probably have that covered this weekend.
I don't know what you mean by "old versions of model code", though.
Sorry, versions initialized against old versions of the schema. So for whatever reason, model X gets loaded, then migration Y runs (changing the schema underlying X), then annotate runs -- model X won't have the new attributes/indexes/whatever loaded into ActiveRecord's metadata during that run of Rake. So a common problem we were experiencing with annotate was that the annotations would "lag" a migration or two behind reality unless you explicitly ran db:migrate twice.
> 3) I ditched the ObjectSpace approach and with it, all the bad juju with
> FactoryGirl, JRuby, et al. Instead, I forcibly eager-load classes (using
> strategies that work for both Rails 2.3 and Rails 3.x), then look for
> ActiveRecord::Base.subclasses. There's a lot of code simplification that
> could come out of this but I haven't gotten there yet.
I think the bad juju has all been exorcised but I like the sound of this.
> 4) I split the binary into annotate_models and annotate_routes. It neatly
> gets around the issue of annotate being ambiguous, while avoiding the issue
> of annotate_models being misleading. And I think it's safe to say we can
> afford to ditch backwards compatibility on that name.
I disagree. "annotate" is so short and punchy and clear, and there's
now a fair bit of history around it too. The only problem is that
imagemagick has a tool with the same name but I think "bundle exec
annotate" can work around that.
Ugh. I loathe having to do that. As do many developers -- hell, even RVM is shipping with rubygems-bundler pre-installed into the global gemset for all rubies (which I promptly kill, to avoid nasty compatibility issues -- but people are willing to bear some such issues out to avoid the bundle exec solution; my solution is to actually use gemsets properly, keep them tidy, and not have it ever become an issue at all).
Also, the bundle exec option falls apart when you want to automate anything. One thing bundler does is override system() and `` behavior, such that spawned sub-shells have an environment that doesn't require explicit use of "bundle exec", so if you want to automate RMagick from within your Rails app (say, a Rake task to some common transformative steps on static assets), it's now ambiguous as to which tool will be executed when you start your sub-shell.
I do agree that annotating routes was feature creep but I don't think
it's hurting much, relatively speaking.
Eh? Didn't mean to imply it's feature-creep. I think it's a natural fit; I just think the name "annotate_models" is confusing in that context.
I actually think there's more annotations worth adding, like summaries of attribute scoping metadata. I.E. a section showing what attributes are mass-assignable, what are protected, what are read-only, and what named mass-assignment scopes exist. However, that's considerably more work, more scope, so I didn't just go ahead and build it.
> 5) I revamped the integration testing system. RVM is now required, but this
> comes with a slew of benefits:
> a) It's easier to set up complex test scenarios -- for example, I have one
> with no asset pipeline, one with FactoryGirl set to autoload, etc.
> b) You can "tinker" more easily instead of running formal tests as instead
> of simple Gemfile/Gemfile.lock pairs, I'm basically making
> super-stripped-down Rails apps that get copied en masse into place for test
> runs. The .rvmrc, Gemfile, and Gemfile.lock are templated, such that the
> repo version is good for you to just cd in and play around, and at
> rspec-time it'll write ones appropriate for that. The beautiful part is
> that both should inherit your currently active Ruby and use a private
> gemset. (I'm testing that out as I write this...)
> c) I downgraded Rake on the Rails 2.3 test to be Rake 0.8.7. Reduces a
> ton of noise, and is more typical of legacy setups I think.
> d) Incremental runs are much faster since they don't need to reinstall all
> the gems if you want to maintain a clean operating/development environment.
>
> These integration setups are quite small -- ignoring the .gitkeep files --
> the Rails 2.3 one is 3 templates plus 28 files. The Rails 3.2.2 one is 3
> templates plus 34 files. The Rails 3.2 + FactoryGirl one is 28 files + 3
> templates. Etc etc etc.
That all sounds very nice. How hard would it be to roll out a new
template for a new release of Rails?
I haven't automated it yet, but my general approach here has been to copy-pasta one of the existing integrations, and make the relevant mass substitutions, nuke the Gemfile.lock and rebuild it, then tweak that to be a template. Depending on the complexity of the scenario involved, it's a few minutes of work once you know what needs to be changed and where.
I.E. it definitely bears a little automation on the generation-side of things, but it feels very achievable to automate the process of kickstarting a barebones integration test to the point that you can customize it to the desired behavior trivially.
I'm not sure about downgrading rake. I was thinking of monkey-patching
rake and rubygems to get rid of some of the Rails 2 noise.
It's only downgraded in the 2.3 test. And we could easily have a 2.3 integration test that uses a newer build of Rake as well. Point is though that it felt like a good fit WRT regression-testing, in terms of the likely 'real world' usage scenarios for 2.3 users.
> 7) I switched from mg and rdoc to Jeweler and yard. The change to Jeweler
> is a bigger deal to me: I've been fighting some strange tools in JRuby-land
> that like to do things like package up gems separately from their .gemspec
> file, so trying to refer to gem code from the gemspec just does not work *at
> all* (and using git on the fly won't work when all the files are in a jar).
> I also tweaked the existing history and todo list to be proper rdoc syntax
> and added them to the docset so running "rake yard" will include them.
Yard is fine but I've had the opposite experience with Jeweler. A lot
of what Jeweler did is now part of Rubygems (gem) and mg just calls
that. I find Jeweler just adds unnecessary complexity these days.
Take a peek at how I made use of it. I have a pattern for using it that gives me a nice degree of control, and results in a clean, standalone .gemspec that doesn't have any dependencies on git, or the relative position of the .gemspec to the code. It also allows me to keep the Gemfile as the central authority for what the project depends on in what contexts, without doing stupid things like making the project's test-mode dependencies be a part of the test-mode dependencies of the consumer of the project.
What do you mean by trying to refer to gem code from the gemspec? The
only code I ever use from inside a non-generated gemspec is
Foo::Version (or Foo::VERSION) and you can cleanly require just that
file if you isolate it.
The use of Dir.glob / `git ls-files`, and requiring any files cause the interpretation of the gemspec to be only viable if it is located in the root directory of where the gem was expanded to. The Dir.glob / `git ls-files` part also only works when the gem is expanded onto the FS (which isn't the case when your app has been compiled into one mega-jar file in jruby) -- and the `git ls-files` part even requires that git be present on machines where the gem is *deployed*. That's a reasonable assumption for a development environment but not necessarily for a production environment.
I'm not opposed to using something *else* besides Jeweler -- my only objective is to make it so the .gemspec is a generated artifact and not a piece of meta-code. It's just too much of a PITA as soon as you're out of MRI+Bundler land.
> 8) I've overhauled code-loading quite a bit to be cleaner and more
> manageable, but at present there's a strong assumption of Bundler being
> involved. That needs to change. I just didn't have time to code the
> fallback path.
Since Rails requires Bundler now, I wouldn't feel too bad about that
assumption, but if it's possible to do without it, that would be best.
I'm mostly thinking about it in terms of Rails 2.x compatibility. I've been able to preserve 2.3 compatibility with my changes quite readily, with only a few specific little quirks...
I wonder if you can put some of the non-controversial changes into
separate branches and make pull requests for them, then we can keep
bickering about the rest :-)
That... would be a bit messy at this point. I mean, I can flip around the sorting default easily enough, and the yard/jeweler switch could also be surgically avoided, but things like the split to two binaries would require a fair bit more work, as would backing out the loading of the whole Rails env.
The output of:
git diff -M -C ef70622..78a5a3f --shortstat
Is:
200 files changed, 5064 insertions(+), 967 deletions(-)
... which is pretty significant, given that my branch has 205 files under version control, totaling 6,431 lines. Granted most of that is the five separate integration tests -- 178 files, 4,260 lines, but still -- it's a significant changeset and many of the changes necessarily touch the same parts of the code.
Also, since my last mail: I managed to refactor things a bit, and remove a fair chunk of now-dead code (another side-effect of load-the-rails-env-and-proceed-from-there) as well as adding support for multiple model directories, made --require work properly with annotate_routes and a few other fixes.
There's a couple broken specs related to your --trace option and the behavior when say, a model can't be annotated; I just need to look into what is meaningfully correct behavior here. In a couple cases I intentionally don't bother producing output (I.E. when a model file can't be found in the model-dir search path, it must be coming from an engine/plugin/gem of some sort), etc.
-JF