rack-contrib reorg

6 views
Skip to first unread message

Ryan Tomayko

unread,
Jan 12, 2009, 5:59:10 PM1/12/09
to rack-...@googlegroups.com
I've had a TODO for some time now to move the code in rack-contrib
around a bit to prevent potential name clashes with rack core. Right now
all of the source files for rack-contrib are named as 'rack/foo.rb' and
attach themselves to the Rack module as Rack::Foo.

At the very least, we need to move the files from 'rack/' into a
different directory to prevent require clashes and rubygems weirdness. e.g.,

lib/rack/foo.rb -> lib/rack/contrib/foo.rb

What I'm not sure about is whether we also want to move all of the
objects into a separate module once loaded. e.g.,

Rack::Foo -> Rack::Contrib::Foo

For some reason, I'd like to leave the classes under the Rack module but
I'm not sure that this accomplishes our goals. Would you guys mind
weighing in here with what makes sense to you so I can take the
temperature a bit?

Also, let me know if you have any local changes that should go in before
the big move takes place so that we can plan accordingly.

Thanks,
Ryan

Daniel Roethlisberger

unread,
Jan 12, 2009, 6:31:48 PM1/12/09
to rack-...@googlegroups.com
Ryan Tomayko <r...@tomayko.com> 2009-01-12:

> What I'm not sure about is whether we also want to move all of the
> objects into a separate module once loaded. e.g.,
>
> Rack::Foo -> Rack::Contrib::Foo

Personally, I think it would make sense to be able to move stuff
from contrib to rack proper (and back) without breaking existing
code.

--
Daniel Roethlisberger
http://daniel.roe.ch/

Ryan Tomayko

unread,
Jan 12, 2009, 6:46:53 PM1/12/09
to rack-...@googlegroups.com
On 1/12/09 3:31 PM, Daniel Roethlisberger wrote:
> Ryan Tomayko <r...@tomayko.com> 2009-01-12:
>> What I'm not sure about is whether we also want to move all of the
>> objects into a separate module once loaded. e.g.,
>>
>> Rack::Foo -> Rack::Contrib::Foo
>
> Personally, I think it would make sense to be able to move stuff
> from contrib to rack proper (and back) without breaking existing
> code.

Right. I think that's what makes me want to leave it under Rack as well.
rack/contrib.rb adds autoloads under the Rack module for all
rack-contrib components so it's a nice little compatibility story: you
require 'rack/contrib' and use the autoloads to bring in the other
modules. Later, when things get moved from rack-contrib into rack (or
vise-versa), your code should continue to work as long as you don't
require 'rack/contrib/foo' directly.

Is that too much magic? I can't see a lot of benefit in forcing an
explicit project reference here.

Thanks,
Ryan

Nicolás Sanguinetti

unread,
Jan 12, 2009, 7:24:20 PM1/12/09
to rack-...@googlegroups.com

No, I think that's the right approach.

-foca

Yehuda Katz

unread,
Jan 12, 2009, 8:02:31 PM1/12/09
to rack-...@googlegroups.com
What about the autoload threadsafety issues?

-- Yehuda
--
Yehuda Katz
Developer | Engine Yard
(ph) 718.877.1325

Ryan Tomayko

unread,
Jan 12, 2009, 8:13:32 PM1/12/09
to rack-...@googlegroups.com
On 1/12/09 5:02 PM, Yehuda Katz wrote:
> What about the autoload threadsafety issues?

We talked about this on IRC the other day and were unable to come up
with a real scenario where it would cause an actual issue. None of the
rack-contrib classes would be first referenced within the context of a
request. They should all be first referenced from the "main thread"
since they're required to construct the application.

We should probably discuss this more broadly, though. Both rack and
rack-contrib use autoload for everything. I'm not familiar enough with
the threading issues to say strongly whether that should change but my
impression is that 1.) an actual problem here would be extremely rare,
and 2.) it's fairly simple for people that do run into an issue to work
around it by using explicit requires in a main thread or synchronized block.

Thanks,
Ryan

Michael Fellinger

unread,
Jan 12, 2009, 8:41:52 PM1/12/09
to rack-...@googlegroups.com

I haven't actually been able to reproduce the problem while using
Rack, but ran into it in another place.
Maybe doing something like this could fix it?

http://paste.rubyists.com/pastes/view/15346

^ manveru

Joshua Peek

unread,
Jan 12, 2009, 9:27:04 PM1/12/09
to rack-...@googlegroups.com
On Mon, Jan 12, 2009 at 7:02 PM, Yehuda Katz <wyc...@gmail.com> wrote:
> What about the autoload threadsafety issues?

Not a real issue since middleware classes are referenced when your
application is initialized (before any requests threads are spawned).

Though it could be a problem if your application is dynamically
rebuilding itself on the fly ;)

+1 for keeping the autoload approach.


--
Joshua Peek

Ryan Tomayko

unread,
Jan 12, 2009, 9:49:27 PM1/12/09
to rack-...@googlegroups.com

I went ahead and pushed the change that moved the source files:

http://github.com/rack/rack-contrib/commit/c55a655

This will break code that required a rack-contrib module directly.

We can deal with the autoload issue separately. manveru was going to try
to get a test case together that illustrated the problem in Rack (with
Rack::Utils being referenced within the request context). I don't think
rack-contrib will be impacted but we may want to use the same
conventions in both places for parity.

Thanks,
Ryan

James Tucker

unread,
Jan 13, 2009, 8:41:46 AM1/13/09
to rack-...@googlegroups.com
On 13 Jan 2009, at 01:02, Yehuda Katz wrote:

What about the autoload threadsafety issues?


That's good enough, because, there are bigger issues with being threaded at load time.

In fact, there are issues with all kinds of stuff in ruby's core for multithreaded apps.

Example (from the #eventmachine channel, where we're helping user 'paddor' with a heavily threaded (on MRI) run on a net-telnet based system for configuring routers):

16:40 +paddor: raggi: Is it possible that non-hanging read() syscalls also get interrupted by the SIGUSR1? Maybe this is my new problem.
16:41 +raggi: paddor: they will get interrupted, but that's not unusual exactly, the ruby scheduler does this to itself regularly
16:41 +raggi: paddor: you could still read the template into memory once, and run #result multiple times
16:41 +paddor: raggi: this is what i do
16:41 +raggi: yeah, so hmm, check which file descriptor the hang is happening on
16:42 +raggi: and match it up to a file in lsof
16:42 +raggi: is it definately ERB?
16:43 +paddor: Yes, line 643 or something.. The exception is coming from using split() or a method of StringScanner.
16:44 +paddor: How many times does the ruby scheduler do this too? Because some minutes ago (when I was at work) my script was hanging for a long time and suddenly continued 
               with the ensure blocks (like I pressed ctrl+c).
16:51 mode/#eventmachine +v alloy by ChanServ
16:57 +raggi: oh, heh, stringscanner is an extension iirc
16:57 +raggi: interesting
16:58 +Aria: Yep

Essentially, what I'm saying is, that you've gotta be pretty mad-crazy to do multithreading a load time, you've gotta at least have crazy hair to use threading in MRI under heavy / complex IO conditions (hammering select hard). More than that, you're destined to hit race conditions in ruby-core that no one has fixed, and often they haven't even seen yet if you go down that route. On top of that, there's rarely much real gain from doing so, and it certainly rarely provides an actual stable performance profile (on MRI). 

Nonetheless, the solution should be to use something like my preload example, possibly including a Thread.critical or similar, to stop the scheduler during the loads (not that you get this with require, but hey, it's not my code).

It should also be keenly noted that the autoload problem is not unique to autoload. If you're doing require across multiple threads in parallel, and you have a complex dependency graph, you stand a chance of causing the *exact* same class of race conditions regardless of the lack of use of autoload.

Application developers of threaded code in ruby *really* (more than ever/elsewhere) need to be reading through the libraries they are using. Many ruby libraries do things like this, which is not thread safe at load time:

class Foo
  class <<self; attr_accessor :bar; end
  @bar = 10
end

With so much code around that resembles this style, I really can't say that anyone with sane code should be running into issues. More than that, I suspect that threading loading code is likely to contain way more race conditions than runtime threading of specific actor/reactor / IO loops / request - response style constructs.

Ryan also raised an obvious point about this, in that if you ever use Builder or a middleware chain, you'll always hit the constants before you actually run the app (for middleware / appware stuff). There are potentially a few exceptions, such as Rack::Utils.

But...

Rack::Utils does not contain thread-unsafe class definition code (apart from the potential warnings that will be raised as constants are redefined ;-P) - and that is the key to be aware of. The other key that application developers need to be aware of (and as I noted, this exists in require too, and has nothing to do with autoload) is that monkey patching at 'load time' a constant like this, could be damaged by a load in a separate thread. Load time threading - that way madness lies.

One could say that multithreaded loading in an open class environment is quite dangerous, and as always, threaded environment should be carefully checked over by application authors, properly.

Trying to tender to people by saying "we use require, we're thread safe" is a falsehood and detracts from the fact that we should be telling developers "ooh, threads, be methodical, even experts get it wrong, multiple times."

$0.02

On Mon, Jan 12, 2009 at 4:24 PM, Nicolás Sanguinetti <god...@gmail.com> wrote:

On Mon, Jan 12, 2009 at 9:46 PM, Ryan Tomayko <r...@tomayko.com> wrote:

>
> Is that too much magic? I can't see a lot of benefit in forcing an
> explicit project reference here.

I don't think so. I run mixed environments all the time, sometimes runtime threads, sometimes no threads. I do always, regardless of environment, love having autoload there. It keeps my heap clean and shortens my load time, for tests, for apps. Realistically, this thing saves me time during my working and out of work day, every day, and I've never lost time to it through bugs or it's apparent magic properties.

No, I think that's the right approach.

+1

Matt Todd

unread,
Jan 13, 2009, 9:13:44 AM1/13/09
to rack-...@googlegroups.com
Perhaps we could move middleware into Rack::Middleware, but non-middleware or -end-point contributions would remain in Rack?

Yehuda Katz

unread,
Jan 13, 2009, 11:55:40 AM1/13/09
to rack-...@googlegroups.com
The difference between require and autoload is that require is triggered by the "user" on purpose, and is almost always triggered at boot-time. Autoload is triggered by the *interpreter*, and it's impossible to mutex the require that is caused by autoload, even if you would want to.

The important thing is to make SURE that nothing is being autoloaded that will be hit the first time at runtime. I suppose the preload solution works, but we still need to know which autoloads won't be triggered until runtime, and make sure to include them in the preloader.

This situation is more complicated than it looks at first glance, and asking all users of Rack to understand how it works so that they can make the appropriate decision is unfair burden-shifting onto many other users of Rack.

-- Yehuda

James Tucker

unread,
Jan 13, 2009, 1:04:46 PM1/13/09
to rack-...@googlegroups.com

On 13 Jan 2009, at 16:55, Yehuda Katz wrote:

> The difference between require and autoload is that require is
> triggered by the "user" on purpose, and is almost always triggered
> at boot-time.

A quick grep for spaces in front of require in very commonly used ruby
projects can reveal interesting contradictions.

> Autoload is triggered by the *interpreter*, and it's impossible to
> mutex the require that is caused by autoload, even if you would want
> to.

Mutexes are not a solution to thread safety. They are just a tool.
Yes, you can't protect autoload, but if you were that worried about a
mutex, you could re-write it on the ruby side. That wasn't accepted
because of the deadlock potentials. Those exist in require, too.

> The important thing is to make SURE that nothing is being autoloaded
> that will be hit the first time at runtime.

Nope.

The important thing is to make sure that code running in potentially
multiple threads does not raise race conditions.

This is one of those areas where you get common statements not holding
to the true state of the world, such as "survival of the fittest",
which is wrong, and should be in fact "unsurvival of the unfittest".

Yes - this does mean you could autoload twice, but you can require
twice with late requires, too.

This remains ok until assignment is not atomic. When that happens,
we've got a whole other world of crap, and again, require doesn't help
unless you eliminate all of what I mentioned at the top.

> I suppose the preload solution works, but we still need to know
> which autoloads won't be triggered until runtime, and make sure to
> include them in the preloader.

No, you *need* to add unsafe loads to the preloader, and that's it. If
you want to be 'extensively safe', don't use threads. I'm dead serious
about that.

> This situation is more complicated than it looks at first glance,
> and asking all users of Rack to understand how it works so that they
> can make the appropriate decision is unfair burden-shifting onto
> many other users of Rack.

So you're suggesting users run around with burning scissors and their
eyes shut? (wrt Threading without understanding Threading impacts).

Like I explained, this problem is in no way unique to autoload.

The critical thing, is to not write runtime code in classes that are
to be late loaded or late evaluated, or, pre-run that code. That rule
needs to be understood by ruby developers who write in a threaded
environment, and has nothing to do (in reality) with the specific
loading mechanism. As I've already said, hiding issues by taking out
the obvious hits, doesn't make them magically disappear.

Yehuda Katz

unread,
Jan 13, 2009, 1:28:43 PM1/13/09
to rack-...@googlegroups.com
On Tue, Jan 13, 2009 at 10:04 AM, James Tucker <jftu...@gmail.com> wrote:


On 13 Jan 2009, at 16:55, Yehuda Katz wrote:

The difference between require and autoload is that require is triggered by the "user" on purpose, and is almost always triggered at boot-time.

A quick grep for spaces in front of require in very commonly used ruby projects can reveal interesting contradictions.


Autoload is triggered by the *interpreter*, and it's impossible to mutex the require that is caused by autoload, even if you would want to.

Mutexes are not a solution to thread safety. They are just a tool. Yes, you can't protect autoload, but if you were that worried about a mutex, you could re-write it on the ruby side. That wasn't accepted because of the deadlock potentials. Those exist in require, too.


The important thing is to make SURE that nothing is being autoloaded that will be hit the first time at runtime.

Nope.

The important thing is to make sure that code running in potentially multiple threads does not raise race conditions.

This is one of those areas where you get common statements not holding to the true state of the world, such as "survival of the fittest", which is wrong, and should be in fact "unsurvival of the unfittest".

Yes - this does mean you could autoload twice, but you can require twice with late requires, too.

This remains ok until assignment is not atomic. When that happens, we've got a whole other world of crap, and again, require doesn't help unless you eliminate all of what I mentioned at the top.

It's easy for the providing library to do the requires at the right time. When the providing library (in this case rack), relies upon autoload to require the files at some unspecified point in the future, it shifts the burden of knowing about what things are autoloaded and when they get triggered to the end-user, who then has to do his/her own requires to get around this problem.

The underlying problem with autoload is that the current behavior is to remove the "autoload needed" flag from the constant before the file is required, which means that another thread trying to trigger the same constant before the file has been loaded produces a NameError. This is definitely unexpected and is not something that users of Rack are likely to be able to figure out at first glance.

Using autoload for things that will inevitably get triggered during bootup: fine. Using autoload for things that will get triggered the first time at runtime (possibly concurrently): not fine.
 


I suppose the preload solution works, but we still need to know which autoloads won't be triggered until runtime, and make sure to include them in the preloader.

No, you *need* to add unsafe loads to the preloader, and that's it. If you want to be 'extensively safe', don't use threads. I'm dead serious about that.


This situation is more complicated than it looks at first glance, and asking all users of Rack to understand how it works so that they can make the appropriate decision is unfair burden-shifting onto many other users of Rack.

So you're suggesting users run around with burning scissors and their eyes shut? (wrt Threading without understanding Threading impacts).

Like I explained, this problem is in no way unique to autoload.

The critical thing, is to not write runtime code in classes that are to be late loaded or late evaluated, or, pre-run that code. That rule needs to be understood by ruby developers who write in a threaded environment, and has nothing to do (in reality) with the specific loading mechanism. As I've already said, hiding issues by taking out the obvious hits, doesn't make them magically disappear.

See above.

James Tucker

unread,
Jan 13, 2009, 1:42:24 PM1/13/09
to rack-...@googlegroups.com

On 13 Jan 2009, at 18:28, Yehuda Katz wrote:
> The underlying problem with autoload is that the current behavior is
> to remove the "autoload needed" flag from the constant before the
> file is required, which means that another thread trying to trigger
> the same constant before the file has been loaded produces a
> NameError. This is definitely unexpected and is not something that
> users of Rack are likely to be able to figure out at first glance.

Agreed.

Charles Oliver Nutter

unread,
Jan 13, 2009, 1:50:02 PM1/13/09
to Rack Development
On Jan 13, 12:28 pm, "Yehuda Katz" <wyc...@gmail.com> wrote:
> On Tue, Jan 13, 2009 at 10:04 AM, James Tucker <jftuc...@gmail.com> wrote:
> > A quick grep for spaces in front of require in very commonly used ruby
> > projects can reveal interesting contradictions.

That ideally shouldn't be done either, but at least under 1.9
concurrent requires will block. I think there's still some question
about the correct way to handle it, though. All 1.8 versions will
either proceed with execution as if the require were complete or re-
require the file in parallel. Both are issues.

> > Mutexes are not a solution to thread safety. They are just a tool. Yes, you
> > can't protect autoload, but if you were that worried about a mutex, you
> > could re-write it on the ruby side. That wasn't accepted because of the
> > deadlock potentials. Those exist in require, too.

autoload is different from require in that it immediately mutates
state that changes the behavior of other threads. It is not, by any
definition of the word, threadsafe...and it can't easily be made
threadsafe.

> > Yes - this does mean you could autoload twice, but you can require twice
> > with late requires, too.

Incorrect. Autoload may fire twice, or the second thread may see an
undefined constant. The latter case results in either an error (const
missing) or the lookup returns an incorrect result from a higher
scope. Both cases are irreparably broken, which means autoloads must
never be used in a situation where there may be threads concurrently
hitting the autoloaded constant in question.

Yes, you can require users to explicitly load all libraries they will
need if they know they'll be running multi-threaded. Unfortunately
it's not always easy to know that you're running multithreaded, since
some libraries spin up threads or execute tasks in parallel. Even net/
* libraries use timeout.rb which spins up a thread. So in general any
application might use multiple threads since Ruby supports multiple
threads and library authors use them. And because of that, you also
expect users to know about the threading characteristics of all
libraries they're running. That sounds unreasonable, right? Expecting
them to pre-load libraries assumes they know the threading
characteristics of Rack.

There is one way you can get around this...define your own mutexed
service loader that lazily requires in the files. Essentially,
implement your own autoloader that handles threading concerns
correctly. Something like this:

def get_service
CONST ||= load_service 'something_big'
end

def load_service(service)
begin
mutex.lock
require service
ensure
mutex.unlock
end
end

Obviously the trick here is ensuring there aren't a whole bunch of
load_service mutexes, since that can lead to a deadlock. Also, the
method + const is slower than just a const lookup, but not by a lot
(and by a whole lot less in JRuby than in MRI). So it needs to be
fleshed out, but I don't think we're going to see C Ruby fix autoload
any time soon.

- Charlie

Pratik

unread,
Jan 13, 2009, 1:51:53 PM1/13/09
to rack-...@googlegroups.com
Just use https://gist.github.com/c4f441f48458e0f15ef7 and get it over with.

--
Cheers!
- Pratik
http://m.onkey.org

James Tucker

unread,
Jan 13, 2009, 2:01:52 PM1/13/09
to rack-...@googlegroups.com

On 13 Jan 2009, at 18:50, Charles Oliver Nutter wrote:
>
>>> Yes - this does mean you could autoload twice, but you can require
>>> twice
>>> with late requires, too.
>
> Incorrect. Autoload may fire twice, or the second thread may see an
> undefined constant. The latter case results in either an error (const
> missing) or the lookup returns an incorrect result from a higher
> scope. Both cases are irreparably broken, which means autoloads must
> never be used in a situation where there may be threads concurrently
> hitting the autoloaded constant in question.

class A
def foo
end
include MakeABlankSlate
end

Thread.new { require 'a' }
Thread.new { require '../a/a.rb' }

Is A#foo defined or not, for how long, when and why?

Mixing load and run times in a reflective environment can be
dangerous, with or without immediately broken paradigms.

Yehuda Katz

unread,
Jan 13, 2009, 2:29:58 PM1/13/09
to rack-...@googlegroups.com
Agreed. But require is usually done at boot-time, when the library is required. autoload, on the other hand, triggers at an unspecified future time, often at runtime.

-- Yehuda

Charles Oliver Nutter

unread,
Jan 13, 2009, 2:33:49 PM1/13/09
to Rack Development
On Jan 13, 1:01 pm, James Tucker <jftuc...@gmail.com> wrote:
> > Incorrect. Autoload may fire twice, or the second thread may see an
> > undefined constant. The latter case results in either an error (const
> > missing) or the lookup returns an incorrect result from a higher
> > scope. Both cases are irreparably broken, which means autoloads must
> > never be used in a situation where there may be threads concurrently
> > hitting the autoloaded constant in question.
>
> class A
>    def foo
>    end
>    include MakeABlankSlate
> end
>
> Thread.new { require 'a' }
> Thread.new { require '../a/a.rb' }
>
> Is A#foo defined or not, for how long, when and why?
>
> Mixing load and run times in a reflective environment can be  
> dangerous, with or without immediately broken paradigms.

This is a different case, since you're requiring the resource under
two different names in two different requires. An autoload is always
the same name in the same position.

Under 1.9 the above code would never see A#foo because the concurrent
requires would block. I'm not sure if this is specified behavior yet.

Under 1.8, the above could might see A#foo defined or might not, and
there's no way to predict that.

But again, totally different case from autoload.

- Charlie

Joshua Peek

unread,
Jan 13, 2009, 3:18:22 PM1/13/09
to rack-...@googlegroups.com
Reply all
Reply to author
Forward
0 new messages