Changsoo,
Not only was Zap enjoying his weekend, he was asleep.
Now, the test being set up is pretty unfair the way it is made.
Scanline renderer calling a max legacy Bitmap is almost - but not quite - as simple as a function call that looks up a single in-memory piece of data. It's not that simple (there's some trilinear-ish sampling - that max calls "Pyramid" - going on), but that's roughly it.
Calling OSL bitmap from scanline, means that a scanline "ShadeContext" class has to be rearranged into a an OSL "ShaderGlobal" class. There's not a giant amount of work, but its' also non-zero.
Then, OSL bitmap calls OIIO which, as Larry correctly pointed out, gives you - through a clever layer of demand loading cache magic - a beufitul bicubically interpolated pixel. Which is definitely also more work.
I'm guessing the reason it's closer in Arnold is probably because Arnold is doing the Arnold OSL thing (not using the OSL in max itself), which is probably fairly optimal (since the shading language was written with (SPI version of) Arnold in mind originally). It probably has little to nothing to do to re-juggle the Arnold "AtShaderGlobals" into OSL "ShaderGlobals" (which the name alone would suggest). Or maybe Arnold is using a different filtering as a default - I actually don't know....
Here's a test I would suggest you do that is fairer. Since the overhead I mention above is to "enter the world of OSL", and happens only once for an entire shade tree, you can try something like this:
Take two max Bitmap, and put them through, say, an RGB Multiply. Copy this entire shade tree, and take the output of both into another multiply, giving you the multiplied value of all four textures.
Right-click the rightmost node, and use "Render Map". Set some high resolution so you can get some good timings. Measure this time.
Now set up the equivalent OSL tree doing the same thing. Compare the timings.
Also, you can edit the OSL texture shader and change the texture lookup line so it reads:
Col = texture(Filename, ulookup, vlookup, "wrap", WrapMode, "alpha", A, "interp", "linear");
This will test without the default bicubic filtering.
The idea with max's OSL support is that if your renderer supports OSL, use your renderers OSL (like Arnold is doing), which may potentially allow some kind of magical optimizations.
The OSL that can any renderer can execute via EvalColor using the classic C++ shader API also works fine - but the small-but-nonzero overhead to turn a ShadeContext into ShaderGlobals will be there. But that overhead is only once for an entire shade tree (if the whole tree is OSL). In practice, for any real world scene's real world shading, it's quite negligeble.
/Z