Sorry for the slow response. I'm digging myself out of work right so my response time might be quite long.
I believe the key to your questions are understanding surfaces. If I'm misunderstanding your issue, please let me know.
You should think of surfaces like arrays. They're basically places where you can store (image) data temporarily, before moving that data elsewhere (e.g. the screen). As you've noticed in your other email, there are different kinds of surfaces, which (potentially) live in different places: software surfaces are guaranteed to live in main memory, while hardware surfaces may potentially live on the GPU (though note that they're aren't guaranteed to, and in fact that GPU itself may be on the same die as the CPU).
As you have noticed, you don't *need* surfaces in a lot of cases, except the surface which represents the display itself (which is usually bound to *default-surface*). But you may still want to use surfaces anyway, for convenience, or potentially for performance. For example:
1. Some operations like sdl:load-image are very slow, because they need to load the image from disk, decode it, etc. You probably don't want to call sdl:load-image on every time through your main loop, and you definitely don't want to call it for every single object at every time step. Instead, you would load the image once, save it in a variable somewhere (you don't need a draw call because load-image already returns a surface), and use that surface instead of loading the image over again.
2. For some composite operations, like mirroring, you would prefer not to do the work required to draw the scene multiple times (e.g. 2 for a single mirror). So instead, you can make a new surface, draw everything to that surface, and then blit that to the screen as many times as you want, and potentially save a factor of N in the draw time (depending on how expensive the original operation was).
Transparency should be automatic as long as any intermediate surfaces have transparency enabled. This means making sure load-image understands the transparency of the source image, and making sure that any other surfaces you use have transparency enabled. The screen itself doesn't need transparency, but it would probably help to make sure it follows the same format (bit depth) as your images to avoid any conversions during blitting.
If you have any other questions, please let me know (and hopefully I won't be quite so abysmally slow at answering this time).