Just some possible observations regarding reducing in size ...
One fairly obvious thing would be to in-line the subroutines. Each call (JMS) costs one word, each subroutine entry point costs one word, and each return (JMP I) costs one word. You may be able to save up to six words on that alone.
I would need to look at things more closely, but it also seems that if you sequenced the instructions more carefully you might be able to avoid a bunch of save-retrieve pairs. For example, locations 3635-3640 retrieve the old null AC + Link only to save and restore the AC again in the LEFT and RIGHT subroutines. If you could do what you needed to do in the LEFT and RIGHT subroutines before you restored the null AC + Link then you could save 4 more words by avoiding those two DCA TEMP and TAD TEMP pairs.
Up to six words avoided on the subroutine calls plus another four by avoiding the DCA TEMP + TAD TEMP pairs is already up to ten (decimal) locations saved.
Once last comment is that if you're willing to live with a different pattern, the logic could be even simpler. For example, in RSX-11/M the idle pattern is a group of 3-4 lights that always rotate in a left-going direction. It takes fewer instructions to always go left than to switch back and forth.
-- steve