Solar geoengineering via stratospheric aerosol injection (SAI) poses an optimization problem. How exactly should aerosol be deployed to maximize its benefits while minimizing undesirable side-effects, such as shifts in rainfall patterns? Previous work explored this problem using feedback control based on linear algorithms. Here we investigate an alternative approach which also naturally incorporates feedback. We let a reinforcement learning (RL) algorithm control the stratospheric aerosol in an idealized global climate model (GCM). Within several dozen GCM simulations, RL learns to produce stable and plausible strategies. RL also learns the “kicking the can down the road” effect identified in recent studies, in which one can reverse warming more rapidly by varying the aerosol mass over time, which we further explain using a simple energy-balance model. Our results provide a first proof-of-concept that RL can identify promising SAI strategies.