Here is my first effort, focused on the first batter to reach base; the first hit could be determined by changing some of the criteria of the query. It's in MySQL, so it may not work for people with other database management systems. As you may note, it uses session variables to select the condition where all prior plate appearances had not resulted in a batter reaching base, but the current batter did reach base. This doesn't give the percentages, just the aggregate counts; I usually pull the data into Excel Pivot tables for further calculation.
It could use another set of eyes, because some of the outputs don't make any sense to me, e.g. it says that there were 24 games in 1950 in which the fourth batter (or higher) was the first to reach base, which seems impossible. I may have missed some conditions needed in the defintion of the session variables (I had to add one for lead-off Home Runs, and there certainly could be others). I've only been working with Retrosheet for six weeks or so, and I admittedly have more to learn.
/* Program code */
set @prior =0;
set @current =0;
select `season`,`BAT_HOME_ID`,`inn_ct`,`inn_pa_ct`,count(*) as `dist_ct`
from
(
SELECT
`GAME_ID`,CAST(substr(`game_id`,4,4) AS UNSIGNED) as `season`, `BAT_HOME_ID`,`inn_ct`,`INN_PA_CT`,`GAME_PA_CT`,`EVENT_CD`,`EVENT_TX`,`START_BASES_CD`,`END_BASES_CD`
FROM (
SELECT
`GAME_ID` ,`BAT_HOME_ID`,`inn_ct`,`INN_PA_CT`,`GAME_PA_CT`,`EVENT_CD`,`EVENT_TX`,`START_BASES_CD`,`END_BASES_CD`
,@prior:=IF(`GAME_NEW_FL`='T',0,@current) as `prior`
,@current:=IF(`GAME_NEW_FL`='T',IF(`start_bases_cd`=0 and `h_cd`=4,1,0),@current+IF(`START_BASES_CD`=0 and `END_BASES_CD`>0,1,0)+IF(`start_bases_cd`=0 and `h_cd`=4,1,0)) as `current`
FROM `events`
where `BAT_EVENT_FL`='T'
) as `e`
where `prior`=0 and `current`=1) as `sel`
group by
`season`,`BAT_HOME_ID`,`inn_ct`,`inn_pa_ct`
;
/* 100.9459 sec */