Hi Zhehao,
I believe the FP3001 auxiliary BNC is more for TTL triggering of frames, or for outputting a sync signal, I'm not aware that you can use it as an analog input port. However, going back to your original idea of synching audio in the computer, latencies might be in the order of 10ms but this might be good enough for photometry signals.
You can use the AudioCapture source to grab audio data from either a microphone or line-in input in the soundcard and then combine it with photometry data using CombineLatest or WithLatestFrom. You would need to decide in which format you would like to see the data synchronised, either marking the audio stimulus data with photometry, or the opposite, label the photometry frame with whether there was an onset of a stimulus in that frame.