Hi Holly,
Your figure doesn't appear to have paths from the SNP to the internalizing and externalizing factors, just to TUD. In such a model, the SNP effect on TUD is not incremental to its effects on INT and EXT (i.e. INT and EXT are not controlled for in the TUD ~ SNP regression). However, your code indicates that these paths are estimated. I always like to check the output to be sure that the parameters that I think my code is estimating are indeed being estimated. But assuming that this the case, and all estimates look sensible given what you know about the data (i.e. the model seems to have appropriately converged), then it does seem that you are doing what you intended.
You mention that you get a hit on chr 15 for TUD, which I assume is the Mr. Big hit for tobacco, and chr 4 for PAU, which I assume is alcohol dehydrogenase for alcohol. It's certainly possible that the substance-specific pathways are isolated to a few of these large effect variants operating through well-understood core pathways, and that the vast majority of the remainign polygenic propensity for different substance use phenotypes operates through a mixture of internalizing and externalizing. It's also possible that after conditioning on EXT and INT, you simply don't have the power to produce a predictive PGI (i.e. there are substance-specific polygenic effects but your PGI is low powered). Did you check the h2SNP of this direct path to TUD? That would give you a clue of how much signal you are workign with.
The other thing that seems a bit strange to me is that you say that you have done this for other disorders: PAU, CUD, TUD. However, your model with a direct effect on TUD only seems to allow for a cross-loading for TUD, but not PAU and CUD. That suggests to me that you are changing the measurement model each time you estimate a direct effect on a different substance use phenotype. That seems a bit strange to me, as it redefines the factor each time. I would think that a sensible approach would be to determine a sensible measurement model to use first, and then add the snp to the model and allow for direct effects (in facto, you can allow for multiple direct effects simultaneously- you just can't do it for all indicators at once). If you don't have an apriori sense of what cross-loadings to allow, you can use a "modification indices" style approach in which you fit the model with simple structure first and then inspect the residuals to determine for which substance use phenotypes a cross loading on INT would be approparite. Yavor Dragostinov took such an approach for his measurement model of the Big Five for the Schwaba et al. ReGPC preprint (see the supplement: 3.5 Genomic SEM Analyses Stratified by Measurement Instrument).
All the best,
Elliot