Hello, I'm not sure how mathematically accurate my analyses are, as this is my first time using SKAT. However, the analysis worked when I included the set ID in the way shown below. I performed the analysis via pathways, not gene-to-gene. Have you considered adding CHRs? I'm including how I obtained the set ID and an example of the set ID below; I hope it helps.
Code
def write_setid_locus(matrix_table, gene_sets_dict, output_file):
"""
for each gene set:
set name -> chr:pos:ref:alt
"""
rows = matrix_table.rows()
with open(output_file, "w") as f:
for set_name, gene_list in gene_sets_dict.items():
gene_set = set(gene_list)
filtered = rows.filter(hl.literal(gene_set).contains(rows.gene_symbol))
loci_tuples = filtered.aggregate(
hl.agg.collect((filtered.locus.contig, filtered.locus.position,
filtered.alleles[0], filtered.alleles[1]))
)
for contig, pos, ref, alt in loci_tuples:
if None not in (contig, pos, ref, alt):
f.write(f"{set_name} {contig}:{pos}:{ref}:{alt}\n")
gene_sets = {
"VEGF": vegf,
"TGF_BETA": tgf_beta,
"Cytokine": Cytokine,
"JAK/STAT": jak_stat,
"T-Cell": t_cell,
"Literature_Genes": lit_genes}
write_setid_locus(kegg_genes_mis, gene_sets, "plink/kegg_genes_mis.SetID")
VEGF chr20:47651082:AGC:A
VEGF chr20:47651085:AGCAGCAACAGCAGCAG:A
VEGF chr20:47651122:A:AGCAG
TGF_BETA chr3:184388996:G:A
TGF_BETA chr5:80474703:C:G
TGF_BETA chr6:7727136:C:T
TGF_BETA chr6:26091105:G:A
17 Temmuz 2025 Perşembe tarihinde saat 17:44:47 UTC+3 itibarıyla wided boukhalfa şunları yazdı: