Sleuth workflow for group factors

42 views
Skip to first unread message

Asan Emirsale

unread,
Jan 24, 2024, 6:39:44 AMJan 24
to kallisto and applications
Hello!
The aim of the study to assess the change of expression between several genotypes of the same species under the same experimental conditions. There are two questions regarding this task:
1) how should (a) the matrix be designed and (b) the model fitted?
2) it is clear that in the absent of control, the design matrix should be re-leveled each time the first factor changes to fit model against new "reference" base.  (a) Should the sleuth object be prepared every time base level changes? (b) Would the "reference" role of the genotype affect the test result ("A(reference) vs B", or "B(reference) vs A")?

I tried to implement this in code, following the maximalistic approach in comparing all possible pairs (i.e. "A_vs_B", "B_vs_A" etc.).

```R
metadata <- dplyr::select(metadata, c('lib', 'sample'))
metadata$sample <- as.character(metadata$sample)

head(metadata, n = 18)
```

|  | lib | sample |
| ---- | ---- | ---- |
|  | \<chr\> | \<chr\> |
| 1 | Q5 | A |
| 2 | Q8 | A |
| 3 | Q10 | A |
| 4 | Q1 | B |
| 5 | Q4 | B |
| 6 | Q7 | B |
| 7 | Q12 | C |
| 8 | Q14 | C |
| 9 | Q16 | C |
| 10 | Q18 | D |
| 11 | Q3 | D |
| 12 | Q6 | D |
| 13 | Q9 | E |
| 14 | Q11 | E |
| 15 | Q13 | E |
| 16 | Q15 | F |
| 17 | Q17 | F |
| 18 | Q2 | F |

```R
genotypes <- unique(metadata$sample)



# preparing sleuth object
so <- sleuth_prep(s2c, num_cores = 12, extra_bootstrap_summary = TRUE, read_bootstrap_tpm = TRUE)

for (idx in 1:length(genotypes)) {
  message <- paste("Starting loop", idx, "with sample", genotypes[idx], sep = " ")
  print(message)

  # relevel matrix
  new_s2c <- so$sample_to_covariates
  new_s2c$sample <- as.factor(new_s2c$sample)

  new_s2c$sample <- relevel(as.factor(new_s2c$sample), ref = genotypes[idx])

  md <- model.matrix(~new_s2c$sample, new_s2c)
  colnames(md)[2] <- paste('sample', genotypes[idx], sep = "")

  # fit new model with the reference changed
  new_fit_name <- paste("full", genotypes[idx], sep = "_")
  so <- sleuth_fit(so, ~sample, fit_name = new_fit_name)

  # redo LRT
  so <- sleuth_lrt(so, 'reduced', new_fit_name)

  for (bidx in 1:length(genotypes)) {
    if (idx != bidx) {
    comp <- paste(genotypes[idx], genotypes[bidx], sep = "_vs_")
    print(comp)
    # Perform Wald Test
    new_cond <- paste("sample", genotypes[bidx], sep = "")
    sleuth_wt(so, new_cond, which_model = new_fit_name)
    }
  }

  so_label <- paste("so", "ref", genotypes[idx], sep = "_")
  so_label <- paste(so_label, ".sleuth", sep = "")
  sleuth_save(so, file = so_label)
  models(so)
  tests(so)
}
```
Best regards
Asan

Asan Emirsale

unread,
Jan 24, 2024, 6:45:41 AMJan 24
to kallisto and applications
P.S. sorry for Obsidian markdown artifacts.
Here is more friendly representation

lib sample

<chr> <chr>
1 Q5 A
2 Q8 A
3 Q10 A
4 Q1 B
5 Q4 B
6 Q7 B
7 Q12 C
8 Q14 C
9 Q16 C
10 Q18 D
11 Q3 D
12 Q6 D
13 Q9 E
14 Q11 E
15 Q13 E
16 Q15 F
17 Q17 F
18 Q2 F
среда, 24 января 2024 г. в 18:39:44 UTC+7, Asan Emirsale:
Reply all
Reply to author
Forward
0 new messages