--- title: "scMetaTraj workflow" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{scMetaTraj workflow} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 6, fig.height = 4.5 ) ``` ## Overview `scMetaTraj` models metabolism as a continuous state space derived from pathway-level scores rather than a secondary annotation layered onto transcriptomic clustering. The package supports: - pathway/module scoring from a Seurat object or expression matrix - metabolic state space embedding - metabolic subclustering - metabolic pseudotime inference - trend and switchpoint analysis along metabolic pseudotime This vignette uses a small simulated example so that it remains portable and does not depend on local files or large external datasets. ## Simulate a small expression matrix ```{r} library(scMetaTraj) set.seed(2026) expr <- matrix( rexp(14 * 100, rate = 1), nrow = 14, ncol = 100, dimnames = list( c( "HK1", "PFKP", "LDHA", "GPI", "CS", "ACO2", "IDH3A", "NDUFA1", "COX4I1", "ATP5F1A", "G6PD", "PGD", "ACLY", "FASN" ), paste0("Cell", seq_len(100)) ) ) gene_sets <- list( Glycolysis = c("HK1", "PFKP", "LDHA", "GPI"), TCA = c("CS", "ACO2", "IDH3A"), OXPHOS = c("NDUFA1", "COX4I1", "ATP5F1A"), PPP = c("G6PD", "PGD"), Lipid = c("ACLY", "FASN") ) ``` ## Score metabolic modules ```{r} scores <- scMetaTraj_score( x = expr, gene_sets = gene_sets, method = "mean", min_genes = 2, scale = FALSE ) dim(scores) colnames(scores) ``` ## Embed cells in metabolic space `scMetaTraj_embed()` returns PCA coordinates for analysis or UMAP coordinates for visualization. ```{r} emb_pca <- scMetaTraj_embed(scores, method = "PCA", n_pcs = 4) emb_umap <- scMetaTraj_embed(scores, method = "UMAP", n_pcs = 4) head(emb_pca) head(emb_umap) ``` ## Identify metabolic subclusters ```{r} clusters <- scMetaTraj_cluster( embedding = emb_pca, k = 12, method = "louvain" ) table(clusters) ``` Cluster-level summaries can be generated with `scMetaTraj_cluster_profile()`. ```{r} profile_df <- scMetaTraj_cluster_profile(scores, clusters, stat = "mean") head(profile_df) ``` ## Infer metabolic pseudotime ```{r} traj <- scMetaTraj_infer( embedding = emb_pca, k = 12, root_mode = "pc1_min" ) summary(traj$mPT) traj$root ``` The mPT distribution helper prepares ordered cluster labels along the trajectory: ```{r} dist_df <- scMetaTraj_mPT_distribution(traj$mPT, clusters) head(dist_df) ``` ## Track module trends along mPT ```{r} gly_trend <- scMetaTraj_trend( scores = scores[, "Glycolysis"], mPT = traj$mPT, n_bins = 20 ) head(gly_trend) ``` To compare several modules at once: ```{r} multi_res <- scMetaTraj_trend_multi( score_mat = scores, mPT = traj$mPT, modules = c("Glycolysis", "TCA", "OXPHOS"), n_bins = 20 ) head(multi_res$trend_long) multi_res$switchpoints ``` ```{r fig.cap="Example trend plot for several metabolic modules."} scMetaTraj_plot_trend_multi( multi_res$trend_long, multi_res$switchpoints ) ``` ## Interpret results The workflow above illustrates the intended package logic: 1. summarize gene expression into curated metabolic modules 2. analyze cells in module-defined space rather than transcriptome-wide space 3. reconstruct graph-based metabolic pseudotime 4. quantify where module activity changes along the inferred trajectory In real analyses, the same workflow can be applied to Seurat objects and larger curated metabolic gene set collections, while keeping the vignette itself lightweight and fully reproducible.