Skip to content
Snippets Groups Projects
Commit f5999625 authored by Boris Koch's avatar Boris Koch
Browse files

Vignette updated

parent d2c10850
No related branches found
No related tags found
No related merge requests found
--- ---
title: Introduction to UltraMassExplorer (UME) title: Introduction to UltraMassExplorer (UME)
author: "Boris Koch" author: "Boris Koch"
date: "`r date()`" date: "`r Sys.Date()`"
output: rmarkdown::html_vignette output: rmarkdown::html_vignette
#html_document: #html_document:
#toc: true # table of content true #toc: true # table of content true
#toc_depth: 3 # upto three depths of headings (specified by #, ## and ###) #toc_depth: 3 # upto three depths of headings (specified by #, ## and ###)
#number_sections: true ## if you want number sections at each table header #number_sections: true ## if you want number sections at each table header
#theme: united # many options for theme, this one is my favorite. #theme: united # many options for theme, this one is my favorite.
#highlight: tango # specifies the syntax highlighting style #highlight: tango # specifies the syntax highlighting style
vignette: > vignette: >
%\VignetteIndexEntry{Introduction to UltraMassExplorer (UME)} %\VignetteIndexEntry{Introduction to UltraMassExplorer (UME)}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8} %\VignetteEncoding{UTF-8}
%\VignetteEngine{knitr::rmarkdown}
runtime: shiny
--- ---
```{r echo = FALSE} ```{r echo = FALSE}
options(width = 100L) options(width = 100L)
``` ```
![](images/ume_package_icon.png){width="100"}
[![UltraMassExplorer - UME](images/ume_cover.jpg){width="300"}](https://www.awi.de/en/ume) [![UltraMassExplorer - UME](../man/figures/ume_cover.jpg){width="296"}](https://www.awi.de/en/science/biosciences/ecological-chemistry/tools/ume.html)
*UltraMassExplorer* (`ume`) is a package that uses exact molecular masses (derived from high-resolution mass spectrometry) to assign molecular formulas. UME provides tools to evaluate and visualize results (details described in [Leefmann et al. 2019](https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/abs/10.1002/rcm.8315)). UME is also available as a graphical user interface via [UME R Shiny App](https://www.awi.de/en/ume). *UltraMassExplorer* (`ume`) is a package that uses exact molecular masses (derived from high-resolution mass spectrometry) to assign molecular formulas. UME provides tools to evaluate and visualize results (details described in [Leefmann et al. 2019](https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/abs/10.1002/rcm.8315)). UME is also available as a graphical user interface via [UME R Shiny App](https://www.awi.de/en/ume).
------------------------------------------------------------------------ ---
## 1. Installation ```{r install, eval=TRUE, echo=FALSE, warning=FALSE, message=FALSE}
Install UME gitlab repository: library(ume)
```{r install, eval=FALSE} # library(ume.formulas):
# only demo library ume::lib_demo is used in this vignette
data(ume::lib_demo)
# Check if the DT package is installed
if (!requireNamespace("DT", quietly = TRUE)) {
# Install the DT package
install.packages("DT")
}
library(DT)
# In case ume is currently loaded: ```
detach("package:ume", unload = TRUE)
# Install ume ## 1. Package content and documentation:
devtools::install_gitlab(repo = 'bkoch/ume', host = "https://gitlab.awi.de", build_vignettes = TRUE)
# Which version is installed and loaded?
packageVersion("ume")
# What is new? Which version is installed and loaded?
news(package = "ume")
`packageVersion("ume") ` `r packageVersion("ume")`
# Load and attach ume package
#library(ume)
What is new?
# Molecular formula libraries:
# Check if the ume.formulas package is installed. `news(package = "ume")`
# The package contains pre-built molecular formula libraries
# (CAUTION: large files - installation takes long time!!!):
#
# if (!requireNamespace("ume.formulas", quietly = TRUE)) {
# remotes::install_git(url = "https://gitlab.awi.de/bkoch/ume.formulas.git")
# packageVersion("ume.formulas")
# }
# library(ume.formulas) # only demo library ume::lib_demo is used in this vignette
data(ume::lib_demo)
```
Package content and documentation:
```{r overview, eval = F}
# Browse ume documentation Which functions are available?
vignette("ume") # this vignette
# List all objects and functions in ume package ```{r overview, eval = T, echo=FALSE, warning=FALSE}
ls("package:ume")
# All available ume functions # All available ume functions
objs <- mget(ls("package:ume", all = TRUE), inherits = TRUE) objs <- mget(ls("package:ume", all = TRUE), inherits = TRUE)
...@@ -80,75 +70,62 @@ Package content and documentation: ...@@ -80,75 +70,62 @@ Package content and documentation:
``` ```
## 2. Overview UME workflow ## 2. Overview UME data workflow
1. Analyse the input format of the peak list. 1. Analyse the input format of the peak list.
2. Calculate neutral masses. 2. Calculate neutral masses.
3. Assign molecular formulas (based on a pre-defined formula library). 3. Assign molecular formulas (based on a pre-defined formula library).
4. Calculate evaluation parameters (e.g. DBE, nominal mass, KMD, etc.). 4. Calculate evaluation parameters (e.g. DBE, nominal mass, KMD, etc.).
5. A posteriori formula filtering. 5. A posteriori formula filtering.
6. Perform statistics and data visualization in tables and figures.
#### Tasks 1-6 can be executed in three steps (wrapper functions): Tasks 1-5 can be executed in two steps (wrapper functions)
##### (i) Formula assignment and calculation of evaluation parameters
```{r example_short, eval = F, warning=FALSE} ```{r example_short1, eval = F, warning=FALSE}
# (i) Formula assignment and calculation of evaluation parameters
mfd <- ume_assign_formulas(pl = peaklist_demo, formula_library = lib_demo mfd <- ume_assign_formulas(pl = peaklist_demo, formula_library = lib_demo
, pol = "neg", ma_dev = 0.5, msg = T) , pol = "neg", ma_dev = 0.5, msg = T)
# (ii) Formula filtering (subsetting) and normalization
# See help for all available filter arguments:
?ume_filter_formulas
mfd_filt <- ume_filter_formulas(
mfd = mfd
### file_id's in mfd tp be selected:
# , select_file_ids = c("Nsea_a", "Nsea_b", "Nsea_c")
, remove_blank_list = c("Blank") # file_id's in mfd that contains blank analyses
### Normalization options are "bp", "sum", "sum_ubiq", "sum_rank", "none"):
, normalization = "bp" # (default = "bp")
# , norm_int_min = 2 # minimum relative intensity in %
# , norm_int_max = 100
# , n_rank = 400 # if rank normalization is used (default = 200)
### Selection / Exclusion of formulas:
### check ume::known_mf[, .N, category] for all categoryoptions:
# , select_category = c("marine_dom")
, exclude_category = c("surfactant")
### Isotope checks:
, c_iso_check = TRUE # removes all formulas, for which no 13C isotope exists
, n_iso_check = FALSE
, s_iso_check = FALSE
### Other subsettings
# , ma_dev = 0.2 # mass accuracy threshold in +/- ppm
# , dbe_max = 2
# , dbe_o_min = 0
, dbe_o_max = 10 # Maximum of DBE minus O atoms
# , p_min = 0
, p_max = 0 # Maximum number of P atoms
# , s_min = 0, s_max = 3
# , n_min = 0, n_max = 10
# , oc_min = 0 # minimum of oxygen / carbon ratio
# , oc_max = 2.5
# , hc_min = 0, hc_max = 3
# , nc_min = 0, nc_max = 2
# , mz_min = 200, mz_max = 650
, msg = T # turn on/off messages
)
# (iii) Example figure
uplot.isotope_precision(mfd = mfd_filt, col = "redblue", col_bar = T
, z_var = "nsp_tot", tf = F)
``` ```
The most important arguments for formula filtering: ##### (ii) Formula filtering (subsetting) and normalization
```{r function_arguments, eval = T, warning = F} All available filter arguments: `help(ume_filter_formulas)`
# Arguments for subsetting (filtering):
```{r example_short2, eval = F, warning=FALSE}
mfd_filt <- ume_filter_formulas(
mfd = mfd,
select_file_ids = c("Nsea_a", "Nsea_b", "Nsea_c"), # Choice of files
remove_blank_list = c("Blank"), # file_id's in mfd that contains blank analyses
normalization = "bp", # Min rel. intensity in '%'. Normalization options are "bp", "sum", "sum_ubiq", "sum_rank", "none"
norm_int_min = 2,
norm_int_max = 100,
n_rank = 400, # if rank normalization is used (default = 200)
select_category = c("marine_dom"), # Selection / Exclusion of formulas:
exclude_category = c("surfactant"), # check ume::known_mf[, .N, category] for all category options
c_iso_check = TRUE, # Isotope check: removes all formulas, for which no 13C isotope exists
n_iso_check = FALSE,
s_iso_check = FALSE,
ma_dev = 0.2, # mass accuracy threshold in +/- ppm
dbe_max = 30, # Maximum number of DBE
dbe_o_min = 0, dbe_o_max = 10, # Min/Max of DBE minus O atoms
p_min = 0, p_max = 0, # Min/Max number of P atoms
s_min = 0, s_max = 3,
n_min = 0, n_max = 10,
oc_min = 0, oc_max = 2.5,
hc_min = 0, hc_max = 3,
nc_min = 0, nc_max = 2,
mz_min = 200, mz_max = 650,
msg = T # turn on/off messages
)
```
```{r function_arguments, eval = F, warning = F, echo = F}
args(ume::filter_mf_data) args(ume::filter_mf_data)
args(ume::filter_int) args(ume::filter_int)
``` ```
...@@ -193,12 +170,12 @@ The most important arguments for formula filtering: ...@@ -193,12 +170,12 @@ The most important arguments for formula filtering:
``` ```
## 3. Visualization and statistics
## 3. Visualization and statistics
**(documentation to be expanded)** **(documentation to be expanded)**
```{r eval = F, warning=FALSE} ```{r eval = F, warning=FALSE}
# Mass spectrum # Mass spectrum
uplot.ms(pl = ume::peaklist_demo) uplot.ms(pl = ume::peaklist_demo)
...@@ -206,16 +183,21 @@ The most important arguments for formula filtering: ...@@ -206,16 +183,21 @@ The most important arguments for formula filtering:
ume::calc_data_summary(mfd = ume::mf_data_demo) ume::calc_data_summary(mfd = ume::mf_data_demo)
# Mass accuracy # Mass accuracy
#uplot.freq_ma(mfd = ume::mf_data_demo) uplot.freq_ma(mfd = ume::mf_data_demo)
# Element frequency # Element frequency
uplot.freq(mfd = ume::mf_data_demo, var = "n") uplot.freq(mfd = ume::mf_data_demo, var = "n")
# van Krevelen # van Krevelen
uplot.vk(mfd = ume::mf_data_demo) uplot.vk(mfd = ume::mf_data_demo)
# Precision isotope abundance:
uplot.isotope_precision(mfd = ume::mf_data_demo, col = "redblue", col_bar = T
, z_var = "nsp_tot", tf = F)
``` ```
## 4. Re-calibration of peaklists ## 4. Re-calibration of peaklists
Automated calibration can be performed with existing calibration lists stored in ume::known_mf. The function "ume::calc_recalibrate_ms" assigns calibrants to the peak list and analyses the mass accuracy. Three outlier tests are performed and only those assigned calibrants that pass all three tests are used for recalibration. The recalibration is based on a linear model. The function output is a list object that contains a summary on calibrants and figures that compare the calibration status before and after recalibration. For example: Automated calibration can be performed with existing calibration lists stored in ume::known_mf. The function "ume::calc_recalibrate_ms" assigns calibrants to the peak list and analyses the mass accuracy. Three outlier tests are performed and only those assigned calibrants that pass all three tests are used for recalibration. The recalibration is based on a linear model. The function output is a list object that contains a summary on calibrants and figures that compare the calibration status before and after recalibration. For example:
...@@ -257,7 +239,7 @@ Automated calibration can be performed with existing calibration lists stored in ...@@ -257,7 +239,7 @@ Automated calibration can be performed with existing calibration lists stored in
### Mass Peak List ### Mass Peak List
The mass calibrated *peak list* is the core of the UME work flow. The peak list (pl) is a table [(as R data.table)](https://www.rdocumentation.org/packages/data.table/) that contains information from one or several mass spectrometric analyses: The mass calibrated *peak list* is the core of the `ume` work flow. The peak list (pl) is a table [(as R data.table)](https://www.rdocumentation.org/packages/data.table/) that contains information from one or several mass spectrometric analyses:
- Analytical data: - Analytical data:
...@@ -271,24 +253,26 @@ The mass calibrated *peak list* is the core of the UME work flow. The peak list ...@@ -271,24 +253,26 @@ The mass calibrated *peak list* is the core of the UME work flow. The peak list
- Unique identifier for the mass spectrometric analysis (file_id) - Unique identifier for the mass spectrometric analysis (file_id)
- Unique identifier for each mass peak (peak_id) - Unique identifier for each mass peak (peak_id)
#### Example peak list (peaklist_demo) The package contains an example peak list:
`ume::peaklist_demo[1:3]`
<div style="overflow-x:auto;"> ```{r example_peaklist, results = "asis", warning=FALSE, echo = FALSE}
```{r example_peaklist, results = "asis", echo = "false", warning=FALSE} pander::pandoc.table(peaklist_demo[1:3], digits = 8)
# Render the table with knitr::kable()
knitr::kable(ume::peaklist_demo[1:3], format = "html")
``` ```
</div>
### Isotopic masses ### Isotopic masses
All calculated molecular masses in UME are based on the [NIST data](https://www.nist.gov/pml/atomic-weights-and-isotopic-compositions-relative-atomic-masses) and available as a data ressource in the UME package (masses.rda): All calculated molecular masses in `ume` are based on the [NIST data](https://www.nist.gov/pml/atomic-weights-and-isotopic-compositions-relative-atomic-masses) and available as a data ressource in the package (masses.rda).
<div style="overflow-x:auto;"> Isotope information of all elements:
```{r, results = "asis", echo = "false"}
knitr::kable(ume::masses[1:3, 1:11], format = "html") `ume::masses[]`
```{r, results = "asis", echo = FALSE}
cols <- names(masses)[!names(masses) %in% c("last_update", "valence2")]
pander::pandoc.table(masses[1:3, ..cols], digits = 8)
``` ```
</div>
### Molecular formula library ### Molecular formula library
...@@ -299,27 +283,38 @@ Molecular formula assignment in UME is based on a pre-defined molecular formula ...@@ -299,27 +283,38 @@ Molecular formula assignment in UME is based on a pre-defined molecular formula
- The atom number of each isotope contained in a given molecular formula - The atom number of each isotope contained in a given molecular formula
- The exact mass of each formula (*mass*; as taken from *masses*; s. above) - The exact mass of each formula (*mass*; as taken from *masses*; s. above)
<div style="overflow-x:auto;"> Demo formula library:
```{r, results = "asis", echo = "false"}
knitr::kable(ume::lib_demo[1:3], format = "html") `ume::lib_demo`
```{r, results = "asis", echo = FALSE}
pander::pandoc.table(ume::lib_demo[1:3], digits = 10)
``` ```
</div>
It is important to consider that the formula assignment process fundamentally depends on the content of the formula library. Predefined libraries are available on the original [UME gitlab repository](https://gitlab.com/BorisKoch/ultramassexplorer/-/tree/master/lib) but can also be constructed using an R script: ## 5. Create a custom molecular formula library
It is important to consider that the formula assignment process fundamentally depends on the content of the formula library. Predefined libraries are available on the original [UME gitlab repository](https://gitlab.com/BorisKoch/ultramassexplorer/-/tree/master/lib).
Custom libraries can also be constructed:
```{r eval = F} ```{r eval = F}
ultramassmf <- create_ume_formula_library(max_mass = 50)
ume_custom_library <- create_ume_formula_library(max_mass = 50)
``` ```
## 5. UME core functions
**(documentation to be expanded)**
### Double bond equivalent (DBE) ```{r, eval = F, echo = FALSE}
Calculates DBE for a given formula. Uses isotope masses and element valences defined in *ume::masses*. ## 5. UME core functions
#**(documentation to be expanded)**
### Double bond equivalent (DBE)
```{r, eval = T} # Calculates DBE for a given formula. Uses isotope masses and element valences defined in *masses.rda*.
DT::renderDataTable(ume::mf_data_demo[, dbe:=ume::calc_dbe(mfd = ume::mf_data_demo)])
``` ```
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment