Title: | Data to Illustrate OOMPA Algorithms |
---|---|
Description: | This is a data-only package to provide example data for other packages that are part of the "Object-Oriented Microrray and Proteomics Analysis" suite of packages. These are described in more detail at the package URL. |
Authors: | Kevin R. Coombes |
Maintainer: | Kevin R. Coombes <[email protected]> |
License: | Apache License (== 2.0) |
Version: | 3.1.4 |
Built: | 2024-12-10 02:55:49 UTC |
Source: | https://github.com/r-forge/oompa |
This data set provides experimental and clinical information about the (partial) prostate cancer data set included for demonstration purposes as part of the tail.rank.test package. The experiments were two-color glass microarrays printed at Stanford.
data(clinical.info)
data(clinical.info)
A data frame with 112 observations on the following 6 variables.
A factor containing the barcode of the microarray on which the experiment was performed. Each of the 112 entries should be distinct.
A factor describing the reference sample used in each experiment. This was a common reference, so the identifiers here are not meaningful.
A factor identifying the test sample in each experiment. These match the codes published in the original paper.
A factor with three levels identifying normal prostate
(N
), prostate cancer (T
), or lymph node metastasis
(L
).
A factor with five levels: I
II
III
N
O
. These correspond to the groups found
in the original paper using clustering.
a factor with levels new
or old
. At
least two different print designs of microarrays were used in this
experiment; this factor identifies the design.
The data was originally described in the paper by Lapointe et al., and downloaded from the Stanford Microarray Database https://bio.tools/stanfordmicroarraydb.
Lapointe J et al. (2004) Gene expression profiling identifies clinically relevant subtypes of prostate cancer. Proc Natl Acad Sci U S A, 101, 811–816.
data(clinical.info) summary(clinical.info)
data(clinical.info) summary(clinical.info)
A subset of the microarray data from a study of prostate cancer at Stanford is supplied as demo data with the tail.rank.test package.
data(expression.data)
data(expression.data)
A data frame with 2000 observations on the 112 variables.
Each column represent a different patient sample, as
described in the accompanying data.frame called
clinical.info
.
This data set contains normalized microarray expression data on 2000 randomly selected genes from a prostate cancer data set. The study was originially described in a publication by Lapointe et al. The experiments were performed on two-color glass microarrays printed at Stanford and available from the Stanford Microarray Database. We downloaded the raw data and preprocessed it. In particular,after background correction and loess normalization, we computed log ratios between the channels. We then randomly selected 2000 of the 42129 spots to include as demonstration data here.
https://bio.tools/stanfordmicroarraydb
Lapointe J et al. (2004) Gene expression profiling identifies clinically relevant subtypes of prostate cancer. Proc Natl Acad Sci U S A, 101, 811–816.
data(expression.data) summary(expression.data)
data(expression.data) summary(expression.data)
This data set provides information about the genes included with the (partial) prostate cancer data set as part of the tail.rank.test package.
data(gene.info)
data(gene.info)
A data frame with 2000 observations on the following 6 variables.
a numeric vector; where is this clone spotted on the old arrays
a numeric vector; where is this clone spotted on the new arrays
a factor; the IMAGE clone identifier
a factor; the official gene symbol
a factor; the UniGene cluster number
a factor; the GenBanlk accession number
The data was originally described in the paper by Lapointe et al., and downloaded from the Stanford Microarray Database https://bio.tools/stanfordmicroarraydb. We randomly selected 2000 of the 42129 spots to include as demonstration data here.
Lapointe J et al. (2004) Gene expression profiling identifies clinically relevant subtypes of prostate cancer. Proc Natl Acad Sci U S A, 101, 811–816.
clinical.info
,
expression.data
data(gene.info) summary(gene.info)
data(gene.info) summary(gene.info)
This data set contains clinical annotations and the log expression of 150 genes for a set of 444 lung cancer patients. The 150 genes were selected randomly from a larger Affymetrix U133A dataset.
data(lungData)
data(lungData)
A data matrix (lung.dataset
) containing the log
expression of 150 genes (rows) in 444 lung tumor samples (columns),
along with a data frame (lung.clinical) containing clinical
annotations of the patients.
Supporting data for the Nature Medicine paper by Shedden et al. was downloaded from the (now defunct) caArray web site. The original data used to be available by FTP from the NIH, but can now only be found in the Gene Expression Omnibus at https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE68571. The data were log transformed by mapping the expression value x to log2(1+x). A subset of genes and of clinical annotation columns were selected to form this data set.
Abrams ZB, Zucker M, Wang M, Asiaee Taheri A, Abruzzo LV, Coombes KR.
Thirty biologically interpretable clusters of transcription
factors distinguish cancer type.
BMC Genomics. 2018 Oct 11;19(1):738. doi: 10.1186/s12864-018-5093-z.
Asiaee A, Abrams ZB, Nakayiza S, Sampath D, Coombes KR.
Explaining Gene Expression Using Twenty-One MicroRNAs.
J Comput Biol. 2020 Jul;27(7):1157-1170. doi: 10.1089/cmb.2019.0321.