Metadata for GSE8671

Hi!

Does @yvesgaetan or anyone have any tips on where to locate and download the metadata set for GSE8671 and how to read that metadata set in R?

Thanks!

If you go to the GEO Accession Display, click on the .TXT file (series matrix) and it should be all the stuff above the actual data (like descriptions). @yvesgaetan correct me if i’m wrong :slight_smile:

Hmm I tried that but when I ran it in R it gave me this error:

Any advice on how to move forward?

try this: meta <- read.delim(‘GSE8671_series_matrix.txt’, header = TRUE, sep = “\t”)

In order to add the metadata, the best thing to do is to download the series_mstrix.txt.gz file from GEO (as @annieanand correctly said). Then WITHOUT unzipping the file you should load this file into R.

Here is an example on how to do it:

# load library
library("GEOquery")

# load series matrix file
gse=getGEO(filename="GSE32323_series_matrix.txt.gz")

# do your annotation
...

# do your filtering
...

# before Limma, create a new ExpressionSet object
gset <- ExpressionSet(assayData=as.matrix(final_data)) # final_data is the your normalised expression matrix having as row the gene symbols

# take the metadata from clean AffyBatch object (clean = No outliers)
gset@phenoData@data <- gse@phenoData@data

# Use this new ExpressionSet object in limma

Yves

Thank you! I was successfully able to load the unzipped meta data file.

But I’m a little confused as to how to create a clean AffyBatch obejct / how to remove outliers from an AffyBatch. I had removed the outliers after converting the raw data to a dataframe.

Hey V!

To remove the outliers you have to do as in the following example:

# suppose your outliers are sample 1,11,34
# raw is the name of your affyBatch object

raw <- raw[-c(1,11,34),]

Let me know if it works,

Yves

Hi Yves,

Yes that makes so much sense. Thank you for your help!