Investigation of the effects of MAS5, RMA and GCRMA preprocessing methods on an affymetrix zebrafish genechip dataset using statistical and network parameters

Date

2010

Editor(s)

Advisor

Konu, Özlen

Supervisor

Co-Advisor

Co-Supervisor

Instructor

Source Title

Print ISSN

Electronic ISSN

Publisher

Volume

Issue

Pages

Language

English

Type

Journal Title

Journal ISSN

Volume Title

Attention Stats
Usage Stats
3
views
15
downloads

Series

Abstract

Microarray data preprocessing is an important determinant of the accuracy and repeatability of expression profiling studies. Recent studies have focused on comparison of preprocessing methodologies using differential expression analysis of spike-in datasets and qRT-PCR confirmations. Other approaches include comparison of array-wise and probe-wise correlation and of selected gene network parameters. However, zebrafish GeneChip datasets have not been used in such comparisons; furthermore, detailed analysis of upregulated and downregulated gene sets with respect to known network parameters are not well characterized across different preprocessing methodologies. In this study we re-analyzed a public zebrafish hypoxia microarray dataset (GSE4989; Marques et al. 2008) using MAS5, RMA, and gcRMA methods. Comparisons were made in terms of differentially expressed gene sets and defined network parameters, namely, clustering coefficient, degree distribution, and betwenness centrality. Our findings indicated that gcRMA and RMA exhibited greater similarity to each other in terms of differentially expressed genes, and network parameters. In addition, the network analysis demonstrated that upregulated and downregulated gene sets had distinct network structures; downregulated probesets had greater clustering coefficients and degree distributions for positively correlated probesets in all three preprocessing methods. However, gcRMA and RMA methods accentuated this difference further than MAS5 did, suggesting that preprocessing methods differ in their modulation of gene expression network structure. A selected group of probesets that showed invariant network structure parameters across RMA, gcRMA and MAS5 was determined and analyzed functionally for the zebrafish hypoxia dataset. The results of this thesis suggest that preprocessing methods may alter network structure of the datasets differentially with respect to upregulated and downregulated gene sets. Accordingly, it might be beneficial to filter differentially expressed genes that are robust to such network topology modulation to increase the repeatability of gene sets.

Course

Other identifiers

Book Title

Keywords

Degree Discipline

Molecular Biology and Genetics

Degree Level

Master's

Degree Name

MS (Master of Science)

Citation

Published Version (Please cite this version)