Browsing by Subject "Databases, Factual"
Now showing 1 - 2 of 2
- Results Per Page
- Sort Options
Item Open Access A comprehensive methodology for determining the most informative mammographic features(2013) Wu, Y.; Alagoz O.; Ayvaci, M.U.S.; Munoz Del Rio, A.; Vanness, D.J.; Woods, R.; Burnside, E.S.This study aims to determine the most informative mammographic features for breast cancer diagnosis using mutual information (MI) analysis. Our Health Insurance Portability and Accountability Act-approved database consists of 44,397 consecutive structured mammography reports for 20,375 patients collected from 2005 to 2008. The reports include demographic risk factors (age, family and personal history of breast cancer, and use of hormone therapy) and mammographic features from the Breast Imaging Reporting and Data System lexicon. We calculated MI using Shannon's entropy measure for each feature with respect to the outcome (benign/malignant using a cancer registry match as reference standard). In order to evaluate the validity of the MI rankings of features, we trained and tested naïve Bayes classifiers on the feature with tenfold cross-validation, and measured the predictive ability using area under the ROC curve (AUC). We used a bootstrapping approach to assess the distributional properties of our estimates, and the DeLong method to compare AUC. Based on MI, we found that mass margins and mass shape were the most informative features for breast cancer diagnosis. Calcification morphology, mass density, and calcification distribution provided predictive information for distinguishing benign and malignant breast findings. Breast composition, associated findings, and special cases provided little information in this task. We also found that the rankings of mammographic features with MI and AUC were generally consistent. MI analysis provides a framework to determine the value of different mammographic features in the pursuit of optimal (i.e., accurate and efficient) breast cancer diagnosis. © 2013 Society for Imaging Informatics in Medicine.Item Open Access Content-based retrieval of historical Ottoman documents stored as textual images(IEEE, 2004) Şaykol, E.; Sinop, A. K.; Güdükbay, Uğur; Ulusoy, Özgür; Çetin, A. EnisThere is an accelerating demand to access the visual content of documents stored in historical and cultural archives. Availability of electronic imaging tools and effective image processing techniques makes it feasible to process the multimedia data in large databases. In this paper, a framework for content-based retrieval of historical documents in the Ottoman Empire archives is presented. The documents are stored as textual images, which are compressed by constructing a library of symbols occurring in a document, and the symbols in the original image are then replaced with pointers into the codebook to obtain a compressed representation of the image. The features in wavelet and spatial domain based on angular and distance span of shapes are used to extract the symbols. In order to make content-based retrieval in historical archives, a query is specified as a rectangular region in an input image and the same symbol-extraction process is applied to the query region. The queries are processed on the codebook of documents and the query images are identified in the resulting documents using the pointers in textual images. The querying process does not require decompression of images. The new content-based retrieval framework is also applicable to many other document archives using different scripts.