Browsing by Subject "Graph mining"

Now showing 1 - 4 of 4

Open Access
Hydra: Detecting fraud in financial transactions via graph based representation and visual analysis
(IEEE, 2020) Canbaz, Yusuf Sait; Doğrusöz, Uğur; Çeliksoy, M.; Güngör, F.; Kurban, K.
In this paper, we describe a web based tool named Hydra for analyzing financial transaction data with the aim to detect or verify fraudulent activities via visual analysis and graph based querying. Hydra exclusively uses graph based query algorithms to mine useful information in the transaction database, presents the results visually, facilitating interactive graphical analysis with state-of-the-art graph visualization technologies. We present the various components of Hydra and their aims. In addition, a number of different types of scenarios using these components for a network of prepaid card transactions are provided to illustrate the use of Hydra for detecting or verifying fraudulent activities.
Open Access
RSTrace+: Reviewer suggestion using software artifact traceability graphs
(Elsevier BV, 2021-02) Sülün, Emre; Tüzün, Eray; Doğrusöz, Uğur
Context: Various types of artifacts (requirements, source code, test cases, documents, etc.) are produced throughout the lifecycle of a software. These artifacts are connected with each other via traceability links that are stored in modern application lifecycle management repositories. Throughout the lifecycle of a software, various types of changes can arise in any one of these artifacts. It is important to review such changes to minimize their potential negative impacts. To make sure the review is conducted properly, the reviewer(s) should be chosen appropriately. Objective: We previously introduced a novel approach, named RSTrace, to automatically recommend reviewers that are best suited based on their familiarity with a given artifact. In this study, we introduce an advanced version of RSTrace, named RSTrace+ that accounts for recency information of traceability links including practical tool support for GitHub. Methods: In this study, we conducted a series of experiments on finding the appropriate code reviewer(s) using RSTrace+ and provided a comparison with the other code reviewer recommendation approaches. Results: We had initially tested RSTrace+ on an open source project (Qt 3D Studio) and achieved a top-3 accuracy of 0.89 with an MRR (mean reciprocal ranking) of 0.81. In a further empirical evaluation of 40 open source projects, we compared RSTrace+ with Naive-Bayes, RevFinder and Profile based approach, and observed higher accuracies on the average. Conclusion: We confirmed that the proposed reviewer recommendation approach yields promising top-k and MRR scores on the average compared to the existing reviewer recommendation approaches. Unlike other code reviewer recommendation approaches, RSTrace+ is not limited to recommending reviewers for source code artifacts and can potentially be used for recommending reviewers for other types of artifacts. Our approach can also visualize the affected artifacts and help the developer to make assessments of the potential impacts of change to the reviewed artifact.
Open Access
Structural scene analysis of remotely sensed images using graph mining
(2010) Özdemir, Bahadır
The need for intelligent systems capable of automatic content extraction and classi cation in remote sensing image datasets, has been constantly increasing due to the advances in the satellite technology and the availability of detailed images with a wide coverage of the Earth. Increasing details in very high spatial resolution images obtained from new generation sensors have enabled new applications but also introduced new challenges for object recognition. Contextual information about the image structures has the potential of improving individual object detection. Therefore, identifying the image regions which are intrinsically heterogeneous is an alternative way for high-level understanding of the image content. These regions, also known as compound structures, are comprised of primitive objects of many diverse types. Popular representations such as the bag-of-words model use primitive object parts extracted using local operators but cannot capture their structure because of the lack of spatial information. Hence, the detection of compound structures necessitates new image representations that involve joint modeling of spectral, spatial and structural information. We propose an image representation that combines the representational power of graphs with the e ciency of the bag-of-words representation. The proposed method has three parts. In the rst part, every image in the dataset is transformed into a graph structure using the local image features and their spatial relationships. The transformation method rst detects the local patches of interest using maximally stable extremal regions obtained by gray level thresholding. Next, these patches are quantized to form a codebook of local information and a graph is constructed for each image by representing the patches as the graph nodes and connecting them with edges obtained using Voronoi tessellations. Transforming images to graphs provides an abstraction level and the remaining operations for the classi cation are made on graphs. The second part of the proposed method is a graph mining algorithm which nds a set of most important subgraphs for the classi cation of image graphs. The graph mining algorithm we propose rst nds the frequent subgraphs for each class, then selects the most discriminative ones by quantifying the correlations between the subgraphs and the classes in terms of the within-class occurrence distributions of the subgraphs; and nally reduces the set size by selecting the most representative ones by considering the redundancy between the subgraphs. After mining the set of subgraphs, each image graph is represented by a histogram vector of this set where each component in the histogram stores the number of occurrences of a particular subgraph in the image. The subgraph histogram representation enables classifying the image graphs using statistical classi ers. The last part of the method involves model learning from labeled data. We use support vector machines (SVM) for classifying images into semantic scene types. In addition, the themes distributed among the images are discovered using the latent Dirichlet allocation (LDA) model trained on the same data. By this way, the images which have heterogeneous content from di erent scene types can be represented in terms of a theme distribution vector. This representation enables further classi cation of images by theme analysis. The experiments using an Ikonos image of Antalya show the e ectiveness of the proposed representation in classi cation of complex scene types. The SVM model achieved a promising classi cation accuracy on the images cut from the Antalya image for the eight high-level semantic classes. Furthermore, the LDA model discovered interesting themes in the whole satellite image.
Open Access
Use of subgraph mining in histopathology image classification
(2022-09) Berdiyev, Bayram
Breast cancer is the most common cancer in women and has a high mortality rate. Computer vision techniques can be used to help experts to analyze the breast cancer biopsy samples better. Graph neural networks (GNN) have been widely used to solve the classification of breast cancer images. Images in this field have varying sizes and GNNs can be applied to varying sized inputs. Graphs can store relations between the vertices of the graph and this is another reason why GNNs are preferred as a solution. We study the use of subgraph mining in classification of regions of interest (ROI) on breast histopathology images. We represent ROI samples with graphs by using patches sampled on nuclei-rich regions as the vertices of the graph. Both micro and macro level information are essential when classifying histopathology images. The patches are used to model micro-level information. We apply subgraph mining to the resulting graphs to identify frequently occurring subgraphs. Each subgraph is composed of a small number of patches and their relations, which can be used to represent higher level information. We also extract ROI-level features by applying a sliding window mechanism with larger sized patches. The ROI-level features, subgraph features and a third representation obtained from graph convolutional networks are fused to model macro-level information about the ROIs. We also study embedding the subgraphs in the graph representation as additional vertices. The proposed models are evaluated on a challenging breast pathology dataset that includes four diagnostic categories from the full spectrum. The experiments show that embedding the subgraphs in the graph representation improves the classification accuracy and the fused feature representation performs better than the individual representations in an ablation study.