Browsing by Subject "Social networks"
Now showing 1 - 20 of 20
- Results Per Page
- Sort Options
Item Open Access An analysis of social networks based on tera-scale telecommunication datasets(IEEE Computer Society, 2019) Aksu, Hidayet; Körpeoğlu, İbrahim; Ulusoy, ÖzgürWith the popularization of mobile phone usage, telecommunication networks have turned into a socially binding medium. Considering the traces of human communication held inside these networks, telecommunication networks are now able to provide a proxy for human social networks. To study degree characteristics and structural properties in large-scale social networks, we gathered a tera-scale dataset of call detail records that contains ≈ 5 × 10 7 nodes and ≈ 3.6 × 10 10 links for three GSM (mobile) networks, as well as ≈ 1.4 × 10 7 nodes and ≈ 1.9 × 10 9 links for one PSTN (fixed-line) network. In this paper, we first empirically evaluate some statistical models against the degree distribution of the country's call graph and determine that a Pareto log-normal distribution provides the best fit, despite claims in the literature that power-law distribution is the best model. We then question how network operator, size, density, and location affect degree distribution to understand the parameters governing it in social networks. Our empirical analysis indicates that changes in density, operator and location do not show a particular correlation with degree distribution; however, the average degree of social networks is proportional to the logarithm of network size. We also report on the structural properties of the communication network. These novel results are useful for managing and planning communication networks.Item Open Access Analyzing developer contributions using artifact traceability graphs(Springer New York LLC, 2022-03-28) Çetin, H. Alperen; Tüzün, ErayContext In a software project, properly analyzing the contributions of developers could provide valuable insights for decision-makers. The contributions of a developer could be in many different forms such as committing and reviewing code, opening and resolving issues. Previous approaches mainly consider the commit-based contributions which provide an incomplete picture of developer contributions. Objective Different from the traditional commit-based approaches for analyzing developer contributions, we aim to provide a more holistic approach to reflect the rich set of software development activities using artifact traceability graphs. Method For analyzing the developer contributions, we propose a novel categorization of developers (Jacks, Mavens and Connectors) in a software project. We introduce a set of algorithms on artifact traceability graphs to identify key developers, recommend replacements for leaving developers and evaluate knowledge distribution among developers. Results We evaluate our proposed algorithms on six open-source projects and demonstrate that the identified key developers match the top commenters up to 98%, recommended replacements are correct up to 91% and identified knowledge distribution labels are compatible 94% on average with the baseline approaches. Conclusions The proposed algorithms using artifact traceability graphs for analyzing developer contributions could be used by software project decision-makers in several scenarios. (1) Identifying different types of key developers. (2) Finding a replacement developer in large teams. (3) Evaluating the overall knowledge distribution amongst developers to take early precautions.Item Open Access Analyzing developer contributions using artifact traceability graphs(2020-12) Çetin, Hamdi AlperenSoftware artifacts are the by-products of the development process. Throughout the life cycle of a project, developers produce different artifacts such as source files and bug reports. To analyze developer contributions, we construct artifact traceability graphs with these artifacts and their relations using the data from software development and collaboration tools. Developers are the main resource to build and maintain software projects. Since they keep the knowledge of the projects, developer turnover is a critical risk for software projects. From different viewpoints, some developers can be valuable and indispensable for the project. They are the key developers of the project, and identifying them is a crucial task for managerial decisions. Regardless of whether they are key developers or not, when developers leave the project, their work should be transferred to other developers. Even though all developers continue to work on the project, the knowledge distribution can be imbalanced among developers. Evaluating knowledge distribution is important since it might be an early warning for future problems. We employ algorithms on artifact traceability graphs to identify key develop-ers, recommend replacements for leaving developers and evaluate knowledge distribution among developers. We conduct experiments on six open source projects: Hadoop, Hive, Pig, HBase, Derby and Zookeeper. Then, we demonstrate that the identified key developers match the top commenters up to 98%, recommended replacements are correct up to 91% and identified knowledge distribution labels are compatible with the baseline approach up to 94%.Item Open Access Cascade-aware partitioning of large graph databases(Springer, 2019) Demirci, Gündüz Vehbi; Ferhatosmanoğlu, H.; Aykanat, CevdetGraph partitioning is an essential task for scalable data management and analysis. The current partitioning methods utilize the structure of the graph, and the query log if available. Some queries performed on the database may trigger further operations. For example, the query workload of a social network application may contain re-sharing operations in the form of cascades. It is beneficial to include the potential cascades in the graph partitioning objectives. In this paper, we introduce the problem of cascade-aware graph partitioning that aims to minimize the overall cost of communication among parts/servers during cascade processes. We develop a randomized solution that estimates the underlying cascades, and use it as an input for partitioning of large-scale graphs. Experiments on 17 real social networks demonstrate the effectiveness of the proposed solution in terms of the partitioning objectives.Item Open Access The effect of social networks on the quality of political thinking(Wiley-Blackwell Publishing, Inc., 2012) Erisen, E.; Erisen, C.In this article we investigate the effect of social networks on the quality of political thinking. First, the article introduces new social network concepts into the literature and develops the corresponding measures. Second, the article explores the quality of political thinking as a concept and develops its measures based on the volume and the causality of thoughts, and their integrative complexity. We make use of a survey to collect information on social networks and the experimental manipulation controls for the effect of policy frames. Our findings consistently show the significant negative impact of cohesive social networks on the quality of policy-relevant thinking. We conclude that close-knit social networks could create "social bubbles" that would limit how one communicates with others and reasons about politics. © 2012 International Society of Political Psychology.Item Open Access Efficient quantification of profile matching risk in social networks using belief propagation(Springer Science and Business Media Deutschland GmbH, 2020) Halimi, A.; Ayday, Erman; Chen, L.; Li, N.; Liang, K.; Schneider, S.Many individuals share their opinions (e.g., on political issues) or sensitive information about them (e.g., health status) on the internet in an anonymous way to protect their privacy. However, anonymous data sharing has been becoming more challenging in today’s interconnected digital world, especially for individuals that have both anonymous and identified online activities. The most prominent example of such data sharing platforms today are online social networks (OSNs). Many individuals have multiple profiles in different OSNs, including anonymous and identified ones (depending on the nature of the OSN). Here, the privacy threat is profile matching: if an attacker links anonymous profiles of individuals to their real identities, it can obtain privacy-sensitive information which may have serious consequences, such as discrimination or blackmailing. Therefore, it is very important to quantify and show to the OSN users the extent of this privacy risk. Existing attempts to model profile matching in OSNs are inadequate and computationally inefficient for real-time risk quantification. Thus, in this work, we develop algorithms to efficiently model and quantify profile matching attacks in OSNs as a step towards real-time privacy risk quantification. For this, we model the profile matching problem using a graph and develop a belief propagation (BP)-based algorithm to solve this problem in a significantly more efficient and accurate way compared to the state-of-the-art. We evaluate the proposed framework on three real-life datasets (including data from four different social networks) and show how users’ profiles in different OSNs can be matched efficiently and with high probability. We show that the proposed model generation has linear complexity in terms of number of user pairs, which is significantly more efficient than the state-of-the-art (which has cubic complexity). Furthermore, it provides comparable accuracy, precision, and recall compared to state-of-the-art. Thanks to the algorithms that are developed in this work, individuals will be more conscious when sharing data on online platforms. We anticipate that this work will also drive the technology so that new privacy-centered products can be offered by the OSNs.Item Open Access Embracing American culture: structures of social identity and social networks among first-generation biculturals(Sage Publications, 2007) Mok, A.; Morris, M. W.; Benet-Martínez, V.; Karakitapoğlu-Aygün, Z.This study examines the relationship between bicultural individuals' identity structure and their friendship network. A key dimension of identity structure for first-generation immigrants is the degree to which the secondary, host-culture identity is integrated into the primary, ethnic identity. Among first-generation Chinese Americans, regression analyses controlling for cultural identification strengths show that more integrated identity structures are associated with larger and more richly interconnected circles of non-Chinese friends.Item Open Access Estimating network structure via random sampling: cognitive social structures and the adaptive threshold method(Elsevier, 2012-10) Siciliano, M. D.; Yenigun, D.; Ertan, G.This paper introduces and tests a novel methodology for measuring networks. Rather than collecting data to observe a network or several networks in full, which is typically costly or impossible, we randomly sample a portion of individuals in the network and estimate the network based on the sampled individuals' perceptions on all possible ties. We find the methodology produces accurate estimates of social structure and network level indices in five different datasets. In order to illustrate the performance of our approach we compare its results with the traditional roster and ego network methods of data collection. Across all five datasets, our methodology outperforms these standard social network data collection methods. We offer ideas on applications of our methodology, and find it especially promising in cross-network settings.Item Unknown Identifying key developers using artifact traceability graphs(Association for Computing Machinery, 2020) Çetin, H. Alperen; Tüzün, Eray; Minku, L.; Menzies, T.; Nagappan, M.Developers are the most important resource to build and maintain software projects. Due to various reasons, some developers take more responsibility, and this type of developers are more valuable and indispensable for the project. Without them, the success of the project would be at risk. We use the term key developers for these essential and valuable developers, and identifying them is a crucial task for managerial decisions such as risk assessment for potential developer resignations. We study key developers under three categories: jacks, mavens and connectors. A typical jack (of all trades) has a broad knowledge of the project, they are familiar with different parts of the source code, whereas mavens represent the developers who are the sole experts in specific parts of the projects. Connectors are the developers who involve different groups of developers or teams. They are like bridges between teams. To identify key developers in a software project, we propose to use traceable links among software artifacts such as the links between change sets and files. First, we build an artifact traceability graph, then we define various metrics to find key developers. We conduct experiments on three open source projects: Hadoop, Hive and Pig. To validate our approach, we use developer comments in issue tracking systems and demonstrate that the identified key developers by our approach match the top commenters up to 92%.Item Unknown Identifying the most valuable developers using artifact traceability graphs(Association for Computing Machinery, Inc, 2019) Çetin, H. Alperen; Apel, S.; Dumas, M.; Russo, A.; Pfahl, D.Finding the most valuable and indispensable developers is a crucial task in software development. We categorize these valuable developers into two categories: connector and maven. A typical connector represents a developer who connects different groups of developers in a large-scale project. Mavens represent the developers who are the sole experts in specific modules of the project. To identify the connectors and mavens, we propose an approach using graph centrality metrics and connections of traceability graphs. We conducted a preliminary study on this approach by using two open source projects: QT 3D Studio and Android. Initial results show that the approach leads to identify the essential developers.Item Unknown Integrating social factors into mobile local search(2015-08) Kahveci, BasriAs availability of internet access on mobile devices develops year after year, users have been able to make use of mobile internet and search services while on the go. Location information on these devices has enabled mobile users to utilize local search applications for discovering places and activities around them. Although mobile local search is a kind of search activity, it is inherently di erent than general web search. Mobile local search focuses on local businesses and points of interest, instead of web pages as in general web search. Moreover, users' context has a signi cant e ect on their decision process. In previous studies, ranking signals and user context have been investigated on a small set of features. We extend ranking signals and user context in mobile local search with using data of location-based social networks. We developed a mobile local search application, Gezinio, and collected a data set of local search queries. Gezinio helps users to issue local queries and see various kinds of social information about local businesses around them. We built ranking models and investigated how social features a ect decision process of users. We show that social features in uence users' click decisions and they can be utilized by ranking models to improve the local search experience. Additionally, we propose di erent social features for di erent query categories.Item Unknown Learning from failures: Director interlocks and corporate misconduct(Elsevier BV, 2022-10-20) Wang, Z.; Yao, S.; Şensoy, Ahmet; Goodell, J. W.; Cheng, F.Motivated by social learning and social network theories, we argue that firms learn from failures in their director interlocked firms. Empirical results show that enforcement for violations in errant firms inhibit misconduct commitments in focal firms (i.e., firms interlocked with errant firms). We investigate the role of interlocking directors in facilitating the inhibition of misconduct. Empirical results evidence that information transmission by interlocking directors plays a crucial role in the process of inhibitive learning. Besides information transmission, we also find that interlocking directors react with higher diligence in focal firms. Further, overall diligence of independent directors in focal firms is heightened. Additionally, we test several factors that influence the significance of this inhibition, including characteristics of interlocking directors, firm features, and industry characters. Finally, the enforcement can deter more than one form of misconduct in focal firms. Overall, we thoroughly investigate the reactions of focal firms and their directors. Our study focuses on inhibitive learning, which has received limited attention in corporate finance literature.Item Unknown Mathematical model of causal inference in social networks(IEEE, 2016) Şimsek, Mustafa; Delibalta, İ.; Baruh, L.; Kozat, Süleyman SerdarIn this article, we model the effects of machine learning algorithms on different Social Network users by using a causal inference framework, making estimation about the underlying system and design systems to control underlying latent unobservable system. In this case, the latent internal state of the system can be a wide range of interest of user. For example, it can be a user's preferences for some certain products or affiliation of the user to some political parties. We represent these variables using state space model. In this model, the internal state of the system, e.g. the preferences or affiliations of the user is observed using user's connections with the Social Networks such as Facebook status updates, shares, comments, blogs, tweets etc.Item Unknown Misinformation detection by leveraging user communities on social media(2024-05) Özçelik, OğuzhanSocial media platforms have become a primary source of accessing information. However, the spread of misinformation is inevitable due to the ease of creating and sharing malicious content, including fake news. Social media users on such platforms (e.g., Twitter) often find themselves exposed to similar viewpoints and tend to avoid contrasting opinions, particularly when connected within a community. To investigate this problem, we examine the presence of user communities and leverage them as a tool to detect misinformation on social media. In this thesis, we first collect tweets together with user engagements relevant to recent events between 2020 and 2022. We then construct a human-annotated social media dataset having 5,284 English and 5,064 Turkish tweets with their veracity labels. After the data construction process, we leverage the presence of user communities for misinformation detection on social media. For this purpose, we propose a text similarity-based method that utilizes user-follower interactions within a social network to identify misinformation content. Our method first extracts important textual features of social media posts using contrastive learning. We then measure the similarity for each social media post, based on its relevance to each user in the community. Next, we train a classifier to assess the truthfulness of social media posts using these similarity scores. We evaluate our approach on three social media datasets and compare our method with the state-of-the-art approaches. The experimental results show that contrastive learning and user communities can effectively enhance the detection of misinformation on social media. Our model can identify misinformation content by achieving a consistently high weighted F1 score of over 90% across all datasets, even employing only a small number of users in communities.Item Unknown A noncooperative dynamic game model of opinion dynamics in multilayer social networks(2017-08) Niazi, Muhammad Umar B.How do people living in a society form their opinions on daily or prevalent topics? A noncooperative di erential (dynamic) game model of opinion dynamics, where the agents' motives are shaped by how susceptible they are to others' in uence, how stubborn they are, and how quick they are willing to change their opinions on socially prevalent issues is considered here. The agents connected through a multilayer network interact with each other on a set of issues (layers) for a nite time duration. They express their opinions, listen to others' and, hence, mutually in uence each other. The tendency of agents to interact with people of similar traits, known as homophily, restricts them in their own localities, which may correspond to ethnicity but may as well be the ideological ones. This governs their interpersonal in uences and is the cause of clustering in the network. As the agents build their biases, they also create conceptions about the correlation between the issues. As a result, antagonistic interactions arise if the agents see each other as holding inconsistent opinions on the issues according to their individual conceptions. This way the interpersonal in uence becomes ine ective leading to con ict and disagreement between the agents. The dynamic game formulated here takes these subtle issues into account. The game is proved to admit a unique Nash equilibrium under a mild necessary and su cient condition. This condition is argued to be ful lled if there is some harmony of views among the agents in the network. The harmony may be in the form of similarity in pairwise conceptions about the issues but may also be a collective agreement on the status of a leader in the network. Since the agents do not seek any social motive in the game but their own individual motives, the existence of a Nash equilibrium can be interpreted as an emergent collective behavior out of the noncooperative actions of the agents.Item Open Access Online contextual influence maximization with costly observations(IEEE, 2019-06) Sarıtaç, Anıl Ömer; Karakurt, Altuğ; Tekin, CemIn the online contextual influence maximization problem with costly observations, the learner faces a series of epochs in each of which a different influence spread process takes place over a network. At the beginning of each epoch, the learner exogenously influences (activates) a set of seed nodes in the network. Then, the influence spread process takes place over the network, through which other nodes get influenced. The learner has the option to observe the spread of influence by paying an observation cost. The goal of the learner is to maximize its cumulative reward, which is defined as the expected total number of influenced nodes over all epochs minus the observation costs. We depart from the prior work in three aspects: 1) the learner does not know how the influence spreads over the network, i.e., it is unaware of the influence probabilities; 2) influence probabilities depend on the context; and 3) observing influence is costly. We consider two different influence observation settings: costly edge-level feedback, in which the learner freely observes the set of influenced nodes, but pays to observe the influence outcomes on the edges of the network; and costly node-level feedback, in which the learner pays to observe whether a node is influenced or not. Since the offline influence maximization problem itself is NP-hard, for these settings, we develop online learning algorithms that use an approximation algorithm as a subroutine to obtain the set of seed nodes in each epoch. When the influence probabilities are Hölder continuous functions of the context, we prove that these algorithms achieve sublinear regret (for any sequence of contexts) with respect to an approximation oracle that knows the influence probabilities for all contexts. Our numerical results on several networks illustrate that the proposed algorithms perform on par with the state-of-the-art methods even when the observations are cost free.Item Open Access Partitioning models for scaling distributed graph computations(2019-08) Demirci, Gündüz VehbiThe focus of this thesis is intelligent partitioning models and methods for scaling the performance of parallel graph computations on distributed-memory systems. Distributed databases utilize graph partitioning to provide servers with data-locality and workload-balance. Some queries performed on a database may form cascades due to the queries triggering each other. The current partitioning methods consider the graph structure and logs of query workload. We introduce the cascade-aware graph partitioning problem with the objective of minimizing the overall cost of communication operations between servers during cascade processes. We propose a randomized algorithm that integrates the graph structure and cascade processes to use as input for large-scale partitioning. Experiments on graphs representing real social networks demonstrate the e ectiveness of the proposed solution in terms of the partitioning objectives. Sparse-general-matrix-multiplication (SpGEMM) is a key computational kernel used in scienti c computing and high-performance graph computations. We propose an SpGEMM algorithm for Accumulo database which enables high performance distributed parallelism through its iterator framework. The proposed algorithm provides write-locality and avoids scanning input matrices multiple times by utilizing Accumulo's batch scanning capability and node-level parallelism structures. We also propose a matrix partitioning scheme that reduces the total communication volume and provides a workload-balance among servers. Extensive experiments performed on both real-world and synthetic sparse matrices show that the proposed algorithm and matrix partitioning scheme provide signi cant performance improvements. Scalability of parallel SpGEMM algorithms are heavily communication bound. Multidimensional partitioning of SpGEMM's workload is essential to achieve higher scalability. We propose hypergraph models that utilize the arrangement of processors and also attain a multidimensional partitioning on SpGEMM's workload. Thorough experimentation performed on both realistic as well as synthetically generated SpGEMM instances demonstrates the e ectiveness of the proposed partitioning models.Item Open Access Profile matching across online social networks(Springer Science and Business Media Deutschland GmbH, 2020) Halimi, A.; Ayday, Erman; Meng, W.; Gollmann, D.; Jensen, C. D.; Zhou, J.In this work, we study the privacy risk due to profile matching across online social networks (OSNs), in which anonymous profiles of OSN users are matched to their real identities using auxiliary information about them. We consider different attributes that are publicly shared by users. Such attributes include both strong identifiers such as user name and weak identifiers such as interest or sentiment variation between different posts of a user in different platforms. We study the effect of using different combinations of these attributes to profile matching in order to show the privacy threat in an extensive way. The proposed framework mainly relies on machine learning techniques and optimization algorithms. We evaluate the proposed framework on three datasets (Twitter - Foursquare, Google+ - Twitter, and Flickr) and show how profiles of the users in different OSNs can be matched with high probability by using the publicly shared attributes and/or the underlying graphical structure of the OSNs. We also show that the proposed framework notably provides higher precision values compared to state-of-the-art that relies on machine learning techniques. We believe that this work will be a valuable step to build a tool for the OSN users to understand their privacy risks due to their public sharings.Item Open Access SiMiD: similarity-based misinformation detection via communities on social media posts(IEEE, 2024-01-02) Özçelik, Oğuzhan; Toraman, C.; Can, FazlıSocial media users often find themselves exposed to similar viewpoints and tend to avoid contrasting opinions, particularly when connected within a community. In this study, we leverage the presence of communities in misinformation detection on social media. For this purpose, we propose a similarity-based method that utilizes user-follower interactions within a social network to identify and combat misinformation spread. The method first extracts important textual features of social media posts via contrastive learning and then measures the cosine similarity per social media post based on their relevance to each user in the community. Next, we train a classifier to assess the truthfulness of social media posts using these similarity scores. We evaluate our approach on three real-world datasets and compare our method with six baselines. The experimental results and statistical tests show that contrastive learning and leveraging communities can effectively enhance the detection of misinformation on social media.Item Open Access Social networks and credit access in Indonesia(Pergamon Press, 2004) Okten, C.; Osili, U. O.In this paper, we investigate how family and community networks affect an individual's access to credit institutions using new data from the Indonesia Family Life Surveys. Our theoretical framework emphasizes the family and community's role in providing information, thus lowering the search costs of the borrower and monitoring and enforcement costs for the lender. From our empirical results, community and family networks are important in knowing a place to borrow, as well as for loan approval. Consistent with an information-based explanation of networks, family and community networks have a larger impact on credit awareness of new credit institutions with a lower impact on awareness of established credit sources. Interestingly, we find that women benefit from participating in community networks more than men. There is no evidence that the rich benefit from community networks more than the poor. Our results on the benefits from participation in the community network are robust to the inclusion of community fixed effects. © 2004 Elsevier Ltd. All rights reserved.