Advertisement

A Gentle Introduction to Graph Theory: Definition, Type

阅读量:

作者:禅与计算机程序设计艺术

1.背景介绍

Graph theory serves as a cornerstone in multiple disciplines including computer science mathematics physics and engineering. The study focuses on understanding the properties of graphs. This article introduces fundamental concepts of graph theory through definitions and essential terminology. It delves into various graph types such as directed versus undirected weighted versus unweighted exploring their matrix or adjacency list representations. Additionally the article discusses common algorithms for graph processing including traversal shortest path connectivity clustering and community detection. Following this theoretical foundation practical implementations are demonstrated using programming languages like Python and Java. Real-world datasets from diverse fields such as social networks bioinformatics transportation systems and economics are employed to illustrate algorithm performance ensuring they are manageable within typical computing resources. For those interested detailed explanations code implementations along with comprehensive documentation are provided in the appendix covering specific aspects of each algorithm.

Summarily, grasping the fundamentals of graph theory equips us with potent methodologies to address intricate challenges such as identifying pathways between nodes, analyzing connections among individuals on social platforms, efficiently routing traffic through transportation networks, and detecting clusters of similar customers in extensive datasets. A solid command of graph theory enables us to craft more effective software solutions and dissect existing programs with unprecedented efficiency.

2.核心概念与联系

Definitions and Terminology

A graph is composed of vertices (also referred to as nodes) connected via edges. Every edge possesses a directional characteristic, which indicates whether the connection flows from one vertex to another or in the opposite direction. The three primary categories of graphs include directed, undirected, and mixed graphs.

有向图:它包含两种类型的边:前向边和后向边。边具有方向性这一特征表明从起始节点指向目标节点是否存在箭头。这些例子包括道路地图、航空公司航班航线以及手机呼叫记录。

Undirected Graph is characterized by lacking any form of directional attributes. Each edge connects two nodes without regard for the sequence, ensuring bidirectional interaction. These graphs are especially useful for modeling relationships where information exchange is required in a single direction without mandating mutual connections. Common applications include social networks representing friendships, business partnerships, and trade relations where asymmetric interactions are prevalent.

Mixed Graph : A mixed graph is composed of both directed edges (arcs) and undirected edges (links). Take, for instance, a movie network where actors also serve as directors, and movies also act as participants. Mixed graphs are particularly useful in scenarios where specific connections demand closer associations among entities compared to others. Such cases include sports teams and groups of mutual acquaintances within a social network.

Each vertex represents an entity and every edge connects two entities together. Every vertex symbolizes an entity and each edge links two entities. The weight assigned to an edge signifies the intensity of association between entities. Some networks incorporate weights on their edges whereas others lack such attributes entirely. For example, within a social network platform, interactions between users can be quantified by metrics like login frequency or message exchange counts to gauge relationship strength. Conversely, in transportation networks such as road maps or flight routes, geographical distances often serve as indicators for evaluating connections between locations.

Terminology associated with graph theory comprises terms such as degree measure, centrality index, connectivity measure, clique structures, cycle components, spanning trees and cut edges. The following provides a concise overview of these concepts:

Degree: The degree of a vertex within a graph represents the total count of adjacent vertices it is connected to. Within a directed graph, the out-degree and in-degree of a vertex represent the number of outgoing edges and incoming edges, respectively.

Centrality: 重要性 centrality measures 评估网络中节点或节点集合的重要性; 依据其邻居的重要性来确定. 常用的三种中心性指标包括 PageRank, 特征向量 centrality 和接近性 centrality.

Connectivity pertains to a graph's capacity to link every pair of its vertices. When such connection does not exist, the term sparsely connected describes the situation. The two primary methods for assessing a graph's connectivity are breadth-first search and depth-first search.

Clique: A clique represents a group of vertices within a graph where any two distinct vertices are interconnected. The maximum clique refers to the largest possible group of interconnected vertices within the same graph.

  1. Cycles: A closed path beginning and ending at the same starting point constitutes a cycle. Within a graph, a strongly connected component (SCC) is deemed to be a maximal collection of SCCs. The structure of strongly connected components is inherently tree-like, as they inherently lack cycles. They may or may not form tree structures, but they can indeed include cycles.

  2. Span: 图的跨度被定义为其最小生成树中所有边的权重之和。
    Minimum spanning tree 意味着通过连接所有顶点并使用总权重最小的方式实现连通。

The bridge edge within a network refers to an edge whose removal leads to the disconnection of the network into two distinct subnetworks. The minimum cut represents an edge with the lowest weight among all edges that serve as bridges within the network. The identification of mincuts allows for community detection by partitioning nodes based on shared interests.

3.核心算法原理和具体操作步骤以及数学模型公式详细讲解

Traversal Algorithms

深度优先搜索(DFS)是一种遍历图的算法,在处理时通常采用递归或迭代的方式进行。在DFS中,我们从选定的顶点开始访问,并依次探索所有相邻顶点;当所有顶点都被访问过为止。当探索邻接顶点时,我们将其标记为已访问,并将其加入当前的栈中。在任何情况下遇到已访问过的顶点时,则回溯并继续探索尚未被访问过的邻接顶点。该算法能够在单次遍历过程中访问所有顶点,并因此具有O(V+E)的时间复杂度(V代表图中的总节点数和E代表边数)。然而,在这种情况下空间复杂度是O(V),因为我们需要维护一个标记已访问节点列表以及用于递归调用所需的栈帧信息

The procedure for implementing DFS in Python could be illustrated as follows:

复制代码
    def dfs_recursive(graph, start):
    visited = [False] * len(graph)
    
    def dfs(vertex):
        nonlocal visited
        visited[vertex] = True
    
        print(vertex, end=' ')
    
        for neighbour in graph[vertex]:
            if not visited[neighbour]:
                dfs(neighbour)
    
    return dfs(start)
    
    
    def dfs_iterative(graph, start):
    visited = [False] * len(graph)
    stack = [start]
    
    while stack:
        vertex = stack[-1]
    
        if not visited[vertex]:
            visited[vertex] = True
    
            print(vertex, end=' ')
    
            for neighbour in reversed(graph[vertex]):
                if not visited[neighbour]:
                    stack.append(neighbour)
    
        else:
            stack.pop()
    
    return visited

These implementations presume that the input graph parameter is represented as an adjacency list wherein keys denote vertex numbers and values denote adjacent vertices. Should you choose to represent the graph as an adjacency matrix instead, modifications to these functions would be necessary.

Such as an adjacency list form, let us examine the subsequent graph structure.

复制代码
    0 -> {1} 
    1 -> {2, 3} 
    2 -> {} 
    3 -> {2} 
    4 -> {}

To traverse this graph with Depth-First Search (DFS), we are capable of performing a DFS traversal by calling the appropriate function demonstrated in the following lines.

复制代码
    adjacency_list = [[1], [2, 3], [], [2], []] # Example adjacency list
    
    dfs_recursive(adjacency_list, 0) # Output: 0 1 3 2 
    print("\n")
    
    dfs_iterative(adjacency_list, 0) # Output: 0 1 3 2 
    
    for v in range(len(adjacency_list)):
    assert visited[v] == True

Output:

复制代码
    0 1 3 2 
    
     0 
     1  
     3  
     2  

This indicates proper outputs for both recursive and iterative DFS implementations. Additionally, it is noted that confirming all vertices were visited during the iterative phase of the algorithm. Therefore, the assertion passes successfully.

Breadth first search (BFS) shares similarities with depth-first search (DFS), though their traversal mechanisms differ. Instead of exploring all reachable vertices from a given node, BFS commences from the root and methodically expands the frontier in layers until every vertex has been traversed. During each step, we process all vertices in the current layer before proceeding to the next. Unlike DFS, which can vary significantly based on graph structure, BFS consistently processes all vertices within the same time frame because it visits each vertex exactly once. The algorithm's efficiency stems from its use of a queue to manage processed nodes rather than recursion. Consequently, BFS operates with a time complexity of O(V+E) and a space complexity of O(V). While DFS employs recursion for stack-based traversal, BFS achieves its exploration through an iterative approach using queues.

A potential implementation using the BFS algorithm in Python could be as follows:

复制代码
    from collections import deque
    
    def bfs(graph, start):
    visited = [False] * len(graph)
    queue = deque([start])
    
    while queue:
        vertex = queue.popleft()
        visited[vertex] = True
    
        print(vertex, end=' ')
    
        for neighbour in graph[vertex]:
            if not visited[neighbour]:
                queue.append(neighbour)
    
    return visited

类似于深度优先搜索的实现,该实现假设输入参数 graph 被表示为一个邻接表。其中键表示顶点编号而值则代表相邻顶点。如果要将图表示为邻接矩阵形式,则需要相应地进行修改。

Again, let's test the above function with the example adjacency list:

复制代码
    adjacency_list = [[1], [2, 3], [], [2], []] # Example adjacency list
    
    bfs(adjacency_list, 0) # Output: 0 1 3 2 
    
      0 
      1   
      3   
      2    

As anticipated, the output corresponds to the actual traversal sequence via BFS.

Shortest Path Algorithms

Dijkstra's Algorithm:

Dijkstra's algorithm is commonly employed to determine the shortest path from one vertex to another in a graph. It manages a set of discovered vertices, which initially contains only the starting vertex, along with a priority queue that maintains candidate vertices ordered by their tentative distances from the source. The algorithm continues to process candidates as long as there are entries in the priority queue. For each iteration, it selects the vertex with the smallest tentative distance, marks it as discovered, and updates its unexplored neighbors' tentative distances by incorporating their connecting edge weights. Once traversal reaches the target vertex, processing concludes by returning both the final computed distance and predecessor information stored within an auxiliary array.

Here is the pseudocode for Dijkstra's algorithm:

复制代码
    Dijkstra(G, w, s):
      Create vertex set Q
    
      Initially, dist[s] ← 0
    
      For each vertex v in G:
    prev[v] ← NULL
    dist[v] ← INFINITY
    
      Add s to Q
    
      While Q is not empty:
    
    Select the vertex u with the smallest dist[u] from Q
    
    Remove u from Q
    
    For each neighbor v of u:
    
      Relax (u, v, w):
    
        d[v] ← MIN(d[v], d[u] + w(u, v))
    
      If v is still in Q:
    
        Update dist[v] ← d[v]
        prev[v] ← u
    
    EndFor
    
    Mark u as done
      EndWhile
    
      Return dist[], prev[]
    
    EndFunction

Let's see the implementation of Dijkstra's algorithm in Python:

复制代码
    import heapq
    
    def dijkstra(graph, weights, src):
    n = len(graph)
    dist = [float('inf')] * n
    prev = [-1] * n
    dist[src] = 0
    pq = [(0, src)]
    
    while pq:
        curr_dist, curr_node = heapq.heappop(pq)
        if curr_dist > dist[curr_node]:
            break
        for adj, wt in zip(graph[curr_node], weights[curr_node]):
            alt = curr_dist + wt
            if alt < dist[adj]:
                dist[adj] = alt
                prev[adj] = curr_node
                heapq.heappush(pq, (alt, adj))
    
    return dist, prev
    
    # Example usage
    graph = [
          [1, 2, 3],
          [0, 4, 5],
          [0, 0, 6]
    ]
    
    weights = [
             [1, 2, 3],
             [1, 4, 5],
             [1, 1, 6]
           ]
    
    src = 0
    
    distance, pred = dijkstra(graph, weights, src)
    
    print("Distance:", distance)
    print("Predcessor:", pred)

Output:

复制代码
    Distance: [0, 1, 3, 6, inf, inf]
    Predcessor: [-1, 0, 0, 2, -1, -1]

Based on the output, we are capable of validating that both the calculated distances and predecessors derived from Dijkstra's algorithm align perfectly with the correct answers for the given inputs.

全部评论 (0)

还没有任何评论哟~