Skip to content

Conversation

zhongyunWan
Copy link
Contributor

Optimize the grouping logic by using a hash table to reduce the complexity of the groupByEndpoints function from O(n²) to O(n).

@zhongyunWan
Copy link
Contributor Author

优化 groupByEndpoints 函数:
当前代码的主要性能瓶颈在于 groupByEndpoints 函数的 O(n²) 复杂度。通过使用哈希表优化分组逻辑,可以将复杂度降至 O(n),优化后在数据量越大的情况下效果越明显;

数据结构

  • edgeMap: 存储按端点分组的边集合,键为"source-target"格式
  • processedEdgesSet: 记录已处理过的边ID,避免重复处理
  • reverses: 记录反向边信息(当source和target相互颠倒时)
  • includedEdgesInGroup: 追踪每个分组中已包含的边ID

算法流程

  1. 初始扫描:
    a. 遍历所有边,为每条边创建或找到对应的分组
    b. 使用"source-target"作为分组键
    c. 将边添加到对应分组并标记为已处理
  2. 平行边查找
    a. 对于每条边,查找其所有平行边
    b. 使用isParallelEdges函数判断两条边是否平行
    c. 将找到的平行边添加到同一分组
  3. 反向边处理
    a. 特别标记反向边(如A→B和B→A)
    b. 在reverses对象中记录为"source|target|index": true

@Aarebecca Aarebecca merged commit 50688e0 into v5 Mar 12, 2025
2 checks passed
@Aarebecca Aarebecca deleted the opt-parallel-edge-algo branch March 12, 2025 04:44
@hustcc hustcc changed the title perf(transforms): reduce the complexity of groupByEndpoints for paral… perf(transforms): reduce the complexity of groupByEndpoints Mar 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants