Search Results for “parallel algorithms for minimum spanning trees”

Source: Parallel algorithms for minimum spanning trees

In graph theory a minimum spanning tree (MST)

T

{\displaystyle T}

of a graph

G
=
(
V
,
E
)

{\displaystyle G=(V,E)}

with

|

V

|

=
n

{\displaystyle |V|=n}

and

|

E

|

=
m

{\displaystyle |E|=m}

is a tree subgraph of

G

{\displaystyle G}

that contains all of its vertices and is of minimum weight.
MSTs are useful and versatile tools utilised in a wide variety of practical and theoretical fields. For example, a company looking to supply multiple stores with a certain product from a single warehouse might use an MST originating at the warehouse to calculate the shortest paths to each company store. In this case the stores and the warehouse are represented as vertices and the road connections between them - as edges. Each edge is labelled with the length of the corresponding road connection.
If

G

{\displaystyle G}

is edge-unweighted every spanning tree possesses the same number of edges and thus the same weight. In the edge-weighted case, the spanning tree, the sum of the weights of the edges of which is lowest among all spanning trees of

G

{\displaystyle G}

, is called a minimum spanning tree (MST). It is not necessarily unique. More generally, graphs that are not necessarily connected have minimum spanning forests, which consist of a union of MSTs for each connected component.
As finding MSTs is a widespread problem in graph theory, there exist many sequential algorithms for solving it. Among them are Prim's, Kruskal's and Borůvka's algorithms, each utilising different properties of MSTs. They all operate in a similar fashion - a subset of

E

{\displaystyle E}

is iteratively grown until a valid MST has been discovered. However, as practical problems are often quite large (road networks sometimes have billions of edges), performance is a key factor. One option of improving it is by parallelising known MST algorithms.

Prim's algorithm

Kruskal's algorithm

= Approach 1: Parallelising the sorting step

= Approach 2: Filter-Kruskal

Borůvka's algorithm

parallel

= Parallelisation

p
r
e
d
[
v
]
←
e

{\displaystyle pred[v]\gets e}

Here the issue arises some vertices are handled by more than one processor. A possible solution to this is that every processor has its own

p
r
e
v

{\displaystyle prev}

array which is later combined with those of the others using a reduction. Each processor has at most two vertices that are also handled by other processors and each reduction is in

O
(
log
⁡
p
)

{\displaystyle O(\log p)}

. Thus the total runtime of this step is in

O
(

m
p

+
log
⁡
n
+
log
⁡
p
)

{\displaystyle O({\frac {m}{p}}+\log n+\log p)}

.

Assigning subgraphs to vertices

Observe the graph that consists solely of edges collected in the previous step. These edges are directed away from the vertex to which they are the lightest incident edge. The resulting graph decomposes into multiple weakly connected components. The goal of this step is to assign to each vertex the component of which it is a part. Note that every vertex has exactly one outgoing edge and therefore each component is a pseudotree - a tree with a single extra edge that runs in parallel to the lightest edge in the component but in the opposite direction. The following code mutates this extra edge into a loop:

parallel forAll

v
∈
V

{\displaystyle v\in V}

w
←
p
r
e
d
[
v
]

{\displaystyle w\gets pred[v]}

if

p
r
e
d
[
w
]
=
v
∧
v
<
w

{\displaystyle pred[w]=v\land v

p
r
e
d
[
v
]
←
v

{\displaystyle pred[v]\gets v}

Now every weakly connected component is a directed tree where the root has a loop. This root is chosen as the representative of each component. The following code uses doubling to assign each vertex its representative:

while

∃
v
∈
V
:
p
r
e
d
[
v
]
≠
p
r
e
d
[
p
r
e
d
[
v
]
]

{\displaystyle \exists v\in V:pred[v]\neq pred[pred[v]]}

forAll

v
∈
V

{\displaystyle v\in V}

p
r
e
d
[
v
]
←
p
r
e
d
[
p
r
e
d
[
v
]
]

{\displaystyle pred[v]\gets pred[pred[v]]}

Now every subgraph is a star. With some advanced techniques this step needs

O
(

n
p

+
log
⁡
n
)

{\displaystyle O({\frac {n}{p}}+\log n)}

time.

Contracting the subgraphs

In this step each subgraph is contracted to a single vertex.

k
←

{\displaystyle k\gets }

number of subgraphs

V
′

←
{
0
,
…
,
k
−
1
}

{\displaystyle V'\gets \{0,\dots ,k-1\}}

find a bijective function

f
:

{\displaystyle f:}

star root

→
{
0
,
…
,
k
−
1
}

{\displaystyle \rightarrow \{0,\dots ,k-1\}}

E
′

←
{
(
f
(
p
r
e
d
[
v
]
)
,
f
(
p
r
e
d
[
w
]
)
,
c
,

e

o
l
d

)
:
(
v
,
w
)
∈
E
∧
p
r
e
d
[
v
]
≠
p
r
e
d
[
w
]
}

{\displaystyle E'\gets \{(f(pred[v]),f(pred[w]),c,e_{old}):(v,w)\in E\land pred[v]\neq pred[w]\}}

Finding the bijective function is possible in

O
(

n
p

+
log
⁡
p
)

{\displaystyle O({\frac {n}{p}}+\log p)}

using a prefix sum. As we now have a new set of vertices and edges the adjacency array must be rebuilt, which can be done using Integersort on

E
′

{\displaystyle E'}

in

O
(

m
p

+
log
⁡
p
)

{\displaystyle O({\frac {m}{p}}+\log p)}

time.

= Complexity

=
Each iteration now needs

O
(

m
p

+
log
⁡
n
)

{\displaystyle O({\frac {m}{p}}+\log n)}

time and just like in the sequential case there are

log
⁡
n

{\displaystyle \log n}

iterations, resulting in a total runtime of

O
(
log
⁡
n
(

m
p

+
log
⁡
n
)
)

{\displaystyle O(\log n({\frac {m}{p}}+\log n))}

. If

m
∈
Ω
(
p

log

2

⁡
p
)

{\displaystyle m\in \Omega (p\log ^{2}p)}

the efficiency of the algorithm is in

Θ
(
1
)

{\displaystyle \Theta (1)}

and it is relatively efficient. If

m
∈
O
(
n
)

{\displaystyle m\in O(n)}

then it is absolutely efficient.

Further algorithms

There are multiple other parallel algorithms that deal the issue of finding an MST. With a linear number of processors it is possible to achieve this in

O
(
log
⁡
n
)

{\displaystyle O(\log n)}

. Bader and Cong presented an MST-algorithm, that was five times quicker on eight cores than an optimal sequential algorithm.
Another challenge is the External Memory model - there is a proposed algorithm due to Dementiev et al. that is claimed to be only two to five times slower than an algorithm that only makes use of internal memory

Prim's algorithm

Kruskal's algorithm

= Approach 1: Parallelising the sorting step

= Approach 2: Filter-Kruskal

Borůvka's algorithm

= Parallelisation

= Complexity

Further algorithms

References

Kata Kunci Pencarian:

Recent Movies

Recent Movies

Categories

Recent Movies