C ++でのばらばらのセット（ユニオン検索）の実装

Question

クラスカルのアルゴリズムで使用するディスジョイントセットを実装しようとしていますが、それをどのように行うべきか、特にツリーフォレストを管理する方法を正確に理解できません。ディスジョイントセットのWikipediaの説明を読み、アルゴリズムの概要（Cormen et al）の説明を読んだ後、次のことを思いつきました。

 class DisjointSet { public: class Node { public: int data; int rank; Node* parent; Node() : data(0), rank(0), parent(this) { } // the constructor does MakeSet }; Node* find(Node*); Node* merge(Node*, Node*); // Union }; DisjointSet::Node* DisjointSet::find(DisjointSet::Node* n) { if (n != n->parent) { n->parent = find(n->parent); } return n->parent; } DisjointSet::Node* DisjointSet::merge(DisjointSet::Node* x, DisjointSet::Node* y) { x = find(x); y = find(y); if (x->rank > y->rank) { y->parent = x; } else { x->parent = y; if (x->rank == y->rank) { ++(y->rank); } } }

私はこれが不完全であると私はかなり確信していて、私は何かが欠けていると思います。

アルゴリズムの概要では、木の森があるべきだと述べていますが、この森の実際の実装については説明していません。私はCS 61B講義31：Disjoint Sets（ http://www.youtube.com/watch?v=wSPAjGfDl7Q ）を見て、ここで講師は配列のみを使用して、フォレストとそのすべてのツリーの両方を格納しましたと値。前述のように、明示的な「ノード」タイプのクラスはありません。私は他にも多くのソースを見つけました（私は複数のリンクを投稿することはできません）。この手法も使用しています。私はこれを喜んで行いますが、これは検索のために配列のインデックスに依存し、int以外の型の値を格納したいので、他のものを使用する必要があります（std :: mapが思い浮かびます）。

C++を使用しているため、メモリの割り当てについて不明な点があります。ポインタのツリーを格納しています。MakeSet操作は次のようになります。new DisjointSet :: Node; 。現在、これらのノードには親へのポインタしかありません。そのため、ツリーの下部を見つける方法がわかりません。どのようにしてツリーをトラバースしてすべての割り当てを解除できますか？

このデータ構造の基本概念は理解していますが、実装について少し混乱しています。どんなアドバイスや提案も大歓迎です、ありがとう。

Stuart Golodetz · Answer

決して完璧な実装ではありませんが（結局それを書きました！）、これは役に立ちますか？

/*** * millipede: DisjointSetForest.h * Copyright Stuart Golodetz, 2009. All rights reserved. ***/ #ifndef H_MILLIPEDE_DISJOINTSETFOREST #define H_MILLIPEDE_DISJOINTSETFOREST #include <map> #include <common/exceptions/Exception.h> #include <common/io/util/OSSWrapper.h> #include <common/util/NullType.h> namespace mp { /** @brief A disjoint set forest is a fairly standard data structure used to represent the partition of a set of elements into disjoint sets in such a way that common operations such as merging two sets together are computationally efficient. This implementation uses the well-known union-by-rank and path compression optimizations, which together yield an amortised complexity for key operations of O(a(n)), where a is the (extremely slow-growing) inverse of the Ackermann function. The implementation also allows clients to attach arbitrary data to each element, which can be useful for some algorithms. @tparam T The type of data to attach to each element (arbitrary) */ template <typename T = NullType> class DisjointSetForest { //#################### NESTED CLASSES #################### private: struct Element { T m_value; int m_parent; int m_rank; Element(const T& value, int parent) : m_value(value), m_parent(parent), m_rank(0) {} }; //#################### PRIVATE VARIABLES #################### private: mutable std::map<int,Element> m_elements; int m_setCount; //#################### CONSTRUCTORS #################### public: /** @brief Constructs an empty disjoint set forest. */ DisjointSetForest() : m_setCount(0) {} /** @brief Constructs a disjoint set forest from an initial set of elements and their associated values. @param[in] initialElements A map from the initial elements to their associated values */ explicit DisjointSetForest(const std::map<int,T>& initialElements) : m_setCount(0) { add_elements(initialElements); } //#################### PUBLIC METHODS #################### public: /** @brief Adds a single element x (and its associated value) to the disjoint set forest. @param[in] x The index of the element @param[in] value The value to initially associate with the element @pre - x must not already be in the disjoint set forest */ void add_element(int x, const T& value = T()) { m_elements.insert(std::make_pair(x, Element(value, x))); ++m_setCount; } /** @brief Adds multiple elements (and their associated values) to the disjoint set forest. @param[in] elements A map from the elements to add to their associated values @pre - None of the elements to be added must already be in the disjoint set forest */ void add_elements(const std::map<int,T>& elements) { for(typename std::map<int,T>::const_iterator it=elements.begin(), iend=elements.end(); it!=iend; ++it) { m_elements.insert(std::make_pair(it->first, Element(it->second, it->first))); } m_setCount += elements.size(); } /** @brief Returns the number of elements in the disjoint set forest. @return As described */ int element_count() const { return static_cast<int>(m_elements.size()); } /** @brief Finds the index of the root element of the tree containing x in the disjoint set forest. @param[in] x The element whose set to determine @pre - x must be an element in the disjoint set forest @throw Exception - If the precondition is violated @return As described */ int find_set(int x) const { Element& element = get_element(x); int& parent = element.m_parent; if(parent != x) { parent = find_set(parent); } return parent; } /** @brief Returns the current number of disjoint sets in the forest (i.e. the current number of trees). @return As described */ int set_count() const { return m_setCount; } /** @brief Merges the disjoint sets containing elements x and y. If both elements are already in the same disjoint set, this is a no-op. @param[in] x The first element @param[in] y The second element @pre - Both x and y must be elements in the disjoint set forest @throw Exception - If the precondition is violated */ void union_sets(int x, int y) { int setX = find_set(x); int setY = find_set(y); if(setX != setY) link(setX, setY); } /** @brief Returns the value associated with element x. @param[in] x The element whose value to return @pre - x must be an element in the disjoint set forest @throw Exception - If the precondition is violated @return As described */ T& value_of(int x) { return get_element(x).m_value; } /** @brief Returns the value associated with element x. @param[in] x The element whose value to return @pre - x must be an element in the disjoint set forest @throw Exception - If the precondition is violated @return As described */ const T& value_of(int x) const { return get_element(x).m_value; } //#################### PRIVATE METHODS #################### private: Element& get_element(int x) const { typename std::map<int,Element>::iterator it = m_elements.find(x); if(it != m_elements.end()) return it->second; else throw Exception(OSSWrapper() << "No such element: " << x); } void link(int x, int y) { Element& elementX = get_element(x); Element& elementY = get_element(y); int& rankX = elementX.m_rank; int& rankY = elementY.m_rank; if(rankX > rankY) { elementY.m_parent = x; } else { elementX.m_parent = y; if(rankX == rankY) ++rankY; } --m_setCount; } }; } #endif

Murilo Vasconcelos · Answer

あなたの実装は良さそうです（関数のマージを除いて、voidを返すことを宣言するか、そこにreturnを置く必要があります。私はvoidを返すことを好みます）。重要なのは、Nodes*を追跡する必要があることです。これを行うには、DisjointSetクラスにvector<DisjointSet::Node*>を含めるか、これを別の場所にvectorを設定し、DisjointSetのメソッドをstaticとして宣言します。。

以下は実行の例です（マージをvoidを返すように変更し、DisjointSetのメソッドをstaticに変更しなかったことに注意してください：

int main() { vector<DisjointSet::Node*> v(10); DisjointSet ds; for (int i = 0; i < 10; ++i) { v[i] = new DisjointSet::Node(); v[i]->data = i; } int x, y; while (cin >> x >> y) { ds.merge(v[x], v[y]); } for (int i = 0; i < 10; ++i) { cout << v[i]->data << ' ' << v[i]->parent->data << endl; } return 0; }

この入力では：

3 1 1 2 2 4 0 7 8 9

それは期待されたものを出力します：

0 7 1 1 2 1 3 1 4 1 5 5 6 6 7 7 8 9 9 9

あなたの森は木の構成です：

 7 1 5 6 9 / / | \ | 0 2 3 4 8

したがって、あなたのアルゴリズムは優れており、Union-findが私が知る限り最も複雑で、Nodeをvectorで追跡します。したがって、単に割り当てを解除することができます：

for (int i = 0; i < int(v.size()); ++i) { delete v[i]; }

THK · Answer

Boostには、ばらばらのセットの実装があります。

http://www.boost.org/doc/libs/1_54_0/libs/disjoint_sets/disjoint_sets.html

Puppy · Answer

アルゴリズムについて話すことはできませんが、メモリ管理では、通常、スマートポインターと呼ばれるものを使用し、それが指すものを解放します。共有所有権と単一所有権のスマートポインタを取得でき、非所有権も取得できます。これらの正しい使用は、メモリの問題がないことを保証します。

user3960974 · Answer

次のコードは、パス圧縮によるunion-find disjointsセットを実装するために理解するのが簡単のようです

int find(int i) { if(parent[i]==i) return i; else return parent[i]=find(parent[i]); } void union(int a,int b) { x=find(a);y=find(b); if(x!=y) { if(rank[x]>rank[y]) parent[y]=x; else { parent[x]=y; if(rank[x]==rank[y]) rank[y]+=1; } } }

user1423561 · Answer

このコードを見てください

class Node { int id,rank,data; Node *parent; public : Node(int id,int data) { this->id = id; this->data = data; this->rank =0; this->parent = this; } friend class DisjointSet; }; class DisjointSet { unordered_map<int,Node*> forest; Node *find_set_helper(Node *aNode) { if( aNode->parent != aNode) aNode->parent = find_set_helper(aNode->parent); return aNode->parent; } void link(Node *xNode,Node *yNode) { if( xNode->rank > yNode->rank) yNode->parent = xNode; else if(xNode-> rank < yNode->rank) xNode->parent = yNode; else { xNode->parent = yNode; yNode->rank++; } } public: DisjointSet() { } void make_set(int id,int data) { Node *aNode = new Node(id,data); this->forest.insert(make_pair(id,aNode)); } void Union(int xId, int yId) { Node *xNode = find_set(xId); Node *yNode = find_set(yId); if(xNode && yNode) link(xNode,yNode); } Node* find_set(int id) { unordered_map<int,Node*> :: iterator itr = this->forest.find(id); if(itr == this->forest.end()) return NULL; return this->find_set_helper(itr->second); } void print() { unordered_map<int,Node*>::iterator itr; for(itr = forest.begin(); itr != forest.end(); itr++) { cout<<"
id : "<<itr->second->id<<" parent :"<<itr->second->parent->id; } } ~DisjointSet(){ unordered_map<int,Node*>::iterator itr; for(itr = forest.begin(); itr != forest.end(); itr++) { delete (itr->second); } } };

dionyziz · Answer

実装は問題ありません。ここで行う必要があるのは、それらのunion/findメソッドを呼び出すことができるように、ばらばらのセットノードの配列を保持することだけです。

クラスカルのアルゴリズムでは、1つの非結合セットを含む配列が必要ですNodeグラフの頂点ごと。次に、次のEdgeを調べてサブグラフに追加するときは、findメソッドを使用します。これらのノードがすでにサブグラフにあるかどうかを確認します。そうである場合は、次のエッジに移動できます。それ以外の場合は、そのエッジをサブグラフに追加して、そのエッジで接続されている2つの頂点間で結合操作を実行します。

Pei Z · Answer

ディスジョイントセットを最初から実装する場合は、 C++のデータ構造とアルゴリズム分析 の本を読むことを強くお勧めしますMark A. Weiss。

第8章では、基本的な検索/結合から始まり、高さ/深さ/ランクによって徐々に結合を追加し、圧縮を見つけます。最終的には、Big-O分析を提供します。

信頼してください。DisjointSetsとそのC++実装について知りたいことがすべて含まれています。

user3310464 · Answer

このブログ記事は、パス圧縮を使用したC++実装を示しています。 http://toughprogramming.blogspot.com/2013/04/implementing-disjoint-sets-in-c.html

95_96 · Answer

分離セット（ベクトルまたはマップ（RBツリー））を実装するのにどちらのスタイルが適しているかを尋ねようとしている場合、追加するものがあるかもしれません

make_set (int key , node info )：これは通常、（1）ノードを追加し、（2）ノードがそれ自体を指すようにするメンバー関数（parent = key）です。これにより、最初は互いに素なセットが作成されます。ベクトルの演算時間の複雑さO(n)、マップO（n * logn）の場合）。
find_set( int key )：これには一般に2つの機能があります。（1）指定されたキーを使用してノードを見つける（2）パス圧縮。パス圧縮では実際に計算できませんでしたが、ノードを単純に検索するために、（1）ベクトルO(1)および（2）マップO(log(n))。

結論として、私はここを見ればベクトルの実装はより良く見えますが、両方の時間の複雑さはO（M *α（n））≈O（M * 5）であると私は読んだと言いたいです。

ps。私はそれが正しいと確信していますが、私が書いたものを確認してください。