Data Structure in Java&Cpp

2023年7月26日关作者 ScotI_Blog

mainly based on YDJSIR-NJU/NJUSE-20-DataStructure: Collection of Data Structure & Algorithm @ Software Institute, Nanjing University (github.com) and have extensions to improve my data structure skills

Contents

0.1 栈
- 0.1.1 [NOIP2003 普及组] 栈
0.2 优先队列
0.3 二叉树
0.4 初始化
0.5 查询
- 0.5.1 路径压缩
0.6 合并
- 0.6.1 启发式合并
0.7 删除
0.8 移动
0.9 复杂度
- 0.9.1 时间复杂度
- 0.9.2 空间复杂度
0.10 带权并查集
- 0.10.1 题解 P1955 【[NOI2015]程序自动分析】
0.11 并查集
0.12 离散化
0.13 并查集和离散化
0.14 将条件转化为并查集的基本操作
0.15 细节
0.16 代码

1 P2234 [HNOI2002] 营业额统计
2 Insert Operation in B-Tree

栈

栈是 OI 中常用的一种线性数据结构，请注意，本文主要讲的是栈这种数据结构，而非程序运行时的系统栈/栈空间。

栈的修改是按照后进先出的原则进行的，因此栈通常被称为是后进先出（last in first out）表，简称 LIFO 表。

栈是一种常见的数据结构，它可以在O(1)的时间内完成元素的插入和删除操作。栈的特点是先进后出，也就是最后插入的元素最先被删除。

栈有两个基本操作：入栈（push）和出栈（pop）。当元素被插入到栈中时，我们称之为入栈操作；当元素从栈中被删除时，我们称之为出栈操作。入栈和出栈操作都是在栈顶进行的，也就是说，只有栈顶的元素可以被删除或者查询。

除了基本操作之外，栈还有一些其他的常用操作，例如查看栈顶元素（top）、判断栈是否为空（empty）等。这些操作的时间复杂度都是O(1)。

栈可以用数组或链表实现。使用数组实现栈时，需要注意栈的大小，当栈的大小超出数组的容量时，需要进行扩容操作。使用链表实现栈时，不需要担心栈的大小，但需要考虑链表的遍历和指针操作的复杂度。

栈的应用非常广泛，例如在编译器中用于语法分析、在计算器中用于表达式求值、在操作系统中用于函数调用栈等。掌握栈的基本操作和应用场景，对于理解和实现许多算法和数据结构都具有重要意义。

[NOIP2003 普及组] 栈

题目背景

栈是计算机中经典的数据结构，简单的说，栈就是限制在一端进行插入删除操作的线性表。

栈有两种最重要的操作，即 pop（从栈顶弹出一个元素）和 push（将一个元素进栈）。

栈的重要性不言自明，任何一门数据结构的课程都会介绍栈。宁宁同学在复习栈的基本概念时，想到了一个书上没有讲过的问题，而他自己无法给出答案，所以需要你的帮忙。

题目描述

宁宁考虑的是这样一个问题：一个操作数序列，$1,2,\ldots ,n$（图示为 1 到 3 的情况），栈 A 的深度大于 $n$。

现在可以进行两种操作，

将一个数，从操作数序列的头端移到栈的头端（对应数据结构栈的 push 操作）

将一个数，从栈的头端移到输出序列的尾端（对应数据结构栈的 pop 操作）

使用这两种操作，由一个操作数序列就可以得到一系列的输出序列，下图所示为由 1 2 3 生成序列 2 3 1 的过程。

（原始状态如上图所示）

你的程序将对给定的 $n$，计算并输出由操作数序列 $1,2,\ldots,n$ 经过操作可能得到的输出序列的总数。

输入格式

输入文件只含一个整数 $n$（$1 \leq n \leq 18$）。

输出格式

输出文件只有一行，即可能输出序列的总数目。

样例 #1

样例输入 #1
3
样例输出 #1
5
提示

【题目来源】

NOIP 2003 普及组第三题

[su_accordion][su_spoiler title=”题解” open=”no” style=”default” icon=”plus” anchor=”” anchor_in_url=”no” class=””]

cpp version:
#include<iostream>
using namespace std;
long n,f[20][20];//f数组记录方案
long dfs(int x,int y)//x是操作队列里元素的个数，y是栈里的个数
{
    if(f[x][y]!=0) return f[x][y];//记忆化，走过的方案直接调用
    if(x==0) return 1;//当操作队列里没有了，就只有一种方案了
    if(y>0) f[x][y]+=dfs(x,y-1);//栈里不为空的时候才可以把栈里的元素推出
    f[x][y]+=dfs(x-1,y+1);//操作队列里元素减一，栈里元素加一
    return f[x][y];//返回方案值
}
int main()
{
    cin>>n;
    cout<<dfs(n,0)<<endl;
    return 0;
}
java version:
import java.util.Scanner;
public class noip2003 {
    static int[][] f=new int[20][20];
    static long dfs(int x,int y)
    {
        if(f[x][y]!=0) return f[x][y];
        if(x==0) return 1;
        if(y>0)
        f[x][y]+=dfs(x,y-1);
        f[x][y]+=dfs(x-1,y+1);
        return f[x][y];
    }
public static void main(String[] args)  
{
    try (Scanner cin = new Scanner(System.in)) {
        int n=cin.nextInt();
        System.out.println(dfs(n,0));
    }
}
}

[/su_spoiler] [/su_accordion]

优先队列

优先队列具有以下几个特点:

1. 每个元素都有一个优先级,出队顺序按照优先级从高到低。

2. 如果两个元素优先级相同,则按照插入的顺序出队。

3. 新插入的元素默认会插入到队尾,然后根据其优先级来调整其位置。

4. 出队操作总是返回优先级最高的元素。

以上操作都是基于优先队列“自动排序”的特征

5. 支持常见的入队、出队以及返回顶部元素的操作。

6. 可以通过堆来实现,时间复杂度为 O(logN)。也可以通过优先级队列等其他数据结构来实现。

7. 主要应用在需要频繁删除最重要元素的场景,比如任务调度等。

大多数的优先队列用堆来实现，所以插入、删除操作的时间复杂度基本上都是log(N)

a place for progressing

二叉树

二叉树是一种常见的树状数据结构，在计算机科学中被广泛应用。它由一组节点组成，这些节点通过指向其它节点的引用（称为指针）来连接。每个节点最多可以有两个子节点，分别称为左子节点和右子节点。

二叉树的特点是具有层级结构，顶部的节点称为根节点，根节点下面的节点称为子节点。每个节点最多可以有一个父节点，除了根节点没有父节点。节点没有子节点的节点称为叶节点。

二叉树的结构可以用递归的方式定义。每个节点包含一个值和指向其左子节点和右子节点的指针。左子节点的值小于或等于父节点的值，而右子节点的值大于父节点的值。这使得在二叉树中可以高效地进行搜索、插入和删除操作。

二叉树的应用非常广泛。它们可以用于构建高效的搜索和排序算法

以下是使用c++搭建的一个简单的二叉树的类[su_accordion][su_spoiler title=”detail” open=”no” style=”default” icon=”plus” anchor=”” anchor_in_url=”no” class=””]

class BinaryTree {
private:
    struct Node {
        int data;
        Node* left;
        Node* right;

        Node(int value) {
            data = value;
            left = nullptr;
            right = nullptr;
        }
    };

    Node* root;

public:
    BinaryTree() {
        root = nullptr;
    }

    ~BinaryTree() {
        destroyTree(root);
    }

    void insert(int value) {
        if (root == nullptr) {
            root = new Node(value);
        } else {
            insertNode(root, value);
        }
    }

    bool search(int value) const {
        return searchNode(root, value);
    }

    void remove(int value) {
        removeNode(root, value);
    }

    void displayInOrder() const {
        displayInOrderTraversal(root);
    }

private:
    void destroyTree(Node* node) {
        if (node != nullptr) {
            destroyTree(node->left);
            destroyTree(node->right);
            delete node;
        }
    }

    void insertNode(Node* node, int value) {
        if (value < node->data) {
            if (node->left == nullptr) {
                node->left = new Node(value);
            } else {
                insertNode(node->left, value);
            }
        } else if (value > node->data) {
            if (node->right == nullptr) {
                node->right = new Node(value);
            } else {
                insertNode(node->right, value);
            }
        }
    }

    bool searchNode(const Node* node, int value) const {
        if (node == nullptr) {
            return false;
        } else if (value == node->data) {
            return true;
        } else if (value < node->data) {
            return searchNode(node->left, value);
        } else {
            return searchNode(node->right, value);
        }
    }

    void removeNode(Node*& node, int value) {
        if (node == nullptr) {
            return;
        } else if (value < node->data) {
            removeNode(node->left, value);
        } else if (value > node->data) {
            removeNode(node->right, value);
        } else {
            if (node->left == nullptr && node->right == nullptr) {
                delete node;
                node = nullptr;
            } else if (node->left == nullptr) {
                Node* temp = node;
                node = node->right;
                delete temp;
            } else if (node->right == nullptr) {
                Node* temp = node;
                node = node->left;
                delete temp;
            } else {
                Node* minRight = findMin(node->right);
                node->data = minRight->data;
                removeNode(node->right, minRight->data);
            }
        }
    }

    Node* findMin(Node* node) const {
        while (node->left != nullptr) {
            node = node->left;
        }
        return node;
    }

    void displayInOrderTraversal(const Node* node) const {
        if (node != nullptr) {
            displayInOrderTraversal(node->left);
            std::cout << node->data << " ";
            displayInOrderTraversal(node->right);
        }
    }
};

[/su_spoiler] [/su_accordion]

二叉树的建立

[su_accordion][su_spoiler title=”java version” open=”no” style=”default” icon=”plus” anchor=”” anchor_in_url=”no” class=””]

public static void CreateTree(Node root, String s) {
//这段代码有参考，所以写的怪怪的
        int len = s.length();
        while (cnt < len) {
            if (s.charAt(cnt) == ')') {
                cnt++;
                return;
            }
            else if (s.charAt(cnt) == '(') {
                cnt++;
                if (s.charAt(cnt) == ',') {
                    cnt++;
                }
                else {
                    Node node = new Node(s.charAt(cnt));
                    root.setLeft(node);
                    cnt++;
                    CreateTree(node, s);
                }
            }
            else if (s.charAt(cnt) == ',') {
                cnt++;
                Node node = new Node(s.charAt(cnt));
                root.setRight(node);
                cnt++;
                CreateTree(node, s);
            }
            else {
                Node node = new Node(s.charAt(cnt));
                root.setLeft(node);
                cnt++;
                CreateTree(node, s);
            }
        }
    }

[/su_spoiler] [/su_accordion]

cpp version

#include "btree.h"
void CreateBTNode(BTNode * &b,char *str)
{
        BTNode *ST[MaxSize],*p;
        int top = -1,k,j=0;
        char ch;
        b=NULL;
        ch=str[j];
        while(ch!='\0')
      { 
           switch(ch){
                case '(':
                     top++; ST[top]=p;k=1;  break;
                case ')':top--;break;
                case ',':k=2;break;
                default: p =(BTNode *)malloc (sizeof(BTNode));
                             p->data = ch;p->lchild = p->rchild = NULL;
                             if(b==NULL)
                                 b=p;//即为根节点
                             else{
                              switch(k){
                                   case 1:St[top]->lchild=p;break;
                                   case 2:St[top]->rchild=p;break; 
                                   }   
                              }
                 j++;
                ch = str[j];
          }
}

分块思想

根据OI WIKI 精神，做一个与分块相关的专题讨论

分块的基本思想是，通过对原数据的适当划分，并在划分后的每一个块上预处理部分信息，从而较一般的暴力算法取得更优的时间复杂度。
-OI WIKI

并查集

并查集是一种用于管理元素所属集合的数据结构，实现为一个森林，其中每棵树表示一个集合，树中的节点表示对应集合中的元素。

顾名思义，并查集支持两种操作：

合并（Union）：合并两个元素所属集合（合并对应的树）
查询（Find）：查询某个元素所属集合（查询对应的树的根节点），这可以用于判断两个元素是否属于同一集合

并查集在经过修改后可以支持单个元素的删除、移动；使用动态开点线段树还可以实现可持久化并查集。

初始化

初始时，每个元素都位于一个单独的集合，表示为一棵只有根节点的树。方便起见，我们将根节点的父亲设为自己。

cpp实现

struct dsu {
  vector<size_t> pa;

  explicit dsu(size_t size) : pa(size) { iota(pa.begin(), pa.end(), 0); }
};

查询

我们需要沿着树向上移动，直至找到根节点。

size_t dsu::find(size_t x) { 
return pa[x] == x ? x : find(pa[x]); 
}

路径压缩

查询过程中经过的每个元素都属于该集合，我们可以将其直接连到根节点以加快后续查询。

size_t dsu::find(size_t x) { 
return pa[x] == x ? x : pa[x] = find(pa[x]); 
}

合并

要合并两棵树，我们只需要将一棵树的根节点连到另一棵树的根节点。

启发式合并

合并时，选择哪棵树的根节点作为新树的根节点会影响未来操作的复杂度。我们可以将节点较少或深度较小的树连到另一棵，以免发生退化。

struct dsu {
  vector<size_t> pa, size;
  explicit dsu(size_t size_) : pa(size_), size(size_, 1) {
    iota(pa.begin(), pa.end(), 0);
  }
  void unite(size_t x, size_t y) {
    x = find(x), y = find(y);
    if (x == y) return;
    if (size[x] < size[y]) swap(x, y);
    pa[y] = x;
    size[x] += size[y];
  }
};

删除

要删除一个叶子节点，我们可以将其父亲设为自己。为了保证要删除的元素都是叶子，我们可以预先为每个节点制作副本，并将其副本作为父亲。即必须保证删除叶子

struct dsu {
  vector<size_t> pa, size;
  explicit dsu(size_t size_) : pa(size_ * 2), size(size_ * 2, 1) {
    iota(pa.begin(), pa.begin() + size_, size_);
    iota(pa.begin() + size_, pa.end(), size_);
  }
  void erase(size_t x) {
    --size[find(x)];
    pa[x] = x;
  }
};

移动

与删除类似，通过以副本作为父亲，保证要移动的元素都是叶子。

void dsu::move(size_t x, size_t y) {
  auto fx = find(x), fy = find(y);
  if (fx == fy) return;
  pa[x] = fy;
  --size[fx], ++size[fy];
}

复杂度

时间复杂度

同时使用路径压缩和启发式合并之后，并查集的每个操作平均时间仅为，其中为阿克曼函数的反函数，其增长极其缓慢，也就是说其单次操作的平均运行时间可以认为是一个很小的常数。

时间复杂度的证明在这个页面中。

空间复杂度

显然为。

带权并查集

我们还可以在并查集的边上定义某种权值、以及这种权值在路径压缩时产生的运算，从而解决更多的问题。比如对于经典的「NOI2001」食物链，我们可以在边权上维护模 3 意义下的加法群。

例题：NOI2015 程序自动分析（参考了一个讲解的很清楚的大佬的题解）

题解 P1955 【[NOI2015]程序自动分析】

posted on 2018-06-15 12:59:10 | under 题解 | 33

并查集

并查集实际上是一个森林。

并查集有两种基本操作：find和merge

find

俗称找爸爸。

本人习惯于使用递归实现

int find(int x)
{
    if(f[x]==x) return x;
    else return find(f[x]);
}

即可，比较简单请自行理解

merge

合并两棵树，直接把第2棵树的最早的祖先的爸爸改为第1棵树的最早的祖先

代码很简单：

void merge(int a,int b)
{
    f[find(a)]=find(b);
}

路径压缩

按照上面的代码，很快这棵树就会变成类链的东西，那么每一次find操作的耗时就不能接受了。

我们考虑在整个过程中，我们只需要利用一个结点的最早的祖先，那么只需要每次find操作过程中，把这个结点的爸爸改成它最早的祖先。

只需要把find操作的代码稍作修改即可。

int find(int x)
{
    if(f[x]==x) return x;
    else return f[x]=find(f[x]);
}

离散化

离散化，就是把无限空间中有限的个体映射到有限的空间中去，以此提高算法的时空效率。

来自百度~~（反正我一开始没看懂）~~

讲通俗一点，就是把一个可以很大的数据排个序，去个重，放在新的数组里，每次用二分找它在新数组的位置。

并查集和离散化

离散化之后，对于每一个x进行并查集操作

将条件转化为并查集的基本操作

两个x相等，则相当于在一个集合里，否则不在一个集合。

细节

进行限制条件的判断前，先排个序，11在前面，00在后面，为什么请自己思考

代码

~~直接复制你会爆零的~~[su_accordion][su_spoiler title=”Spoiler 标题” open=”no” style=”default” icon=”plus” anchor=”” anchor_in_url=”no” class=””]

#include<iostream>
#include<cstdio>
#include<algorithm>
using namespace std;
#define maxn 100007
#define maxnt 1000000007
struct node{
    int a,b,e;
}a[maxn];
int ttttt,f[maxn*2],x[maxn*2],n,lsh[maxn*2],nn,wz[maxn*2];
bool flag;
void mak()
{
    for(register int i=1;i<=nn;i++) f[i]=i;
}
int find(int x)
{
    return f[x]==x?x:f[x]=find(f[x]);
}
bool comp(int a,int b)
{
    return a<b;
}
bool cmp(node a,node b)
{
    return a.e>b.e;
}
int er(int zz)
{
    int l=1,r=nn,m;
    while(l<=r)
    {
        m=(l+r)/2;
        if(lsh[m]==zz) return m;
        if(zz>lsh[m]) l=m+1;
        else r=m-1;
    }
    return zz;
}
int mian()
{
    scanf("%d",&ttttt);
    while(ttttt--)
    {
        scanf("%d",&n);
        nn=0;
        flag=0;
        for(register int i=1;i<=n;i++)
        {
            scanf("%d%d%d",&a[i].a,&a[i].b,&a[i].e);
            x[i*2-1]=a[i].a,x[i*2]=a[i].b;
        }
        sort(x+1,x+2*n+1,comp);
        for(register int i=1;i<=2*n;i++)
        {
            if(x[i]!=x[i-1])
            {
                lsh[++nn]=x[i];
//              wz[i]=nn;
            }
//          else wz[i]=wz[i-1];
        }
        mak();
        sort(a+1,a+n+1,cmp);
        for(register int i=1;i<=n;i++)
        {
            if(a[i].e)
            {
                int aaa=find(er(a[i].a)),bbb=find(er(a[i].b));
                if(aaa!=bbb) f[aaa]=bbb;
            }
            else
            {
                int aaa=find(er(a[i].a)),bbb=find(er(a[i].b));
                if(aaa==bbb)
                {
                    printf("NO\n");
                    flag=1;
                    break;
                }
            }
        }
        if(!flag) printf("YES\n");
    }
    return 0;
}

[/su_spoiler] [/su_accordion]

Collection类详解：

public interface Collection<AnyType> extends Iterator<Anytype>
{
     int size();
     boolean IsEmpty();
     void clear();
     boolean contains(AnyType x);
     boolean add(AnyType x);
     boolean remove(AnyType x);
     Java.util.Iterator<AnyType> iterator();
}

在Java中,获取collection类中的Iterator接口主要有两种方式:

1. 使用集合类的iterator()方法

每一个集合类如List、Set、Queue等都实现了iterator()方法,用于获取该集合的Iterator对象。

例如:java

List<String> list = new ArrayList<>();

Iterator iterator = list.iterator();

2. 通过JDK1.5提供的增强for循环语法

增强for循环可以隐式获取集合的Iterator对象,语法简洁。

例如:java

List<String> list = new ArrayList<>();

for(String s : list){ // 内部实现相当于Iterator迭代 }

增强for循环本质上是语法糖,会自动调用集合的iterator()方法并遍历。

线性表例题：

P2234 [HNOI2002] 营业额统计

题目描述

Tiger 最近被公司升任为营业部经理，他上任后接受公司交给的第一项任务便是统计并分析公司成立以来的营业情况。

Tiger 拿出了公司的账本，账本上记录了公司成立以来每天的营业额。分析营业情况是一项相当复杂的工作。由于节假日，大减价或者是其他情况的时候，营业额会出现一定的波动，当然一定的波动是能够接受的，但是在某些时候营业额突变得很高或是很低，这就证明公司此时的经营状况出现了问题。经济管理学上定义了一种最小波动值来衡量这种情况：当最小波动值越大时，就说明营业情况越不稳定。

而分析整个公司的从成立到现在营业情况是否稳定，只需要把每一天的最小波动值加起来就可以了。你的任务就是编写一个程序帮助 Tiger 来计算这一个值。

我们定义，一天的最小波动值 = $\min{|\text{该天以前某一天的营业额}-\text{该天营业额}|}$。

特别地，第一天的最小波动值为第一天的营业额。

输入格式

第一行为正整数 $n$（$n \leq 32767$），表示该公司从成立一直到现在的天数，接下来的 $n$ 行每行有一个整数 $a_i$（$|a_i| \leq 10^6$) ，表示第 $i$ 天公司的营业额，可能存在负数。

输出格式

输出一个正整数，即每一天最小波动值的和，保证结果小于 $2^{31}$。

样例 #1

样例输入 #1

样例输出 #1

提示

结果说明：$5+|1-5|+|2-1|+|5-5|+|4-5|+|6-5|=5+4+1+0+1+1=12$

高端のanswer[su_accordion][su_spoiler title=”详细代码” open=”no” style=”default” icon=”plus” anchor=”” anchor_in_url=”no” class=””]

#include<cstdio>
#include<set>
using namespace std;
typedef multiset<int>::iterator iter;
//定义iter类型，免去每次打一大串的苦恼
const int inf=0x7fffffff;
//定义无限大，0x7fffffff为16进制
int n,x,ans;
multiset<int>s;
inline int min(int a,int b){return a<b?a:b;}
inline int abs(int x){return x<0?-x:x;}
int main(){
	s.insert(inf);s.insert(-inf);
    //插入正无穷和负无穷，防止迭代器访问到一些奇奇怪怪的内存
	scanf("%d",&n);
	scanf("%d",&x);s.insert(x);ans=x;
    //第一个单独考虑
	while(--n){
		scanf("%d",&x);
		iter it=s.insert(x);
        //插入x，并返回x在s中的位置（迭代器）
		it--;
		iter l=it;
		it++;it++;
		iter r=it;
		it--;
        //迭代器只支持++，--运算符，所以看上去很麻烦。。
        //其实l就是上一个数，r是下一个数（在s中）
		if(*l==-inf)ans+=abs(x-*r);
        //在最前面
		if(*r==inf)ans+=abs(x-*l);
        //在最后面
		if(*l!=-inf&&*r!=inf)ans+=min(abs(x-*r),abs(x-*l));
        //一般情况
	}
	printf("%d",ans);
}

[/su_spoiler]

堆

堆是一棵树，其每个节点都有一个键值，且每个节点的键值都大于等于/小于等于其父亲的键值。

（小根）堆主要支持的操作有：插入一个数、查询最小值、删除最小值、合并两个堆、减小一个元素的值。

我们首先着力讨论一下二叉堆

从二叉堆的结构说起，它是一棵二叉树，并且是完全二叉树，每个结点中存有一个元素（或者说，有个权值）。

作为一棵完全二叉树，二叉堆完全可以用一个1-index的数组来存储，对于节点p，p*2即为左儿子，p*2+1即为右节点。同时，用size记录当前二叉堆中节点的个数。

插入操作

插入操作是指向二叉堆中插入一个元素，要保证插入后也是一棵完全二叉树。

最简单的方法就是，最下一层最右边的叶子之后插入。

如果最下一层已满，就新增一层。

插入之后可能会不满足堆性质？

向上调整：如果这个结点的权值大于它父亲的权值，就交换，重复此过程直到不满足或者到根。

可以证明，插入之后向上调整后，没有其他结点会不满足堆性质。

向上调整的时间复杂度是 O（logN）的。

删除操作

删除操作指删除堆中最大的元素，即删除根结点。

但是如果直接删除，则变成了两个堆，难以处理。

所以不妨考虑插入操作的逆过程，设法将根结点移到最后一个结点，然后直接删掉。

然而实际上不好做，我们通常采用的方法是，把根结点和最后一个结点直接交换。

于是直接删掉（在最后一个结点处的）根结点，但是新的根结点可能不满足堆性质……

向下调整：在该结点的儿子中，找一个最大的，与该结点交换，重复此过程直到底层。

可以证明，删除并向下调整后，没有其他结点不满足堆性质。

实现二叉堆

基本依托于上浮下沉的两种交换[su_accordion][su_spoiler title=”代码” open=”no” style=”default” icon=”plus” anchor=”” anchor_in_url=”no” class=””]

void up(int x) {
  while (x > 1 && h[x] > h[x / 2]) {
    swap(h[x], h[x / 2]);
    x /= 2;
  }
}
void down(int x) {
  while (x * 2 <= n) {
    t = x * 2;
    if (t + 1 <= n && h[t + 1] > h[t]) t++;
    if (h[t] <= h[x]) break;
    std::swap(h[x], h[t]);
    x = t;
  }
}

[/su_spoiler]

所以解决了上浮与下沉问题之后，插入和删除迎刃而解。

建堆

考虑这么一个问题，从一个空的堆开始，插入 n个元素，不在乎顺序。

直接一个一个插入需要nlogn的时间，有没有更好的方法？

方法一：使用 decreasekey（即，向上调整）

`void build_heap_1() {`

`for (i = 1; i <= n; i++)`

`up(i);`

`}`

从根开始，按 BFS 序进行。

为啥这么做：对于第 k 层的结点，向上调整的复杂度为logk而不是logn 。

总复杂度：O（logN）。

（在「基于比较的排序」中证明过）

方法二：使用向下调整

`void build_heap_2() {`

`for (i = n; i >= 1; i--)`

`down(i);`

`}`

这时换一种思路，从叶子开始，逐个向下调整

换一种理解方法，每次「合并」两个已经调整好的堆，这说明了正确性。

注意到向下调整的复杂度，为 logk，另外注意到叶节点无需调整，因此可从序列约n/2的位置开始调整，可减少部分常数但不影响复杂度。

[su_spoiler title=”建堆代码复杂度证明” open=”no” style=”default” icon=”plus” anchor=”” anchor_in_url=”no” class=””]

[/su_spoiler] [/su_accordion]

SP16254 RMID2 – Running Median Again

[su_accordion][su_spoiler title=”detail” open=”no” style=”default” icon=”plus” anchor=”” anchor_in_url=”no” class=””]

[/su_spoiler] [/su_accordion]

[su_accordion][su_spoiler title=”题解” open=”no” style=”default” icon=”plus” anchor=”” anchor_in_url=”no” class=””]

#include <cstdio>
#include <iostream>
#include <queue>
using namespace std;
int main() {
  int t, x;
  scanf("%d", &t);
  while (t--) {
    // 大根堆，维护前一半元素（存小值）
    priority_queue<int, vector<int>, less<int> > a;
    // 小根堆，维护后一半元素（存大值）
    priority_queue<int, vector<int>, greater<int> > b;
    while (scanf("%d", &x) && x) {
      // 若为查询并删除操作，输出并删除大根堆堆顶元素
      // 因为这题要求输出中位数中较小者（偶数个数字会存在两个中位数候选）
      // 这个和上面的第k大讲解有稍许出入，但如果理解了上面的，这个稍微变通下便可理清
      if (x == -1) {
        printf("%d\n", a.top());
        a.pop();
      }
      // 若为插入操作，根据大根堆堆顶的元素值，选择合适的堆进行插入
      else {
        if (a.empty() || x <= a.top())
          a.push(x);
        else
          b.push(x);
      }
      // 对对顶堆进行调整
      if (a.size() > (a.size() + b.size() + 1) / 2) {
        b.push(a.top());
        a.pop();
      } else if (a.size() < (a.size() + b.size() + 1) / 2) {
        a.push(b.top());
        b.pop();
      }
    }
  }
  return 0;
}

[/su_spoiler] [/su_accordion]

配对堆

配对堆是一个支持插入，查询/删除最小值，合并，修改元素等操作的数据结构，是一种可并堆。有速度快和结构简单的优势，但由于其为基于势能分析的均摊复杂度，无法可持久化。

配对堆是一棵满足堆性质的带权多叉树（如下图），即每个节点的权值都小于或等于他的所有儿子

通常我们使用儿子 – 兄弟表示法储存一个配对堆（如下图），一个节点的所有儿子节点形成一个单向链表。每个节点储存第一个儿子的指针，即链表的头节点；和他的右兄弟的指针。

这种方式便于实现配对堆，也将方便复杂度分析。

配对堆比较明显的特征在于其合并

合并

合并两个配对堆的操作很简单，首先令两个根节点较小的一个为新的根节点，然后将较大的根节点作为它的儿子插入进去。（见下图）

Node* meld(Node* x, Node* y) {
  // 若有一个为空则直接返回另一个
  if (x == nullptr) return y;
  if (y == nullptr) return x;
  if (x->v > y->v) std::swap(x, y);  // swap后x为权值小的堆，y为权值大的堆
  // 将y设为x的儿子
  y->sibling = x->child;
  x->child = y;
  return x;  // 新的根节点为 x
}

删除最小值

首先要提及的一点是，上文的几个操作都十分偷懒，完全没有对数据结构进行维护，所以我们需要小心设计删除最小值的操作，来保证总复杂度不出问题。

根节点即为最小值，所以要删除的是根节点。考虑拿掉根节点之后会发生什么：根节点原来的所有儿子构成了一片森林；而配对堆应当是一棵树，所以我们需要通过某种顺序把这些儿子全部合并起来。

一个很自然的想法是使用 meld 函数把儿子们从左到右挨个并在一起，这样做的话正确性是显然的，但是会导致单次操作复杂度退化到 n。

为了保证总的均摊复杂度，需要使用一个「两步走」的合并方法：

把儿子们两两配成一对，用 meld 操作把被配成同一对的两个儿子合并到一起（见下图 1），
将新产生的堆 从右往左（即老的儿子到新的儿子的方向）挨个合并在一起（见下图 2）。

Node* merges(Node* x) {
  if (x == nullptr || x->sibling == nullptr)
    return x;  // 如果该树为空或他没有下一个兄弟，就不需要合并了，return。
  Node* y = x->sibling;                // y 为 x 的下一个兄弟
  Node* c = y->sibling;                // c 是再下一个兄弟
  x->sibling = y->sibling = nullptr;   // 拆散
  return meld(merges(c), meld(x, y));  // 核心部分
}

最后一句话是该函数的核心，这句话分三部分：

meld(x,y)「配对」了 x 和 y。
merges(c) 递归合并 c 和他的兄弟们。
将上面 2 个操作产生的 2 个新树合并。

需要注意到的是，上文提到了第二步时的合并方向是有要求的（从右往左合并），该递归函数的实现已保证了这个顺序，如果读者需要自行实现迭代版本的话请务必注意保证该顺序，否则复杂度将失去保证。

有了 merges 函数，delete-min 操作就显然了。

Node* delete_min(Node* x) {
  Node* t = merges(x->child);
  delete x;  // 如果需要内存回收
  return t;
}

B 树

The limitations of traditional binary search trees can be frustrating. Meet the B-Tree, the multi-talented data structure that can handle massive amounts of data with ease. When it comes to storing and searching large amounts of data, traditional binary search trees can become impractical due to their poor performance and high memory usage. B-Trees, also known as B-Tree or Balanced Tree, are a type of self-balancing tree that was specifically designed to overcome these limitations.

properties

~Every node except the root must contain at most t-1 keys. The root may contain a minimum of 1 key.
~All nodes (including root) may contain at most (2*t – 1) keys
~B-Tree grows and shrinks from the root which is unlike Binary Search Tree. Binary Search Trees grow downward and also shrink from downward.

Traversal in B-Tree:

Traversal is also similar to Inorder traversal of Binary Tree. We start from the leftmost child, recursively print the leftmost child, then repeat the same process for the remaining children and keys. In the end, recursively print the rightmost child.

Search Operation in B-Tree:

Search is similar to the search in Binary Search Tree. Let the key to be searched is k.

Start from the root and recursively traverse down.
For every visited non-leaf node,
- If the node has the key, we simply return the node.
- Otherwise, we recur down to the appropriate child (The child which is just before the first greater key) of the node.
If we reach a leaf node and don’t find k in the leaf node, then return NULL.

Searching a B-Tree is similar to searching a binary tree. The algorithm is similar and goes with recursion. At each level, the search is optimized as if the key value is not present in the range of the parent then the key is present in another branch. As these values limit the search they are also known as limiting values or separation values. If we reach a leaf node and don’t find the desired key then it will display NULL.

Insert Operation in B-Tree

How to make sure that a node has space available for a key before the key is inserted? We use an operation called splitChild() that is used to split a child of a node. See the following diagram to understand split. In the following diagram, child y of x is being split into two nodes y and z. Note that the splitChild operation moves a key up and this is the reason B-Trees grow up, unlike BSTs which grow down.

Insertion
1) Initialize x as root.
2) While x is not leaf, do following
..a) Find the child of x that is going to be traversed next. Let the child be y.
..b) If y is not full, change x to point to y.
..c) If y is full, split it and change x to point to one of the two parts of y. If k is smaller than mid key in y, then set x as the first part of y. Else second part of y. When we split y, we move a key from y to its parent x.
3) The loop in step 2 stops when x is leaf. x must have space for 1 extra key as we have been splitting all nodes in advance. So simply insert k to x.

针对一棵高度为 h 的 m阶 B 树，插入一个元素时，首先要验证该元素在 B 树中是否存在，如果不存在，那么就要在叶子节点中插入该新的元素，此时分 3 种情况：

如果叶子节点空间足够，即该节点的关键字数小于 m-1，则直接插入在叶子节点的左边或右边；
如果空间满了以至于没有足够的空间去添加新的元素，即该节点的关键字数已经有了 m个，则需要将该节点进行「分裂」，将一半数量的关键字元素分裂到新的其相邻右节点中，中间关键字元素上移到父节点中，而且当节点中关键元素向右移动了，相关的指针也需要向右移。
1. 从该节点的原有元素和新的元素中选择出中位数
2. 小于这一中位数的元素放入左边节点，大于这一中位数的元素放入右边节点，中位数作为分隔值。
3. 分隔值被插入到父节点中，这可能会造成父节点分裂，分裂父节点时可能又会使它的父节点分裂，以此类推。如果没有父节点（这一节点是根节点），就创建一个新的根节点（增加了树的高度）。

如果一直分裂到根节点，那么就需要创建一个新的根节点。它有一个分隔值和两个子节点。

// C++ program for B-Tree insertion
#include<iostream>
using namespace std;

// A BTree node
class BTreeNode
{
	int *keys; // An array of keys
	int t;	 // Minimum degree (defines the range for number of keys)
	BTreeNode **C; // An array of child pointers
	int n;	 // Current number of keys
	bool leaf; // Is true when node is leaf. Otherwise false
public:
	BTreeNode(int _t, bool _leaf); // Constructor

	// A utility function to insert a new key in the subtree rooted with
	// this node. The assumption is, the node must be non-full when this
	// function is called
	void insertNonFull(int k);

	// A utility function to split the child y of this node. i is index of y in
	// child array C[]. The Child y must be full when this function is called
	void splitChild(int i, BTreeNode *y);

	// A function to traverse all nodes in a subtree rooted with this node
	void traverse();

	// A function to search a key in the subtree rooted with this node.
	BTreeNode *search(int k); // returns NULL if k is not present.

// Make BTree friend of this so that we can access private members of this
// class in BTree functions
friend class BTree;
};

// A BTree
class BTree
{
	BTreeNode *root; // Pointer to root node
	int t; // Minimum degree
public:
	// Constructor (Initializes tree as empty)
	BTree(int _t)
	{ root = NULL; t = _t; }

	// function to traverse the tree
	void traverse()
	{ if (root != NULL) root->traverse(); }

	// function to search a key in this tree
	BTreeNode* search(int k)
	{ return (root == NULL)? NULL : root->search(k); }

	// The main function that inserts a new key in this B-Tree
	void insert(int k);
};

// Constructor for BTreeNode class
BTreeNode::BTreeNode(int t1, bool leaf1)
{
	// Copy the given minimum degree and leaf property
	t = t1;
	leaf = leaf1;

	// Allocate memory for maximum number of possible keys
	// and child pointers
	keys = new int[2*t-1];
	C = new BTreeNode *[2*t];

	// Initialize the number of keys as 0
	n = 0;
}

// Function to traverse all nodes in a subtree rooted with this node
void BTreeNode::traverse()
{
	// There are n keys and n+1 children, traverse through n keys
	// and first n children
	int i;
	for (i = 0; i < n; i++)
	{
		// If this is not leaf, then before printing key[i],
		// traverse the subtree rooted with child C[i].
		if (leaf == false)
			C[i]->traverse();
		cout << " " << keys[i];
	}

	// Print the subtree rooted with last child
	if (leaf == false)
		C[i]->traverse();
}

// Function to search key k in subtree rooted with this node
BTreeNode *BTreeNode::search(int k)
{
	// Find the first key greater than or equal to k
	int i = 0;
	while (i < n && k > keys[i])
		i++;

	// If the found key is equal to k, return this node
	if (keys[i] == k)
		return this;

	// If key is not found here and this is a leaf node
	if (leaf == true)
		return NULL;

	// Go to the appropriate child
	return C[i]->search(k);
}

// The main function that inserts a new key in this B-Tree
void BTree::insert(int k)
{
	// If tree is empty
	if (root == NULL)
	{
		// Allocate memory for root
		root = new BTreeNode(t, true);
		root->keys[0] = k; // Insert key
		root->n = 1; // Update number of keys in root
	}
	else // If tree is not empty
	{
		// If root is full, then tree grows in height
		if (root->n == 2*t-1)
		{
			// Allocate memory for new root
			BTreeNode *s = new BTreeNode(t, false);

			// Make old root as child of new root
			s->C[0] = root;

			// Split the old root and move 1 key to the new root
			s->splitChild(0, root);

			// New root has two children now. Decide which of the
			// two children is going to have new key
			int i = 0;
			if (s->keys[0] < k)
				i++;
			s->C[i]->insertNonFull(k);

			// Change root
			root = s;
		}
		else // If root is not full, call insertNonFull for root
			root->insertNonFull(k);
	}
}

// A utility function to insert a new key in this node
// The assumption is, the node must be non-full when this
// function is called
void BTreeNode::insertNonFull(int k)
{
	// Initialize index as index of rightmost element
	int i = n-1;

	// If this is a leaf node
	if (leaf == true)
	{
		// The following loop does two things
		// a) Finds the location of new key to be inserted
		// b) Moves all greater keys to one place ahead
		while (i >= 0 && keys[i] > k)
		{
			keys[i+1] = keys[i];
			i--;
		}

		// Insert the new key at found location
		keys[i+1] = k;
		n = n+1;
	}
	else // If this node is not leaf
	{
		// Find the child which is going to have the new key
		while (i >= 0 && keys[i] > k)
			i--;

		// See if the found child is full
		if (C[i+1]->n == 2*t-1)
		{
			// If the child is full, then split it
			splitChild(i+1, C[i+1]);

			// After split, the middle key of C[i] goes up and
			// C[i] is splitted into two. See which of the two
			// is going to have the new key
			if (keys[i+1] < k)
				i++;
		}
		C[i+1]->insertNonFull(k);
	}
}

// A utility function to split the child y of this node
// Note that y must be full when this function is called
void BTreeNode::splitChild(int i, BTreeNode *y)
{
	// Create a new node which is going to store (t-1) keys
	// of y
	BTreeNode *z = new BTreeNode(y->t, y->leaf);
	z->n = t - 1;

	// Copy the last (t-1) keys of y to z
	for (int j = 0; j < t-1; j++)
		z->keys[j] = y->keys[j+t];

	// Copy the last t children of y to z
	if (y->leaf == false)
	{
		for (int j = 0; j < t; j++)
			z->C[j] = y->C[j+t];
	}

	// Reduce the number of keys in y
	y->n = t - 1;

	// Since this node is going to have a new child,
	// create space of new child
	for (int j = n; j >= i+1; j--)
		C[j+1] = C[j];

	// Link the new child to this node
	C[i+1] = z;

	// A key of y will move to this node. Find the location of
	// new key and move all greater keys one space ahead
	for (int j = n-1; j >= i; j--)
		keys[j+1] = keys[j];

	// Copy the middle key of y to this node
	keys[i] = y->keys[t-1];

	// Increment count of keys in this node
	n = n + 1;
}

// Driver program to test above functions
int main()
{
	BTree t(3); // A B-Tree with minimum degree 3
	t.insert(10);
	t.insert(20);
	t.insert(5);
	t.insert(6);
	t.insert(12);
	t.insert(30);
	t.insert(7);
	t.insert(17);

	cout << "Traversal of the constructed tree is ";
	t.traverse();

	int k = 6;
	(t.search(k) != NULL)? cout << "\nPresent" : cout << "\nNot Present";

	k = 15;
	(t.search(k) != NULL)? cout << "\nPresent" : cout << "\nNot Present";

	return 0;
}

分类目录算法与数据结构