Meaning of Block Partitioning#
Block partitioning is a concept, not a data structure.
The basic idea of block partitioning is to achieve better time complexity than general brute-force algorithms by appropriately dividing the original data and preprocessing some information on each block after partitioning.
Time Complexity#
The time complexity of block partitioning mainly depends on the block length, which can generally be determined by the mean inequality to find the optimal block length for a given problem, along with the corresponding time complexity.
Mean Inequality#
For the perfect square formula we learned in middle school, $(x+y)^2 \ge 0$, through simple transformation we can derive $x^2+y^2 \ge 2xy$.
By substituting $x = \sqrt{a}$ and $y = \sqrt{b}$, we get $a + b \ge 2\sqrt{ab}$ $(a, b > 0)$, where equality holds when $a = b$, which is the mean inequality.
Analysis#
Taking a problem as an example: Segment Tree 1
That's right, although it's a template problem for segment trees, it can also be solved using block partitioning, and it runs very fast.
Let the block length be $s$. Regardless of the operation on [l, r]:
- If $l$ and $r$ are in the same block, enumerate: $O(s)$.
- If $l$ and $r$ are not in the same block: enumerate the block containing $l$, then enumerate from the next block of $l$ to the previous block of $r$, and enumerate the block containing $r$: $O(s + \frac{n}{s})$.
If there are $m$ operations in total, the overall time complexity is $O(m(s + \frac{n}{s}))$. Using the mean inequality, we know that when $s = \frac{n}{s}$, this expression achieves its minimum value, i.e., optimal time complexity, at this point $s = \sqrt{n}$.
(tips: For different problems, the block length $s$ may vary, and specific analysis is required, but generally speaking, for operations like this problem, the block length is typically $\sqrt{n}$.)
Template#
For the segment tree problem mentioned earlier, we can write the block partitioning code.
#include <bits/stdc++.h>
using namespace std;
const int N = 1e5 + 5;
int n, m;
long long a[N], blk[N], b[N];
int pos[N];
int R;
void add(int l, int r, long long k)
{
if(pos[l] == pos[r])
{
for(int i = l;i <= r;i++)
a[i] += k;
blk[pos[l]] += (r - l + 1) * k;
return;
}
for(int i = l;pos[i] == pos[l];i++)
{
a[i] += k;
blk[pos[i]] += k;
}
for(int i = pos[l] + 1;i <= pos[r] - 1;i++)
{
blk[i] += R * k;
b[i] += k;
}
for(int i = r;pos[i] == pos[r];i--)
{
a[i] += k;
blk[pos[i]] += k;
}
}
long long query(int l, int r)
{
long long ans = 0;
if(pos[l] == pos[r])
{
for(int i = l;i <= r;i++)
{
ans += a[i] + b[pos[i]];
}
return ans;
}
for(int i = l;pos[i] == pos[l];i++)
{
ans += a[i] + b[pos[i]];
}
for(int i = pos[l] + 1;i <= pos[r] - 1;i++)
{
ans += blk[i];
}
for(int i = r;pos[i] == pos[r];i--)
{
ans += a[i] + b[pos[i]];
}
return ans;
}
int main()
{
clock_t c1 = clock();
#ifdef LOCAL
freopen("in.in", "r", stdin);
freopen("out.out", "w", stdout);
#endif
//------------------------------
scanf("%d%d", &n, &m);
R = sqrt(n) + 1;
for(int i = 1;i <= n;i++)
{
scanf("%lld", &a[i]);
pos[i] = i / R + 1;
blk[pos[i]] += a[i];
}
long long opt, x, y, z;
while(m--)
{
scanf("%lld", &opt);
if(opt == 1)
{
scanf("%lld%lld%lld", &x, &y, &z);
add(x, y, z);
}
else if(opt == 2)
{
scanf("%lld%lld", &x, &y);
printf("%lld\n", query(x, y));
}
}
//------------------------------
end:
cerr << "Time Used:" << clock() - c1 << "ms" << endl;
return 0;
}
tips:
- The $pos$ array records the block number of each element.
- The $blk$ array records the actual sum within each block.
- The $b$ array acts like the $lazytag$ in segment trees, recording values that have been added to the $blk$ array but have not yet been "propagated" to the original array (however, propagation is not necessary here, as it is not implemented in the code).
Other Problems#
Point Update + Query Count of Numbers Less Than (or Equal to) or Greater Than (or Equal to) a Certain Number in a Range#
For each block, we can create a vector and sort it. When updating, we can use binary search to find the position of the number in the vector, erase it, insert the new number, and then sort again.
For queries, if it is not a complete block, we enumerate violently; if it is a complete block, we can simply use the vector's $lower_bound$ to query the count.