Bit Manipulation#
Function#
The underlying implementation of computers is binary. Mastering bit manipulation can improve the time and space efficiency of program execution.
Easter Egg#
The preface of the blue book numbered 0xFF is -1, and the postscript is 0x7F which is 127 (under the condition of signed 8-bit binary numbers).
Preliminary Understanding#
- And: and, &, ∧ (can be remembered in conjunction with set operations)
- Or: or, |, ∨
- Not: not, ~, ¬
- Xor: xor, ^
In an $m$-bit binary number, the lowest bit is usually the $0$th bit, incrementing from right to left, and the highest bit is the $(m-1)$th bit.
Storage and Operations of Integers#
Below, we introduce the example of 32-bit binary numbers, specifically int and unsigned int in C++.
Two's Complement#
-
unsigned int
Directly treat encoding C as a 32-bit binary number S. -
int
The highest bit is the sign bit, where 0 indicates non-negative numbers and 1 indicates negative numbers.
For sign bit 0, directly treat encoding C as a 32-bit binary number S.
For sign bit 1, define the value represented by the bitwise negation of encoding C, ~C, as $-1-S$.It can be observed that under two's complement, each value has a unique representation method, and performing addition or subtraction on any two values is equivalent to performing binary addition or subtraction under 32-bit two's complement with no carry from the highest bit (meaning if the highest bit overflows, it is ignored, like 1+1=0, but the carry from the second highest bit must be considered, similar to this). Upon overflow, a 32-bit unsigned integer effectively takes modulo $2^{32}$. This also explains the phenomenon of negative numbers appearing when signed integers overflow arithmetically. (However, I don't quite understand this; can someone help clarify?)
One's Complement#
Skipped, it's just one less than two's complement.
Simple Representation#
Using binary to represent an int requires 32 bits, which is cumbersome. Using decimal does not express each bit of two's complement, so hexadecimal is commonly used to represent a constant. This way, only 8 characters are needed.
Common value: 0x3F3F3F3F (doubling will not overflow, each byte is the same, convenient for assignment).
Shift Operations#
Left shift: In binary representation, move the number to the left, filling low bits with 0, and discarding high bits that overflow.
Arithmetic right shift: In binary two's complement representation, move the number to the right, filling high bits with the sign bit, and discarding low bits that overflow. This is equivalent to dividing by 2 and rounding down. ($(-3)>>1=-2$)
Logical right shift: In binary two's complement representation, move the output to the right, filling high bits with 0, and discarding low bits that overflow.
Most compilers use arithmetic right shift.
Applications#
- Calculate $a^b \mod p$
According to mathematical knowledge, every number can be uniquely represented as a sum of distinct powers of 2 (since any integer can be converted to binary), meaning if $b$ has $k$ bits in its binary representation, where the $i (0\leq i\leq k)$th bit is $c_i$, then
Thus,
We can write the following code:
#include<bits/stdc++.h>
using namespace std;
int main()
{
int a = 0;
int b = 0;
int p = 0;
int ans = 1;
cin >> a >> b >> p;
for(;b > 0;b >>= 1)
{
if(b & 1)
{
ans = (long long)ans * a % p;
}
a = (long long)a * a % p;
}
cout << ans % p;
return 0;
}
In C++, when two values perform arithmetic operations, the highest value type involved in the operation serves as the basis, regardless of the type used to store the result.
However, there is a problem here. The highest built-in integer type in C++ is 64 bits, so operations like $a*b \mod p$ at the $10^{18}$ level require special handling.
- Calculate $a*b \mod p$ ($1\leq a,b,p\leq 10^{18}$)
Method 1: Similar to the idea of fast exponentiation.
Then,
This way, it won't exceed 64 bits. The code is as follows:
#include<bits/stdc++.h>
using namespace std;
#define ll long long
int main()
{
ll a = 0, b = 0, p = 0;
cin >> a >> b >> p;
ll ans = 0;
for(;b > 0;b >>= 1)
{
if(b & 1)
{
ans = (ans + a) % p;
}
a = 2 * a % p;
}
cout << ans % p;
}
Method 2:
Use $ab \mod p=ab-\lfloor a*b/p\rfloor p$. Let $c=\lfloor ab/p\rfloor$.
How to calculate $c$? Use floating-point numbers (long double). When the precision of floating-point numbers is insufficient to retain accurate digits, lower bits will be discarded. Thus, we get $c._\ ____$.
(The role of long double is to allow $a*b$ to be calculated successfully without overflow errors; if unsigned long long or long long were used, it wouldn't compute correctly, let alone the errors. Even though long double may have errors, in the multiplication of 64-bit numbers, the maximum would be 128 bits, and the resulting error would not be significant, allowing for the correct result.)
Next, we just need to calculate $ab-cp$. Since $ab-cp$ is actually $ab \mod p$, we have $ab-c*p \leq p < 2^{64}$, thus
Taking modulo $2^{64}$ is the natural overflow of unsigned long long, and we can compute the final result using long long (to prevent negative results).
The code is as follows:
#include<bits/stdc++.h>
using namespace std;
typedef unsigned long long ull;
int main()
{
ull a, b, p;
cin >> a >> b >> p;
a %= p, b %= p;
ull c = (long double)a * b / p;
ull x = a * b, y = c * p;
long long ans = (long long)(x % p) - (long long)(y % p);
if(ans < 0) ans += p;
cout << ans % p;
}
Essentially: converting division into subtraction.
- State Compression DP
Binary state compression refers to the method of representing and storing a boolean array of length $m$ using an $m$-bit binary representation. The following bit operations can achieve the corresponding operations.
Operation | Calculation |
---|---|
Extract the $k$th bit of integer $n$ in binary representation | $(n>>k)&1$ |
Extract the last $k$ bits of integer $n$ in binary representation | $n&((1<<k)-1)$ |
Toggle the $k$th bit of integer $n$ in binary representation | $n\ xor\ (1<<k)$ |
Set the $k$th bit of integer $n$ in binary representation to 1 | $n\vert (1<<k)$ |
Set the $k$th bit of integer $n$ in binary representation to 0 | $n & (~(1<<k))$ |
This method can save a lot of space. However, when $m$ is too large, one can choose bitset from the STL.
Example problem: Shortest Hamiltonian Path (Acwing91)
"The naive algorithm": Enumerate all permutations of $n$ points, calculate the length and take the minimum, with enumeration complexity $O(n!)$, length calculation complexity $O(n)$, overall complexity $O(nn!)$. Using state compression DP can optimize this to $O(n^22^n)$.
We use an $n$-bit binary number to represent
{% mermaid %}
graph LR
A[Dynamic Programming]
A-->B[State Representation f]
A-->C[State Calculation]
{% endmermaid %}