Blue Book 0x01 - histcat

Bit Manipulation#

Function#

The underlying implementation of computers is binary. Mastering bit manipulation can improve the time and space efficiency of program execution.

Easter Egg#

The preface of the blue book numbered 0xFF is -1, and the postscript is 0x7F which is 127 (under the condition of signed 8-bit binary numbers).

Preliminary Understanding#

And: and, &, ∧ (can be remembered in conjunction with set operations)
Or: or, |, ∨
Not: not, ~, ¬
Xor: xor, ^

In an $m$-bit binary number, the lowest bit is usually the $0$th bit, incrementing from right to left, and the highest bit is the $(m-1)$th bit.

Storage and Operations of Integers#

Below, we introduce the example of 32-bit binary numbers, specifically int and unsigned int in C++.

Two's Complement#

unsigned int
Directly treat encoding C as a 32-bit binary number S.
int
The highest bit is the sign bit, where 0 indicates non-negative numbers and 1 indicates negative numbers.
For sign bit 0, directly treat encoding C as a 32-bit binary number S.
For sign bit 1, define the value represented by the bitwise negation of encoding C, ~C, as $-1-S$.

It can be observed that under two's complement, each value has a unique representation method, and performing addition or subtraction on any two values is equivalent to performing binary addition or subtraction under 32-bit two's complement with no carry from the highest bit (meaning if the highest bit overflows, it is ignored, like 1+1=0, but the carry from the second highest bit must be considered, similar to this). Upon overflow, a 32-bit unsigned integer effectively takes modulo $2^{32}$. This also explains the phenomenon of negative numbers appearing when signed integers overflow arithmetically. (However, I don't quite understand this; can someone help clarify?)

One's Complement#

Skipped, it's just one less than two's complement.

Simple Representation#

Using binary to represent an int requires 32 bits, which is cumbersome. Using decimal does not express each bit of two's complement, so hexadecimal is commonly used to represent a constant. This way, only 8 characters are needed.
Common value: 0x3F3F3F3F (doubling will not overflow, each byte is the same, convenient for assignment).

Shift Operations#

Left shift: In binary representation, move the number to the left, filling low bits with 0, and discarding high bits that overflow.

1 << n = 2^n

n << 1 = 2 \times n

Arithmetic right shift: In binary two's complement representation, move the number to the right, filling high bits with the sign bit, and discarding low bits that overflow. This is equivalent to dividing by 2 and rounding down. ($(-3)>>1=-2$)
Logical right shift: In binary two's complement representation, move the output to the right, filling high bits with 0, and discarding low bits that overflow.
Most compilers use arithmetic right shift.

Applications#

Calculate $a^b \mod p$

According to mathematical knowledge, every number can be uniquely represented as a sum of distinct powers of 2 (since any integer can be converted to binary), meaning if $b$ has $k$ bits in its binary representation, where the $i (0\leq i\leq k)$th bit is $c_i$, then

b=c_{k-1}2^{k-1}+c_{k-2}2^{k-2}+......+c_02^0

Thus,

a^b=a^{c_{k-1}*2^{k-1}}*a^{c_{k-2}*2^{k-2}}*......*a^{c_0*2^0}

We can write the following code:

#include<bits/stdc++.h>

using namespace std;

int main()
{
    int a = 0;
    int b = 0;
    int p = 0;
    int ans = 1;
    cin >> a >> b >> p;
    for(;b > 0;b >>= 1)
    {
        if(b & 1)
        {
            ans = (long long)ans * a % p;
        }
        a = (long long)a * a % p;
    }
    cout << ans % p;
    return 0;
}

In C++, when two values perform arithmetic operations, the highest value type involved in the operation serves as the basis, regardless of the type used to store the result.

However, there is a problem here. The highest built-in integer type in C++ is 64 bits, so operations like $a*b \mod p$ at the $10^{18}$ level require special handling.

Calculate $a*b \mod p$ ($1\leq a,b,p\leq 10^{18}$)

Method 1: Similar to the idea of fast exponentiation.

b=c_{k-1}2^{k-1}+c_{k-2}2^{k-2}+......+c_02^0

Then,

a*b=c_{k-1}*a*2^{k-1}+c_{k-2}*a*2^{k-2}+......+c_0*a*2^0

This way, it won't exceed 64 bits. The code is as follows:

#include<bits/stdc++.h>
using namespace std;
#define ll long long
int main()
{
    ll a = 0, b = 0, p = 0;
    cin >> a >> b >> p;
    ll ans = 0;
    for(;b > 0;b >>= 1)
    {
        if(b & 1)
        {
            ans = (ans + a) % p;
        }
        a = 2 * a % p;
    }
    cout << ans % p;
}

Method 2:
Use $ab \mod p=ab-\lfloor a*b/p\rfloor p$. Let $c=\lfloor ab/p\rfloor$.
How to calculate $c$? Use floating-point numbers (long double). When the precision of floating-point numbers is insufficient to retain accurate digits, lower bits will be discarded. Thus, we get $c._\ ____$.

(The role of long double is to allow $a*b$ to be calculated successfully without overflow errors; if unsigned long long or long long were used, it wouldn't compute correctly, let alone the errors. Even though long double may have errors, in the multiplication of 64-bit numbers, the maximum would be 128 bits, and the resulting error would not be significant, allowing for the correct result.)

Next, we just need to calculate $ab-cp$. Since $ab-cp$ is actually $ab \mod p$, we have $ab-c*p \leq p < 2^{64}$, thus

a*b-c*p=(a*b-c*p)\mod 2^{64}=(a*b)\mod 2^{64} - (c*p) \mod 2^{64}

Taking modulo $2^{64}$ is the natural overflow of unsigned long long, and we can compute the final result using long long (to prevent negative results).

The code is as follows:

#include<bits/stdc++.h>

using namespace std;

typedef unsigned long long ull;
int main()
{
    ull a, b, p;
    cin >> a >> b >> p;
    a %= p, b %= p;
    ull c = (long double)a * b / p;
    ull x = a * b, y = c * p;
    long long ans = (long long)(x % p) - (long long)(y % p);
    if(ans < 0) ans += p;
    cout << ans % p;
}

Essentially: converting division into subtraction.

State Compression DP

Binary state compression refers to the method of representing and storing a boolean array of length $m$ using an $m$-bit binary representation. The following bit operations can achieve the corresponding operations.

Operation	Calculation
Extract the $k$th bit of integer $n$ in binary representation	$(n>>k)&1$
Extract the last $k$ bits of integer $n$ in binary representation	$n&((1<<k)-1)$
Toggle the $k$th bit of integer $n$ in binary representation	$n\ xor\ (1<<k)$
Set the $k$th bit of integer $n$ in binary representation to 1	$n\vert (1<<k)$
Set the $k$th bit of integer $n$ in binary representation to 0	$n & (~(1<<k))$

This method can save a lot of space. However, when $m$ is too large, one can choose bitset from the STL.

Example problem: Shortest Hamiltonian Path (Acwing91)
"The naive algorithm": Enumerate all permutations of $n$ points, calculate the length and take the minimum, with enumeration complexity $O(n!)$, length calculation complexity $O(n)$, overall complexity $O(nn!)$. Using state compression DP can optimize this to $O(n^22^n)$.

We use an $n$-bit binary number to represent

{% mermaid %}

graph LR

A[Dynamic Programming]

A-->B[State Representation f]

A-->C[State Calculation]

{% endmermaid %}