The 3rd Universal Cup Semifinals Analysis (Draft)

The 3rd Universal Cup Semifinals: Analysis Report

Qingyu Shi

This is the draft of the problems of the 3rd Universal Cup Semifinals. We expect to officially release a version of this document in a few months. If you find an error, please send an e-mail to [email protected] or a private message to me (@Qingyu) about it.

Summary

This year, we received 96 problem submissions from 71 different proposals all over the world. We admitted 17 problems to our pool (and used 13 of them in the end), and shortlisted 4 more problems for backup.

Congratulations to USA1 for definding their second time winning on The Universal Cup Semifinals! As the team got the highest rating in online stage and won the last semifinals and finals by a wide margin, they were considered favorites to win the semifinals. They lead the contest by solving 6 problems in the first hour, and solved 10 problems before the frozen period. They eventually secured their place by solving 11 --- 2 more than the second place. Unbelievable achievements!

Expected Difficulties (Easy to Hard)

Problem	H	F	C	G	D	E	L	M	J	A	I	B	K
Expected Solves						50	30	15	10	5	1	1	0
Actual Solves	181	130	136	104	76	84	37	24	11	4	3	0	0

Expected Medals Cut-off

	Champion	Finalists Candidate	Finalists Cut-off
Expected Solves	10	8	7
Actual Solves	11	8	?

Staff

Qingyu
Chairman
Executive Director of Contest

jiangly
Deputy Chairman
Contest Director

quailty
Chief Judge

bulijiojiodibuliduo Judge	zhoukangyang Judge	Gellyfish Judge
Crysfly Judge	w4p3r Judge	Heltion Judge
Nyaan Judge	kotatsugame Judge	risujiroh Judge
chenxinyang2006 Judge	Kevin090228 Judge

twingy Member of Live Stream Affairs	cubercsl Technical Director	Dup4 Member of Live Stream Affairs
dailongao Member of Live Stream Affairs	Tangent Member of Live Stream Affairs

A. Iridescent Universe 2

Prepared by quailty.

Additional test cases & tester solutions by bulijiojiodibuliduo, peehs_moorhsum and fjzzq2002

Binary search the answer $d$ ($0 < d < \pi$), we need to find a point such that at most $k$ great circle segments have a distance to this point of less than $d$. By expanding each great circle segment outward by a distance of $d$, we obtain $n$ sausages, and we need to find a point that lies within the interiors (excluding the boundaries) of at most $k$ of these sausages. The boundaries of the $n$ sausages divide the sphere into several regions, and we only need to find points on the boundaries of these regions.

If $d < \pi/2$, the boundary of a sausage consists of $4$ small circle segments; otherwise, it consists of $2$ small circle segments. We can extract all corresponding small circles (up to $4n$ in total), find the intersection points between these small circles, and check the validity of each intersection point. The time complexity is $O(n^3 \log(1/\epsilon))$, where $\epsilon$ is the required precision, and it can pass with careful implementation.

It is observed that we can binary search on each sausage individually. By enumerating these sausages in random order, we expect that $O(\log{n})$ sausages can lead to a better answer. We only need to binary search on these sausages and apply the aforementioned method on the boundary of each sausage. The expected time complexity is $O(n^3 + n^2 \log n \log(1/\epsilon))$.

Theoretically, we could apply the sweep line technique on the boundaries of the sausages, which yields an expected time complexity of $O(n^2 \log n + n \log^2 n \log(1/\epsilon))$. However, it is anticipated that the implementation would be too complicated to fit into a contest.

B. Queue Editor

Prepared by bulijiojiodibuliduo, jiangly and Kubic.

Analysis will be added a bit later.

C. Adjacent Add

Prepared by Nyaan.

Consider the following invariant $C$. The value of $C$ does not change through the operation.

$$C = A_1 - \frac{1}{K}A_2 + \frac{1}{K^2} A_3 + \dots + \left(-\frac{1}{K}\right)^{N-1} A_N$$

Suppose that through the operation, all values of $A$ can be made equal to $x$. In this case, the following equation holds:

$$C = x - \frac{1}{K} x + \frac{1}{K^2} x + \dots + \left(-\frac{1}{K}\right)^{N-1} x$$

Since $C$ is invariant, subtracting both expressions gives:

$$(x-A_1) - \frac{1}{K} (x-A_2) + \dots + \left(-\frac{1}{K}\right)^{N-1} (x-A_N) = 0$$

Let the left-hand side of the above equation be $G(x)$. Under the constraint $K \geq 2$, $G(x)$ can be expressed as a strictly increasing linear function. Therefore, there exists exactly one real number $x$ such that $G(x) = 0$. Moreover, if the solution $x$ satisfying $G(x)=0$ is an integer, then setting $A_1, A_2, \dots, A_{N-1}$ to $x$ in sequence will naturally make $A_N$ equal to $x$ as well, and all elements of $A$ can be made equal. Hence, the problem reduces to determining whether the solution $x$ to $G(x) = 0$ is an integer.

Whether the solution $x$ to $G(x) = 0$ is an integer can be determined using binary search. Due to the constraint $K \geq 2$, it suffices to search the range $\vert x \vert \leq 2 \times 10^9$. Furthermore, given a value of $x$, one can determine whether $G(x)$ is positive, zero, or negative in $\mathrm{O}(N)$ time. (This part is somewhat intricate and will be omitted here.) Therefore, this problem can be solved in $\mathrm{O}(N \log (\max(A)))$ time, which is sufficiently fast.

D. Circular Matching

Prepared by Nyaan.

First, we solve a subproblem. The subproblem is to compute the weight of the minimum weight perfect matching between balls of different colors when $2M$ balls are evenly placed on a circle, with half of them painted red and the other half blue. (Here, the weight is defined as the shortest distance between two balls.) We can prove the following proposition:

The answer to the subproblem is the same as the answer obtained when the circle is cut open at an appropriate position and reduced to a problem on a line.

Proof. Consider the circle as a cycle graph with $N$ vertices and $N$ edges. Take one optimal solution of the subproblem and denote it by $(u_1,v_1), (u_2,v_2), \dots, (u_M,v_M)$.

We define a \textbf{crossing matching} as a set of four balls $(w_1, w_2, w_3, w_4)$ that satisfy the following conditions:

When starting from $w_1$ and listing vertices counterclockwise, $w_1, w_2, w_3, w_4$ appear in this order along the circumference. (They do not need to be adjacent.)
$w_1$ is matched with $w_3$, and $w_2$ is matched with $w_4$.

Take one crossing matching. Assume the color of $w_1$ is red; thus, the color of $w_3$ is blue. If we assume that $w_2$ is blue, then replacing the matching by pairing $w_1$ with $w_2$ and $w_3$ with $w_4$ would reduce the total weight, leading to a contradiction. Hence, $w_2$ must also be red. In this case, replacing the matching with $w_1$ with $w_4$ and $w_2$ with $w_3$ decreases or maintains the solution weight while reducing the number of crossing matchings. Therefore, as long as crossing matchings exist, we can apply this operation to remove them, proving that there exists an optimal solution with no crossing matchings.

Now, let $(u_1,v_1), (u_2,v_2), \dots, (u_M,v_M)$ be such an optimal solution without crossing matchings. For each $(u_i,v_i)$, take one shortest path between $u_i$ and $v_i$ and color the edges along this path. If there exists at least one uncolored edge, we can cut the circle open at that position and reduce the problem to one on a line. Moreover, when there are no crossing matchings, we can confirm that at least one such uncolored edge must exist. Thus, we have proven that the problem can always be reduced to a problem on a straight line by appropriately cutting the circle open. \hfill $\square$

Therefore, let us first consider a simpler version of the subproblem: computing the minimum weight matching when balls $1, 2, \dots, 2M$ are placed evenly on a straight line. The answer to this problem is straightforward and can be obtained in linear time using the following cumulative sum algorithm:

Set $s_0 = 0$, and for $i=1,2,\dots,N$, compute $s_i = s_{i-1} + (1\text{ if (i-th ball is red) else } -1)$. Then, $\sum_{i=1}^N \vert s_i \vert$ is the answer.

Returning to the original subproblem, we can derive a formula for the answer using the properties of cumulative sums:

Set $s_0 = 0$, and for $i=1,2,\dots,N$, compute $s_i = s_{i-1} + (1\text{ if (i-th ball is red) else } -1)$.
Choose an integer $x$ such that $\min(s) \leq x \leq \max(s)$.
The answer is $\displaystyle \min_x \sum_{i=1}^N \vert s_i - x \vert$.

The value of $x$ that minimizes $\displaystyle \min_x \sum_{i=1}^N \vert s_i - x \vert$ is the $M$th largest value among $s_1, s_2, \dots, s_{2M}$. Substituting this value into the formula and simplifying, the answer becomes:

$$\Bigl(\text{sum of the largest $M$ values among } s_1, s_2, \dots, s_{2M}\Bigr) \ - \ \Bigl(\text{sum of the smallest $M$ values among } s_1, s_2, \dots, s_{2M}\Bigr).$$

Since we have simplified the subproblem, the remaining task is to handle queries. This is a typical query problem, and one possible solution uses parallel binary search combined with a Fenwick Tree to achieve a time complexity of $\mathrm{O}(Q \log^2 N)$, which is fast enough. (Alternatively, there are solutions using a wavelet matrix or combining Mo's algorithm with square root decomposition of the array. As long as you construct an algorithm with sufficiently good complexity, you can solve this problem.)

E. Circular Convolution 2

Prepared by Kubic.

Analysis will be added a bit later.

F. Even Circuit

Prepared by Nyaan.

Let us consider a subproblem: determining whether $K=4$ can be the answer. This is equivalent to determining whether there exists a quadruple of distinct integers $(x, y, z, w)$ such that $A_i$ are all distinct and

$$A_{x} \oplus A_{y} \oplus A_{z} \oplus A_{w} = 0.$$

This condition is equivalent to:

$$A_{x} \oplus A_{y} \oplus A_{z} \oplus A_{w} = 0 \iff A_x \oplus A_y = A_z \oplus A_w.$$

Therefore, this problem can be solved in $\mathrm{O}(N + \max(A))$ time by using the pigeonhole principle and examining all pairs $(A_i, A_j)$.

This approach can be generalized for arbitrary $K$. The condition that the answer is at most $K$ is equivalent to the existence of at least two sets of $\frac{K}{2}$ indices such that the XOR sum of $A_i$ over each set is equal. Hence, this problem can also be solved using the pigeonhole principle in the general case.

Given the constraint $A_i < 2^{22}$, we can observe that the answer is at most $24$. By performing a suitably bounded search, the time complexity can be reduced to approximately $\mathrm{O}((N + \max(A)) \log \max(A))$, which is efficient enough to solve the problem.

G. Master of Cards

Prepared by Nyaan.

We construct a graph $G$ with $3N$ vertices numbered from $1$ to $3N$ and $3N$ edges, where each edge is assigned a color, using the following procedure:

For each $1 \leq i \leq N$, perform the following steps:
- Add an edge of color $i$ between vertex $a_i$ and vertex $N + b_i$.
- Add an edge of color $i$ between vertex $N + b_i$ and vertex $2N + c_i$.
- Add an edge of color $i$ between vertex $2N + c_i$ and vertex $a_i$.

This construction makes the original problem equivalent to a task on graph $G$, which consists of performing the following operation as many times as possible. (The validity of this transformation can be shown with a brief argument, so we omit the proof.)

Select two edges that share a vertex, remove them, and output the pair of colors of the removed edges.

The reformulated problem can be solved in $\mathrm{O}(N)$ time using a graph traversal algorithm such as depth-first search. Therefore, the problem can be solved in $\mathrm{O}(N)$ time, which is sufficiently fast.

H. Shortcut on Tree

Prepared by kotatsugame and Nyaan.

Color all vertices except the root recursively in red and blue from the leaves according to the following rules:

If a vertex is a leaf, color it red.
If a vertex is not a leaf, color it blue if it has any red children; otherwise, color it red.

Then, construct the $N-1$ directed edges according to the following steps:

Draw a directed edge from each red vertex to the root.
Draw a directed edge from the root to each blue vertex.

This is the solution. Its correctness can be justified by confirming the following:

From any vertex, the root can be reached by traversing at most $2$ edges, and
From the root, any vertex can be reached by traversing at most $2$ edges.

I. Ethanol

Prepared by risujiroh.

As a subproblem, consider the same setting but with an added constraint: in each operation, the amount of mixture transferred must be exactly $\Delta$. (Taking the limit $\Delta \to 0$ yields the original problem's solution.)
Let's focus on the operations involving container 1. We can identify a total of $2X / \Delta$ such operations:

$X / \Delta$ times, a mixture flows from container 0 into container 1. The concentration of the mixture being transferred is always 0, but for future purposes, we impose the condition that the concentration of the mixture transferred must be non-increasing over time.
$X / \Delta$ times, a mixture flows from container 1 into container 2. Again, the concentration of the mixture transferred must be non-increasing over time.

Now consider the order of these $2X / \Delta$ operations. We can prove that the following two types of operations worsen the final result and should be avoided:

When container 1 contains a mixture of concentration $d$ and mass at least $1 + \Delta$, transferring mixture of concentration $d' < d$ from container 0 into container 1.
When container 1 contains a mixture of concentration $d$ and mass at least $1 + \Delta$, and it is possible to transfer mixture of concentration $d' > d$ from container 0, transferring mixture from container 1 into container 2 instead.

Therefore, we may ban these two types of operations. As a result, the order of the $2X / \Delta$ operations becomes uniquely determined, and can be described as follows:

While the concentration of the mixture coming from container 0 is greater than or equal to the concentration of the mixture currently in container 1, transfer mixture from container 0 to container 1.
At the moment when the above inequality reverses, transfer as much mixture as possible from container 1 to container 2.
Then, repeatedly alternate the two operations: (transfer from container 0 to container 1) and (transfer from container 1 to container 2).

Now that we understand the timing of inflow and outflow of mixtures around container 1, let's consider container 2. Due to the non-increasing concentration condition of transferred mixtures, we can apply the same reasoning. Repeating this process for containers 3, 4, $\dots$, up to container $N$, we can compute everything sequentially.

Since we understand the structure of the solution to the subproblem, we can now take the limit as $\Delta \to 0$ to solve the original problem. This involves:

Managing the concentration changes using functions derived from differential equations, and
Using binary search to find the switching points where the concentration order changes.

In the end, the entire process can be modeled using a linear combination of constants and functions of the form $x^n e^{-x}$. The time complexity is $\mathrm{O}(N^3 \times \text{(binary search cost)})$, which is sufficiently fast.

J. International Olympiad in Fishing

Prepared by Gellyfish.

Comments from the Chair: This problem was originally proposed to APIO (Asia Pacific Informatics Olympiad)... Huh.

The first press of the button is obviously symmetrical. So let's assume that the first button pressed is the one labelled "Rank". It's not hard to see that pressing the button labelled "Rank" again is ineffective, so the next time we will press the one labelled "Suit". Similarly, we can conclude that we will press the button labelled "Rank" for the third time.

We can prove that subsequent operations can't get any more information. Because we assume that the permutaion formed by the indices of the cards after we press the first three buttons is $p,q,r$ in order. They correspond to the following three sorted results, respectively:

$p$ is sort by "Rank" as the first keyword and index as the second keyword
$q$ is sort by "Suit" as the first keyword, "Rank" as the second keyword and index as the third keyword
$r$ is sort by "Rank" as the first keyword, "Suit" as the second keyword and index as the third keyword

So if we click back and forth between the button labelled "Rank" and the button labelled "Suit", the permutaions formed by the indices of the cards will be: $p, q, r, q, r, q, r\dots$; It also confirms that subsequent operations are meaningless.

In fact at this point we can transform the problem into counting the number of all the possible $(p, q, r)$. Because if there are $L$ different kinds of $(p, q, r)$, then the answer will be $\frac L {N^{2N}}$. Because for each possible $(p, q, r)$, suppose there are $x$ initial situations that could result in this outcome, then we can calculate the probability of this outcome occurring as $\frac x {N^{2N}}$; And we can't know which of these $x$ initial situations it is when we answer, so we can only choose one of them at random, and the probability of getting it right is $\frac 1 x$. it's easy see that $\frac x {N^{2N}} \times \frac 1 x = \frac 1 {N^{2N}}$, thus each possible $(p, q, r)$ contributes $\frac 1 {N^{2N}}$ to the answer, so of course the answer is $\frac L {N^{2N}}$.

Also we have a $O(N^{2N+1} \log N)$ solution now, we can enumerate each initial situation and simulate the $(p, q, r)$ that would be obtained, and just see how many different results there are!

We have transformed the problem into counting the number of all the possible $(p, q, r)$. So let's try to solve the problem now.

Since $p, q$ are relatively independent of each other and $p, r$ are more closely related to each other, we try to count how many possible $r$ there are for each pair $(p, q)$ and sum up.

First we have the following properties:

For each position that $p_i > p_{i+1}$, we split $p$ here, and it is easy to see that it will be splited into a number of contiguous segments, It is easy to see that the exchange of positions only occurs independently within each segment, because $p_i > p_{i+1}$ means this two cards have different ranks.

This reveals that we have to split $p$ into a number of segments, and then sort $q$ as the keyword within each segment to get $r$.

Now we have a $O((N!)^2 \times 2^N \times N^2)$ solution: We enumerate all $(p, q)$ and the way to split $p$, and see how many different $r$ can be obtained.

We consider how to split $p$ so that $r$ is not duplicated nor omitted. For ease of illustration, for two arrays $X, Y$, we write $X+Y$ to denote the array obtained by directly splicing $X$ before $Y$.

Let's give our final conclusions first:

For each postion $p_i > p_{i+1}$, we must split $p$ in the middle.
For other positions which be splited, let the array corresponding to the front and back segments be $A, B$ respectively, and they become $A'$ and $B'$ when sorted by $q$. Then $A+B$ must not be $A'+B'$ when sorted by $q$, otherwise it means we can just not split this position.

For the $p_i > p_{i+1}$ position is obvious, now let's give the proof of the other partition:

First of all, it's not hard to find that every possible $r$ can be counted.
Next we show that each $r$ is not counted more than once. For different two split cases, assume that the set of positions they split is $\bar{A}, \bar{B}$ respectively, we can conclude that only all splitting positions within $\bar{A} \cap \bar{B}$ can exist, for other positions, the results will be the same when the two segments before and after are merged and sorted, It contradicts our second limitation.

The second restriction is still not formal enough, let us denote by $q^{-1}_i$ the position of $i$ in $q$, We can translate the second restriction into:

For other positions which be splited, let the array corresponding to the front and back segments be $A, B$, we should have $\max_{i \in A} q^{-1}_i > \min_{j \in B} q^{-1}_j$.

Now we have a $O(N! \times N^3)$ solution: We enumerate all $p$ and use Dynamic Programming.

Consider swapping the order of summation, We first enumerate the way to split $p$, and then count how many possible $q^{-1}$ and $(p, r)$. Now the problem is much easier:

If we split $p$ into $m$ segments and the set of indices in the $k$-th segment is $S_k$. For all $k>1$, we should have $\max_{i \in S_{k-1}} i> \min_{j \in S_k} j$ or $\max_{i \in S_{k-1}} q^{-1}_i> \min_{j \in S_k} q^{-1}_j$.

It's not hard to find that for each $[S_1, S_2, \dots, S_m]$ and $q^{-1}$, We can uniquely determine that $(p, r)$. Because $p$ and $r$ are the result of sorting directly by index within each segment and by $q^{-1}$ within each segment, respectively.

Let $T_k = \{q^{-1}_i | i \in S_k\}$, the restriction can be written as $\max(S_{k-1})> \min(S_k)$ or $\max(T_{k-1})> \min(T_k)$.

We consider that instead of enumerating $q^{-1}$, we now can only enumerate $S$ and $T$.

For each $[(S_1, T_1), (S_2, T_2), \dots, (S_m, T_m)]$, there are $\prod_{i=1}^m |S_i|!$ possible $q$, and $S, T$ have to satisfy the following restrictions:

$\forall i \in [1, m], |S_i| = |T_i|$
$\forall 1 \leq i < j \leq m, S_i \cap S_j = T_i \cap T_j = \varnothing$
$\cup_{i=1}^K S_i = \cup_{i=1}^K T_i = \{1, 2, 3, \dots, N\}$
$\forall i \in [2, m], (\max(S_{i-1}) > \min(S_i)) \operatorname{or} (\max(T_{k-1})> \min(T_k))$

Now we can solve the problem with Principle of Inclusion-Exclusion and Dynamic Programming.

Let $g_n$ means the number of $S, T$ that $\cup_{i=1}^m S_i = \cup_{i=1}^m T_i = \{1, 2, 3, \dots, n\}$ and satisfies the other conditions.

$f_n$ means there are some contiguous segments that $\max(S_{k-1}) < \min(S_k)$ and $\max(T_{k-1}) < \min(T_k)$ for each interval, and the total length is $n$.

$f_0 = g_0 = 1$
$f_i = -\sum_{j=1}^i f_{i-j} \times j!$
$g_i = -\sum_{j=1}^i g_{i-j} \times f_j \times \binom i j^2$

$g_N$ is the answer we need. And now we have solved the problem in $O(N^2)$ time complexity. We can achieve a time complexity of $O(N \log N)$ by Fast Fourier Transform and Polynomial Inverse.

K. Degree Sequence 3

Prepared by bulijiojiodibuliduo, Qingyu, zhoukangyang, and Crysfly.

Comments from the Chair: This task was originally purposed to The 10th China Collegiate Programming Contest Finals. When it was proposed, we only have an $O(n^3)$ solution. Kangyang then optimized the solution to an amazing $O(n \log n)$ time, but the solution was based on some intuitions and we cannot prove the correctness of the greedy solution. After CCPC Finals, we reconsidered the problem and decided to fix all the proofs before the semifinals. Our judges spent weeks of intensive discussions and finally finished all the details of the solutions. However, the $O(n \log n)$ solution Kangyang proposed is too difficult to implement. Later, the judges came up with an easier $O(n \log^2 n)$ solution, which is much easier to implement.

In practical, we do not expect any efficiency difference between those two solutions. Both solutions should pass without any issues.

Analysis will be added a bit later.

L. Bot Friends 2

Prepared by Qingyu and Kubic.

Analysis will be added a bit later.

M. Traveling in Cells 3

Prepared by Qingyu and w4p3r.

Analysis by Milmon

Comments from the Chair: This task was originally purposed to The 2nd Universal Cup Finals. It was waitlisted but did not used in the end. This is one of the earliest proposals we have for The 3rd Universal Cup Semifinals pool, and it was later accepted to serve as an easy and lightweight implementation task.

The author's original solution is in $O(n \log n \log V)$, where $V$ is the range of the coordinates. Our judges later found a solution in $O(n \log n)$, which is much faster and easier to implement. When setting the TL, I decided to set it to 4 seconds, which is about 20 times of our $O(n \log n)$ solution. Therefore, any $O(n \log n)$ solution can pass the problem easily, while $O(n \log n \log V)$ solutions can barely pass by constant optimizations.

Let's just assume that $b \leq a$. The minimum cost between two points is clearly achieved by first moving to the nearest two points congruent to the destination's coordinate modulo $a$, and then jumping to the destination. Thus, we consider partitioning the number axis into blocks of length $a$.

It can be easily proven by the adjustment method: the optimal destination must lie in the block containing the median or the two adjacent blocks. By enumerating these blocks, the problem reduces to selecting a point within a block such that the total minimum cost for moving the $n$ points to this point is minimized.

Suppose we are considering points in the block $[L, L + a]$ as potential destinations. For each given point $x_i$ among the $n$ points, we analyze its contribution. The current block is divided at $r = x_i \bmod a$ into two parts. The left part will only choose to move to either $L + r$ or $L + r - a$, while the right part will only choose to move to either $L + r$ or $L + r + a$. For both parts, it is easy to determine the breakpoint between choosing the left or right option. Thus, the contribution of each given point can be described by $O(1)$ piecewise linear functions.

Finally, we only need to process the obtained $O(n)$ linear functions offline by sorting the breakpoints and handling them efficiently.

The time complexity is $O(n \log n)$.

Qingyu✨ 的網誌

網誌

The 3rd Universal Cup Semifinals: Analysis Report

Summary

Staff

A. Iridescent Universe 2

B. Queue Editor

C. Adjacent Add

D. Circular Matching

E. Circular Convolution 2

F. Even Circuit

G. Master of Cards

H. Shortcut on Tree

I. Ethanol

J. International Olympiad in Fishing

K. Degree Sequence 3

L. Bot Friends 2

M. Traveling in Cells 3

評論

發表評論