Problem Description
Given a sequence $a_{1\sim n}$, there are $q$ operations of two types:
1 l r x: Add $x$ to $a_l \sim a_r$, where $x$ can be negative.2 l r: Find the GSS of $a_l \sim a_r$, with the option of selecting nothing (i.e., answer $\ge 0$).
Constraints
- $1 \le n, q \le 10^5$
- $|a_i|, |x| \le 10^9$
- It is guaranteed that $|a_i| \le 2 \times 10^9$ always holds.
- Time limit: $1\text{s}$, Memory limit: $64\text{MB}$
Solution
Assume $n, q$ are of the same order.
Simplified Version
First, consider global addition. This is equivalent to querying $(l, r, \Delta)$ for the GSS after adding $\Delta$ to $a_l \sim a_r$. Consider maintaining $\text{pre}$, $\text{suf}$, $\text{ans}$. Taking $\text{pre}$ as an example, let $\text{pre}_i$ denote the prefix sum of length $i$. After adding $\Delta$, the value becomes $\max\{ \Delta \times i + \text{pre}_i \}$. Treating the expression inside $\max$ as a linear function in $\Delta$, we are taking the maximum of functions. Maintain an upper convex hull of $(i, \text{pre}_i)$ and binary search for the tangent point. The same applies to $\text{suf}$.
For merging: - $\text{pre}$ and $\text{suf}$ are obtained by shifting one side and taking the corresponding $\max$. - For $\text{ans}$, first take the $\max$ of both sides. For the case crossing the midpoint, we need to add both the lengths and values from both sides, which can be done using Minkowski sum.
Build a segment tree where each node maintains $\text{sum}$ and convex hulls for $\text{pre}$, $\text{suf}$, $\text{ans}$. This can be built in $O(n \log n)$ time and space. For a query, split it into $O(\log n)$ nodes, perform binary search on each node's convex hull, and finally merge the answers, achieving $O(n \log^2 n)$.
For optimization, offline the queries and sort them by $\Delta$ in ascending order. This provides monotonicity, allowing us to scan pointers on the convex hulls, achieving $O(n \log n)$ amortized.
At this point, you've succeeded halfway. However, to pass #7495 you need to replace the segment tree with divide-and-conquer to achieve $O(n)$ space.
Back to the Original Problem
The pushup operation described above is linear and cannot be directly extended to range addition.
First, go offline. Partition the sequence into blocks, and associate each operation with all blocks it affects. Each block maintains a segment tree as described above. Then process each block's operations, keeping track of $\Delta$ as the current block addition value. Add an offset to $a_i$ to make all $\Delta$ non-negative.
For partial block modifications, extract nodes from the segment tree and apply a $\text{tag}$ for overall translation on the corresponding convex hulls. Note that pushup resets the convex hull pointers. Since pushup is $O(\text{len})$, each modification is $O(\sqrt n)$.
Since partial block modifications affect monotonicity, we only sort queries between two modifications. Note that the total length for sorting within a single block is $O(n)$, so use radix sort. For each sorted segment, first apply the modifications, then binary search on the root node's convex hull to find the pointer for the current $\Delta$. For whole block queries, simply move the pointer, achieving $O(1)$ amortized per query. For partial block queries, use the segment tree to query, still requiring binary search, giving $O(\log^2 n)$ per query.
Time complexity: $O(n\sqrt n + n \log^2 n)$, Space complexity: $O(n\sqrt n)$.
Performance Optimization
- If space is insufficient, process blocks one by one to achieve $O(n)$ space.
- Add fast I/O. Don't use
vectorfor convex hulls; use a memory pool. - Set a threshold for sorting: use
sortfor small sizes, radix sort for large sizes. - Due to the large constant factor of segment trees, direct block partitioning is unbalanced. Let the weight of each point be: the number of operations for which it is an endpoint $+1$, then perform weighted block partitioning. This achieves better balance. With block size tuned to $900$, it passes.