Nearly optimal first-order methods for convex **optimization** under
gradient norm measure: An adaptive regularization approach

**optimization**

**problems**minimizing smooth functions, the gradient (resp., gradient mapping) norm is a fundamental optimality measure for which a regularization technique of first-order methods is known to be nearly optimal. Expand abstract.

**optimization**

**problems**minimizing smooth functions, the gradient (resp., gradient mapping) norm is a fundamental optimality measure for which a regularization technique of first-order methods is known to be nearly optimal. In this paper, we report an adaptive regularization approach attaining this iteration complexity without the prior knowledge of the distance from the initial point to the optimal solution set, which was required to be known in the existing regularization approach. To obtain further faster convergence adaptively, we secondly apply this approach to construct a first-order method that is adaptive to the H\"olderian error bound condition (or equivalently, the {\L}ojasiewicz gradient property) which covers moderately wide class of applications. The proposed method attains the nearly optimal iteration complexity with respect to the gradient mapping norm.

4/10 relevant

arXiv

What do QAOA energies reveal about graphs?

**Optimization**Algorithm (QAOA) is a hybrid classical-quantum algorithm to approximately solve NP

**optimization**

**problems**such as MAX-CUT. We describe a new application area of QAOA circuits: graph structure discovery. We omit the time-consuming parameter-

**optimization**phase and utilize the dependence of QAOA energy on the graph structure for randomly or judiciously chosen parameters to learn about graphs. In the first part, Following up on Wang et. al. and Brandao et. al. we give explicit formulas. We show that the layer-one QAOA energy for the MAX-CUT

**problem**for three regular graphs carries exactly the information: {\em (# of vertices, # of triangles)}. We have calculated our explicit formulas differently from \cite{wang2018quantum}, by developing the notion of the $U$ polynomial of a graph $G$. Many of our discoveries can be interpreted as computing $U(G)$ under various restrictions. The most basic question when comparing the structure of two graphs is if they are isomorphic or not. We find that the QAOA energies separate all non-isomorphic three-regular graphs up to size 18, all strongly regular graphs up to size 26 and the Praust and the smallest Miyazaki examples. We observe that the QAOA energy values can be also used as a proxy to how much graphs differ. Unfortunately, we have also found a sequence of non-isomorphic pairs of graphs, for which the energy gap seems to shrink at an exponential rate as the size grows. Our negative findings however come with a surprise: if the QAOA energies do not measurably separate between two graphs, then both of their energy landscapes must be extremely flat (indistinguishable from constant), already when the number of QAOA layers is intermediately large. This holds due to a remarkable uncoupling phenomenon that we have only deduced from computer simulation.

4/10 relevant

arXiv

Upper and Lower Bounds for Large Scale Multistage Stochastic
**Optimization** **Problems**: Decomposition Methods

**problems**, we propose two decomposition methods, whether handling the coupling constraints by prices or by resources. Expand abstract.

**optimization**

**problem**involving multiple units. Each unit is a (small) control system. Static constraints couple units at each stage. To tackle such large scale problems, we propose two decomposition methods, whether handling the coupling constraints by prices or by resources. We introduce the sequence (one per stage) of global Bellman functions, depending on the collection of local states of all units. We show that every Bellman function is bounded above by a sum of local resource-decomposed value functions, and below by a sum of local price-decomposed value functions-each local decomposed function having for arguments the corresponding local unit state variables. We provide conditions under which these local value functions can be computed by Dynamic Programming. These conditions are established assuming a centralized information structure, that is, when the information available for each unit consists of the collection of noises affecting all the units. We finally study the case where each unit only observes its own local noise (decentralized information structure).

9/10 relevant

arXiv

Upper and Lower Bounds for Large Scale Multistage Stochastic
**Optimization** **Problems**: Application to Microgrid Management

**problem**instances incorporating more than 60 state variables in a Dynamic Programming framework. Expand abstract.

**problem**is coupled both in time and in space, so that a direct resolution of the

**problem**for large microgrids is out of reach (curse of dimensionality). By affecting price or resources to each node in the network and resolving each nodal sub-

**problem**independently by Dynamic Programming, we provide decomposition algorithms that allow to compute a set of decomposed local value functions in a parallel manner. By summing the local value functions together, we are able, on the one hand, to obtain upper and lower bounds for the optimal value of the problem, and, on the other hand, to design global admissible policies for the original system. Numerical experiments are conducted on microgrids of different size, derived from data given by the research and development centre Efficacity, dedicated to urban energy transition. These experiments show that the decomposition algorithms give better results than the standard SDDP method, both in terms of bounds and policy values. Moreover, the decomposition methods are much faster than the SDDP method in terms of computation time, thus allowing to tackle

**problem**instances incorporating more than 60 state variables in a Dynamic Programming framework.

7/10 relevant

arXiv

Convolutional Neural Network-based Topology **Optimization** (CNN-TO) By
Estimating Sensitivity of Compliance from Material Distribution

**optimization**method that applies a convolutional neural network (CNN), which is one deep learning technique for topology optimization

**problems**. Expand abstract.

**optimization**method that applies a convolutional neural network (CNN), which is one deep learning technique for topology

**optimization**

**problems**. Using this method, we acquire a structure with a little higher performance that could not be obtained by the previous topology

**optimization**method. In particular, in this paper, we solve a topology

**optimization**

**problem**aimed at maximizing stiffness with a mass constraint, which is a common type of topology

**optimization**. In this paper, we first formulate the conventional topology

**optimization**by the solid isotropic material with penalization method. Next, we formulate the topology

**optimization**using CNN. Finally, we show the effectiveness of the proposed topology

**optimization**method by solving a verification example, namely a topology

**optimization**

**problem**aimed at maximizing stiffness. In this research, as a result of solving the verification example for a small design area of 16x32 element, we obtain the solution different from the previous topology

**optimization**method. This result suggests that stiffness information of structure can be extracted and analyzed for structural design by analyzing the density distribution using CNN like an image. This suggests that CNN technology can be utilized in the structural design and topology

**optimization**.

7/10 relevant

arXiv

Distributed Online **Optimization** with Long-Term Constraints

**optimization**problems, where the distributed system consists of various computing units connected through a time-varying communication graph. In each time step, each computing unit selects a constrained vector, experiences a loss equal to an arbitrary convex function evaluated at this vector, and may communicate to its neighbors in the graph. The objective is to minimize the system-wide loss accumulated over time. We propose a decentralized algorithm with regret and cumulative constraint violation in $\mathcal{O}(T^{\max\{c,1-c\} })$ and $\mathcal{O}(T^{1-c/2})$, respectively, for any $c\in (0,1)$, where $T$ is the time horizon. When the loss functions are strongly convex, we establish improved regret and constraint violation upper bounds in $\mathcal{O}(\log(T))$ and $\mathcal{O}(\sqrt{T\log(T)})$. These regret scalings match those obtained by state-of-the-art algorithms and fundamental limits in the corresponding centralized online

**optimization**

**problem**(for both convex and strongly convex loss functions). In the case of bandit feedback, the proposed algorithms achieve a regret and constraint violation in $\mathcal{O}(T^{\max\{c,1-c/3 \} })$ and $\mathcal{O}(T^{1-c/2})$ for any $c\in (0,1)$. We numerically illustrate the performance of our algorithms for the particular case of distributed online regularized linear regression

**problems**.

7/10 relevant

arXiv

First order **optimization** methods based on Hessian-driven Nesterov
accelerated gradient flow

**optimization**

**problems**are also developed. Expand abstract.

**optimization**methods are proposed from ODE solvers. It is shown that (semi-)implicit schemes can always achieve linear rate and explicit schemes have the optimal(accelerated) rates for convex and strongly convex objectives. In particular, Nesterov's optimal method is recovered from an explicit scheme for our H-NAG flow. Furthermore, accelerated splitting algorithms for composite

**optimization**

**problems**are also developed.

4/10 relevant

arXiv

On local quasi efficient solutions for nonsmooth vector **optimization**

**optimization**

**problems**under new generalized approximate invexity assumptions. Expand abstract.

**optimization**

**problems**under new generalized approximate invexity assumptions. We formulate necessary and sufficient optimality conditions based on Stampacchia and Minty types of vector variational inequalities involving Clarke's generalized Jacobians. We also establish the relationship between local quasi weak efficient solutions and vector critical points.

4/10 relevant

arXiv

TSSOS: A Moment-SOS hierarchy that exploits term sparsity

**optimization**

**problems**either randomly generated or coming from the networked systems literature. Expand abstract.

**optimization**

**problems**. We show how to exploit term (or monomial) sparsity of the input polynomials to obtain a new converging hierarchy of semidefinite programming relaxations. The novelty (and distinguishing feature) of such relaxations is to involve block-diagonal matrices obtained in an iterative procedure performing completion of the connected components of certain adjacency graphs. The graphs are related to the terms arising in the original data and not to the links between variables. Our theoretical framework is then applied to compute lower bounds for polynomial

**optimization**

**problems**either randomly generated or coming from the networked systems literature.

4/10 relevant

arXiv

Active strict saddles in nonsmooth **optimization**

**optimization**

**problems**. Expand abstract.

**problems**converge only to local minimizers, when randomly initialized. We argue that the strict saddle property may be a realistic assumption in applications, since it provably holds for generic semi-algebraic

**optimization**

**problems**.

4/10 relevant

arXiv