Introduction to Nonsmooth Analysis and **Optimization**

**optimization**

**problems**that arise in inverse

**problem**s, imaging, and PDE-constrained optimization. Expand abstract.

**optimization**

**problems**that arise in inverse problems, imaging, and PDE-constrained

**optimization**. They cover convex subdifferentials, Fenchel duality, monotone operators and resolvents, Moreau--Yosida regularization as well as Clarke and (briefly) limiting subdifferentials. Both first-order (proximal point and splitting) methods and second-order (semismooth Newton) methods are treated. The required background from functional analysis and calculus of variations is also briefly summarized.

5/10 relevant

arXiv

Model Inversion Networks for Model-Based **Optimization**

**optimization**literature, high-dimensional model-based optimization

**problems**over images and protein designs, and contextual bandit optimization from logged data. Expand abstract.

**optimization**problems, where the goal is to find an input that maximizes an unknown score function given access to a dataset of inputs with corresponding scores. When the inputs are high-dimensional and valid inputs constitute a small subset of this space (e.g., valid protein sequences or valid natural images), such model-based

**optimization**

**problems**become exceptionally difficult, since the optimizer must avoid out-of-distribution and invalid inputs. We propose to address such

**problem**with model inversion networks (MINs), which learn an inverse mapping from scores to inputs. MINs can scale to high-dimensional input spaces and leverage offline logged data for both contextual and non-contextual

**optimization**

**problems**. MINs can also handle both purely offline data sources and active data collection. We evaluate MINs on tasks from the Bayesian

**optimization**literature, high-dimensional model-based

**optimization**

**problems**over images and protein designs, and contextual bandit

**optimization**from logged data.

7/10 relevant

arXiv

Stochastic Recursive Variance Reduction for Efficient Smooth Non-Convex
Compositional **Optimization**

**optimization**arises in many important machine learning tasks such as value function evaluation in reinforcement learning and portfolio management. The objective function is the composition of two expectations of stochastic functions, and is more challenging to optimize than vanilla stochastic

**optimization**

**problems**. In this paper, we investigate the stochastic compositional

**optimization**in the general smooth non-convex setting. We employ a recently developed idea of \textit{Stochastic Recursive Gradient Descent} to design a novel algorithm named SARAH-Compositional, and prove a sharp Incremental First-order Oracle (IFO) complexity upper bound for stochastic compositional optimization: $\mathcal{O}((n+m)^{1/2} \varepsilon^{-2})$ in the finite-sum case and $\mathcal{O}(\varepsilon^{-3})$ in the online case. Such a complexity is known to be the best one among IFO complexity results for non-convex stochastic compositional optimization, and is believed to be optimal. Our experiments validate the theoretical performance of our algorithm.

5/10 relevant

arXiv

FEM Based Preliminary Design **Optimization** in Case of Large Power Transformers

**optimization**

**problems**. Expand abstract.

**optimization**

**problems**. Most of the published algorithms are using a copper filling factor based winding model to calculate the main dimensions of the transformer during this first, preliminary design step. Therefore, these cost

**optimization**methods are not considering the detailed winding layout and the conductor dimensions. However, the knowledge of the exact conductor dimensions is essential to calculate the thermal behaviour of the windings and make a more accurate stray loss calculation. The paper presents a novel, evolutionary algorithm-based transformer

**optimization**method which can determine the optimal conductor shape for the windings during this examined preliminary design stage. The accuracy of the presented FEM method was tested on an existing transformer design. Then the results of the proposed

**optimization**method have been compared with a validated transformer design

**optimization**algorithm.

4/10 relevant

Preprints.org

Nearly optimal first-order methods for convex **optimization** under
gradient norm measure: An adaptive regularization approach

**optimization**

**problems**minimizing smooth functions, the gradient (resp., gradient mapping) norm is a fundamental optimality measure for which a regularization technique of first-order methods is known to be nearly optimal. Expand abstract.

**optimization**

**problems**minimizing smooth functions, the gradient (resp., gradient mapping) norm is a fundamental optimality measure for which a regularization technique of first-order methods is known to be nearly optimal. In this paper, we report an adaptive regularization approach attaining this iteration complexity without the prior knowledge of the distance from the initial point to the optimal solution set, which was required to be known in the existing regularization approach. To obtain further faster convergence adaptively, we secondly apply this approach to construct a first-order method that is adaptive to the H\"olderian error bound condition (or equivalently, the {\L}ojasiewicz gradient property) which covers moderately wide class of applications. The proposed method attains the nearly optimal iteration complexity with respect to the gradient mapping norm.

4/10 relevant

arXiv

What do QAOA energies reveal about graphs?

**Optimization**Algorithm (QAOA) is a hybrid classical-quantum algorithm to approximately solve NP

**optimization**

**problems**such as MAX-CUT. We describe a new application area of QAOA circuits: graph structure discovery. We omit the time-consuming parameter-

**optimization**phase and utilize the dependence of QAOA energy on the graph structure for randomly or judiciously chosen parameters to learn about graphs. In the first part, Following up on Wang et. al. and Brandao et. al. we give explicit formulas. We show that the layer-one QAOA energy for the MAX-CUT

**problem**for three regular graphs carries exactly the information: {\em (# of vertices, # of triangles)}. We have calculated our explicit formulas differently from \cite{wang2018quantum}, by developing the notion of the $U$ polynomial of a graph $G$. Many of our discoveries can be interpreted as computing $U(G)$ under various restrictions. The most basic question when comparing the structure of two graphs is if they are isomorphic or not. We find that the QAOA energies separate all non-isomorphic three-regular graphs up to size 18, all strongly regular graphs up to size 26 and the Praust and the smallest Miyazaki examples. We observe that the QAOA energy values can be also used as a proxy to how much graphs differ. Unfortunately, we have also found a sequence of non-isomorphic pairs of graphs, for which the energy gap seems to shrink at an exponential rate as the size grows. Our negative findings however come with a surprise: if the QAOA energies do not measurably separate between two graphs, then both of their energy landscapes must be extremely flat (indistinguishable from constant), already when the number of QAOA layers is intermediately large. This holds due to a remarkable uncoupling phenomenon that we have only deduced from computer simulation.

4/10 relevant

arXiv

Upper and Lower Bounds for Large Scale Multistage Stochastic
**Optimization** **Problems**: Decomposition Methods

**problems**, we propose two decomposition methods, whether handling the coupling constraints by prices or by resources. Expand abstract.

**optimization**

**problem**involving multiple units. Each unit is a (small) control system. Static constraints couple units at each stage. To tackle such large scale problems, we propose two decomposition methods, whether handling the coupling constraints by prices or by resources. We introduce the sequence (one per stage) of global Bellman functions, depending on the collection of local states of all units. We show that every Bellman function is bounded above by a sum of local resource-decomposed value functions, and below by a sum of local price-decomposed value functions-each local decomposed function having for arguments the corresponding local unit state variables. We provide conditions under which these local value functions can be computed by Dynamic Programming. These conditions are established assuming a centralized information structure, that is, when the information available for each unit consists of the collection of noises affecting all the units. We finally study the case where each unit only observes its own local noise (decentralized information structure).

9/10 relevant

arXiv

Upper and Lower Bounds for Large Scale Multistage Stochastic
**Optimization** **Problems**: Application to Microgrid Management

**problem**instances incorporating more than 60 state variables in a Dynamic Programming framework. Expand abstract.

**problem**is coupled both in time and in space, so that a direct resolution of the

**problem**for large microgrids is out of reach (curse of dimensionality). By affecting price or resources to each node in the network and resolving each nodal sub-

**problem**independently by Dynamic Programming, we provide decomposition algorithms that allow to compute a set of decomposed local value functions in a parallel manner. By summing the local value functions together, we are able, on the one hand, to obtain upper and lower bounds for the optimal value of the problem, and, on the other hand, to design global admissible policies for the original system. Numerical experiments are conducted on microgrids of different size, derived from data given by the research and development centre Efficacity, dedicated to urban energy transition. These experiments show that the decomposition algorithms give better results than the standard SDDP method, both in terms of bounds and policy values. Moreover, the decomposition methods are much faster than the SDDP method in terms of computation time, thus allowing to tackle

**problem**instances incorporating more than 60 state variables in a Dynamic Programming framework.

7/10 relevant

arXiv

Convolutional Neural Network-based Topology **Optimization** (CNN-TO) By
Estimating Sensitivity of Compliance from Material Distribution

**optimization**method that applies a convolutional neural network (CNN), which is one deep learning technique for topology optimization

**problems**. Expand abstract.

**optimization**method that applies a convolutional neural network (CNN), which is one deep learning technique for topology

**optimization**

**problems**. Using this method, we acquire a structure with a little higher performance that could not be obtained by the previous topology

**optimization**method. In particular, in this paper, we solve a topology

**optimization**

**problem**aimed at maximizing stiffness with a mass constraint, which is a common type of topology

**optimization**. In this paper, we first formulate the conventional topology

**optimization**by the solid isotropic material with penalization method. Next, we formulate the topology

**optimization**using CNN. Finally, we show the effectiveness of the proposed topology

**optimization**method by solving a verification example, namely a topology

**optimization**

**problem**aimed at maximizing stiffness. In this research, as a result of solving the verification example for a small design area of 16x32 element, we obtain the solution different from the previous topology

**optimization**method. This result suggests that stiffness information of structure can be extracted and analyzed for structural design by analyzing the density distribution using CNN like an image. This suggests that CNN technology can be utilized in the structural design and topology

**optimization**.

7/10 relevant

arXiv

Distributed Online **Optimization** with Long-Term Constraints

**optimization**problems, where the distributed system consists of various computing units connected through a time-varying communication graph. In each time step, each computing unit selects a constrained vector, experiences a loss equal to an arbitrary convex function evaluated at this vector, and may communicate to its neighbors in the graph. The objective is to minimize the system-wide loss accumulated over time. We propose a decentralized algorithm with regret and cumulative constraint violation in $\mathcal{O}(T^{\max\{c,1-c\} })$ and $\mathcal{O}(T^{1-c/2})$, respectively, for any $c\in (0,1)$, where $T$ is the time horizon. When the loss functions are strongly convex, we establish improved regret and constraint violation upper bounds in $\mathcal{O}(\log(T))$ and $\mathcal{O}(\sqrt{T\log(T)})$. These regret scalings match those obtained by state-of-the-art algorithms and fundamental limits in the corresponding centralized online

**optimization**

**problem**(for both convex and strongly convex loss functions). In the case of bandit feedback, the proposed algorithms achieve a regret and constraint violation in $\mathcal{O}(T^{\max\{c,1-c/3 \} })$ and $\mathcal{O}(T^{1-c/2})$ for any $c\in (0,1)$. We numerically illustrate the performance of our algorithms for the particular case of distributed online regularized linear regression

**problems**.

7/10 relevant

arXiv