Single Versus Union: Non-parallel Support Vector Machine Frameworks

**optimization**

**problems**to obtain a series of hyperplanes, but is hard to measure the loss of each sample. Expand abstract.

**optimization**

**problems**to obtain a series of hyperplanes, but is hard to measure the loss of each sample. The other type constructs all the hyperplanes simultaneously, and it solves one big

**optimization**

**problem**with the ascertained loss of each sample. We give the characteristics of each framework and compare them carefully. In addition, based on the second framework, we construct a max-min distance-based nonparallel support vector machine for multiclass classification problem, called NSVM. It constructs hyperplanes with large distance margin by solving an

**optimization**

**problem**. Experimental results on benchmark data sets and human face databases show the advantages of our NSVM.

7/10 relevant

arXiv

Solving Dynamic Multi-objective **Optimization** **Problems** Using Incremental
Support Vector Machine

**Optimization**

**Problems**(DMOPs) is that optimization objective functions will change with times or environments. Expand abstract.

**Optimization**

**Problems**(DMOPs) is that

**optimization**objective functions will change with times or environments. One of the promising approaches for solving the DMOPs is reusing the obtained Pareto optimal set (POS) to train prediction models via machine learning approaches. In this paper, we train an Incremental Support Vector Machine (ISVM) classifier with the past POS, and then the solutions of the DMOP we want to solve at the next moment are filtered through the trained ISVM classifier. A high-quality initial population will be generated by the ISVM classifier, and a variety of different types of population-based dynamic multi-objective

**optimization**algorithms can benefit from the population. To verify this idea, we incorporate the proposed approach into three evolutionary algorithms, the multi-objective particle swarm optimization(MOPSO), Nondominated Sorting Genetic Algorithm II (NSGA-II), and the Regularity Model-based multi-objective estimation of distribution algorithm(RE-MEDA). We employ experiments to test these algorithms, and experimental results show the effectiveness.

10/10 relevant

arXiv

S-DIGing: A Stochastic Gradient Tracking Algorithm for Distributed
**Optimization**

**optimization**

**problems**where the local objective function is complicated and numerous. Expand abstract.

**optimization**

**problems**where agents of a network cooperatively minimize the global objective function which consists of multiple local objective functions. Different from most of the existing works, the local objective function of each agent is presented as the average of finite instantaneous functions. The intention of this work is to solve large-scale

**optimization**

**problems**where the local objective function is complicated and numerous. Integrating the gradient tracking algorithm with stochastic averaging gradient technology, we propose a novel distributed stochastic gradient tracking (termed as S-DIGing) algorithm. At each time instant, only one randomly selected gradient of a instantaneous function is computed and applied to approximate the gradient of local objection function. Based on a primal-dual interpretation of the S-DIGing algorithm, we show that the S-DIGing algorithm linearly converges to the global optimal solution when step-size lies in an explicit internal under the assumptions that the instantaneous functions are strongly convex and have Lipschitz-continuous gradient. Numerical experiments on the logistic regression

**problem**are presented to demonstrate the practicability of the algorithm and correctness of the theoretical results.

5/10 relevant

arXiv

A solution for fractional PDE constrained **optimization** **problems** using
reduced basis method

**optimization**

**problem**governed by a fractional parabolic equation with the fractional derivative in time from order beta in (0,1) is defined by Caputo fractional derivative. Expand abstract.

**optimization**

**problem**governed by a fractional parabolic equation with the fractional derivative in time from order beta in (0,1) is defined by Caputo fractional derivative.

9/10 relevant

arXiv

**Optimization** Hierarchy for Fair Statistical Decision **Problems**

**optimization**hierarchy that lends itself to numerical computation, and we use tools from variational analysis and random set theory to prove that higher levels of this hierarchy lead to consistency in the sense that it asymptotically imposes this independence as a constraint in corresponding statistical... Expand abstract.

**optimization**hierarchy for fair statistical decision

**problems**. Because our hierarchy is based on the framework of statistical decision problems, this means it provides a systematic approach for developing and studying fair versions of hypothesis testing, decision-making, estimation, regression, and classification. We use the insight that qualitative definitions of fairness are equivalent to statistical independence between the output of a statistical technique and a random variable that measures attributes for which fairness is desired. We use this insight to construct an

**optimization**hierarchy that lends itself to numerical computation, and we use tools from variational analysis and random set theory to prove that higher levels of this hierarchy lead to consistency in the sense that it asymptotically imposes this independence as a constraint in corresponding statistical decision

**problems**. We demonstrate numerical effectiveness of our hierarchy using several data sets, and we conclude by using our hierarchy to fairly perform automated dosing of morphine.

4/10 relevant

arXiv

A Saddle-Point Dynamical System Approach for Robust Deep Learning

**optimization**

**problems**, empirical results show that the algorithm can achieve significant robustness for deep learning. Expand abstract.

**optimization**

**problem**in the presence of uncertainties. The robust learning

**problem**is formulated as a robust

**optimization**problem, and we introduce a discrete-time algorithm based on a saddle-point dynamical system (SDS) to solve this

**problem**. Under the assumptions that the cost function is convex and uncertainties enter concavely in the robust learning problem, we analytically show that using a diminishing step-size, the stochastic version of our algorithm, SSDS converges asymptotically to the robust optimal solution. The algorithm is deployed for the training of adversarially robust deep neural networks. Although such training involves highly non-convex non-concave robust

**optimization**problems, empirical results show that the algorithm can achieve significant robustness for deep learning. We compare the performance of our SSDS model to other state-of-the-art robust models, e.g., trained using the projected gradient descent (PGD)-training approach. From the empirical results, we find that SSDS training is computationally inexpensive (compared to PGD-training) while achieving comparable performances. SSDS training also helps robust models to maintain a relatively high level of performance for clean data as well as under black-box attacks.

7/10 relevant

arXiv

The Quantum Approximate **Optimization** Algorithm and the
Sherrington-Kirkpatrick Model at Infinite Size

**Optimization**Algorithm (QAOA) is a general-purpose algorithm for combinatorial

**optimization**

**problems**whose performance can only improve with the number of layers $p$. While QAOA holds promise as an algorithm that can be run on near-term quantum computers, its computational power has not been fully explored. In this work, we study the QAOA applied to the Sherrington-Kirkpatrick (SK) model, which can be understood as energy minimization of $n$ spins with all-to-all random signed couplings. There is a recent classical algorithm by Montanari that can efficiently find an approximate solution for a typical instance of the SK model to within $(1-\epsilon)$ times the ground state energy, so we can only hope to match its performance with the QAOA. Our main result is a novel technique that allows us to evaluate the typical-instance energy of the QAOA applied to the SK model. We produce a formula for the expected value of the energy, as a function of the $2p$ QAOA parameters, in the infinite size limit that can be evaluated on a computer with $O(16^p)$ complexity. We found optimal parameters up to $p=8$ running on a laptop. Moreover, we show concentration: With probability tending to one as $n\to\infty$, measurements of the QAOA will produce strings whose energies concentrate at our calculated value. As an algorithm running on a quantum computer, there is no need to search for optimal parameters on an instance-by-instance basis since we can determine them in advance. What we have here is a new framework for analyzing the QAOA, and our techniques can be of broad interest for evaluating its performance on more general

**problems**.

4/10 relevant

arXiv

Semiclassical **optimization** of entrainment stability and phase coherence
in weakly forced quantum nonlinear oscillators

**optimization**

**problems**, one for the stability and the other for the phase coherence of the entrained state, are considered. Expand abstract.

**optimization**problems, one for the stability and the other for the phase coherence of the entrained state, are considered. The optimal waveforms of the periodic amplitude modulation can be derived by applying the classical

**optimization**methods to the semiclassical phase equation that approximately describes the quantum limit-cycle dynamics. Using a quantum van der Pol oscillator with squeezing and Kerr effects as an example, the performance of

**optimization**is numerically analyzed. It is shown that the optimized waveform for the entrainment stability yields faster entrainment to the driving signal than the case with a simple sinusoidal waveform, while that for the phase coherence yields little improvement from the sinusoidal case. These results are explained from the properties of the phase sensitivity function.

4/10 relevant

arXiv

Constrained Bayesian **Optimization** with Max-Value Entropy Search

**optimization**

**problems**we show that cMES compares favourably to prior work, while being simpler to implement and faster than other constrained extensions of Entropy Search. Expand abstract.

**optimization**(BO) is a model-based approach to sequentially optimize expensive black-box functions, such as the validation error of a deep neural network with respect to its hyperparameters. In many real-world scenarios, the

**optimization**is further subject to a priori unknown constraints. For example, training a deep network configuration may fail with an out-of-memory error when the model is too large. In this work, we focus on a general formulation of Gaussian process-based BO with continuous or binary constraints. We propose constrained Max-value Entropy Search (cMES), a novel information theoretic-based acquisition function implementing this formulation. We also revisit the validity of the factorized approximation adopted for rapid computation of the MES acquisition function, showing empirically that this leads to inaccurate results. On an extensive set of real-world constrained hyperparameter

**optimization**

**problems**we show that cMES compares favourably to prior work, while being simpler to implement and faster than other constrained extensions of Entropy Search.

4/10 relevant

arXiv

ZO-AdaMM: Zeroth-Order Adaptive Momentum Method for Black-Box
**Optimization**

**optimization**methods for solving machine learning

**problems**. However, AdaMM is not suited for solving black-box

**optimization**problems, where explicit gradient forms are difficult or infeasible to obtain. In this paper, we propose a zeroth-order AdaMM (ZO-AdaMM) algorithm, that generalizes AdaMM to the gradient-free regime. We show that the convergence rate of ZO-AdaMM for both convex and nonconvex

**optimization**is roughly a factor of $O(\sqrt{d})$ worse than that of the first-order AdaMM algorithm, where $d$ is

**problem**size. In particular, we provide a deep understanding on why Mahalanobis distance matters in convergence of ZO-AdaMM and other AdaMM-type methods. As a byproduct, our analysis makes the first step toward understanding adaptive learning rate methods for nonconvex constrained

**optimization**. Furthermore, we demonstrate two applications, designing per-image and universal adversarial attacks from black-box neural networks, respectively. We perform extensive experiments on ImageNet and empirically show that ZO-AdaMM converges much faster to a solution of high accuracy compared with $6$ state-of-the-art ZO

**optimization**methods.

4/10 relevant

arXiv