Your conditions: Mathematical Statistics
  • Copula Entropy: Theory and Applications

    Subjects: Mathematics >> Statistics and Probability Subjects: Statistics >> Mathematical Statistics Subjects: Information Science and Systems Science >> Basic Disciplines of Information Science and Systems Science submitted time 2024-05-22

    Abstract: Statistical independence is a core concept in statistics and machine learning. Representing and measuring independence are of fundamental importance in related fields. Copula theory provides the tool for representing statistical independence, while Copula Entropy (CE) presents the tool for measuring statistical independence. This paper first introduces the theory of CE, including its definition, theorem, properties, and estimation method. The theoretical applications of CE to structure learning, association discovery, variable selection, causal discovery, system identification, time lag estimation, domain adaptation, multivariate normality test, two-sample test, and change point detection are reviewed. The relationships between the theoretical applications and their connection to correlation and causality are discussed. The frameworks based on CE, the kernel method, and distance correlation for measuring statistical independence and conditional independence are compared. The advantage of CE based on methods over the other comparable methods is evaluated with simulated and real data. The applications of CE in theoretical physics, astrophysics, geophysics, theoretical chemistry, cheminformatics, materials science, hydrology, climatology, meteorology, environmental science, ecology, animal morphology, agronomy, cognitive neuroscience, motor neuroscience, computational neuroscience, psychology, system biology, bioinformatics, clinical diagnostics, geriatrics, psychiatry, public health, economics, management, sociology, pedagogy, computational linguistics, mass media, law, political science, military science, informatics, energy, food engineering, architecture, civil engineering, transportation, manufacturing, reliability, metallurgy, chemical engineering, aeronautics and astronautics, weapon, automobile, electronics, communication, high performance computing, cybersecurity, remote sensing, ocean, and finance are briefly introduced.

  • Change Point Detection with Copula Entropy based Two-Sample Test

    Subjects: Statistics >> Mathematical Statistics submitted time 2024-03-01

    Abstract: Change point detection is a typical task that aim to find changes in time series and can be tackled with two-sample test. Copula Entropy is a mathematical concept for measuring statistical independence and a two-sample test based on it was introduced recently. In this paper we propose a nonparametric multivariate method for multiple change point detection with the copula entropy-based two-sample test. The single change point detection is first proposed as a group of two-sample tests on every points of time series data and the change point is considered as with the maximum of the test statistics. The multiple change point detection is then proposed by combining the single change point detection method with binary segmentation strategy. We verified the effectiveness of our method and compared it with the other similar methods on the simulated univariate and multivariate data and the Nile data.

  • Uniform convergence of empirical distribution and its application in the limit behavior of No Free Lunch theorem

    Subjects: Statistics >> Mathematical Statistics Subjects: Computer Science >> Other Disciplines of Computer Science submitted time 2024-01-08

    Abstract: The No Free Lunch (NFL) theorem is an important result of statistical learning theory, according to Bayesian modeling, the expectation of the loss/utility can be deduced with its form related to the selection of the hypothesis space of the prediction functions. If the real prediction function space is considered unknowable, then the arbitrarily selected hypothesis function space may not necessarily yield the expectation of the optimal loss function.

    In this paper, the limit behavvior of the NFL theorem is analyzed based on a local form of the uniform convergence of the empirical distribution, i.e. the Glivenko-Cantelli theorem, is obtained: under certain condition of the deterministic and non-deterministic prediction problem, the expectation of the loss/utility is independent of the specific choice of the hypothetical function space as the sample size tends to infinity. A by-product of this work is that the total variation of the distribution can be deduced from the local form of the uniform convergence of the distribution derived in this paper. Previously, this property was generally considered non-existent.

  • 排序下PPS抽样估计量的修正与应用

    Subjects: Statistics >> Mathematical Statistics submitted time 2018-09-26

    Abstract: 受许多事物具有齐夫现象的启发,本文提出了排序后PPS抽样方法,并给出了修正汉森-赫维茨估计量及其方差。在此过程中本文解决了,长期以来抽样调查实践中将重要单元直接入样时,多少重要单元直接入样没有明确方法的问题,本文给出了理论依据和具体的确定方法。最后通过一个例子和中国城市人口抽样调查的案例,展示了修正汉森-赫维茨估计量的优势,并对这一研究方法做了总结和展望。