The Stats Map

      • active statistical inference
      • adversarial contamination model
      • alpha-divergence
      • anisotropic distribution
      • anti-concentration
      • anytime-valid
      • anytime-valid p-values
      • asymptotic confidence sequences
      • Banach space
      • basic inequalities
      • basic matrix inequalities
      • Bayes factors
      • Bayesian interpretation of probability
      • Bayesian nonparametrics
      • Bayesian parametrics
      • Bayesian statistics
      • Bernstein von-Mises theorem
      • Berry-Esseen bounds
      • betting strategies
      • BH procedure
      • bootstrapping
      • bounded difference inequalities
      • bounded scalar concentration
      • calibration
      • Catoni-Giulini M-estimator
      • causal inference
      • cdf concentration
      • cdf estimation
      • central limit theorems
      • chaining
      • characteristic function
      • Chernoff method
      • chi-squared divergence
      • CLTs in Banach spaces
      • coarsened filtrations can increase power
      • comparing forecasters by betting
      • concentration in Banach spaces
      • concentration inequalities
      • concentration of functions
      • concentration of measure
      • concentration of self-bounding functions
      • concentration via convex optimization
      • concentration via covering
      • conditional independence testing
      • confidence intervals
      • confidence sequences
      • confidence sequences for convex functionals
      • confidence sequences for quantiles
      • confidence sequences via conjugate mixtures
      • confidence sequences via predictable plug-ins
      • conformal p-value
      • conformal prediction
      • conjugate transpose
      • contextual bandit
      • covering and packing
      • Cramer-Rao lower bound
      • credible intervals
      • current statistical practice combines the Fisherian and Neyman-Pearson perspectives
      • deep density estimation
      • density estimation
      • differential privacy
      • Dirichlet process
      • distributional distance
      • Donsker class
      • Doob's maximal inequality
      • doubly robust estimator
      • duality between hypothesis tests and CIs
      • Dudley chaining
      • Dudley's entropy bound
      • e-BH procedure
      • e-process
      • e-value
      • e-value calibrators
      • e-values enable post-hoc hypothesis testing
      • Efron-Stein inequality
      • empirical Bernstein bounds
      • empirical process theory
      • empirical risk minimization
      • ensemble learning
      • entropy number
      • ergodic theorems
      • estimating means by betting
      • evidence against the null
      • evidence is quantifiable in small-worlds
      • exchangeable distribution
      • exponential families
      • exponential inequalities
      • external randomization
      • f-divergence
      • FDR control
      • Fisher information
      • Fisher information distance
      • Fisher's paradigm
      • fixed-time
      • fork-convex
      • foundations of statistics
      • frequentist interpretation of probability
      • frequentist statistics
      • from boundedness to variance adaptivity
      • from independence to iid
      • game theory
      • game-theoretic convergence of opinions
      • game-theoretic hypothesis testing
      • game-theoretic LLN
      • game-theoretic probability
      • game-theoretic statistics
      • Gaussian complexity
      • Gaussian process
      • Gaussian process regression
      • Gaussian sequence model
      • generalized linear model
      • generic chaining
      • Glivenko-Cantelli class
      • goodness-of-fit test
      • GRO e-variable
      • GROW e-variable
      • growth rate conditions in sequential testing
      • heavy-tailed concentration
      • Hellinger distance
      • Hermitian matrix
      • hilbert space
      • histograms
      • Hölder space
      • Huber contamination model
      • hypothesis testing
      • ideal metrics
      • infinitely divisible distribution
      • information processing inequality
      • information theory
      • instrumentalist theory of probability
      • integral probability metric
      • interpolating between Markov and Chernoff
      • inverse problems
      • irregular problems in hypothesis testing
      • isotropic distributions
      • issues with p-values
      • Jeffreys prior
      • Jeffreys' paradigm of hypothesis testing
      • Karlin-Rubin theorem
      • Kelly betting
      • kernel density estimation
      • kernel MMD
      • kernel regression
      • kernel trick
      • KL divergence
      • knn
      • KS distance
      • lady tasting tea
      • law of likelihood
      • laws of large numbers
      • laws of the iterated logarithm
      • learning theory
      • light-tailed maximal inequalities
      • light-tailed, unbounded scalar concentration
      • likelihood principle
      • likelihood-ratio test
      • Lindeberg-Feller CLT
      • Lindeberg-Levy CLT
      • Linear Regression
      • linear smoothers
      • list of maximal inequalities
      • local differential privacy
      • local polynomial regression
      • Loewner order
      • log-concave distribution
      • Lp norm
      • Lyapunov CLT
      • M-estimation
      • marginal consistency
      • Markovian alternatives
      • martingale CLT
      • martingale concentration
      • martingale dependence
      • matrix inequalities
      • matrix martingale inequalities
      • maximal inequalities
      • maximizing log-wealth
      • Mayo's error statistics
      • MCMC
      • mean estimation
      • median-of-means
      • Mercer kernel
      • merging e-values
      • method of moments for concentration
      • method of moments for estimation
      • metric entropy
      • metric space
      • MGF
      • minimal sufficiency
      • MLE
      • model selection
      • model-X assumption
      • Monge formulation
      • monotone likelihood ratio
      • multi-group calibration
      • multi-group consistency
      • multiarmed bandit
      • multiple testing
      • multivariate concentration
      • multivariate heavy-tailed mean estimation
      • multivariate light-tailed concentration
      • mutual information
      • Nash equilibrium
      • negative correlation can improve concentration
      • Neyman-Pearson lemma
      • Neyman-Pearson lemma for discrete distributions
      • Neyman-Pearson paradigm
      • Neyman-Pearson paradigm with losses
      • nonparametric classification
      • nonparametric density estimation
      • nonparametric regression
      • numeraire e-variable
      • online calibration
      • online gradient descent
      • online marginal estimation
      • Online Newton Step
      • operator norm inequalities
      • optimal transport
      • optimal transport costs
      • optimality of Markov and Chebyshev
      • optimization perspective on Markov's inequality
      • optional continuation
      • optional stopping
      • Orlicz norm
      • p-hacking
      • p-value
      • PAC learning
      • PAC-Bayes
      • parametric density estimation
      • parametric versus nonparametric statistics
      • partitions and trees
      • permutation test
      • permutation testing by betting
      • Petrov's CLT template
      • pinball loss
      • Pinelis approach to concentration
      • portfolio optimization
      • post-hoc confidence sequences via e-processes
      • post-hoc hypothesis testing
      • post-hoc hypothesis testing with losses
      • post-hoc valid confidence sequences
      • PRDS
      • prediction-powered inference
      • proper scoring rule
      • quantile estimation
      • quantitative CLT template with ideal metrics
      • Rademacher complexity
      • randomized inequalities
      • Rao-Blackwell theorem
      • REGROW e-variable
      • representer theorem
      • reverse information projection (RIPr)
      • RKHS
      • rkhs regression
      • robust statistics
      • Royall's three questions
      • safe, anytime-valid inference (SAVI)
      • scalar heavy-tailed mean estimation
      • score function
      • self-normalized concentration
      • self-supervised learning
      • semi-supervised learning
      • sequential hypothesis testing
      • sequential probability ratio test
      • sequential statistics
      • small worlds vs large worlds
      • splines
      • squared error
      • statistical decision theory
      • statistical inference
      • stitching for LIL rates
      • stopping-time
      • strong approximations
      • sub-exponential distributions
      • sub-Gaussian distributions
      • sub-Gaussian process
      • sub-psi process
      • submartingale
      • sufficiency and the likelihood
      • sufficient statistic
      • supermartingale
      • supervised learning
      • survey sampling
      • t-test
      • techniques for multivariate concentration
      • test-martingale
      • testing by betting—composite vs composite
      • testing by betting—simple vs composite
      • testing by betting—simple vs simple
      • testing by betting—two-sample testing
      • testing exchangeability
      • testing forecasters by betting
      • testing group invariance
      • the missing factor in Hoeffding's bounds
      • the problem of approximate inference in deep learning
      • time-uniform
      • total variation distance
      • trimmed mean estimator
      • truncation-based estimators
      • two-sample testing
      • u-statistics
      • uncertainty quantification
      • uniform convergence bounds
      • uniformly most powerful test
      • universal inference
      • unsupervised learning
      • v-statistics
      • Vapnik-Chervonenkis theory
      • variational approach to concentration
      • variational inference
      • Ville's inequality
      • Wald interval
      • Wald test
      • Warner's randomized response
      • Wasserstein Distance
      • wavelets
      • weighted least squares
      • zero sum game

    unsupervised learning

    Modified Jan 04, 20251 min read

    Machine learning in which no explicit labels are provided to the algorithm. Common tasks include clustering or dimensionality reduction. Distinct from, but sometimes confused with self-supervised learning. Semi-supervised learning combines supervised learning and unsupervised learning.

    Honestly, if you’re here, you should just read the wiki page on unsupervised learning. This is mostly a signpost for me to able link to.


    Graph View

    Backlinks

    • semi-supervised learning
    • supervised learning
        • active statistical inference
        • adversarial contamination model
        • alpha-divergence
        • anisotropic distribution
        • anti-concentration
        • anytime-valid
        • anytime-valid p-values
        • asymptotic confidence sequences
        • Banach space
        • basic inequalities
        • basic matrix inequalities
        • Bayes factors
        • Bayesian interpretation of probability
        • Bayesian nonparametrics
        • Bayesian parametrics
        • Bayesian statistics
        • Bernstein von-Mises theorem
        • Berry-Esseen bounds
        • betting strategies
        • BH procedure
        • bootstrapping
        • bounded difference inequalities
        • bounded scalar concentration
        • calibration
        • Catoni-Giulini M-estimator
        • causal inference
        • cdf concentration
        • cdf estimation
        • central limit theorems
        • chaining
        • characteristic function
        • Chernoff method
        • chi-squared divergence
        • CLTs in Banach spaces
        • coarsened filtrations can increase power
        • comparing forecasters by betting
        • concentration in Banach spaces
        • concentration inequalities
        • concentration of functions
        • concentration of measure
        • concentration of self-bounding functions
        • concentration via convex optimization
        • concentration via covering
        • conditional independence testing
        • confidence intervals
        • confidence sequences
        • confidence sequences for convex functionals
        • confidence sequences for quantiles
        • confidence sequences via conjugate mixtures
        • confidence sequences via predictable plug-ins
        • conformal p-value
        • conformal prediction
        • conjugate transpose
        • contextual bandit
        • covering and packing
        • Cramer-Rao lower bound
        • credible intervals
        • current statistical practice combines the Fisherian and Neyman-Pearson perspectives
        • deep density estimation
        • density estimation
        • differential privacy
        • Dirichlet process
        • distributional distance
        • Donsker class
        • Doob's maximal inequality
        • doubly robust estimator
        • duality between hypothesis tests and CIs
        • Dudley chaining
        • Dudley's entropy bound
        • e-BH procedure
        • e-process
        • e-value
        • e-value calibrators
        • e-values enable post-hoc hypothesis testing
        • Efron-Stein inequality
        • empirical Bernstein bounds
        • empirical process theory
        • empirical risk minimization
        • ensemble learning
        • entropy number
        • ergodic theorems
        • estimating means by betting
        • evidence against the null
        • evidence is quantifiable in small-worlds
        • exchangeable distribution
        • exponential families
        • exponential inequalities
        • external randomization
        • f-divergence
        • FDR control
        • Fisher information
        • Fisher information distance
        • Fisher's paradigm
        • fixed-time
        • fork-convex
        • foundations of statistics
        • frequentist interpretation of probability
        • frequentist statistics
        • from boundedness to variance adaptivity
        • from independence to iid
        • game theory
        • game-theoretic convergence of opinions
        • game-theoretic hypothesis testing
        • game-theoretic LLN
        • game-theoretic probability
        • game-theoretic statistics
        • Gaussian complexity
        • Gaussian process
        • Gaussian process regression
        • Gaussian sequence model
        • generalized linear model
        • generic chaining
        • Glivenko-Cantelli class
        • goodness-of-fit test
        • GRO e-variable
        • GROW e-variable
        • growth rate conditions in sequential testing
        • heavy-tailed concentration
        • Hellinger distance
        • Hermitian matrix
        • hilbert space
        • histograms
        • Hölder space
        • Huber contamination model
        • hypothesis testing
        • ideal metrics
        • infinitely divisible distribution
        • information processing inequality
        • information theory
        • instrumentalist theory of probability
        • integral probability metric
        • interpolating between Markov and Chernoff
        • inverse problems
        • irregular problems in hypothesis testing
        • isotropic distributions
        • issues with p-values
        • Jeffreys prior
        • Jeffreys' paradigm of hypothesis testing
        • Karlin-Rubin theorem
        • Kelly betting
        • kernel density estimation
        • kernel MMD
        • kernel regression
        • kernel trick
        • KL divergence
        • knn
        • KS distance
        • lady tasting tea
        • law of likelihood
        • laws of large numbers
        • laws of the iterated logarithm
        • learning theory
        • light-tailed maximal inequalities
        • light-tailed, unbounded scalar concentration
        • likelihood principle
        • likelihood-ratio test
        • Lindeberg-Feller CLT
        • Lindeberg-Levy CLT
        • Linear Regression
        • linear smoothers
        • list of maximal inequalities
        • local differential privacy
        • local polynomial regression
        • Loewner order
        • log-concave distribution
        • Lp norm
        • Lyapunov CLT
        • M-estimation
        • marginal consistency
        • Markovian alternatives
        • martingale CLT
        • martingale concentration
        • martingale dependence
        • matrix inequalities
        • matrix martingale inequalities
        • maximal inequalities
        • maximizing log-wealth
        • Mayo's error statistics
        • MCMC
        • mean estimation
        • median-of-means
        • Mercer kernel
        • merging e-values
        • method of moments for concentration
        • method of moments for estimation
        • metric entropy
        • metric space
        • MGF
        • minimal sufficiency
        • MLE
        • model selection
        • model-X assumption
        • Monge formulation
        • monotone likelihood ratio
        • multi-group calibration
        • multi-group consistency
        • multiarmed bandit
        • multiple testing
        • multivariate concentration
        • multivariate heavy-tailed mean estimation
        • multivariate light-tailed concentration
        • mutual information
        • Nash equilibrium
        • negative correlation can improve concentration
        • Neyman-Pearson lemma
        • Neyman-Pearson lemma for discrete distributions
        • Neyman-Pearson paradigm
        • Neyman-Pearson paradigm with losses
        • nonparametric classification
        • nonparametric density estimation
        • nonparametric regression
        • numeraire e-variable
        • online calibration
        • online gradient descent
        • online marginal estimation
        • Online Newton Step
        • operator norm inequalities
        • optimal transport
        • optimal transport costs
        • optimality of Markov and Chebyshev
        • optimization perspective on Markov's inequality
        • optional continuation
        • optional stopping
        • Orlicz norm
        • p-hacking
        • p-value
        • PAC learning
        • PAC-Bayes
        • parametric density estimation
        • parametric versus nonparametric statistics
        • partitions and trees
        • permutation test
        • permutation testing by betting
        • Petrov's CLT template
        • pinball loss
        • Pinelis approach to concentration
        • portfolio optimization
        • post-hoc confidence sequences via e-processes
        • post-hoc hypothesis testing
        • post-hoc hypothesis testing with losses
        • post-hoc valid confidence sequences
        • PRDS
        • prediction-powered inference
        • proper scoring rule
        • quantile estimation
        • quantitative CLT template with ideal metrics
        • Rademacher complexity
        • randomized inequalities
        • Rao-Blackwell theorem
        • REGROW e-variable
        • representer theorem
        • reverse information projection (RIPr)
        • RKHS
        • rkhs regression
        • robust statistics
        • Royall's three questions
        • safe, anytime-valid inference (SAVI)
        • scalar heavy-tailed mean estimation
        • score function
        • self-normalized concentration
        • self-supervised learning
        • semi-supervised learning
        • sequential hypothesis testing
        • sequential probability ratio test
        • sequential statistics
        • small worlds vs large worlds
        • splines
        • squared error
        • statistical decision theory
        • statistical inference
        • stitching for LIL rates
        • stopping-time
        • strong approximations
        • sub-exponential distributions
        • sub-Gaussian distributions
        • sub-Gaussian process
        • sub-psi process
        • submartingale
        • sufficiency and the likelihood
        • sufficient statistic
        • supermartingale
        • supervised learning
        • survey sampling
        • t-test
        • techniques for multivariate concentration
        • test-martingale
        • testing by betting—composite vs composite
        • testing by betting—simple vs composite
        • testing by betting—simple vs simple
        • testing by betting—two-sample testing
        • testing exchangeability
        • testing forecasters by betting
        • testing group invariance
        • the missing factor in Hoeffding's bounds
        • the problem of approximate inference in deep learning
        • time-uniform
        • total variation distance
        • trimmed mean estimator
        • truncation-based estimators
        • two-sample testing
        • u-statistics
        • uncertainty quantification
        • uniform convergence bounds
        • uniformly most powerful test
        • universal inference
        • unsupervised learning
        • v-statistics
        • Vapnik-Chervonenkis theory
        • variational approach to concentration
        • variational inference
        • Ville's inequality
        • Wald interval
        • Wald test
        • Warner's randomized response
        • Wasserstein Distance
        • wavelets
        • weighted least squares
        • zero sum game

      BC © 2025

      • GitHub
      • Twitter