This page contains resources about Statistical Learning Theory  and Computational Learning Theory.

Subfields and ConceptsEdit

  • Asymptotics
  • Vapnik-Chervonenkis(VC) Theory
    • VC dimension
    • Symmetrization
    • Chernoff Bounds
  • Kernel Methods
  • Support Vector Machines
  • Probably Approximately Correct (PAC) Learning
  • Boosting
  • Estimation Theory
  • Decision Theory
  • Information Theory
    • Entropy
    • Kullback-Leibler (KL) Divergence
    • Kolmogorov Complexity
  • Game Theory
    • Minimax Theorem
    • Blackwell's Approachability
  • Occam's razor / Occam Learning
  • Solomonoff's Theory of Inductive Inference
  • No Free Lunch Theorem
  • Principle of Maximum Entropy
  • Maximum Entropy (Maxent) Models / Entropic priors
    • Multinomial logistic regression / Softmax regression
  • Online Learning and Online Convex Optimization
    • Regret Bounds
    • Bregman Divergence
    • No-regret Learning
    • Online Gradient Descent
    • Online Subgradient Descent
    • Mirror Descent
    • Stochastic Gradient Descent (SGD)
    • Mini-batch Gradient Descent
    • Follow The Regularized Leader (FTRL)
    • Multi-Armed Bandit (MAB)
    • Regularization
      • L2-regularization / Tikhonov regularization / Ridge regression
      • L1-regularization / Least absolute shrinkage and selection operator (LASSO)
      • Matrix Regularization
  • Reinforcement Learning

Online CoursesEdit

Video LecturesEdit

Lecture NotesEdit

Books and Book ChaptersEdit

  • Kearns, M. J. (1990). The Computational Complexity of Machine Learning. MIT press.
  • Natarajan, B. K. (1991). Machine Learning: A Theoretical Approach. Morgan Kaufmann.
  • Kearns, M. J., & Vazirani, U. V. (1994). An Introduction to Computational Learning Theory. MIT press.
  • Devroye, L., Györfi, L., & Lugosi, G. (1997). A Probabilistic Theory of Pattern Recognition. Springer Science & Business Media.
  • Anthony, M. H. G., & Biggs, N. (1997). Computational Learning Theory. Cambridge University Press.
  • Mitchell, T. M. (1997). "Chapter 7: Computational Learning Theory". Machine Learning. McGraw Hill.
  • Vapnik, V. N., & Vapnik, V. (1998). Statistical Learning Theory (Vol. 1). New York: Wiley.
  • Vapnik, V. (1999). The Nature of Statistical Learning Theory. Springer Science & Business Media.
  • Devroye, L., & Lugosi, G. (2001). Combinatorial methods in density estimation. Springer Science & Business Media.
  • Cesa-Bianchi, N., & Lugosi, G. (2006). Prediction, Learning, and Games. Cambridge University Press.
  • Vapnik, V. (2006). Estimation of dependences based on empirical data. Springer Science & Business Media.
  • Rissanen, J. (2007). Information and complexity in statistical modeling. Springer Science & Business Media.
  • Anderson, D. R. (2008). "Section 3.2: Linking Information Theory to Statistical Theory". Model Based Inference in the Life Sciences. Springer New York.
  • Hastie, T., Tibshirani, R., Friedman, J., Hastie, T., Friedman, J., & Tibshirani, R. (2009). The Elements of Statistical Learning. 2nd Ed. New York: Springer.
  • Shalev-Shwartz, S. (2011). Online Learning and Online Convex Optimization.Foundations and Trends® in Machine Learning4(2), 107-194.
  • Sridharan, K. (2012). Learning From An Optimization Viewpoint. arXiv preprint arXiv:1204.4145.
  • Mohri, M., Rostamizadeh, A., & Talwalkar, A. (2012). Foundations of Machine Learning. MIT press.
  • Shalev-Shwartz, S., & Ben-David, S. (2014). Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press.
  • Hazan, E. (2015). Introduction to online convex optimization. Foundations and Trends® in Optimization2(3-4), 157-325.

Scholarly ArticlesEdit

  • Vapnik, V. N. (1999). An overview of statistical learning theory. IEEE transactions on neural networks10(5), 988-999.
  • Bousquet, O., Boucheron, S., & Lugosi, G. (2004). Introduction to Statistical Learning Theory. In Advanced Lectures on Machine Learning (pp. 169-207). Springer Berlin Heidelberg.
  • Boucheron, S., Bousquet, O., & Lugosi, G. (2005). Theory of classification: A survey of some recent advances. ESAIM: probability and statistics9, 323-375.
  • Ying, Y., & Pontil, M. (2008). Online gradient descent learning algorithms. Foundations of Computational Mathematics8(5), 561-596.
  • Shalev-Shwartz, S. (2011). Online learning and online convex optimization. Foundations and Trends® in Machine Learning4(2), 107-194.
  • Sridharan, K. (2012). Learning from an optimization viewpoint. arXiv preprint arXiv:1204.4145.
  • Villa, S., Rosasco, L. & Poggio, T. (2013). On Learning, Complexity and Stability. arXiv preprint arXiv:1303.5976.
  • Bubeck, S. (2014). Convex optimization: Algorithms and complexity. arXiv preprint arXiv:1405.4980.


See alsoEdit


Other ResourcesEdit