Research and mathematics

In this section, main research papers, major projects and other significant work, written by me, related to the field of mathematics, statistics and operational research are published. Pursuing my education in the fields of mathematics and statistics, I have developed key interest in topics related to large dimensional data analysis, machine learning and data mining, mostly studied under the application to finance and technology framework.

Disclaimer: Copyright Artur Kotlicki. Any document may not be cited, reproduced or distributed without express written permission. For more details, email me at

Research papers

Random matrix theory and estimation of high-dimensional covariance matrices
Imperial College London, Final Year Project
This projects aims to present significant results of random matrix theory in regards to the principal component analysis, including Wigner’s semicircular law and Marcenko-Pastur law describing limiting distribution of large dimensional random matrices. The work bases on the large dimensional data assumptions, where both the number of variables and sample size tends to infinity, while their ratio tends to a finite limit. In this project, key results enabling to establish a low dimensional factor model form a large noisy data will be stated, as well as a general way of proving them will be given. A significant portion of the proofs relies on the Stieltjes transform, a common tool used for studying the convergence of spectral distribution of large matrice, which is also discussed in this project. The established theory is applied to a real-life financial data, based on the S&P 500 index.

Concave-Convex Adaptive Rejection Sampling (CCARS) algorithm, Monte Carlo integration, and Metropolis-Hastings sampler
Imperial College London, Stochastic Simulation Project
The aim of the project is to study and implement efficient algorithms for simulating from a given density, as well as determine the normalising constant using Monte Carlo Integration. We will present a detailed study on the Concave-Convex Adaptive Rejection Sampling (CCARS) algorithm, use its results to estimate the normalising constant of a given density, and finally employ a Metropolis-Hastings algorithm as an alternative to the proposed CCARS algorithm in order to allow for the comparison between different sampling techniques.

January 17, 2014
Available upon request

Final Project in Statistical Pattern Recognition, Assessment of Data Classification Methods
Imperial College London, Statistical Pattern Recognition Project
Comprehensive study of high-dimensional dataset, implementing dimension reduction techniques used with various classifier methods, such as Quadratic Discriminant Analysis (QDA), k-Nearest Neighbours (kNN), classification and regression trees, logistic discrimination, and multilayer perceptron. The project provides in-depth analysis and evaluation of the classication methods, including assessment of their performance and hypothesis testing using McNemar's test.

Stein's Paradox
Imperial College London, Second Year Group Project
Group project studying the Stein's Paradox and mathematical concepts related to it, including a detailed discussion on a transformation from a binomial random variable to normal random variable. Empirical verification of the result using a real-life data from a manufacturing company.

Forecast Models for Sunspot Number (poster)
Imperial College London, First Year Individual Project
A poster summarising a study on a trigonometric model used to forecast sunspot number. After various modifications and improvements to a basic model, a very accurate prediction model was formed (as could be verified few years after the written study).

Which stochastic time series forecasting methods are most suitable for a short-term sale forecast?
International Baccalaureate, Extended Essay
In the taken approach, the sales forecast is being regarded as a forecast of a stochastic process in which economical and market factors are not taken into consideration. The investigated question is a major dilemma of a good producing companies in which supply chain has to be planned in order to maximize sales while preserving low costs of transportation or storage of goods. The problem is being investigated through building mathematical models, providing a three-month sales forecast based on the past sales data of a manufacturing company. In the essay unweighted and double moving average models are presented, as well as Holt’s and Winter’s models, which belong to the exponential smoothing class models, and the autoregressive integrated moving average model.