Write a super-efficient OLS regression function in Java
New here? Learn about Bountify and follow @bountify to get notified of new bounties! x

Write an efficient Java function to perform Multiple Ordinary Least Squares regression. The function should be in the form:

double[] getLineParams(double[][] independentVars, double[] dependentVar);

The length of dependentVar and each of independetVars[] is the number of datapoints in the sample. The length of the return array is the number of independent variables (the length of independentVars), plus 1; it should describe the line that minimizes the sum of squared errors, with the nth element describing the linear contribution of the nth independent variable (which may be negative). The last element of the return array should contain the constant (offset) part of the line equation.

So, assuming there were three independent variables, the objective would be to minimize totalError in the following test code:

double totalError=0;
for (int i=0;i<n;i++) {
  double error=dependentVar[i] -
    returnArray[0] * independentVars[0][i] +
    returnArray[1] * independentVars[1][i] +
    returnArray[2] * independentVars[2][i] +

Any version of Java may be used. Solutions should use core Java and basic Math functions only. In particular no linear algebra libraries may be used. The function is not permitted to spawn any threads.

The winning entry will be the function that gives the correct results as efficiently as possible. The efficiency will be measures in terms of total execution time over a very large sample of pseudo-random datasets, which will vary wildly in width (the number of independent variables) and in length (the number of datapoints).

awarded to Wikimedia

Crowdsource coding tasks.

0 Solutions