User Info  Posts 
davethegr8  Find a “best fit” equation  #1  It's been a while since I was in college and knew how to calculate a best fit line, but I find myself needing to. Suppose I have a set of points, and I want to find the line that is the best of those points. What is the equation to determine a best fit line? How would I do that with PHP? posted date: 20081212 16:25:00



kawhi  Re: Find a “best fit” equation  #2  


Svante  Re: Find a “best fit” equation  #3  An often used approach is to iteratively minimize the sum of squared ydifferences between your points and the fit function. posted date: 20081212 16:29:00 


Tim Whitcomb  Re: Find a “best fit” equation  #4  Although you can use an iterative approach, you can directly calculate the slope and intercept of a line given a set of observations using a leastsquares approach. See the "Univariate Linear Case" section of the Wikipedia article on linear regression for how to calculate the coefficients a and b in y = a + bx given sets of (x,y) points. posted date: 20081212 16:35:00 


Zach Scrivena  Re: Find a “best fit” equation  #5  You may want to check out linear regression, or more generally, curve fitting. posted date: 20081212 16:39:00 


John D. Cook  Re: Find a “best fit” equation  #6  Here's an article comparing two ways to fit a line to data. One thing to watch out for is that there is a direct solution that is correct in theory but can have numerical problems. The article shows why that method can fail and gives another method that is better. posted date: 20081212 16:40:00 


FryGuy  Re: Find a “best fit” equation  #7  Implemented from wiki page, untested. $sx = 0;$sy = 0;$sxy = 0;$sx2 = 0;$n = count($data);foreach ($data as $x => $y){ $sx += $x; $sy += $y; $sxy += $x * $y; $sx2 += $x * $x;}$beta = ($n*$sxy  $sx*$sy) / ($n*$sx2  $sx*$sx);$alpha = $sy/$n  $sx*$beta/$n;echo "y = $alpha + $beta x"; posted date: 20081212 16:43:00 


ruquay  Re: Find a “best fit” equation  #8  Of additional interest is probably how good of a fit the line is.For that, use the Pearson correlation, here in a PHP function: /** * returns the pearson correlation coefficient (least squares best fit line) * * @param array $x array of all x vals * @param array $y array of all y vals */function pearson(array $x, array $y){ // number of values $n = count($x); $keys = array_keys(array_intersect_key($x, $y)); // get all needed values as we step through the common keys $x_sum = 0; $y_sum = 0; $x_sum_sq = 0; $y_sum_sq = 0; $prod_sum = 0; foreach($keys as $k) { $x_sum += $x[$k]; $y_sum += $y[$k]; $x_sum_sq += pow($x[$k], 2); $y_sum_sq += pow($y[$k], 2); $prod_sum += $x[$k] * $y[$k]; } $numerator = $prod_sum  ($x_sum * $y_sum / $n); $denominator = sqrt( ($x_sum_sq  pow($x_sum, 2) / $n) * ($y_sum_sq  pow($y_sum, 2) / $n) ); return $denominator == 0 ? 0 : $numerator / $denominator;} posted date: 20081212 16:45:00 


ruquay  Re: Find a “best fit” equation  #9  btw, the Pearson coefficient ranges from 0 (no correlation) to 1.0 (points lie on a straight line) posted date: 20081212 16:46:00 


Muhd  Re: Find a “best fit” equation  #10  +1 This by far the best answer, the other method is vastly inferior, albeit more popular. posted date: 20110608 10:02:00 


select page: « 1 » 