Friday, February 08, 2013

Results Are In

I scored 46/50 on yesterday's test.  The test was open book/notes, but on the bad news is that there are no bonus or extra credit points on it to make up for the goofy errors I made.  I'm still satisfied with the results.

Grades in that course come 90% from tests and 10% from a project.  My instructor approved my project yesterday:  I'm going to try multiple regression to determine if I can predict an NFL team's score in a game if I'm given only total yards, total number of offensive plays, and number of turnovers.  If my model works, I can try it out next season as a predictor for which teams should win--and see if I get over 50/50.  If nothing else, it's fun :)

6 comments:

maxutils said...

You may have collineartity problems if you use both yards and plays ... you might also consider net turnovers rather than gross ... a team that turns the ball over 6 times should lose ... unless their opponent turs it over 9 times.

Darren said...

Actually, I'll use their opponent's number of turnovers.

I've considered the collinearity problems, just don't have anything better to use. I considered time of possession but our school's former football coach said number of offensive plays would be better.

maxutils said...

That's where the joy of composite variables comes in... having done my thesis in a similar area? My econometrics background taught me that the first thing a model should be is theoretically sound. Using two related offensive variables could seriously skew your results. why not use % of time possessed, multiplied by yardage, + offensive plays as one variable? Definitely more work. I have no doubt that turnovers are going to be statistically significant -- whether for, or against, or net -- but I bet you get the best result with net, second with opposing team, and last with home team. By the way, Bill James' Stats Inc. in Baltimore is where to go for data ...they will do it digitally for you.

Steve USMA '85 said...

So you are going strictly for predicting the score of the team you are gathering the stats for correct? Not concerned with the points scored by the other team by what you write. Looks like you are constrained by not knowing what the other team does.

What other variables do you have available? I would think stats such as number of receivers who caught a pass would be an indicator along with average yards/catch. Another indicator could be how the offense does compared to normal. Such as if they normally have 60/40 run/pass ratio, how close to that did they come? If they switch to 40/60, they are probably behind and need quick points to catch up.

As to turnovers, I don't know that this stat is a good statistic as to how many points you score as much as whether the team will win.

Of course, we all can analyze this to death. If there was a good predictor, Las Vegas would have figured it out already. Curious to see your results Darren. I've done similar projects over time as both student & instructor.

Anonymous said...

Darren: "Actually, I'll use their opponent's number of turnovers."

You want to use both, right? A turnover by your team would cost you points in any sane model, and a turnover by the other team would gain you points.

A question:
*) Does a team gain credit for "yards gained" when returning a fumble/interception?

NOTE: The model you are describing will apply for a game ... unless you expect there to be the ability to predict yards/plays/turnovers, then predicting a game before you have the stats won't work. Right?

-Mark Roulo

maxutils said...

Mark ... I had the same issue, but I think the idea is to use past stats to predict likeliness of wins ...a perfect fit would be to use net points scored as a predictor of wins. But now I sound like a geekier version of Troy Aikman.