class: center, middle, inverse, title-slide # An Interpretable Method of Learning Stochastic Game Dynamics ## Nicholas Ho, Sifan Tao, Adhvaith Vijay ### Advisor : Kostas Pelechrinis ### 2021/11/6 --- <!-- Nick --> ## Introduction / Motivation -- - Soccer is complicated -- - StatbombR data only has tracking data of the ball, but not players - Avoid a player-location agnostic approach to modeling ball statistics -- - Use potential functions - underlying functions that guide forces - to model ball dynamics - Use this to approximate final game outcomes --- <!-- Sifan --> ## [StasbombR Data](https://github.com/statsbomb/StatsBombR) <table> <thead> <tr> <th style="text-align:left;"> team </th> <th style="text-align:left;"> time </th> <th style="text-align:left;"> type </th> <th style="text-align:right;"> location.x </th> <th style="text-align:right;"> location.y </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Barcelona </td> <td style="text-align:left;"> 00:02:40.501 </td> <td style="text-align:left;"> Dribble </td> <td style="text-align:right;"> 66.2 </td> <td style="text-align:right;"> 11.1 </td> </tr> <tr> <td style="text-align:left;"> Barcelona </td> <td style="text-align:left;"> 00:00:37.890 </td> <td style="text-align:left;"> Foul Committed </td> <td style="text-align:right;"> 57.2 </td> <td style="text-align:right;"> 66.3 </td> </tr> <tr> <td style="text-align:left;"> Getafe </td> <td style="text-align:left;"> 00:01:39.139 </td> <td style="text-align:left;"> Interception </td> <td style="text-align:right;"> 27.7 </td> <td style="text-align:right;"> 41.8 </td> </tr> <tr> <td style="text-align:left;"> Valencia </td> <td style="text-align:left;"> 00:48:06.967 </td> <td style="text-align:left;"> Offside </td> <td style="text-align:right;"> 86.3 </td> <td style="text-align:right;"> 23.9 </td> </tr> <tr> <td style="text-align:left;"> Barcelona </td> <td style="text-align:left;"> 00:00:00.229 </td> <td style="text-align:left;"> Pass </td> <td style="text-align:right;"> 61.0 </td> <td style="text-align:right;"> 40.1 </td> </tr> <tr> <td style="text-align:left;"> Barcelona </td> <td style="text-align:left;"> 00:11:44.435 </td> <td style="text-align:left;"> Shot </td> <td style="text-align:right;"> 93.2 </td> <td style="text-align:right;"> 44.4 </td> </tr> </tbody> </table> #### Data provided by [Statsbomb](https://github.com/statsbomb/open-data) --- <!-- Sifan 1--> ## Random Walk A sequence of some steps in random directions on some mathematical space. - A point randomly moves along the integer line - A point randomly moves on x-y plane -- <center><img src="./pics/random_walk.png" width = "400px"/> --- <!-- Sifan 2--> ## Potential Function - Idea: The ball is a randomly drifting ball that is attracted to the goal by some "force". -- - A ball placed in this vector field will move along the arrows of the vector field. <br> <center>![](https://cdn.kastatic.org/googleusercontent/wBalF7MINVVbEksIOvIBlO6RupAqyg5yY7nkoZ5p3bPTBCGjkrWAyTQtspKaMm9BIWRlMKj089R3R15hBDI5yFY) <br> .footnote[Khan Academy] --- <!-- Sifan 3--> ## Potential Function - We can try to model this underlying force based on the movements of the ball -- - Our Potential Functions - Gravity $$ V(x,y) = -\frac{G}{\sqrt{x^2 + y^2}} $$ --- ## Random Walk under Harmonic Potential Function Using potential function as a guidance for random walk -- - A step by the random particle under a force. `$$r(t_{i+1}) - r(t_i) = - \nabla V(r(t_i)) (t_{t+1} - t_i) + \sigma \sqrt{(t_{i+1} - t_i)} Z_{i+1}$$` - `\(r(t_i)\)`: location of the ball at time i - `\(V(r(t_i))\)`: potential function at time i - `\(Z_{i}\)`: standard Gaussian -- <br> - Small_Change = Estimated_Velocity x TimeStep + Noise <!-- END OF SIFAN SLIDES --> --- <!-- Nick --> ## Learning Potential Functions from Trajectories - [Learning a Potential Function from a Trajectory - Brillinger](https://statistics.berkeley.edu/sites/default/files/tech-reports/723.pdf) - Assumptions -- - Overdamped system, friction is high so force affects the velocity, not acceleration. - `\(V(x,y)\)` can be approximated as a linear combination of basis functions --- ## The Basis Functions - Our basis functions are a set of gravitational points on the field - The Coefficients Scale the strength of the hole (Attractive or Repulsive) $$V(x_i,y_i) = \sum \frac{\beta_i}{\sqrt{(x-x_i)^2 + (y-y_i)^2}} $$ <img src="./pics/basis_pic.png" width = "720px"/> --- ## Overview of the Potential Fitting Model `$$V(x,y) = \phi(x,y)^T \beta$$` `$$Force = -\nabla V(x,y) = -\nabla \phi(x,y)^T \beta$$` -- `$$Velocity \sim Force$$` -- `$$\frac{dr(x,y)}{dt} = - \nabla V(x,y)$$` `$$\frac{dr(x,y)}{dt} = -\nabla \phi(x,y)^T \beta$$` This just becomes simple linear regression! --- <!-- Nick --> ## Learning Potential Functions from Trajectories
--- <!-- Nick --> ## Simulating Games using the learned Potential Functions - Overlay the potential coefficients (additive) and simulating
--- ## Potential Surface Contour - Contour Map of Arsenal's Defensive and Manchester's Offensive - Showing the maps change over time <img src="./pics/3d_2d.gif" width="100%" style="display: block; margin: auto;" /> --- ## One realization of a game simulation - Because the offensive and defensive coefficients are swapped based on possession - Heatmap of a simulated Game Trajectory <img src="./pics/trajectory.png" width = "600px"/> --- <!-- Adhvaith and Nick--> ### Simulated Game Shot Example <!-- <img src="./pics/min16.png" width = "600px"/> --> <img src="./pics/first_goal.gif" width="100%" /> --- ### Web App to Generate Simulations - https://cmsac-soccer-potential.herokuapp.com/ - Select teams and run simulations --- ### Connecting Statistical Mechanics to Soccer Potential Functions - By setting up our model this way, we can find the stationary distribution of the particle $$ P(E) = \frac{1}{Z} \exp{[\frac{-E}{k_\beta T}]} $$ - Thus we know the equilibrium probability distribution conditioned by time. - Step Size + Noise are directly influenced by Temperature -- - We can also marginalize over time to get a time-independent equilibrium distribution `$$\sum_{t = 0}^{t = 89} P_{\pi} (x,y | t) dt$$` - But instead we opt to sample score differentials by directly simulating games, this way we can add more detailed submodels --- ### Now we can sample score distributions - The model can provide a distribution of score differentials instead of a point estimate - May be unimodel or multimodal based on the teams competing <img src="./pics/Score_Distributions.png" width="90%" /> --- ## Comparisons with Other Models <table> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:right;"> Baseline Model </th> <th style="text-align:right;"> Poisson Regression Model </th> <th style="text-align:right;"> Our Potential Model </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> MSE </td> <td style="text-align:right;"> 3.597 </td> <td style="text-align:right;"> 3.607 </td> <td style="text-align:right;"> 3.379 </td> </tr> <tr> <td style="text-align:left;"> Residual Variance </td> <td style="text-align:right;"> 3.388 </td> <td style="text-align:right;"> 3.519 </td> <td style="text-align:right;"> 3.367 </td> </tr> </tbody> </table> <img src="./pics/baseline_comparisons.png" width = "1500" height = "500"/> --- ## Comparisons Summary - The averaged baseline model has a bias and fails to capture larger score differential - The poisson regression model is better at predicting larger score differentials, but has larger residual variance - Our model can predict larger score differential, and has a lower residual variance -- - Our model ("trained" on velocities) does just as well as models trained directly on the scores --- <!-- Nick --> ## Discussion and Future Work - Potential functions as a viable way can summarize ball dynamics, as can be seen by the score predictions. - Making the model more expressive by using other basis functions - Lots of potential to improve the framework with more accurate sub-models - Shot Decision model and Goal Decision model - Incorporate team formation and score differential into the potential function - This framework can be used to determine other things such as: - Player-based potential functions to determine player impacts - This framework of using potential energy can be further tied to methods in statistical mechanics - For example tying the idea of "free energy" quantify the value of certain trajectories. --- <!-- Nick --> ## Appendix: Game Simulation