Wednesday, 17 February 2021

Using Machine Learning to Identify T20 Opener Styles

There is little doubt that getting off to a good start in T20 cricket, as in any format, is crucial to being a successful team. Teams often look to get their best batsmen opening to ensure that they face as many balls as possible in the shortened format. However, history will show us that there are several different types of successful T20 opener - Virat Kohli and David Warner play a very different style of innings than Sunil Narine, but given the right balance around them, there is little doubt that all three of those players are very effective opening batsmen.


I wanted to look at finding a way to identify these different styles of opening batsmen using a simple machine learning technique. The important thing here is that I am looking to identify the styles of opening batsmen, not necessarily the quality of the batsman. Whilst that may come through within some of the styles, that is not the direct purpose of this. A certain style of opener may work well in certain teams based on their particular balance of other players whereas another team may be looking for something different. If you have a team of big hitters, an opener that plays as more of an accumulator might be what you are looking for to provide the stability for those other hitters. If you have accumulators in the middle order, you might want a more aggressive opener.


To achieve this, we will use K-means clustering. I will not go into the details here, but at a basic level, this simply looks to find groups in the data. For the data used, I took every player that had opened the batting in at least 10 innings in a decent standard T20 league (IPL, Big Bash, T20 Blast, PSL, CPL, BPL, Super Smash and internationals) since the start of 2018. For players that have batted in multiple positions, I only took their stats for those matches where they opened. This gave us a dataset made up of 128 batsmen. So, let's take a look at what we got from the algorithm.



Each cluster is represented by a different colour, the shapes mark the boundaries of each cluster and each cluster has a central point that represents it. We can see that we have six different clusters, although there is some overlap between three of the clusters. Hopefully we can see some similarities between the styles of the players in each cluster.


Cluster 1 (The Power Hitters)



Players: Sunil Narine, Ed Pollock, Rahkeem Cornwell, Moeen Ali and Finn Allen

The first cluster represents the pure immediate power hitters. There are only 5 players in this cluster, but the most identifying feature is their incredible strike rate right from the very start of their innings. Their innings do not necessarily tend to be the longest, but they are not going to waste any balls, they have the power and the freedom to find the boundary regularly and they can get your team off to a blistering start, although you may also lose early wickets.


Cluster 2 (Powerplay Specialists)



Players: Jason Roy, Parthiv Patel, James Vince, Usman Khawaja, Suryakumar Yadav, Liton Das, Joe Clarke, Adam Rossington, Steven Davies, Neil Broom, Neil Dexter, George Worker, Mark Cosgrove, Afif Hossain,  Ken McClure, David Lloyd, Junaid Siddique, Tim Seifert, Johnson Charles, Sam Heazlett and Mehidy Hasan

The obvious feature of the second cluster is their struggles outside of the powerplay overs. They consist of players that are relatively capable of picking the gaps during the powerplay as shown by a reasonably good fours percentage and a solid strike rate during this period, but lack the power to really push on once the fielding restrictions are relaxed. Having these players in your team is not a major problem, but you do run the risk of a real slowdown once the powerplay is over.

Cluster 3 (Attacking Openers)


Players: Jos Buttler, Aaron Finch, Colin Munro, Alex Hales, Phil Salt, Ben Stokes, Tom Banton, Mayank Agarwal, Luke Ronchi, Mohammad Shahzad, Chad Bowes, Graham Clark, Adam Wheater, Anton Devcich, Adam Lyth, Josh Inglis, Richard Levi, Johann Myburgh, Paul Sterling, Zak Crawley, Wriddhiman Saha, Hamish Rutherford, Ben Duckett, Rilee Rossouw, Miles Hammond, Aneurin Donald and Max Bryant

This is a cluster of attacking openers that are able to sustain that approach. They are certainly not quite as outwardly aggressive as the power hitters, but they are able to start attacking early in their innings and sustain that outside the powerplay. It is no surprise to see plenty of players in this group that would be considered as very good T20 openers that are capable of big scores. They may not necessarily have the range of shots to crank up the strike rate to insane levels, but they are very capable of starting well and continuing to score quickly once the fielding restrictions are relaxed.

Cluster 4 (Poor Openers)


Players: Shaun Marsh, Faf du Plessis, Hashim Amla, Andre Fletcher, Michael Klinger, Max Holden, Michael Pollard, Dom Sibley, Tamim Iqbal, Sean Solia, Chadwick Walton, Chandrapaul Hemraj, Daniel Hughes, Marcus Harris, Ben Dunk, Dane Cleaver, Harry Swindells, Mackenzie Harvey, Anamul Haque, Ahmed Shehzad, Luis Reece, Chris Nash, Jack Edwards and Nic Maddinson

I know I said that the aim of this was not to make judgments on the quality of players, but this cluster has all the hallmarks of those that you do not really want to be opening the batting for your T20 team. They are very slow starters, struggling to find the boundary and tend not to have the power to push on once they are settled. They are capable of making decent scores, but they would often be the likes of 60 off 50 balls-type scores that look at and think they are likely to be match-losing innings given the circumstances, regardless of what the commentators may claim.

Cluster 5 (The Accumulators)


Players: Virat Kohli, KL Rahul, Quinton de Kock, David Warner, Shikhar Dhawan, Babar Azam, Matthew Wade, Tom Latham, Stevie Eskinazi, Tom Kohler-Cadmore, Devon Conway, Josh Philippe, Tom Westley, Ian Bell, D'Arcy Short, Daniel Bell-Drummond, Scott Steel, Marcus Stoinis, Luke Wright, Ajinkya Rahane, Imam-ul-Haq, Alex Carey, Rahul Tripathi, Joe Denly, Alex Davies, Davdutt Padikkal, Shubman Gill, Varun Chopra and Billy Godleman

The clear standout feature of this cluster is the average number of balls faced - the 25.5 balls is more than five more than any other cluster. These players are those that play long innings in this format and do it at a reasonable rate. They may not have the power to really go crazy, but they are good anchor players to build your power players around that will not prove too much of an issue until the late stages of the innings.

Cluster 6 (The Accelerators)


Players: 
Chris Gayle, Jonny Bairstow, Chris Lynn, Rohit Sharma, Martin Guptill, Brendon McCullum, Evin Lewis, Dawid Malan, Liam Livingstone, Kamral Akmal, Brandon King, Cameron Delport, Will Jacks, Glenn Phillips, Prithvi Shaw, Riki Wessels, Shane Watson, Sharjeel Khan, Lendl Simmons, Ambati Rayadu, Fakhar Zaman and Jake Weatherald

This is a cluster of players that can both frustrate and excite. Only cluster 4 start their innings slower than this group, but only the pure power hitters score faster once they have their eye in. They know that they have the power to clear the boundary when they get going and take their time early to ensure that they are seeing the ball well. This can lead to problematic innings if they are dismissed around the 10-15 ball mark, but if they bat long, you can be pretty certain of a rapid acceleration.

Team Structure

Each cluster has its distinguishing features and they each have their positives and negatives. As good as some of the players in group 5 are, ideally you might not want two openers of that style unless you have an outstanding bowling attack. The players in group 6 give you excellent upside, but if you had two players from that cluster, you run the major risk of getting off to a very slow start that could be fatal if you lose wickets at the wrong time. Group 1 will undoubtedly get you off to a rapid start, but you have to accept the fact that you are also likely to lose early wickets.

Ahead of writing this, I asked around on Twitter as to who people considered to be the 'best' T20 opening batsmen in recent years. The four most mentioned players were KL Rahul, Jos Buttler, Chris Gayle and David Warner - all very valid answers. However, the most interesting aspect of that is that these four players fall across three different clusters, further showing that the style of opener alone on the whole does not make a batsman 'good' or 'bad'.

Knowing the different types of opener allow you know how to structure the opening pair and the rest of your team. Knowing the quality of players within each of these styles clearly requires a different way of analysing the data though. Whilst Virat Kohli and Billy Godleman are classified into the same style of play, nobody would argue that Kohli is not a far superior player. However, the balance of styles within a team is crucial and you could actually weaken your overall team by just crowding in the 'best' players if their styles do not mesh well within the greater team structure. This is something that it is worth considering for the IPL teams ahead of tomorrow's auction.

No comments:

Post a Comment

Powered by Blogger.