<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Prime Sports ]]></title><description><![CDATA[Sports betting content and opinion]]></description><link>https://journal.primesports.com</link><image><url>https://substackcdn.com/image/fetch/$s_!Z5g5!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b46edb0-7b2a-48f0-83b1-56b9ca57aa5a_600x600.png</url><title>Prime Sports </title><link>https://journal.primesports.com</link></image><generator>Substack</generator><lastBuildDate>Fri, 08 May 2026 10:39:53 GMT</lastBuildDate><atom:link href="https://journal.primesports.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Prime Sports]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[primesports@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[primesports@substack.com]]></itunes:email><itunes:name><![CDATA[Prime Sports]]></itunes:name></itunes:owner><itunes:author><![CDATA[Prime Sports]]></itunes:author><googleplay:owner><![CDATA[primesports@substack.com]]></googleplay:owner><googleplay:email><![CDATA[primesports@substack.com]]></googleplay:email><googleplay:author><![CDATA[Prime Sports]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Series (V): T20 cricket to diversify bankroll deployment]]></title><description><![CDATA[Final chapter and code]]></description><link>https://journal.primesports.com/p/series-v-t20-cricket-to-diversify</link><guid isPermaLink="false">https://journal.primesports.com/p/series-v-t20-cricket-to-diversify</guid><dc:creator><![CDATA[Prime Sports]]></dc:creator><pubDate>Mon, 01 Jul 2024 14:30:51 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!h1XF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feaf68f40-74bf-4b12-b8e8-cb931f3181ad_419x362.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In the last couple of articles, we&#8217;ve done some exploratory analysis on some cricket stats to help us understand player performance. Today, we&#8217;re going to use those metrics we created to build our very first model and see how it performs.</p><p>To recap, we&#8217;ve identified a couple of per-delivery metrics for both batters and bowlers. Batter metrics are as follows:</p><ul><li><p>Runs per delivery: a base rate of a batter&#8217;s offensive contribution</p></li><li><p>Average number of deliveries: a measure of a batter&#8217;s longevity in their at-bat</p></li><li><p>Fours rate: how many fours per delivery a batter hits</p></li><li><p>Sixes rate: how many sixes per delivery a batter hits</p></li></ul><p>Bowler metrics are as follows:</p><ul><li><p>Runs per delivery allowed: how much offense a bowler allows</p></li><li><p>Wide rate: a metric that shows a bowler&#8217;s propensity to give up extras</p></li><li><p>Fours allowed rate: how often a bowler gives up fours</p></li><li><p>Sixes allowed rate: how often a bowler gives up sixes</p></li><li><p>Wicket rate: how often a bowler fells a wicket per delivery</p></li></ul><p>In order to utilize these metrics in a predictive model, we have to put ourselves in a simulated position where we&#8217;re trying to predict a game in front of us. If two teams are about to play each other, we have to calculate a historical record of these metrics for all of the players that will play in this match and encapsulate these historical averages into single numbers for each player. We&#8217;ll use a simple rolling average of the last 10 matches for each batter and bowler to calculate their averages for all of these metrics as the base inputs to our model.</p><p>Model selection is a much more expansive topic than the scope of this series: there are numerous tradeoffs between simplicity, interpretability, accuracy, and computational cost for all the different modeling approaches out there. The main usefulness of our first pass model is developing some better intuition on the predictive power of each of these metrics, so if we&#8217;re less concerned about matching the modeling approach to the specifics of the sport (aka no simulation of outcomes required), one approach is to just toss all of these metrics into a black-box non-linear model and see how it ranks each of these features in terms of their importance. <a href="https://xgboost.readthedocs.io/en/stable/">XGBoost</a> is a popular model choice: it leverages a lot of advancements in computational architecture to generate all kinds of predictions that fit non-linear relationships in data very well. This is a sensible model choice for our first pass, but in order to utilize this, we need to make sure we are setting up our data set correctly in order to get meaningful predictions.</p><p>One of the biggest sources of bad modeling techniques is <a href="https://en.wikipedia.org/wiki/Overfitting">overfitting</a>, where advanced knowledge of the outcome accidentally leaks into the data set, and the model overfits to features or data points that accidentally correlate strongly to the leaked outcome, but don&#8217;t have as much predictive power in out-of-sample observations. This can happen in sneaky ways that sometimes don&#8217;t appear to be overfitting. For example: in the data set we&#8217;re working with, the results are listed in terms of Team 1 and Team 2, but upon inspection, these are not randomly assigned orders: they are specifically set up in the data set so that the first team batting is team 1, and the second team batting is team 2. When we are predicting future games, we have no idea which team will bat first, as that is determined by a coin toss where the winner of the toss gets to choose if they bat or bowl first. In theory, if there are structural advantages to going first, our model would be overfit to the inherent benefits or flaws of batting first, if there are such structural biases, so this is something we have to check for if we simply want to assign player features based on Team 1 and Team 2. Interestingly, the first batting team generally wins 50% of its matches, so there does not appear to be any structural bias either way towards the first batting team, but this is still a check worth incorporating.</p><p>So here&#8217;s our first approach: we calculate rolling average summary stats for all players in each match based on previous performances, we toss all of those metrics into a black box non-linear model, and try to predict the probability of team 1 winning the match. Here&#8217;s what the distribution of the predicted probabilities looks like:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!saNk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04f88a70-0a53-4d4f-a31d-9524593fddc9_392x262.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!saNk!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04f88a70-0a53-4d4f-a31d-9524593fddc9_392x262.png 424w, https://substackcdn.com/image/fetch/$s_!saNk!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04f88a70-0a53-4d4f-a31d-9524593fddc9_392x262.png 848w, https://substackcdn.com/image/fetch/$s_!saNk!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04f88a70-0a53-4d4f-a31d-9524593fddc9_392x262.png 1272w, https://substackcdn.com/image/fetch/$s_!saNk!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04f88a70-0a53-4d4f-a31d-9524593fddc9_392x262.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!saNk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04f88a70-0a53-4d4f-a31d-9524593fddc9_392x262.png" width="392" height="262" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/04f88a70-0a53-4d4f-a31d-9524593fddc9_392x262.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:262,&quot;width&quot;:392,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!saNk!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04f88a70-0a53-4d4f-a31d-9524593fddc9_392x262.png 424w, https://substackcdn.com/image/fetch/$s_!saNk!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04f88a70-0a53-4d4f-a31d-9524593fddc9_392x262.png 848w, https://substackcdn.com/image/fetch/$s_!saNk!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04f88a70-0a53-4d4f-a31d-9524593fddc9_392x262.png 1272w, https://substackcdn.com/image/fetch/$s_!saNk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F04f88a70-0a53-4d4f-a31d-9524593fddc9_392x262.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Not bad! The probability is centered around 50%, which tracks with our overall average win rate identified in the paragraph above, and we show a healthy amount of dispersion in the estimated probabilities. (An example of a poor model would be a model that just predicted everything at 50% and shows no ability to show any kind of variation in probabilities.)</p><p>Another good check on model results is to see if they are <em>calibrated</em>, aka when a model says a team is 70% to win, the actual win rates should be close to 70%. Here&#8217;s what the calibration curve looks like for our first pass model:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!hPkP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73c8c468-ab25-4779-b7da-de291592567d_386x262.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hPkP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73c8c468-ab25-4779-b7da-de291592567d_386x262.png 424w, https://substackcdn.com/image/fetch/$s_!hPkP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73c8c468-ab25-4779-b7da-de291592567d_386x262.png 848w, https://substackcdn.com/image/fetch/$s_!hPkP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73c8c468-ab25-4779-b7da-de291592567d_386x262.png 1272w, https://substackcdn.com/image/fetch/$s_!hPkP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73c8c468-ab25-4779-b7da-de291592567d_386x262.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hPkP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73c8c468-ab25-4779-b7da-de291592567d_386x262.png" width="386" height="262" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/73c8c468-ab25-4779-b7da-de291592567d_386x262.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:262,&quot;width&quot;:386,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!hPkP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73c8c468-ab25-4779-b7da-de291592567d_386x262.png 424w, https://substackcdn.com/image/fetch/$s_!hPkP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73c8c468-ab25-4779-b7da-de291592567d_386x262.png 848w, https://substackcdn.com/image/fetch/$s_!hPkP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73c8c468-ab25-4779-b7da-de291592567d_386x262.png 1272w, https://substackcdn.com/image/fetch/$s_!hPkP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73c8c468-ab25-4779-b7da-de291592567d_386x262.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Generally looks pretty well calibrated, so this model at least has some predictive power- our features are doing something.</p><p>Now, the million dollar question: which of these metrics are most important? One of the downsides of a black box model like XGBoost is it doesn&#8217;t lend itself nicely to <em>weighting </em>the importance of metrics, so it can&#8217;t give you clean answers like &#8220;Runs per delivery is 5x more important than fours rate&#8221; in a way that a simpler model like logistic regression can. However, we can at least look at supplemental scores like <em>feature importance</em>, which says how often a metric is used in determining how often the decision trees inside the overall model are split. We can use feature importance as a proxy to boost our understanding of what metrics are important for predicting cricket matches. Here&#8217;s what the feature scores look like for the top 30 most important features:&nbsp;</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!h1XF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feaf68f40-74bf-4b12-b8e8-cb931f3181ad_419x362.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!h1XF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feaf68f40-74bf-4b12-b8e8-cb931f3181ad_419x362.png 424w, https://substackcdn.com/image/fetch/$s_!h1XF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feaf68f40-74bf-4b12-b8e8-cb931f3181ad_419x362.png 848w, https://substackcdn.com/image/fetch/$s_!h1XF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feaf68f40-74bf-4b12-b8e8-cb931f3181ad_419x362.png 1272w, https://substackcdn.com/image/fetch/$s_!h1XF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feaf68f40-74bf-4b12-b8e8-cb931f3181ad_419x362.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!h1XF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feaf68f40-74bf-4b12-b8e8-cb931f3181ad_419x362.png" width="419" height="362" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/eaf68f40-74bf-4b12-b8e8-cb931f3181ad_419x362.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:362,&quot;width&quot;:419,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!h1XF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feaf68f40-74bf-4b12-b8e8-cb931f3181ad_419x362.png 424w, https://substackcdn.com/image/fetch/$s_!h1XF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feaf68f40-74bf-4b12-b8e8-cb931f3181ad_419x362.png 848w, https://substackcdn.com/image/fetch/$s_!h1XF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feaf68f40-74bf-4b12-b8e8-cb931f3181ad_419x362.png 1272w, https://substackcdn.com/image/fetch/$s_!h1XF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feaf68f40-74bf-4b12-b8e8-cb931f3181ad_419x362.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The mix of good and bad news: there&#8217;s no magical single feature that blows all the other ones out of the water in terms of their importance: good because it shows we need all the metrics we can get, and bad because it doesn&#8217;t give us a specific direction on what to focus on for developing our understanding of what matters in predicting cricket. But if we look a little closer, we can see runs per delivery appearing disproportionately in the list of important features, suggesting our boring basic stat is doing a lot of heavy lifting. This at least helps us narrow down where we should focus our future understanding: runs are runs at the end of the day, and the more we can refine our understanding of these runs, the better our metrics will be.</p><p>The last question some of you may be wondering: is this model any <em>good? </em>Can I take the results of this model and start hammering the books and turn it into a money printing machine? My guess is no: after all, we took some very simple metrics and haven&#8217;t done any of the numerous adjustments that enhance the predictive power of these metrics. Knowing when a model is good enough to bet into open markets is a lifetime of study: how to benchmark its accuracy, knowing when the model still holds and needs to be re-fit with additional data, etc. But at a minimum, the dispersion we see in estimated probabilities suggests we&#8217;re off to a meaningful start. For the true bettors, the finish line never really arrives, as there&#8217;s always something else that can be added to improve the model, but even those models have to start somewhere. It looks like we&#8217;ve accomplished exactly that: a good start.&nbsp;</p><p>I&#8217;m confident that this model can be improved upon when put in the hands of people able and willing to grind out the marginal improvements required to make this model accurate enough to be profitable. And if you&#8217;re not one of those people, but always wanted to be, this code is a great place to get your hands dirty and try some actual data exploration and modeling yourself. To that end I&#8217;m releasing the code that this series is based on here <a href="https://github.com/PrimeSportsDataScience/2024CricketModel">https://github.com/PrimeSportsDataScience/2024CricketModel</a>.&nbsp;</p><p>I&#8217;ll see you all again when they ask me to predict rugby or something.</p>]]></content:encoded></item><item><title><![CDATA[Series (IV): T20 cricket to diversify bankroll deployment]]></title><description><![CDATA[Part 4: Individual bowling performance]]></description><link>https://journal.primesports.com/p/series-iiii-t20-cricket-to-diversify</link><guid isPermaLink="false">https://journal.primesports.com/p/series-iiii-t20-cricket-to-diversify</guid><dc:creator><![CDATA[Prime Sports]]></dc:creator><pubDate>Mon, 24 Jun 2024 14:32:05 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Z5g5!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b46edb0-7b2a-48f0-83b1-56b9ca57aa5a_600x600.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In the last article, we looked at batting stats in cricket and came up with some elementary metrics that we hope will eventually have some predictive power. Today, we&#8217;ll take a look at the other side: statistics for bowlers. Like last time, draw on our understanding of analysis for baseball pitchers, find similarities that do and do not apply, and see if we can come up with some interesting metrics for cricket bowlers that will help us in our eventual cricket model.</p><p>Hitting and pitching stats in baseball are essentially two sides of the same coin: whatever a batter produces on offense, the pitcher allows on defense. The analysis and understanding of where offense comes from for batters can essentially be inverted when looking at pitchers: K/9 and BB/9 rates are per inning, hitter walk rates and strikeout rates are per plate appearance. <br>If we assume some expected number of plate appearances per inning within reason and functionally make plate appearances equivalent to innings for rate calculation purposes, the statistics become functionally equivalent. There are other similar inversions: ISO and slugging percentage for batters tells a similar story to ground ball rate for pitchers (power hitting produced versus power hitting allowed). <br><br>With that in mind, our starting metric for analyzing bowler performance should probably be the same as one of our base hitter metrics: runs per delivery. For bowlers, we&#8217;ll specifically look at runs per delivery allowed, since bowlers are trying to minimize this metric, and batters are trying to maximize this metric. There shouldn&#8217;t be too many surprises compared to hitters: after all, the average runs per delivery scored across all of cricket has to be equal to the average runs per delivery allowed. But let&#8217;s see what the distribution looks like for runs per delivery for a given match for each bowler:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!38pO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3920863-d95c-429f-942f-2e3a75ff55b0_392x265.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!38pO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3920863-d95c-429f-942f-2e3a75ff55b0_392x265.png 424w, https://substackcdn.com/image/fetch/$s_!38pO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3920863-d95c-429f-942f-2e3a75ff55b0_392x265.png 848w, https://substackcdn.com/image/fetch/$s_!38pO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3920863-d95c-429f-942f-2e3a75ff55b0_392x265.png 1272w, https://substackcdn.com/image/fetch/$s_!38pO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3920863-d95c-429f-942f-2e3a75ff55b0_392x265.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!38pO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3920863-d95c-429f-942f-2e3a75ff55b0_392x265.png" width="392" height="265" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c3920863-d95c-429f-942f-2e3a75ff55b0_392x265.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:265,&quot;width&quot;:392,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!38pO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3920863-d95c-429f-942f-2e3a75ff55b0_392x265.png 424w, https://substackcdn.com/image/fetch/$s_!38pO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3920863-d95c-429f-942f-2e3a75ff55b0_392x265.png 848w, https://substackcdn.com/image/fetch/$s_!38pO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3920863-d95c-429f-942f-2e3a75ff55b0_392x265.png 1272w, https://substackcdn.com/image/fetch/$s_!38pO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3920863-d95c-429f-942f-2e3a75ff55b0_392x265.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Remarkably similar to the same distribution for batter runs per delivery. For reference, we&#8217;ll also look up fours and sixes allowed per delivery:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!r0D2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70dd93f0-5518-4979-b487-2d9409a19d6a_392x262.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!r0D2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70dd93f0-5518-4979-b487-2d9409a19d6a_392x262.png 424w, https://substackcdn.com/image/fetch/$s_!r0D2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70dd93f0-5518-4979-b487-2d9409a19d6a_392x262.png 848w, https://substackcdn.com/image/fetch/$s_!r0D2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70dd93f0-5518-4979-b487-2d9409a19d6a_392x262.png 1272w, https://substackcdn.com/image/fetch/$s_!r0D2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70dd93f0-5518-4979-b487-2d9409a19d6a_392x262.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!r0D2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70dd93f0-5518-4979-b487-2d9409a19d6a_392x262.png" width="392" height="262" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/70dd93f0-5518-4979-b487-2d9409a19d6a_392x262.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:262,&quot;width&quot;:392,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!r0D2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70dd93f0-5518-4979-b487-2d9409a19d6a_392x262.png 424w, https://substackcdn.com/image/fetch/$s_!r0D2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70dd93f0-5518-4979-b487-2d9409a19d6a_392x262.png 848w, https://substackcdn.com/image/fetch/$s_!r0D2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70dd93f0-5518-4979-b487-2d9409a19d6a_392x262.png 1272w, https://substackcdn.com/image/fetch/$s_!r0D2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70dd93f0-5518-4979-b487-2d9409a19d6a_392x262.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Again, pretty similar to the distribution from batters. If all we did was create mirror images of runs scored and allowed per delivery and sliced them in a more granular way we would be in pretty good shape. There are, however, some other additional wrinkles from bowling stats we need to toss in to help us shape our evaluation of bowling performance.</p><p><strong>Defense Independent Bowling Stats</strong></p><p>One additional metric tracked for bowlers is <em>dots</em>, or legal deliveries where no runs are scored. Dots are obviously good for bowlers, so a metric like dots per delivery seems like a worthwhile metric to construct as well. You might be wondering why bother calculating this as a separate metric when it&#8217;s largely overlapping with our workhorse runs per delivery allowed metric. If a bowler is good at not allowing runs, that should be reflected in a low runs per delivery allowed. While this is true, including dots gives us a key metric that has a critical advantage: fielding is not remotely involved.</p><p>We mentioned defense independent pitching statistics in the last article, and how they are valuable because they have a cleaner signal of attribution. When a ball is hit in cricket, once it&#8217;s in the field of play and it&#8217;s not a four or a six, how many runs it produces is a combination of how good or bad a delivery it was from the bowler and how good the fielders and field placement is at minimizing the runs. Attributing blame between the two is very hard for balls in play, and has only been made possible by more granular data like PitchFX and HitFX which track velocity vectors of balls in play to help isolate skills like fielding. <br>We&#8217;re probably not going to get BowlFX any time soon, so we&#8217;ll have to rely on metrics where we know fielding has no influence to identify repeatable and predictable bowler attributes. Dots are a perfect metric: they&#8217;re roughly equivalent to strikeouts in baseball, which the defense also has nothing to do with (outside of catcher framing, which is a niche subject we can basically ignore for now). <br><br>A practical application of something like dots per delivery is observing that if a player&#8217;s dots per delivery has been high for a bowler the last couple of games but their runs per delivery is also high, we would expect that some of those runs are due to factors outside the bowler&#8217;s control: maybe their fielders dropped a disproportionate number of balls, which is essentially variance.&nbsp;</p><p>Similarly, we can also measure mistakes the bowler makes with stats such as <em>wides </em>and <em>no balls</em>. Wides are similar to wild pitches: a delivery that the umpire decides the batter would have no reasonable chance at hitting. Unlike wild pitches, wides result in automatic extras assigned to the batting team, so the consequences are much higher. No balls are deliveries where a bowler&#8217;s technique is ruled illegal: roughly similar to a balk in baseball. No balls are slightly more complicated in that illegal actions by fielders can also result in a no ball, and even if the fielder is responsible, the no ball is credited against the bowler. Wides and no balls per delivery seem like reasonable metrics to include in our models as well: not only do they impact the final score, but they could also be useful proxies for other harder to measure attributes of bowlers. If a bowler has a high dot rate, they might have generally harder to hit balls and produce poor contact from hitters more often.&nbsp;</p><p>Now that we have an initial list of aggregated metrics, we can start calculating them for eventual use in a model. We&#8217;ll outline our first modeling approach in our next article.&nbsp;</p>]]></content:encoded></item><item><title><![CDATA[Series (III): T20 cricket to diversify bankroll deployment]]></title><description><![CDATA[Part 3: Individual batting performance]]></description><link>https://journal.primesports.com/p/series-iii-t20-cricket-to-diversify</link><guid isPermaLink="false">https://journal.primesports.com/p/series-iii-t20-cricket-to-diversify</guid><dc:creator><![CDATA[Prime Sports]]></dc:creator><pubDate>Mon, 17 Jun 2024 09:32:57 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Z5g5!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b46edb0-7b2a-48f0-83b1-56b9ca57aa5a_600x600.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In the last article, we took a look at team level data in T20 cricket to help us understand the overall shape of the game. Today, we&#8217;ll get more granular and look at how individual player analysis helps us understand where runs come from. . From there, we&#8217;ll look at how we start converting raw individual statistics into metrics that might have some predictive power when fed into a model of some kind to predict cricket outcomes. We&#8217;ll use baseball hitters as our frame of reference, similar to last time, to anchor our understanding so we can draw on as many familiar concepts as possible.</p><p>A batter&#8217;s job in cricket seems broadly similar to a batter&#8217;s job in baseball: hit a ball in service of advancing the offense. In many ways, cricket seems easier to analyze a batter&#8217;s contribution to the offense: batters in cricket seem like they would always have the same job in T20 cricket (score as many runs as possible). A batter&#8217;s job in baseball ,by comparison,&nbsp; isn&#8217;t always to try and score as many runs as you can by yourself (sacrifice flies/bunts exist for that exact reason).&nbsp;</p><p>The trickiest part we have to account for in cricket is the dramatically higher levels of variance in a batter&#8217;s contribution due to the nature of the offense. In baseball, batters&#8217; effectiveness is typically measured in terms of their contribution per plate appearance. Metrics like batting average, on-base percentage, and slugging percentage all use plate appearances (or at-bats, more or less) as their denominator, and plate appearances are a pretty stable quantity from game to game. The rules of baseball are naturally conducive to these metrics: a batter&#8217;s contribution to the offense is a result of their consumption of each plate appearance, and the number of pitches they face to get that consumption is independent of the outcome. In other words, it doesn&#8217;t matter if it took you 2 pitches or 12 pitches to get a hit, the end result is the exact same.&nbsp;</p><p>Cricket batters are always contributing to a team&#8217;s offense as long as they&#8217;re not out yet. If you make contact and you don&#8217;t get out, the default assumption should be at least one run gets scored due to the relative ease of scoring in cricket. This is why <em>strike rate</em>, or runs per delivery,<em> </em>is accepted as a standard metric for evaluating players&#8217; offensive capabilities. This doesn&#8217;t, however, seem to tell the whole picture; a player&#8217;s total contribution also depends on their ability to stay batting and not get out. To give an extreme example: a batter who always hit a six (the equivalent of a home run) on the first ball and immediately got out on the second ball would have an incredibly high strike rate, but their longevity would be so small, their overall offensive contribution would be far less than someone who could stay in and generate more runs. To further go against myself, as we saw in the last article, the importance of longevity in T20 cricket is notably less than traditional cricket, since a bowling team only takes all 10 wickets about 16% of the time. Ao a batter having potentially shorter longevity wouldn&#8217;t be nearly as impactful as a format where a batting side is guaranteed at least 10 full wickets. Ideally, a predictive model will validate this observation, but we have to take our baby steps first, and that involves coming up with a list of metrics that do a good job of encapsulating a batter&#8217;s offensive production.&nbsp;</p><p>We&#8217;ll separate out these concepts in two analyses: looking at batters&#8217; abilities to not get out, aka their longevity, and batters&#8217; abilities to produce runs, aka their contribution to offense.&nbsp;</p><p><strong>Survivability</strong></p><p>Just like last time, we&#8217;ll start our understanding of cricket batters by spinning up some distributions. The first thing we&#8217;ll look at is what the distribution looks like for how many deliveries each batter receives per wicket before they get out:&nbsp;</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!r4mO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf0dd02a-a601-4c4f-9d99-c08c97671781_392x263.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!r4mO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf0dd02a-a601-4c4f-9d99-c08c97671781_392x263.png 424w, https://substackcdn.com/image/fetch/$s_!r4mO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf0dd02a-a601-4c4f-9d99-c08c97671781_392x263.png 848w, https://substackcdn.com/image/fetch/$s_!r4mO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf0dd02a-a601-4c4f-9d99-c08c97671781_392x263.png 1272w, https://substackcdn.com/image/fetch/$s_!r4mO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf0dd02a-a601-4c4f-9d99-c08c97671781_392x263.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!r4mO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf0dd02a-a601-4c4f-9d99-c08c97671781_392x263.png" width="392" height="263" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/af0dd02a-a601-4c4f-9d99-c08c97671781_392x263.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:263,&quot;width&quot;:392,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!r4mO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf0dd02a-a601-4c4f-9d99-c08c97671781_392x263.png 424w, https://substackcdn.com/image/fetch/$s_!r4mO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf0dd02a-a601-4c4f-9d99-c08c97671781_392x263.png 848w, https://substackcdn.com/image/fetch/$s_!r4mO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf0dd02a-a601-4c4f-9d99-c08c97671781_392x263.png 1272w, https://substackcdn.com/image/fetch/$s_!r4mO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf0dd02a-a601-4c4f-9d99-c08c97671781_392x263.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This looks like a pretty standard <a href="https://www.graphpad.com/guides/prism/latest/statistics/stat_key_concepts__survival_curves.htm">survival curve</a>, which matches our intuition of how cricket batting works: you&#8217;re constantly batting to stay alive, and as long as you&#8217;re in, you&#8217;re contributing to the offense. The steepness of the survival curve is kind of concerning: it&#8217;s pretty far clustered to the left, with 37% of wickets having 10 deliveries or less. This doesn&#8217;t give us a huge sample size to work with for determining a batter&#8217;s strike rate in a given match, so we&#8217;ll have to be very wary of the effects of low sample size. For now though, we&#8217;ll proceed without strictly accounting for it.&nbsp;</p><p>A batter can be out by 5 commonly occurring ways, we&#8217;ll list them here along with their rough equivalent in baseball:&nbsp;</p><ul><li><p>A fielder catches a ball after the batter hits it. Nearly identical to how it works in baseball, with the difference that it&#8217;s generally harder to catch a cricket ball, so fielding isn&#8217;t nearly as routine in cricket.&nbsp;</p></li><li><p>The bowler hits the wicket with a delivery. Most similar to a strikeout, but not nearly as similar as a caught ball.&nbsp;</p></li><li><p>A run out, where a fielder throws the ball to a wicket the batter is attempting to run towards. Closest to a baserunner being thrown out on the base paths.</p></li><li><p>A &#8220;leg before wicket&#8221;, or LBW, where a batter illegally blocks a delivery with their leg to prevent it potentially hitting the wickets. Roughly equivalent to batter&#8217;s interference, except it happens far more often than in baseball.</p></li><li><p>Stumping, a special case of a run out, where if no part of the batter&#8217;s body or held bat is grounded behind the crease&nbsp; (roughly equivalent to the&nbsp; batter&#8217;s box) when the wickets are hit with the ball by the wicket keeper (a special defensive catcher), they&#8217;re called out. This doesn&#8217;t have a great baseball equivalent: it&#8217;s roughly as if a batter swung so hard that their bat touched the ground outside the batter&#8217;s box, and that means you&#8217;re automatically out.&nbsp;</p></li><li><p>There are other ways players can get called out, but they&#8217;re infrequent enough that we&#8217;ll just label them as &#8220;other&#8221; for now.</p></li></ul><p>Here&#8217;s the overall frequency at which the outs occur over all of T20 cricket to establish a baseline expectation of how and when batters get called out:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!iaqY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32927558-d8c6-4e59-a482-5f1ebe0d9046_386x299.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!iaqY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32927558-d8c6-4e59-a482-5f1ebe0d9046_386x299.png 424w, https://substackcdn.com/image/fetch/$s_!iaqY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32927558-d8c6-4e59-a482-5f1ebe0d9046_386x299.png 848w, https://substackcdn.com/image/fetch/$s_!iaqY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32927558-d8c6-4e59-a482-5f1ebe0d9046_386x299.png 1272w, https://substackcdn.com/image/fetch/$s_!iaqY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32927558-d8c6-4e59-a482-5f1ebe0d9046_386x299.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!iaqY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32927558-d8c6-4e59-a482-5f1ebe0d9046_386x299.png" width="386" height="299" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/32927558-d8c6-4e59-a482-5f1ebe0d9046_386x299.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:299,&quot;width&quot;:386,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!iaqY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32927558-d8c6-4e59-a482-5f1ebe0d9046_386x299.png 424w, https://substackcdn.com/image/fetch/$s_!iaqY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32927558-d8c6-4e59-a482-5f1ebe0d9046_386x299.png 848w, https://substackcdn.com/image/fetch/$s_!iaqY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32927558-d8c6-4e59-a482-5f1ebe0d9046_386x299.png 1272w, https://substackcdn.com/image/fetch/$s_!iaqY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32927558-d8c6-4e59-a482-5f1ebe0d9046_386x299.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Caught balls account for 58% of all T20 cricket outs overall, overwhelming the most common way to get out. Ideally, we would love to create these percentages for individual players as well, because we could hypothesize that different players have different percentages in each of these categories, which might say something about their survivability. If, for example, we later find out that something like LBW is more of a fluky out due to lots of extrinsic factors (batter technique errors that aren&#8217;t typically repeatable, an aggressive umpire that made too harsh a judgment call etc) and we see that a player has had a disproportionate amount of LBW outs in their past performances, we can surmise their survivability might look artificially low due to bad luck, and that it is not predictive of future survivability.&nbsp;</p><p>As an illustration, we can compile statistics on individual batters, and look at the distributions of those stats to get an idea of the variability of stats like the above percentages. Here, for example, is the distribution of batters&#8217; caught ball percentages of their overall outs to see how much variation there is around that 58%:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!FINc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d8256a0-6f7f-49c6-9226-baabb89daee1_392x262.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!FINc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d8256a0-6f7f-49c6-9226-baabb89daee1_392x262.png 424w, https://substackcdn.com/image/fetch/$s_!FINc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d8256a0-6f7f-49c6-9226-baabb89daee1_392x262.png 848w, https://substackcdn.com/image/fetch/$s_!FINc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d8256a0-6f7f-49c6-9226-baabb89daee1_392x262.png 1272w, https://substackcdn.com/image/fetch/$s_!FINc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d8256a0-6f7f-49c6-9226-baabb89daee1_392x262.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!FINc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d8256a0-6f7f-49c6-9226-baabb89daee1_392x262.png" width="392" height="262" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9d8256a0-6f7f-49c6-9226-baabb89daee1_392x262.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:262,&quot;width&quot;:392,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!FINc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d8256a0-6f7f-49c6-9226-baabb89daee1_392x262.png 424w, https://substackcdn.com/image/fetch/$s_!FINc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d8256a0-6f7f-49c6-9226-baabb89daee1_392x262.png 848w, https://substackcdn.com/image/fetch/$s_!FINc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d8256a0-6f7f-49c6-9226-baabb89daee1_392x262.png 1272w, https://substackcdn.com/image/fetch/$s_!FINc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d8256a0-6f7f-49c6-9226-baabb89daee1_392x262.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>While the distribution still centers around that 58% average, there&#8217;s still a decent amount of variability, indicating that not all batters have 58% of their outs from caught balls on average. A natural next step to refine our understanding will be to eventually explore some of the stability and predictive power of these stats down the road. In other words, if a batter&#8217;s caught ball percentage is higher than 58%, we will eventually want to know if we can expect their caught ball percentage to also be higher than 58% in the future.</p><p><strong>Offensive Production&nbsp;</strong></p><p>Let&#8217;s also take a look at the distribution of runs per delivery without any kind of adjustments:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!PTuW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7e9a29a-f5fd-4389-8310-65b5e2b7d1be_392x262.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!PTuW!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7e9a29a-f5fd-4389-8310-65b5e2b7d1be_392x262.png 424w, https://substackcdn.com/image/fetch/$s_!PTuW!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7e9a29a-f5fd-4389-8310-65b5e2b7d1be_392x262.png 848w, https://substackcdn.com/image/fetch/$s_!PTuW!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7e9a29a-f5fd-4389-8310-65b5e2b7d1be_392x262.png 1272w, https://substackcdn.com/image/fetch/$s_!PTuW!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7e9a29a-f5fd-4389-8310-65b5e2b7d1be_392x262.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!PTuW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7e9a29a-f5fd-4389-8310-65b5e2b7d1be_392x262.png" width="392" height="262" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f7e9a29a-f5fd-4389-8310-65b5e2b7d1be_392x262.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:262,&quot;width&quot;:392,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!PTuW!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7e9a29a-f5fd-4389-8310-65b5e2b7d1be_392x262.png 424w, https://substackcdn.com/image/fetch/$s_!PTuW!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7e9a29a-f5fd-4389-8310-65b5e2b7d1be_392x262.png 848w, https://substackcdn.com/image/fetch/$s_!PTuW!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7e9a29a-f5fd-4389-8310-65b5e2b7d1be_392x262.png 1272w, https://substackcdn.com/image/fetch/$s_!PTuW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7e9a29a-f5fd-4389-8310-65b5e2b7d1be_392x262.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This is what happens when you explore data without accounting for outliers: all it takes is one batter to score six runs on a single delivery and get out afterwards to skew your graph and make it harder to get a sense of the more typical distribution. (This could prompt a much longer discussion of managing outliers in your data, but that&#8217;s a topic in itself). Here&#8217;s what that distribution looks like for batters who face at least 8 deliveries to help tamp down some of the visual effects of those outliers:&nbsp;</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ssAH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feba98f66-e9f5-45ea-9f87-e12bdd2921d9_392x262.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ssAH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feba98f66-e9f5-45ea-9f87-e12bdd2921d9_392x262.png 424w, https://substackcdn.com/image/fetch/$s_!ssAH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feba98f66-e9f5-45ea-9f87-e12bdd2921d9_392x262.png 848w, https://substackcdn.com/image/fetch/$s_!ssAH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feba98f66-e9f5-45ea-9f87-e12bdd2921d9_392x262.png 1272w, https://substackcdn.com/image/fetch/$s_!ssAH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feba98f66-e9f5-45ea-9f87-e12bdd2921d9_392x262.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ssAH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feba98f66-e9f5-45ea-9f87-e12bdd2921d9_392x262.png" width="392" height="262" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/eba98f66-e9f5-45ea-9f87-e12bdd2921d9_392x262.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:262,&quot;width&quot;:392,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ssAH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feba98f66-e9f5-45ea-9f87-e12bdd2921d9_392x262.png 424w, https://substackcdn.com/image/fetch/$s_!ssAH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feba98f66-e9f5-45ea-9f87-e12bdd2921d9_392x262.png 848w, https://substackcdn.com/image/fetch/$s_!ssAH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feba98f66-e9f5-45ea-9f87-e12bdd2921d9_392x262.png 1272w, https://substackcdn.com/image/fetch/$s_!ssAH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feba98f66-e9f5-45ea-9f87-e12bdd2921d9_392x262.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Still have a little bit of outlier management to do, but it&#8217;s not as bad. We can at least see a little clearer that the average runs per ball is around 1.15, which sets some sort of baseline expectation of what overall offensive production looks like. Now, we can get a little more granular with the runs themselves to describe a little more about where offense comes from.</p><p>Baseball hitters have distinct styles: the classic oversimplified comparison being speedy contact hitters who get on base a lot and can run the bases fast, versus slow power hitters who may not run fast, but can hit for lots of extra bases. Distinguishing between these types of offensive production is not just useful for descriptive purposes, it&#8217;s also very helpful for predictive purposes. The rise of <a href="https://www.baseball-reference.com/bullpen/Defense-Independent_Pitching_Statistics#:~:text=Defense%20Independent%20Pitching%20Statistics%20(DIPS,strikeouts%2C%20hit%20batters%20and%20walks.">defense-independent pitching statistics</a> in the early sabermetrics days helped separate a batter&#8217;s outcomes from what the batter could control versus what the defense could control, which helped separate signal from noise in predicting a batter&#8217;s future outcomes. In cricket, there are also runs that come predominantly from hitting power: a batter scores six runs for balls that clear the fence (aka a home run), and four runs for balls that reach the boundary of the playing field (roughly equivalent to a ground rule double). We can see the value of separating out how much a player&#8217;s runs come from sixes: the defense has nothing to do with that type of offensive production, so isolating power as a statistic will likely have some predictive value. On the other hand, if we get a model that determines how important power is in boosting a team&#8217;s chance to win, I would not be surprised if it&#8217;s less important than in baseball. One of the secondary benefits of power hitters in baseball is their ability to bring baserunners home, but in cricket, baserunners have less of an impact, so we shouldn&#8217;t expect the same importance level as in baseball.</p><p>As always, we at least want to see what level of importance fours and sixes generally play in a batter&#8217;s offensive production. Here are the distributions of what percent of total runs are accounted for by fours and sixes on a per batter basis:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ch7p!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa46d25db-a307-4111-bcd8-6bc2c04d320e_400x262.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ch7p!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa46d25db-a307-4111-bcd8-6bc2c04d320e_400x262.png 424w, https://substackcdn.com/image/fetch/$s_!Ch7p!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa46d25db-a307-4111-bcd8-6bc2c04d320e_400x262.png 848w, https://substackcdn.com/image/fetch/$s_!Ch7p!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa46d25db-a307-4111-bcd8-6bc2c04d320e_400x262.png 1272w, https://substackcdn.com/image/fetch/$s_!Ch7p!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa46d25db-a307-4111-bcd8-6bc2c04d320e_400x262.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ch7p!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa46d25db-a307-4111-bcd8-6bc2c04d320e_400x262.png" width="400" height="262" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a46d25db-a307-4111-bcd8-6bc2c04d320e_400x262.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:262,&quot;width&quot;:400,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Ch7p!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa46d25db-a307-4111-bcd8-6bc2c04d320e_400x262.png 424w, https://substackcdn.com/image/fetch/$s_!Ch7p!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa46d25db-a307-4111-bcd8-6bc2c04d320e_400x262.png 848w, https://substackcdn.com/image/fetch/$s_!Ch7p!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa46d25db-a307-4111-bcd8-6bc2c04d320e_400x262.png 1272w, https://substackcdn.com/image/fetch/$s_!Ch7p!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa46d25db-a307-4111-bcd8-6bc2c04d320e_400x262.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In general, fours account for 31% of batters&#8217; runs on average, and sixes account for 20%. Combined, they&#8217;re just over half of players&#8217; runs. Eventually, we would like to know if these percentages have some DIPS-like predictive qualities, aka if a player has higher than 20% of runs from sixes over some time period, if we can expect future production to be higher than 20% as well.&nbsp;</p><p>At the end of this, we have a couple new metrics: out type percentages, deliveries per wicket, runs per delivery, fours percentage, and sixes percentage, that we can at least generate and eventually feed into a model of some kind. It has yet to be seen if any of these metrics are especially predictive, but we at least have a way to break down offensive production to understand how and where it comes from. Next article, we&#8217;ll do the same for bowlers.</p>]]></content:encoded></item><item><title><![CDATA[Series (II): T20 cricket to diversify bankroll deployment]]></title><description><![CDATA[Part 2: Starting with teams]]></description><link>https://journal.primesports.com/p/series-t20-cricket-to-diversify-bankroll-26a</link><guid isPermaLink="false">https://journal.primesports.com/p/series-t20-cricket-to-diversify-bankroll-26a</guid><dc:creator><![CDATA[Prime Sports]]></dc:creator><pubDate>Mon, 10 Jun 2024 16:04:41 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!3PXI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff482b5d5-43b4-4a57-a080-4023f52b4f55_751x422.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>We left off last time with a laundry list of observations on the similarities between cricket and baseball as a means of establishing some foundational understandings of how to go about handicapping. I may have&nbsp; zero familiarity with cricket but some concepts and understandings are standard across all modeling, and some from a similar sport can serve as a helpful reference point.&nbsp;</p><p>Today, we&#8217;ll get into it with some numbers using real data sets to help understand the overall shape of T20 cricket; how its stats look and feel, what we might need to do with the raw data to unlock applicable techniques from other sports. We&#8217;ll start to develop additional metrics and data transformations we can use to originate our own model. Let&#8217;s start with the highest level data possible: the humble box score, which just tells us what the score of the game was.&nbsp;</p><p>Scoring differential is a simple yet powerful metric that is used across a lot of sports to gauge strengths of teams against one another. We can derive all sorts of formulas from scoring differential, from the simple Pythagorean expectation of a team&#8217;s win/loss record to more complex algorithms like Elo or Markov chains to convert scoring differential into steady-state team rankings. Scoring differential across multiple games can also tell us something about the behavior of the sport as well.&nbsp;</p><p>When evaluating scoring differential for a given sport, I like to start with looking at the <em>distribution</em> of scoring differential to get a high-level sense of the sport. Distributions are the expanded version of stats like average and median; they give a complete picture of what outcomes of the sport can look like. I prefer distributions over simple summary stats because they can provide key context that simple averages can&#8217;t. The classic example in football, where scoring differential disproportionately occurs with values of 3 and 7 due to the quirks of how scoring works in that sport.</p><p>Here, for example, is what the distribution of run differential looks like between the home team and away team for baseball:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3PXI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff482b5d5-43b4-4a57-a080-4023f52b4f55_751x422.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3PXI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff482b5d5-43b4-4a57-a080-4023f52b4f55_751x422.png 424w, https://substackcdn.com/image/fetch/$s_!3PXI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff482b5d5-43b4-4a57-a080-4023f52b4f55_751x422.png 848w, https://substackcdn.com/image/fetch/$s_!3PXI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff482b5d5-43b4-4a57-a080-4023f52b4f55_751x422.png 1272w, https://substackcdn.com/image/fetch/$s_!3PXI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff482b5d5-43b4-4a57-a080-4023f52b4f55_751x422.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3PXI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff482b5d5-43b4-4a57-a080-4023f52b4f55_751x422.png" width="751" height="422" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f482b5d5-43b4-4a57-a080-4023f52b4f55_751x422.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:422,&quot;width&quot;:751,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3PXI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff482b5d5-43b4-4a57-a080-4023f52b4f55_751x422.png 424w, https://substackcdn.com/image/fetch/$s_!3PXI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff482b5d5-43b4-4a57-a080-4023f52b4f55_751x422.png 848w, https://substackcdn.com/image/fetch/$s_!3PXI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff482b5d5-43b4-4a57-a080-4023f52b4f55_751x422.png 1272w, https://substackcdn.com/image/fetch/$s_!3PXI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff482b5d5-43b4-4a57-a080-4023f52b4f55_751x422.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This more or less resembles a normal distribution, which has consistent and stable properties to unlock all sorts of analyses and formulas. By contrast, here is how the scoring differential looks like for T20 cricket games, where instead of home versus away, we use second team to bat versus first team to bat (functionally the same thing in baseball, since the home team bats second):</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!iS7N!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda2c82b0-e3ec-447c-ad42-0dfab14e78f4_1300x636.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!iS7N!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda2c82b0-e3ec-447c-ad42-0dfab14e78f4_1300x636.png 424w, https://substackcdn.com/image/fetch/$s_!iS7N!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda2c82b0-e3ec-447c-ad42-0dfab14e78f4_1300x636.png 848w, https://substackcdn.com/image/fetch/$s_!iS7N!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda2c82b0-e3ec-447c-ad42-0dfab14e78f4_1300x636.png 1272w, https://substackcdn.com/image/fetch/$s_!iS7N!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda2c82b0-e3ec-447c-ad42-0dfab14e78f4_1300x636.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!iS7N!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda2c82b0-e3ec-447c-ad42-0dfab14e78f4_1300x636.png" width="1300" height="636" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/da2c82b0-e3ec-447c-ad42-0dfab14e78f4_1300x636.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:636,&quot;width&quot;:1300,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!iS7N!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda2c82b0-e3ec-447c-ad42-0dfab14e78f4_1300x636.png 424w, https://substackcdn.com/image/fetch/$s_!iS7N!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda2c82b0-e3ec-447c-ad42-0dfab14e78f4_1300x636.png 848w, https://substackcdn.com/image/fetch/$s_!iS7N!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda2c82b0-e3ec-447c-ad42-0dfab14e78f4_1300x636.png 1272w, https://substackcdn.com/image/fetch/$s_!iS7N!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda2c82b0-e3ec-447c-ad42-0dfab14e78f4_1300x636.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Right away, we can tell that a simple run differential distribution doesn&#8217;t do a good job of explaining the intricacies of cricket scoring due to how clustered the results are. Isn&#8217;t it weird that when the second batting team wins, it&#8217;s almost always by one run? It&#8217;s not actually weird, it&#8217;s a reflection of the rules of the game.&nbsp;</p><p>Recall that in cricket, the first team bats in their entirety, they stop when the other team gets 10 outs or bowls 120 attempts (simplified), and if the second team exceeds their run total when they go up to bat, the match is over. This means that our usual concept of scoring differential needs to be adjusted, because one of the implicit assumptions of scoring differential is that each offense is provided equal opportunities for production according to the rules (in sports like football, things like dominating time of possession can limit scoring opportunities for the opposing offense, but all else being equal, both teams are given the same opportunities to control time of possession). So what can we do to the data in order to make it work a little nicer?</p><p>Given the caps on offensive opportunities (max of 120 deliveries, ending the game if the second team exceeds the other team&#8217;s runs), we are much better off utilizing rates, aka scores per opportunity, in order to get a clearer picture. Coming up with a good rate metric is a challenge in its own right because of how cricket scoring works, especially compared to other sports. We&#8217;ll dive into each half of our potential rate, aka the numerator and the denominator, to see if we can come up with some distinct summary stats that we think will have some predictive power when we eventually feed them into a model.</p><h2><strong>The Numerator: Not All Runs Are Created Equal</strong></h2><p>Runs in cricket come from two places: runs created by batters hitting the ball around the field (known as <em>team runs</em>), and runs created when the bowling (aka pitching) team commits rule infractions, like the bowler overstepping the crease. These penalty runs are known as <em>extras</em>. It will be good to understand how often runs come from batted runs versus extras to help prioritize our analytical focus. This is another great application of using distributions- we can look at the distribution of what percent of total runs extras comprise of a team&#8217;s score.&nbsp;</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!koyk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0790a38c-407f-4247-9573-4606da84b995_381x248.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!koyk!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0790a38c-407f-4247-9573-4606da84b995_381x248.png 424w, https://substackcdn.com/image/fetch/$s_!koyk!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0790a38c-407f-4247-9573-4606da84b995_381x248.png 848w, https://substackcdn.com/image/fetch/$s_!koyk!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0790a38c-407f-4247-9573-4606da84b995_381x248.png 1272w, https://substackcdn.com/image/fetch/$s_!koyk!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0790a38c-407f-4247-9573-4606da84b995_381x248.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!koyk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0790a38c-407f-4247-9573-4606da84b995_381x248.png" width="381" height="248" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0790a38c-407f-4247-9573-4606da84b995_381x248.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:248,&quot;width&quot;:381,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!koyk!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0790a38c-407f-4247-9573-4606da84b995_381x248.png 424w, https://substackcdn.com/image/fetch/$s_!koyk!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0790a38c-407f-4247-9573-4606da84b995_381x248.png 848w, https://substackcdn.com/image/fetch/$s_!koyk!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0790a38c-407f-4247-9573-4606da84b995_381x248.png 1272w, https://substackcdn.com/image/fetch/$s_!koyk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0790a38c-407f-4247-9573-4606da84b995_381x248.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Extras are typically around 6% of a team&#8217;s total runs, but can be as high as 30% of a team&#8217;s total. This is another example of how distributions can tell a story that simple averages can&#8217;t. If we just took the average of 6%, we might conclude that extras aren&#8217;t all that important to account for. But when we see that in 10% of the games, extras account for 10% or more of a total team&#8217;s runs, that&#8217;s a significant enough threshold to warrant specifically accounting for extras. (As a side note, there&#8217;s no hard and fast rule for analyzing distributions like this and determining significance thresholds- a lot of that judgment comes from repetition and experience slicing and dicing sports data.) So at a minimum, it&#8217;s worth our time to separate out extras from batted runs.&nbsp;</p><p>Runs that come from extras feel like they should be flukier and noisier than runs that come from a team&#8217;s bread and butter offense. Maybe there&#8217;s something to be said about a batter&#8217;s ability to induce extras (does he require more aggressive deliveries that risk inducing extras? Is he particularly good at capitalizing on no-ball situations where he essentially gets a free crack at the ball? etc), but a reasonable starting assumption is that runs that come from extras probably shouldn&#8217;t be attributed to a team&#8217;s offensive capabilities. Conversely, runs <em>allowed </em>from extras might have a little more predictive power. If a bowler is consistently wild in their deliveries, we would expect them to give up more runs from extras over the long term than a bowler that has more stable and consistent deliveries. In other words, runs allowed from extras is a lot more within the control of the bowler than runs scored from extras is within control of the batter. Fortunately, the cricket data set we&#8217;re working with already separates out team runs from extras, so there&#8217;s not a lot of additional work we have to do.</p><h2><strong>The Denominator: What Determines Opportunity?</strong></h2><p>The first team to bat in cricket stops batting when they record 10 wickets (aka outs) or when they receive 120 deliveries. Having a sport where there are multiple conditions for the offense&#8217;s turn to be over is particularly interesting: it&#8217;s roughly equivalent to one half of a baseball inning being over after 3 outs or if 15 minutes has elapsed. If we need to use offensive rates of production, then what should the denominator be? runs per wicket? Runs per delivery? Both? Neither? We can at least size the problem by seeing how often an inning in T20 cricket ends under either condition to see where we need to focus our efforts.</p><p>Let&#8217;s start by trying to understand what the distribution of wickets fell for the first batting team looks like. We focus on the first batting team, because they are afforded a more consistent set of batting opportunities under the same conditions each time, as opposed to the second batting team, which has their opportunities capped once they meet their target run score.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!L4rh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75d777fd-72ad-429b-93be-6e4cf6de3b4d_751x452.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!L4rh!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75d777fd-72ad-429b-93be-6e4cf6de3b4d_751x452.png 424w, https://substackcdn.com/image/fetch/$s_!L4rh!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75d777fd-72ad-429b-93be-6e4cf6de3b4d_751x452.png 848w, https://substackcdn.com/image/fetch/$s_!L4rh!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75d777fd-72ad-429b-93be-6e4cf6de3b4d_751x452.png 1272w, https://substackcdn.com/image/fetch/$s_!L4rh!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75d777fd-72ad-429b-93be-6e4cf6de3b4d_751x452.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!L4rh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75d777fd-72ad-429b-93be-6e4cf6de3b4d_751x452.png" width="751" height="452" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/75d777fd-72ad-429b-93be-6e4cf6de3b4d_751x452.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:452,&quot;width&quot;:751,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!L4rh!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75d777fd-72ad-429b-93be-6e4cf6de3b4d_751x452.png 424w, https://substackcdn.com/image/fetch/$s_!L4rh!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75d777fd-72ad-429b-93be-6e4cf6de3b4d_751x452.png 848w, https://substackcdn.com/image/fetch/$s_!L4rh!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75d777fd-72ad-429b-93be-6e4cf6de3b4d_751x452.png 1272w, https://substackcdn.com/image/fetch/$s_!L4rh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F75d777fd-72ad-429b-93be-6e4cf6de3b4d_751x452.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>T20 cricket is all-or-nothing for the first batting team with respect to their wickets: 8 wickets is the same as 2 wickets, because deliveries are capped at 120, so we can reduce this to &#8220;how often does the first batting team stop after 10 wickets&#8221;?. This turns out to be about 16% of the time, enough that it matters to account for. As a starting point, the default assumption should be that the first batting team will receive a full 120 deliveries since it happens the majority of the time, which lends us to using deliveries as the default opportunity rate. We will eventually have to account for cases where deliveries are cut short by all 10 wickets getting felled, but even though it happens frequently enough that it will affect our predictions, we can more or less treat it as an exception to start.</p><h2><strong>Our Metric Of Choice</strong></h2><p>Now that we&#8217;ve settled on team runs scored as the numerator and deliveries as the denominator, we&#8217;ll calculate the runs scored per delivery for each team in each match and see what the distribution looks like of that metric across all matches.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!EU_J!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d338af0-3711-4110-9a49-1bc364207691_861x537.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!EU_J!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d338af0-3711-4110-9a49-1bc364207691_861x537.png 424w, https://substackcdn.com/image/fetch/$s_!EU_J!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d338af0-3711-4110-9a49-1bc364207691_861x537.png 848w, https://substackcdn.com/image/fetch/$s_!EU_J!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d338af0-3711-4110-9a49-1bc364207691_861x537.png 1272w, https://substackcdn.com/image/fetch/$s_!EU_J!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d338af0-3711-4110-9a49-1bc364207691_861x537.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!EU_J!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d338af0-3711-4110-9a49-1bc364207691_861x537.png" width="861" height="537" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8d338af0-3711-4110-9a49-1bc364207691_861x537.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:537,&quot;width&quot;:861,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!EU_J!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d338af0-3711-4110-9a49-1bc364207691_861x537.png 424w, https://substackcdn.com/image/fetch/$s_!EU_J!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d338af0-3711-4110-9a49-1bc364207691_861x537.png 848w, https://substackcdn.com/image/fetch/$s_!EU_J!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d338af0-3711-4110-9a49-1bc364207691_861x537.png 1272w, https://substackcdn.com/image/fetch/$s_!EU_J!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d338af0-3711-4110-9a49-1bc364207691_861x537.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Something familiar finally! This is much&nbsp; closer to our expected baseball distribution, so hopefully we&#8217;re on the right track transforming the data into something we can actually feed into our models. In the next article, we&#8217;ll start to get more granular on team runs scored at the batter level and start to slice and dice where batter runs come from, and see what other baseball concepts we can start utilizing.</p>]]></content:encoded></item><item><title><![CDATA[Series: T20 cricket to diversify bankroll deployment]]></title><description><![CDATA[Part 1: I'm sorry, what?]]></description><link>https://journal.primesports.com/p/series-t20-cricket-to-diversify-bankroll</link><guid isPermaLink="false">https://journal.primesports.com/p/series-t20-cricket-to-diversify-bankroll</guid><dc:creator><![CDATA[Prime Sports]]></dc:creator><pubDate>Mon, 03 Jun 2024 18:50:16 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Z5g5!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b46edb0-7b2a-48f0-83b1-56b9ca57aa5a_600x600.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>Editors note: We asked Colin to go in cold: zero to starter model for the T20 World Cup running through June. This a series start from nothing and building towards a basic prediction model that we&#8217;ll (hopefully) test against the last games of the tournament later this month.</em></p><p>&#8220;We want you to build a cricket model. Don&#8217;t know a single thing about cricket? Great, you&#8217;re perfect.&#8221;&nbsp;</p><p>I feel like I&#8217;ve been chosen for the <a href="https://three-body-problem.fandom.com/wiki/Wallfacer">Wallfacer program</a>. Am I missing something? Don&#8217;t they know I&#8217;ve never watched more than 5 minutes of cricket in my life? I&#8217;m vaguely aware it kind of looks like baseball if I squint hard enough, and that matches famously last days at a time, and didn&#8217;t they make a movie one time about cricket prospects trying to get converted into baseball? Then again, this is how I started off doing <a href="https://www.sbnation.com/tennis/2014/4/15/5615896/tennis-aging-curves-advanced-baseline">tennis modeling</a> a decade ago, never having watched the sport very seriously and trying to predict it anyway. At least back then, I had a decent base modeling approach to apply to the sport I had honed across other places, but this is going to be a new challenge, starting from scratch knowing very little about the rules and no starting clue about how to model or predict the sport.</p><p>I do understand why cricket has some things going for it from a betting angle, though. It runs up against a dead spot in the sports betting calendar cycle, where baseball is so picked over at this point, there&#8217;s not much left to activate otherwise idle bankrolls. Even if there aren&#8217;t huge edges to be found, it&#8217;s better than zero edges, as long as the return on time is worth it. I&#8217;m always a fan of adding new sports to the toolbox and, inevitably, you pick up on one or two things in each sport you model that carries over to some unknown way to your other sports, so I can get behind this as an intellectual challenge.</p><p>My starting point is the same as any other idiot American trying to understand a predominantly foreign sport: look up the rules, and try to shape my understanding according to the sports I already know (baseball is going to be the obvious comparison here). Mercifully, the ask is to focus on T20 cricket, which has been specifically developed as a variant to induce more action and conclude faster to make it more appealing to younger audiences, so at least I don&#8217;t have to worry about multiple days of varying conditions.</p><p>After some <a href="https://www.cheryl-morgan.com/writing/sport/cricket-for-baseball-fans/">helpful primers</a> on the rules of cricket specifically through a <a href="https://www.cheryl-morgan.com/writing/sport/cricket-for-baseball-fans/understanding-cricket-statistics/">baseball lens</a>, I can at least start to identify what I think are some key similarities and differences between the two. Baseball has a ton of already applicable modelling techniques that rely on a lot of assumptions, so I want to see what techniques and approaches I can take from baseball, and what differences will require a rethink of how I might go about trying to model and predict any kind of cricket outcomes.</p><h2><em>Similarities</em></h2><p><strong>Cricket can be well reconstructed from box score data.&nbsp;</strong></p><p>There&#8217;s no clock in T20 cricket (technically there is, as there are penalties for not finishing batting in your allotted 75 minutes, but apparently this rarely happens, so we can basically overlook this for this iteration). Decision: I won&#8217;t account for any kind of clock-dependent situations you find in other sports (pulling the goalie when trailing, playing for the field goal win when down by 3 or less, etc.)</p><p><strong>Cricket outcomes are approximately the sum of individual actions on each team.</strong></p><p>One simplifying assumption about baseball is there aren&#8217;t a lot of interactive effects between team members to account for - baseball is more or less individual team members doing their own pitching, batting, fielding, and baserunning, and outcomes can be predicted fairly well by summing individual performances. The dynamics of cricket lend themselves to the same assumptions.</p><p><strong>Pitcher vs. batter stats drive most of the dynamic; things like fielding and running are less important. </strong>Yes, cricket balls are harder to catch, which makes things like fielding a little more unpredictable, but it&#8217;s still probably a decent assumption that defense holds the same weight in cricket that it does in baseball. Absolutely a skill that some players are better at than others, but not as important as their offensive contributions. Also, since there&#8217;s less total distance to cover when running to score a run, we can likely assume we don&#8217;t have to estimate a player&#8217;s run decision making to start.</p><p><strong>Relievers sort of exist in cricket. </strong>There are bowlers who specialize in late game situations, similar to relievers and/or closers. From the rules alone, it&#8217;s not clear if late-game bowlers have the degree of specialization relievers do (less workload, more situational, etc.), but at a minimum it&#8217;s good context to be aware of.&nbsp;</p><p><strong>Accounting for weather is important. </strong>Temperature, humidity, and field conditions all have a huge impact on how the ball travels. Much like baseball, any good model will account for weather conditions. It remains to be seen on which weather conditions matter all that much, and if they have the same relative importance as baseball.&nbsp;</p><h2>Things that will be notably different:</h2><p><strong>Scoring differential doesn&#8217;t tell the same story. </strong>One of the initially confusing things just trying to get a feel from box scores is seeing the winning team&#8217;s margin as being described in terms of runs <em>or </em>wickets. This is one of the biggest structural changes from baseball: the equivalent would be if in baseball, the away team batted all of their 9 innings first, the teams switch sides and the home team then starts batting, and if the home team scores more runs then the away team, it&#8217;s an automatic walk-off win every time. (This happens to a very small degree now with the home team not batting the bottom of the 9th if they&#8217;re winning, but this is magnified to a much higher degree in cricket). So many equations around run differential and Pythagorean expectation rely on an implicit assumption of equal opportunity rate for both offensive lineups, and that&#8217;s structurally not the case in cricket, so we&#8217;re going to have to dive a little deeper to gauge team strengths than scoring differential alone.</p><p><strong>&#8220;Plate appearances&#8221; have a lot more variance than in baseball. </strong>A plate appearance for a baseball hitter is restricted to a single outcome (on base or out), but a batter appearance in cricket has a whole lot more variance. You could score 50 runs before you&#8217;re out, or you could get out on the first ball (pitch). A cursory look at cricket statistics suggests this is why batters&#8217; offensive production is described in terms of things like runs per over, but rates have their own problem in telling the whole story: a batter that produces 6 runs but only lasts one over is much more efficient, but also much less productive in the aggregate, than a batter that produces 30 runs over 7 overs. This tradeoff between efficiency and volume is something that will have to be navigated very carefully when assessing players&#8217; performances.&nbsp;&nbsp;</p><p><strong>Batters&#8217; offensive contributions have less to do with their teammates&#8217; production. </strong>If you&#8217;re a great hitter in baseball, your RBIs depend heavily on if your teammates in front of you in the batting order already got on base. That dynamic doesn&#8217;t exist in cricket. Your runs are your runs alone, which should produce some more helpful and simplifying modelling assumptions.</p><p><strong>Batters face a different bowler (pitcher) every 6 balls (pitches. </strong>After every 6 bowls, the batter faces a different bowler and hits to the other side of the field (more on that in a bit). At a minimum, the equivalent of plate appearances are not homogenous; they can have multiple pitchers in the same appearance, which at a minimum needs to be accounted for with summary stats.&nbsp;</p><p><strong>The field is not symmetrical.</strong> When batters switch sides, the dimensions of the field in front of them change as well. Cricket fields have what&#8217;s known as both a long boundary and a short boundary, and while it is still possible to hit effectively to the field <em>behind you</em>, most contact will still occur where the ball gets hit away from the batter&#8217;s front facing aim. I don&#8217;t know if long/short boundary splits are a thing, but it seems like a good thing to assume this matters until proven otherwise.&nbsp;</p><p><strong>Umpires have much less influence. </strong>The equivalent of balls and strikes is a much less influential part of the game, which means accounting for the equivalent of where in the count the batter is won&#8217;t be nearly as important, to say nothing of not having to worry about more advanced concepts like catcher framing. Umpires still probably have their tendencies in the calls they do make, but at first glance, their influence appears much smaller than in baseball.&nbsp;</p><p><strong>T20 cricket has a maximum number of pitches. </strong>Each side has a maximum of 20 overs, and with 6 balls in an over, that&#8217;s a maximum of 120 balls to each batting team (unless there are errors from the bowler that lead to extra balls - to be explained in later details). How would baseball look if you weren&#8217;t allowed to bat any more after facing 120 pitches? Batters wouldn&#8217;t get as much of a luxury of feeling out each pitcher&#8217;s style the first couple times through the order, or even within the same at-bat. I&#8217;m not quite sure how that dynamic would play out, but it&#8217;s at a minimum something probably worth accounting for.</p><p><strong>The home team has a heavy influence on the playing surface. </strong>Home field advantage is much less than home crowd influence on umpire calls. The groundskeepers for the home team can curate the playing field with some degree of influence on how the home team would prefer it plays. It&#8217;s not clear how to account for this yet, because that requires understanding how certain players do better or worse under different playing conditions, but if home field advantage ends up being stronger than other sports after running some numbers, at least we have a ready made hypothesis to explain this.</p><p><strong>You have to throw the ball back if you catch it in the stands. </strong>Originally, this point was just supposed to be a throwaway joke about the inherent offensiveness about a sport that doesn&#8217;t let you keep a souvenir. But digging into this a little more shows there&#8217;s a reason that this is a rule: cricket balls get deadened over the course of a match, and that deadening effect is something both the bowler and the batter get used to over time, so later batters will be using almost different equipment than the initial batters. This is additional context that&#8217;s probably important for interpreting players&#8217; stats: how deep in the game were they? And how much of their change in performance is attributable to a older ball. Just goes to show there&#8217;s never any shortage of potential things to account for when building your model.</p><p>I&#8217;ve found that when starting a sport from scratch, there&#8217;s a drinking from the firehose feeling of having to account for every little thing, which can lead to a sort of paralysis by analysis on where to prioritize. Years of practice have taught me the wisdom of being okay with building a model you know will be bad at first, and accepting that it will take a lot of hypothesis testing, data additions, and refinements to get a model that I&#8217;ll be comfortable firing on. </p><p>Hopefully, we&#8217;ll do exactly that in the next couple of articles: build a bad model to start, and get it to a place where additional specific hypotheses can be chased down over time.&nbsp;</p>]]></content:encoded></item><item><title><![CDATA[The forever game of sports betting]]></title><description><![CDATA[or, why we don't build houses from the roof down]]></description><link>https://journal.primesports.com/p/the-forever-game-of-sports-betting</link><guid isPermaLink="false">https://journal.primesports.com/p/the-forever-game-of-sports-betting</guid><dc:creator><![CDATA[Prime Sports]]></dc:creator><pubDate>Mon, 03 Jun 2024 18:37:19 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Z5g5!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b46edb0-7b2a-48f0-83b1-56b9ca57aa5a_600x600.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Someone described the stock market as a series of puzzles of escalating difficulty to solve, with each progressive level of difficulty requiring higher degrees of skill and experience, with the potential rewards of each level of puzzle also scaling with difficulty.<br>You can start out identifying basic patterns that signal inefficiencies and trade on those, but since most market participants can identify those patterns as well, the returns on those trades won&#8217;t be all that large. As you get better at the mechanics of trading and your knowledge base increases, you are in a position to expand the range of trades you can make, requiring a higher level of some kind of skill as you go; forecasting, evaluation, market timing, etc. At each successive level of trade you make, while your margins may go down over time, the amount of volume at which you make trades goes up, resulting in an overall higher earnings potential.</p><p>This is an incredibly useful framework for sports betting as well. People jump into the deep end of sports betting way too early, absolutely sure they can handicap the hardest markets better than the books can, and get frustrated when their envisioned success doesn&#8217;t materialize. In reality, trying to establish and execute more basic edges first, working up the ladder of difficulty of puzzles, in order to learn the fundamentals of identifying value along the way helps establish the fundamentals for improvement.</p><p>There&#8217;s no clear-cut definition of which puzzles/problems should be at the bottom versus the top of this ladder, but a good rule of thumb is the higher up the ladder you go, the more original handicapping you&#8217;ll need to do yourself. There are still plenty of profitable puzzles to solve that require little to no original handicapping on your own, and learning how to solve those lower-level problems provides foundational experience to tackle the next-level ones on the ladder. If someone analytical-minded came up to me with zero sports betting experience but wanted to get good at it, here&#8217;s the order of problems that would be recommended to be proficient at solving:</p><h2><strong>Level 1</strong>: <strong>Matched Betting/Arbitrage</strong></h2><p>This is a great introductory step, because it requires close-to-zero underlying sports knowledge about the bets that drive these opportunities. Matched betting, or arbitrage, is where two prices on a bet at two different sportsbooks are divergent enough that you can place bets on both sides of the bet and make a small return while learning more about how these markets play out. </p><p>You&#8217;re usually making bets on the order of something like betting $240 to win $2, but it is a very low risk approach and offers a safe path to starting to make money. Plenty of services offer step-by-step guides on how to utilize signup bonuses with these bets, resulting in a low 4-figure return from these bonuses alone. <br>Clearing these bonuses does requires you to have accounts at multiple sportsbooks to take advantage of every price offered, an essential and foundational step for achieving higher levels of success at sports betting. </p><p>Getting comfortable with matched betting ingrains a level of price sensitivity at a very early point, which is a habit that most bettors don&#8217;t acquire if they enter sports betting through entertainment-driven entry points. Realistically, your effective hourly rate with arbitrage alone probably be won&#8217;t be high enough to make it worth your while, but if you use it as a method to clear sportsbook deposit bonuses, it&#8217;s a great introductory step to get comfortable with the fundamentals.</p><h2><strong>Level 2: Exploiting Bad Prices, aka Top-Down Betting</strong></h2><p>One natural question people start to ask after getting comfortable betting arbitrage opportunities: why <em>do </em>sportsbooks have wildly divergent prices for bets sometimes? By definition, arbitrage can&#8217;t exist unless at least one of the books is &#8220;wrong&#8221; according to whatever the real fair price is for a bet. If you can start to think critically about how to determine which book is on the wrong side of an arbitrage opportunity, you can start betting into those wrong prices exclusively, taking on some risk but having a much higher rate of return over the long term.</p><p>One simple way to determine which book is wrong is to compare their price to the average price from other sportsbooks. Usually, one of the books sticks out like a sore thumb with a bad price, and making an educated guess that the outlier price from one of the books is the bad price is usually a profitable guess. Pay close attention to what we did here- once again, we made no effort to come up with an original guess on what the fair price should be from our own handicapping, we just compared one price to the rest of the market and guessed it&#8217;s probably bad. This is called <em>top down betting</em>, where you&#8217;re using other market prices to identify books that are offering bad prices. The goal with top down betting is to figure out which bets are profitable only using data from other sportsbooks and no original handicapping of our own.</p><p>One common implementation of this strategy is to compare prices at all sportsbooks to prices offered by &#8220;sharp&#8221; sportsbooks, aka sportsbooks that have high limits on all their bets and dedicate significant resources to making their pricing as sharp as possible, and bet into non-sharp books whose price is significantly different from the sharp books. Not only is this a significantly more profitable approach than matched betting, but it still stays in the sweet spot of <em>requiring little to no sports handicapping knowledge to turn a profit. </em>In my opinion, top-down betting is an underrated place to stay on the sports betting puzzle ladder: you can clear low-to-mid 5 figures with a moderate amount of time spent, and you don&#8217;t have to invest a lot of time into original handicapping.&nbsp;</p><p>So why doesn&#8217;t everyone do this if it&#8217;s such easy money? Because most people get into sports betting not because they want to make money, but because they either want to just have fun, or more crucially that they playing the meta-game of becoming a better bettor. For many people, it&#8217;s just not fun or satisfying if you&#8217;re not using your own sports knowledge to come up with your own price and try to beat the market. And while that&#8217;s still an attainable goal at higher levels, most people probably think they&#8217;re ready to do exactly this, when they should have gotten more proficient at the basics of the lower level steps on the ladder we&#8217;ve been talking about. Before you start handicapping markets yourself, you should at least be confident that you can profit with top-down betting<em>. </em>If you don&#8217;t want to get into handicapping yourself but you&#8217;re cognizant that sportsbooks can and do hang bad prices and you want to exploit that fact, top-down betting is a great place to stay on the ladder to make money in sports betting.&nbsp;</p><h3><strong>Level 3: Handicapping Low-Limit Markets</strong></h3><p>One of the benefits of top-down betting is you really internalize how wide the prices are for so many betting markets across sportsbooks. Wait though, aren&#8217;t the books supposed to be the source of truth for predicting how the game is going to go? If that&#8217;s the case, then why do they have such different prices for the same markets? And when you end up limited after becoming profitable in some of the lower limit markets, maybe you start to wonder what exactly the books are so afraid of, given that they have such low limits and they&#8217;re quick to restrict you. <br>Slowly, but surely, you come to the conclusion that these markets are not in fact efficient, and you might in fact be able to come up with a better number than the books. You think you might be ready to start handicapping your own number and not just rely on bad prices to find profitable spots, but rely more on your own numbers to bet into the market.</p><p>How to get good at coming up with your own number is well beyond the scope of this article, as it&#8217;s a never ending process of refinement. Many bettors are successful blending one or more projection sources as their number and don&#8217;t make any themselves, knowing when and where to execute on those numbers. Others go deep into the modelling and originating process, getting good at the fundamentals of predictive analytics and come up with their own bettable numbers. All of these are viable options and are required if you want to advance to the next ladder of sports betting and open up your range of bettable opportunities. And if you try to go down this route, your best place to start will be lower limit markets. Typically, this means prop betting, alternate spreads/totals, and opening lines for regular spreads and totals markets. Low limit markets have low limits for a reason: they&#8217;re much more inefficient and beatable than mature markets, so sportsbooks will want to minimize their liability in these areas. If you&#8217;re ready to take the plunge and move away from top-down betting, proving your edge in these low-limit markets should be your first test.&nbsp;</p><p>Additionally, the skills you&#8217;ve built up from the previous steps on the ladder will help you juice your edge in these markets. Line shopping will be secondhand nature to you at this point, and you should have some intuition built up around which books are sharper or weaker in certain markets from your time betting into them via top-down betting.</p><p>The advantage of moving up to this level is your expanded range of betting options. If you can beat low-limit markets without needing to bet into only weakly priced markets, you&#8217;ll be able to capitalize on more betting opportunities, increasing your betting volume and your overall returns. And getting the fundamentals of handicapping down is essential if you want to move up to the final stage of sports betting.</p><h2><strong>Level 4: Handicapping High-Limit Markets</strong></h2><p>This is the final step of sports betting mastery: being able to bet your original numbers into the highest limits allowed by sportsbooks. This means your numbers are sustainably better than the sportsbook&#8217;s numbers during all periods and not just during openers. It takes a lot of work to get numbers that are reliably better than the sportsbooks: you&#8217;ve priced in all of the major angles and variables, pored over your outliers to understand where your leaks are and have tamped them down, and know how to blend objective and subjective information into an informed bet/no-bet decision for every number on the board.&nbsp;</p><p>This refinement is a never ending process. The best handicappers never feel like their originating process is truly done, as there are constantly new data sources, modelling techniques, and changes in the fundamental conditions to account for. They also know their edges are constantly in flux, and many will go away over time as the sports betting markets advance toward further efficiency (ie trading teams catch up in their own modelling strategies), requiring them to stay a step ahead and find new sources of edges as old ones dry up. All of them share another crucial trait: an absolute love of the grind, and a high tolerance to put up with the minutiae and details of getting the best number possible.</p><p>This may all sound like a lot of unnecessary work to get to this point- maybe you&#8217;re already betting into these markets and have had some success, so you feel like you don&#8217;t need to go through the baby steps and learn the fundamentals. In the majority of cases, any success from novice sports bettors in these markets is likely due to variance, and their long-term win rate is going to regress to the mean. Maybe this has happened to you already- you had a hot streak to start, but are frustrated that your initial success isn&#8217;t being replicated. If that&#8217;s the case, you&#8217;ll be set up for far more success going through the steps outlined in this ladder, as you will learn the fundamentals of successful sports betting along the way.&nbsp;&nbsp;&nbsp;</p><p>A reminder from the top of the article: you are not required to progress along the entire ladder to be a successful sports bettor. Ultimately, a successful sports bettor is simply one that makes money from sports betting; the amount you make depends on what your goals are and how much time and effort you are willing to put into it. Just like there are successful amateur poker players or traders who can turn a profit but don&#8217;t make it their full time job, there are also sports bettors who can make X amount of money with Y amount of time invested. Where your X and Y are depend entirely on you, but no matter what those values are, there&#8217;s room to make money at whatever level you choose to be.</p><p>Playing with Prime Sports give&#8217;s you a key time saving advantage over other books: You never have to spend time figuring out how to get money down: we never limit our players and we welcome arbitrage action.</p><p></p><h6><em>At Prime Sports, we are committed to our members playing responsibly. It's important to always bet within your limits.<br>If you or someone you know has a gambling problem, call 1-800-GAMBLER</em></h6>]]></content:encoded></item><item><title><![CDATA[Everything In Sports Betting Is A Guess]]></title><description><![CDATA[On the nature of information overload]]></description><link>https://journal.primesports.com/p/everything-in-sports-betting-is-a</link><guid isPermaLink="false">https://journal.primesports.com/p/everything-in-sports-betting-is-a</guid><dc:creator><![CDATA[Prime Sports]]></dc:creator><pubDate>Wed, 01 May 2024 13:00:55 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Z5g5!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b46edb0-7b2a-48f0-83b1-56b9ca57aa5a_600x600.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Anyone that&#8217;s worked in statistics and data science can tell you that numbers can instil a false sense of confidence in &#8216;data driven&#8217; decisions. Pick your tactic: 7 decimal points to convey the illusion of 7 decimal point accuracy; a long and multifaceted end-to-end modelling process to show sophisticated estimation techniques; P-value hunting from A/B tests. There&#8217;s no shortage of ways to launder fundamentally flawed analysis into a pretty end result loaded with pre-conceived bias. </p><p>It&#8217;s part of why I&#8217;ve always preferred to stay in sports betting - your bankroll doesn&#8217;t lie over the long term, and the betting markets reward being accurate and don&#8217;t care if you got there with a flashy approach. <br>What we&#8217;re increasingly seeing in tools and social area of the industry is nice packaging, bold claims and no results to back up anything. To be clear, there are excellent tools out there but they&#8217;re always making educated guesses, based on the biases of the creators, and need a certain amount of analysis of the true value of their output. The start of that is some insight into what&#8217;s going on under the hood of these engines.</p><p>One of the most common examples of tools out there are projections. You can find projections for just about anything in sports these days (team strengths, spreads / totals, prop values), and many places offer betting recommendations off of them. The projection-to-bet process is simple: produce your projections, see where they differ from the market, and bet accordingly. It&#8217;s easy to see why projections are as popular as they are: they have the feeling of elegance and simplicity all in one, giving confidence that someone is crunching numbers for you and all you have to do is find where they&#8217;re different. It&#8217;s even more intoxicating if you&#8217;re making your own projections and bet them: there&#8217;s a genuine rush you might get when your projections are different than market values, because you feel like you may have found and edge no one else has. <br>The UI for these projection-based tools gives it a sort of authority as well: they all have some explanation of their official-sounding methodology, and the numbers they display make it feel like there&#8217;s a robust maturity behind them. This, by the way, could apply to literally any sports betting tool that shows numbers and percentages: the way the numbers are displayed make them feel scientific, which ends up being a short circuit for questioning the accuracy or methodology of these numbers.&nbsp;</p><p>In practice, most projections don&#8217;t actually beat the market, and if they show a different number than the market, it usually means the projection process is wrong, not the market. There are any number of common explanations why: their prediction process simply isn&#8217;t accurate enough, they can&#8217;t keep up with breaking news fast enough to produce actionable recommendations, or they have some flaw in converting projected outcomes to the distributions that betting requires. <br>Anyone who bets their own projections into markets has learned the hard way that profitable projections are <em>hard</em>; getting hit by negative returns remains one of the best ways to drive improvements in your projections process. All tools are powered by many different methodologies and assumptions, each of which has its mistakes that all get obfuscated by a seemingly official number at the end of the process.</p><p>What value, then, should be given to all this data? Should every ROI number just be thrown out entirely? Should projections be combined with home-grown opinions on the game?</p><p>Just knowing that the numbers you see on these tools can come from wildly different approaches is empowering alone. If you know what questions to ask about these tools, you can start to deduce for yourself which tools might be good and which ones might be bad. If you start to see certain patterns in the results they produce, you can start to question some of the assumptions these tools might be making. And even when you get good numbers from tools, you can pair those numbers with information beyond just their calculations to put together pieces of the puzzle for how to find profitable bets.</p><p>There are ways to intelligently incorporate this information. The arduous but reliable route is long-term ROI tracking of bet recommendations produced by betting tools. In the end, if something works, the money will show up. The burden will be on you to track tools results yourself, but there are no shortcuts here: this is a game of time and detail. </p>]]></content:encoded></item><item><title><![CDATA[Why the juice matters]]></title><description><![CDATA[A guest post by a friend of Prime Sports and one of the sharpest players out there, Captain Jack Andrews (@capjack2000)]]></description><link>https://journal.primesports.com/p/why-the-juice-matters</link><guid isPermaLink="false">https://journal.primesports.com/p/why-the-juice-matters</guid><dc:creator><![CDATA[Prime Sports]]></dc:creator><pubDate>Fri, 12 Apr 2024 10:53:08 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Z5g5!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b46edb0-7b2a-48f0-83b1-56b9ca57aa5a_600x600.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>Prime Sports asked Captain Jack if he could go into detail of what juice is and why understanding it is key to good bankroll management across the lifetime of your playing experience. <br>Jack was not compensated for the article, and no editorial requirements to promote Prime in particular were given to him.<br>Follow Captain Jack <a href="https://twitter.com/capjack2000">@capjack2000</a></em></p><div><hr></div><h1>Why the juice matters</h1><p>Some people call it juice, some call it vig, others call it the house edge. Whatever you call it, the vigorish is one of the unfortunate realities of sports betting. It&#8217;s why being the sportsbook operator is profitable, and being the sportsbook bettor is so frustrating.</p><p></p><h3>The Basics of Juice</h3><p>First, let&#8217;s review the basics for anyone unfamiliar. The sportsbook sets lines on a variety of games and bet types. We, the bettors, get to choose which games we want to bet and which ones we&#8217;d rather not touch. To even the playing field a little against adverse selection, the sportsbook gets to charge a vigorish.</p><p>For instance, a coin-flip would be 50/50, or +100 expressed in US odds. But a sportsbook might offer that wager at -110 on either Heads or Tails. That difference between +100 and -110 is the juice.</p><p>Some people quickly assume the juice in sports betting is 10%. $110 to win $100. However, if you win, you get your stake back plus your winnings. &#8220;The loser pays the juice.&#8221; is the refrain you&#8217;ll hear from experienced sports bettors.</p><p>If there are two bettors on the coin-flip and one takes Heads while the other takes Tails, they&#8217;d both put up $110 to win $100. The winner gets their $110 back plus an extra $100. The loser gets nothing, and the sportsbook gets $10. The juice represents $10 of the total $220 wagered between the two parties. Divide $10 by $220 and you get the juice in terms of a percentage. In this case, it&#8217;s 4.55%. There&#8217;s your house edge.</p><p>Now that we have the basics down it should be obvious that the lower the juice, the better for the bettor. One of the benefits of betting at Prime Sportsbook is their reduced juice on their lines. Where most books use a standard -110, they use -108. Reworking the example from above:</p><p>$108 to win $100 on Heads<br>$108 to win $100 on Tails</p><p>$108 + $100 to the winner; $8 to the sportsbook. 8/216 = 0.037 or 3.7%</p><p></p><h3>Win More When You Win - Lose Less When You Lose</h3><p>Sports betting is a low-margin business. Over the course of thousands of wagers, sportsbooks make their money by having that small theoretical margin on every bet made. The more you wager and the more frequently you wager, the greater the juice impacts you.</p><p>Remember when I mentioned sports bettors often say &#8220;the loser pays the juice?&#8221; Well, the inconvenient truth is that you better be ok with losing, because sports bettors lose a lot. Even the best sports bettors can&#8217;t expect to win more than 55-57% of the time. When you&#8217;re dealing with such tight margins, the difference between -110 and -108 is big. A -110 bettor needs to be right 52.4% of the time to turn a profit. While a -108 bettor gets it a little easier at 51.9%.</p><p>Let&#8217;s say you&#8217;re able to hit 55% on your picks. You wager to win $500 per game. After 1000 games, you&#8217;re, as expected, 550-450. You&#8217;d be +$27,500 against a -110 line, and +$32,000 against a -108 line. Sure, you print money with your picks, but would you turn down an extra $4,500? Me neither.</p><p></p><h3>The Power of Synthetic Hold</h3><p>If you read the great book <strong>The Logic of Sports Betting </strong>by Matthew Davidow and Ed Miller, you&#8217;re familiar with the concept of Synthetic Hold. For those of you who haven&#8217;t, Synthetic Hold is when you take the best price in the market on both sides of a wager to identify beatable lines.</p><p>If Sportsbook A has<br><em>Cleveland Guardians +220<br>New York Yankees -260</em></p><p>While Sportsbook B has</p><p><em>Cleveland Guardians +250<br>New York Yankees -300</em></p><p>You&#8217;d take the CLE +250 from B, and NYY -260 from A. The best price from either book. The theoretical house edge on this synthesized market between the two books is only 0.79%. Very low! This is then, theoretically, an easier market for you to beat because you&#8217;re playing against less of a house edge.</p><p>The advantage of having Prime Sportsbook in your rotation of books is their low juice model is going to make them be one side of that synthetic market very often. In Ohio and New Jersey you have a lot of sportsbooks. There&#8217;s a very good chance Prime ends up in some arbitrage situations for you due to their low juice.</p><p>To some, arbitrage is the Holy Grail of sports betting. Place two opposing sports bets at different sportsbooks and guarantee a profit with no risk. Others like having some risk in their betting.</p><p>It&#8217;s one of the reasons I can&#8217;t wait to have Prime Sportsbook on our odds screen at <a href="https://unabated.com">Unabated.com</a>. We identify low synthetic hold and arbitrage opportunities for bettors. It&#8217;s great for bettors regardless of their risk tolerance.</p><p></p><h3>Lower Juice at Sharp Sportsbooks Hits Different</h3><p>Prime Sportsbook has to be sharp with their bookmaking in order to survive with lower juice. The thinner the margins for the sportsbook, the sharper they need to be in adjusting their lines. There are going to be many aspirational bettors that believe they can&#8217;t beat a book like Prime. Some won&#8217;t even try. I actually think Prime is exactly the type of sportsbook you should try to beat.</p><p>The opposite of a sharp sportsbook is a recreational sportsbook who relies on ads and gimmicks to draw in bettors. You know who they are. At the recreational sportsbooks, I pick off the low-hanging fruit or take advantage of their poor bookmaking.</p><p>With a bigger edge at a recreational sportsbook, the juice doesn&#8217;t scare me. I can beat those sportsbooks if I can get into the long run. And that&#8217;s a big if because those sportsbooks are quick to limit or restrict me.</p><p>However, in a sharp book, like Prime, the juice does matter. Because it&#8217;s very likely my edge is smaller and the juice has a greater impact. If I get an edge though, it&#8217;s more sustainable. I don&#8217;t have to sweat whether there will be lower limits next time I try to bet.</p><p></p><h3>Capitalism Breeds Innovation</h3><p>One more slightly esoteric point. The average American sports bettor is currently not price sensitive. They don&#8217;t seek out the best price when betting sports. As a result, we have these large flashy sportsbooks fleecing consumers and they&#8217;re none the wiser.</p><p>In many markets, there&#8217;s still room for growth in sports betting. Room for someone to come in and do it better than it&#8217;s being done and win over business. That&#8217;s what makes capitalism such a powerful economic driver. Capitalism breeds innovation. Innovation benefits consumers.</p><p>You may not consider lower juice to be a cutting-edge innovation. However, if it is successful, other operators will look to copy it or improve upon it. This is why I wrote this article for Prime. I want to encourage bettors to support lower juice options to stop the current trend of higher house edges in sports betting. We can make a more sustainable sports betting experience with our bets and where we encourage others to play.</p>]]></content:encoded></item><item><title><![CDATA[The professional bettor]]></title><description><![CDATA[An introduction on how to transition to a professional, winning, player]]></description><link>https://journal.primesports.com/p/the-professional-bettor</link><guid isPermaLink="false">https://journal.primesports.com/p/the-professional-bettor</guid><dc:creator><![CDATA[Prime Sports]]></dc:creator><pubDate>Thu, 11 Apr 2024 11:00:58 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Z5g5!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0b46edb0-7b2a-48f0-83b1-56b9ca57aa5a_600x600.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>If I had to sum up what distinguishes a professional sports bettor in a single sentence, it would be this:</p><p><em>A professional bettor will bet on any side of any market for the right price, and will only bet when a market offers a favorable price.&nbsp;&nbsp;&nbsp;</em></p><p>This may sound obvious, and maybe even no more descriptive than what you might expect out of a dictionary definition for &#8220;professional sports bettor.&#8221; However, when you contrast this with the ways most sports bettors approach betting, whether it&#8217;s the media they consume, the strategies they employ, or the tools they use, it becomes clear most people don&#8217;t actually approach sports betting this way. A couple of examples:</p><ul><li><p>Take the standard article of sports betting media: the staff picks / betting analysis / recommended plays, where the author provides some combination of statistics, background, and information about a game concluding with which bet to take. Providing analysis and picking a bet is all well and good, but the majority of these articles don&#8217;t provide a crucial piece of information: the price at which they would <em>not </em>bet their side, i.e. the implied fair market line. Market prices change constantly, so without knowing when <em>not </em>to bet, the authors are not providing all the necessary information on what market circumstances will make their bet profitable. There is no &#8220;for the right price&#8221; component of these articles.</p></li><li><p>The other most popular price-insensitive approach to betting typically involves some type of betting system, typically built around some specific game circumstances (certain weather conditions, how teams play after a bye/back-to-back/etc.) and evaluated on how these system bets perform against the spread, or ATS. These ATS systems don&#8217;t provide any information on if a bet is still expected to be profitable if the market line changes: it will treat a -7, -3, and +3 line as the exact same. This price insensitivity is why most of these systems end up not turning a profit: they are not able to articulate what the exact boundary is for when a bet is and is not expected to be profitable.</p></li><li><p>Many of these articles/systems contain some type of reasoning for why they might like a certain team to win or why they have a preference for one side of the over/under line, usually relating to specific players (this quarterback has been playing well, this defense will struggle, etc.) If you expect a team to cover because their quarterback will play well, why are you not targeting quarterback prop markets that are more of a direct match for your beliefs, and may offer more favorable prices than the overall game line? Where is the analysis for which markets are the best ones to target for your predicted outcomes, and what their corresponding fair prices are? The best professional bettors are keenly aware of every price in every market, and which ones will provide the maximal return on their investment and target them accordingly.&nbsp;</p></li><li><p>One critical way those bettors find those markets is to have an open account at as many sportsbooks that will accept their bets and place their bets at the sportsbook that offers them the best price of all the other books, commonly known as line shopping. This is far more critical to most professional bettors&#8217; success than the models, insights, or plays they are able to devise: only executing on those insights at the best price possible across the entire sports betting market. The popularity of parlays is a leading indicator for how often most sports bettors are doing the exact opposite: placing all their bets at a single book, while getting worse expected prices than if they had bet each bet individually. These bets are long-term losers in over 99% of cases, steering users away from seeking out the best prices across all books and having them place bets with much higher vig than standalone bets, eating into their long-term ROI.&nbsp;</p></li></ul><p>Why are most bettors using so many of these losing strategies? Part of it is education: recent sports betting legalization means there are still a lot of inexperienced bettors out there who aren&#8217;t familiar with the best ways to maximize their opportunities for profit. A big reason why we&#8217;re starting this newsletter is to help some of those new bettors understand how to improve their sports betting approach. But even for the bettors who have some idea of what they should be doing, implementing these best practices on their own can be a <em>lot </em>of work. Developing intuition around at what price to bet your beliefs for a single market (spread / totals / props) can take a long time, developing them for <em>all </em>markets takes even longer. Scanning every market across all books to find the best lines can be incredibly time consuming; even the best currently available line shopping tools aren&#8217;t all that easy to use and put a high burden on the people using them. Understanding how one of your beliefs correlates to other beliefs and other markets is very difficult without developing your own mathematical models.&nbsp;</p><p>Fortunately, sports betting legalization has spurred a wave of innovation for both sportsbooks and sports bettors, building on all kinds of new technology from the last decade (advanced modelling / AI techniques, cloud computing, etc).</p><p>We&#8217;re in the process of scoping out our own tools that utilize all of these advancements to help our members, and in the mean time, we&#8217;ll explore what it really means to be a winning sports bettor: how to gauge expectations, develop strategies for finding profitable bets, and being uncompromising in maximizing your returns at all times. We believe that anyone who is willing to put in the work to understand how to become a winning bettor, safely and profitably.</p>]]></content:encoded></item></channel></rss>