Space Recruits, Part 1: The State of Analytics

• 24 min read
Space Recruits, Part 1: The State of Analytics

A conversation with Jan van Haaren on what's up with soccer data and how it can help clubs sign players.

Ah, late July. The big summer tournaments are cooked, the big transfer deals not yet done, and most of the players you care about are hard at work posting thirst trap yacht pics from the Seychelles. Unless you’re heavily invested in a Gold Cup group stage game between Qatar and Grenada, soccer’s basically on break for the first time since back when Zoom happy hours still seemed like a fun way to keep in touch and not a torture from the grimmest circle of hell.

Which makes it a perfect time to catch up on some soccer posting, if you ask me. For the next couple weeks, while clubs haggle over multimillion euro transfer fees for French toddlers or mortgage their stadium to mount a hopeless bid for Erling Håland, space space space will be churning out letters on how recruiting works — or should work, maybe, in a less fallen world.

We’ll start with some stuff on data. Rather than dive into numbers right away, I thought it’d be worthwhile to talk to somebody who does this for a living first. So back in May, not long after he left his position as Chief Analytics Officer at the data recruiting company SciSports and before the Belgian champions Club Brugge hired him to be their new data scientist, I had a long, fascinating conversation with Jan Van Haaren about the state of soccer analytics.

If you care at all about soccer nerdery, you probably already know Van Haaren. As a doctoral student at the Belgian university KU Leuven, which I’ve written about before, he helped create the action value model VAEP, which I’ve also written about before. Action value models, also known as possession value models, are the thing you’re going to hear a lot about once everyone’s caught up on expected goals. They estimate how much every touch of the ball changes a team’s chances of scoring and/or conceding, which seems like it’d probably be a pretty useful thing to know when evaluating potential recruits.

SciSports thought so too, and for the last four years Van Haaren helped develop their software platform that allows clubs to scout based on what he calls “a commercial implementation of VAEP.” He also keeps close tabs on the soccer analytics community as a whole, so he’s the perfect guy to explain where this whole data business came from, where it’s headed, and how it can help clubs make smarter signings. The conversation below has been edited for length (lol) and clarity.


JOHN MULLER: I want to do some newsletters focused on what analytics can contribute to sporting departments’ scouting efforts. I think you’re uniquely situated to talk about it, having worked at a recruiting analytics company before looking to join the club side.

What I understand about the space you’re leaving is that the big event data providers, Opta and StatsBomb, offer player scouting platforms, and then companies like Analytics FC and SciSports have their own software built around custom models. Are there other important competitors I don’t know about?

JAN VAN HAAREN: InStat and Wyscout both have their own software platform. They’re more focused on video, but they also offer some data. I think Analytics FC probably comes closest to SciSports in terms of the product that they’re offering. SciSports has this contribution ratings model that’s a commercial implementation of the VAEP framework that we worked on at KU Leuven, and Analytics FC has something similar.

Yeah, they’ve got a Markov-based version of a similar thing.

Exactly. There’s been so much incremental work in this area, and it’s not even incremental, it’s like people doing the same thing alongside each other. I’ve always found that weird. I’m definitely not going to claim that the VAEP framework is the best framework out there — there’s also g+, which is an improvement in some areas. But instead of reinventing the wheel over and over again, I always wonder, why don’t people start from one of the existing frameworks and try to improve that? That’s how things usually go in science. But I’m getting slightly off track.

No, this is interesting, because this is your specialty. Between VAEP and goals added and now StatsBomb’s OBV, it seems like we’re kind of converging on an idea of how these event data-based action value models, or whatever you want to call them, should work. Is there a gold standard that we’ll eventually agree on, like, this is the right way to do it, or will there always be a million different versions doing slightly different things?

Tough question, but I don’t think we’ll ever have a gold standard. I think you should always tailor the approach to the question you’re trying to solve. These VAEP-like models are very good at answering certain questions, but they’re not very good at answering others. For instance, if you’re going to look at defensive performance, there’s probably better ways of building metrics to evaluate that. All these models kind of agree that success is scoring a goal. But depending on the question you ask, I think it maybe makes sense to use different criteria.

Yeah, I think there’s an emerging consensus that this is a valuable question to ask — “How much does this action contribute to the probability of scoring or conceding a goal?” — but I’m interested in the idea of other success criteria. Can you think of examples of why we might want to ask questions that aren’t “Are we likely to produce a goal this way?”

This is one of the reasons I’m very eager to move to a club, because I’ve always felt like I don’t understand these questions well enough and I don’t have enough opportunities to talk to practitioners who are supposed to use these models.

One way to think about it is that most coaches have a gameplan. They expect their players to do certain things. And each of the things a coach expects from their players, you could transform into a success criterion. For instance, one coach might expect his wingbacks or wingers to execute early crosses. We could say, “Okay, each time a sequence results in an early cross, we mark it as a success.” And then everyone who contributed to setting up this early cross gets credit for that. It’s probably not the best example, but each aspect of a gameplan you could eventually transform somehow into a success criterion.

One thing I wonder about with that is like, okay, yeah, you might want to deviate from an action value model that’s based on average goal values if you think your game model is likely to produce goals from an early cross at a substantially different rate than an average team, but how do you validate that? Because there’s just not enough early crosses in a season to possibly say whether your game model’s actually different from the average, right?

I wouldn’t use this approach to evaluate these crosses themselves, but more like all the actions leading up to this cross. Because I would value how much they contributed to making sure your teammate could perform this early cross.

So it’d be like, okay, we know how much an early cross is worth, but what are the most efficient and effective ways to get there?

Yeah, people would get credit for getting there. I’m just thinking out loud, by the way. But if a coach really believes that these early crosses are important, that they’re part of the gameplan because maybe they have a striker who’s really good at finishing those early crosses, then you would want to reward players for performing actions that contribute to creating those.

But your question is spot on: How can you evaluate or validate this? One of the things I’ve learned in the last few years is there’s a lot of focus on developing new metrics, and I’m always happy to see people coming up with new ideas, but at the same time I think that validating and evaluating the metrics is equally important, and it gets almost no attention.

You’ve been drawing a distinction between validating and evaluating. Can you explain the difference?

I’m not sure if everyone would agree on my definition, but how I usually explain this to people is, for me, validation is about transforming a question you want to answer from soccer language into a data science problem. You want to make sure you’re properly translating the actual question that a practitioner has. Does the approach make sense to answer a particular question?

And then once you’ve settled on your approach, you can usually evaluate it in a more technical, formal way. In the case of expected goals, for instance, if we agree that measuring goal-scoring opportunities using a goals model makes sense, we’ve done the validation. But then we start building this model, and we want to make sure that the model actually does a good job of predicting the outcome of shots. That’s the evaluation part.

So you were saying that one of the main lessons you’ve learned is that evaluation and validation are important, that there’s not enough focus on them.

Yeah, just observing what’s going on in the analytics community as a whole. When people write blog posts about what metrics they’ve developed, usually they use some cherry-picked examples to demonstrate how well their approach works. And I usually wonder, like, can you please do a technical comparison with one of the other models, like VAEP or g+? Because then I will truly understand how well your model works.

It seems to me that there are parallel analytics-community and public timelines. On one timeline you’re sometimes like, why are we not moving toward better metrics already? But also, like, people are just finding out about expected goals. That’s only hit the mainstream in the last year.

Good point. A lot of the work that’s being done is ahead of the curve. A certain number of clubs are convinced of the added value of what you can do with data, but this number is still fairly small. I really hoped this adoption would be much faster among clubs.

Why do you think adoption isn’t faster?

That’s something I’ve thought about a lot, and I think it’s mostly a lack of expertise within clubs. We’re producing all these metrics, but what would be really valuable to many clubs would be people who can actually teach them about all this.

People at clubs will reach out to me on Twitter, and I guess the same thing happens to you, with some basic questions. There’s many analytics companies out there that approach these clubs and offer all these things, but people in the club often don’t really understand what these products do and how they could be useful, and in the end nothing happens.

I’ve noticed especially in the last year that there’s an increasing number of people at clubs that have started to realize this data thing has value and they want to do something with it, but right now it’s often very person-centric. There’s like one person in an organization who’s interested in data analytics and helps to get things started, but when this one person leaves the club, everything’s gone. There’s no culture.

It does definitely feel like there’s been a boom in hires in the last year or two, so that most clubs — maybe not most, but more — at least have somebody who knows something about this stuff.

I agree. In the past year or so, I would say, things seem to be changing, but it’s still going slow.

What’s your estimate of what proportion of clubs are doing really advanced analytics work in house versus signing up with companies to handle their analytics load?

I think the number of clubs doing something more advanced in house is very small, way smaller than people seem to think. My feeling is that even bigger clubs have very little in house. There are the big teams everyone knows about — Liverpool, City, Barcelona — and a few others, like Dortmund. I think all the big clubs do something with data to some extent, and some are more secretive about it than others, but below that I don’t think there’s much going on, actually, as far as I understand.

You say some are more secretive than others, but as far as I can tell they’re all secretive. Liverpool and Barça will publish stuff from time to time, but that’s about it.

Yeah, I mean not necessarily being secretive about what they’re doing, but whether they’re doing something with data at all. Bayern Munich, for instance — I know they have a data scientist, but they’re not very public about it. Maybe Bayern Munich is more advanced than City. Probably not, but I have no idea.

If the analytics departments are really contributing, you’d think the first and most obvious place they would contribute is in recruiting. And hopefully you’d see clubs with better analytics departments make better recruiting decisions. Do you think there’s any evidence that’s happening, and what would that evidence look like?

That’s a tricky question. Looking at the players that Liverpool’s recruiting, I’m pretty certain that involves some data. Liverpool’s head of analytics Ian Graham talked about it at the Barça’s Sports Summit —

That’s cheating, though, because Graham talks about what he does. But from the outside looking in, what would it look like to look at a club and say, okay, this club is using data and this club isn’t?

From my perspective, if they’re signing players that often pop up near the top of my list for different metrics, that might be a clue that they’re using data. And the other way around. But you obviously don’t know for sure.

Even if you show that a club that’s using data performs better than expected given its budget, is it causality or is it just correlation? Because if they invest in analytics, it probably means they’re doing many things in a more structured or objective way. They’re better organized.

Teams with a positive net transfer balance might be another indication that they’re “doing Moneyball,” but there’s no guarantee. You can also just be lucky, of course. You can sign some very talented player by accident and sell him for 20 million euros. That happens.

When you talk about talented players popping up at the top of your list, I’m wondering — obviously people use different metrics to evaluate players, but there are probably some players who are good according to most analytics models. And I’m wondering whether we’ve seen transfer fees for those players rise relative to the market.

I don’t think we’re at the point yet where analytics has enough influence to be used as a factor in transfer fee negotiations. Many clubs still rely on Transfermarkt, and we all know how those “market values” come about. I honestly think it will happen in the future. It’s a very natural thing to do, to build models that can accurately value players, and clubs will use them.

Yeah, I wasn’t necessarily thinking of analytics being used in negotiations, but in funneling demand toward certain players. If there are 30 clubs using SciSports and Analytics FC and they’re all seeing the same three players are good, I would expect to see fees rise for those players just because more clubs are inquiring about them.

Not enough clubs are data-driven yet to really notice those effects in practice. Probably those effects won’t really be there for really high profile players — no one needs SciSports or Analytics FC or StatsBomb to know that Messi is good. Maybe for slightly lower players, talented youngsters, that could have an impact. If more and more clubs are able to identify promising players at an early age, that might affect their fees because there’s more demand.

But you don’t feel like you’ve seen that happen, where there was some 20-year-old in your data who looked great, and suddenly there was a bunch of interest around him, and you thought, oh, maybe everybody’s looking at similar numbers?

I remember conversations with colleagues about some guy we were talking about a few months ago because he was really high on the list, and now he’s being chased by two or three clubs. There’s also a client of SciSports — I won’t reveal the name — who noticed that more and more clubs were scouting the same players. In the beginning, they’d often find players using our metrics that no one else seemed to have discovered already. Then they said, “We’ve noticed that when we find a player who pops up in your metrics, there’s also other clubs who seem to know about this player already.” This was a smaller club, and for them it was really important to be the first one to discover a player, because if one of the bigger clubs discovers the same player, there’s a very slim chance they can still sign him.

I think that actually leads into the next thing I wanted to ask about: What does a good in-house analytics department do on the recruiting side that a consulting company or a software platform can’t do?

That’s a tough question. Do you mean in terms of the metrics or the general approach?

Like, if I’m a sporting director and I have to decide, “Do I just want to get an account with some company or do I want to hire Jan to come run my analytics department,” how’s that going to play out differently for me on the recruiting side?

Having an in-house department would allow you to tailor everything to your needs. We talked about how each team has their gameplan and you can translate principles into metrics. That’s one obvious thing to do.

But I think there’s more to it — it’s also about squad management, squad optimization. Using the metrics we already have, we can estimate the impact of a potential signing on your team, and you can optimize your budget allocation across positions. How much money do you want to spend on your forwards, midfielders, defenders, goalkeepers? There are many, many questions that analytics can help answer, but most clubs haven’t realized it yet. Usually when I talk to clubs, they’re very interested to hear that, but it seems they haven’t really thought about it. To me, that’s weird. I’m like, “I would have expected that a club your size at least would have thought about this.” This is one area I find really interesting, optimizing the entire squad given the gameplan and the squad-building philosophy, translating that into data and metrics. Easier said than done, but I really believe that’s the future of recruitment.

When clubs are convinced they want to do something with data, the first thing they often try to do is integrate data into their current process. But that’s not the way I would do it. Data offers you the possibility to completely revamp the way that you’re doing things and make it way more efficient. Many clubs are quite old-fashioned and are used to having scouts who watch games and report back to their clubs. But once you start using data, you probably don’t need this setup anymore. You can use data as a first filter and then you can scout specific players in more detail after that.

Clubs often say they don’t have a budget to work with data. But if we can prevent you from making one or two unnecessary trips to visit a match, then you are probably already making a profit. So yeah, clubs really need to change their thinking to leverage the full potential that the data offers them. But it takes a bit of courage to make that happen. Most clubs prefer to fail in the traditional way than to fail in an innovative way. I talk to a lot of people who are open to trying new things but are reluctant to implement them because if they failed, people would tell them, “Yeah, you should have done it the traditional way. This data approach doesn’t make any sense.” But probably those clubs would have failed anyway, right?

You talked about clubs trying to shoehorn data into their existing process. What does that look like — what’s the bad, Frankenstein process where data gets jammed in where it shouldn’t be?

Often data is just used as yet another opinion, like, we have the voice of the scouts and we have the voice of the data, and we blend it together. It’s not per se wrong, but I think you can use the data in a way better way.

I agree that the most efficient way to use data is as a first filter. The question’s not just what is your scout’s opinion of a player, but which players are your scouts giving you opinions on? You need to narrow down thousands of players.

But you also at some point do need to say, okay, here’s what the data can tell us about the player, and here’s what the scouts can tell us about the player, and how do we weigh these two together?

What I often see is data is used in this way only when it supports the opinion of the scouts. If the data shows a different picture of the player, then the data won’t get a lot of attention.

I learned a lot by reading Thinking, Fast and Slow by Daniel Kahneman and The Undoing Project from Michael Lewis. I think there are a lot of interesting ideas in those books that are very relevant for soccer analytics. I think there’s a lot of bias in the opinion of a scout. Insights from data are also biased, but usually to a lesser extent than people’s personal opinions.

Do you think it’s a stretch to say that one of the important contributions of an analytics department isn’t even the data work itself so much as bringing a more objective or structured approach to some of these questions?

Maybe. To avoid misconceptions, I do think that the opinions of scouts are very relevant and very important. Obviously there are many aspects we can’t really evaluate yet using data. But it’s a matter of weighing all these different insights in the right way.

An increasing number of football clubs are being run as an actual company. Until recently, almost all clubs were making decisions based on their gut feeling. Now there’s a number of clubs thinking through things in a more disciplined way. At some point, as analytics gets more advanced, clubs that use it will get a bigger edge over clubs that aren’t using it. Right now the edge is still relatively small, but as time goes on metrics will get better, and eventually clubs will be forced to start using analytics or they will simply disappear. Especially in Europe, where you have promotion and relegation, the clubs that are outsmarting their competitors will climb the pyramid and the others will fall down.

So far we’ve been talking about data as a singular thing, but there’s a pretty big difference between what you can measure with event data versus with tracking data. We’re starting to see more widely available tracking data. How is that going to change the role of data in recruiting?

I don’t think it will have a huge impact on the role of analytics, but it will have an impact on the accuracy and applicability of the metrics. It might also help to convince more clubs of the added value of data.

The argument has long been that we’re using event data because we want to recruit players in lower divisions and Eastern Europe where we don’t have tracking data available, because tracking data is often only available for your own league or even your own matches. But nowadays you have broadcast tracking data as well, through companies like Sportlogiq and SkillCorner, so this argument is disappearing. Tracking data is becoming increasingly available even for smaller leagues. So I don’t think in two or three years’ time we’ll still use a lot of models that are purely based on event data. Especially with the StatsBomb 360 data, I think that’s a huge thing for soccer analytics.

I tend to think of the major leaps forward with tracking data as (a) you can measure attacking off-ball work; (b) you can see decisionmaking, which is huge — until pretty recently that was something only scouts could give you a read on; and (c) it gives you some read on defense, which as far as I can tell is basically impossible to do well with event data.

I wouldn’t necessarily agree there’s nothing you can do in those areas. For each of those things you just mentioned, we’ve come up with some proxies based on event data. For the decisionmaking part, we had a section about it in the Choke or Shine paper at Sloan. I’m the first person to admit that the decision rating is very limited and there are ways to improve it. Tracking data will definitely help to do that.

But then again, the question is, how do you want to use those metrics? Because often in recruitment, the metrics are only used to find players. You want players who performed well in a certain area to pop up at the top of your list, and the actual number doesn’t matter that much. You obviously want your metrics to be as accurate, reliable, and robust as possible, but it really depends on the task at hand how reliable and robust they need to be.

So I’m very curious to see, once we have a sufficiently large amount of tracking or broadcast tracking data, how metrics with that type of data compare to what we already have. It will definitely be better, don’t get me wrong, but the question is, will the same players still appear near the top of the list? Because if the same players still appear near the top of the list, what’s the added value of the tracking data?

There’s been a lot of criticism of the work that we’ve published because people seem to look at it from a match analysis perspective. And yeah, of course, I totally agree that if you want to evaluate or analyze one specific match using VAEP — please, don’t do it! It’s not designed to be used in that way. It’s not designed to evaluate individual actions. We know there’s a lot of contextual information missing. We know we don’t know all the locations of the players, and obviously it would help to have those locations. But if you’re looking at recruiting metrics, you’re usually aggregating across many different actions over multiple matches.

I wonder if the list comes out the same for the top ten players, who most clubs don’t have a prayer of signing, but maybe in the middle those little differences matter a lot more.

Our numbers suggest that, just ranking-wise, there’s a really high correlation.

What about on defense, do you think there’s interesting stuff you can do with event data or do you think it’s a lost cause?

We’ve also done some work in that area, and there it’s the same argument. We know about all the limitations that the approach has, but it’s better to have some insights than no insights at all.

We’ve come up with an approach that’s a proxy for how well someone is doing defensively. It compares the actual output of your opponent against the expectation based on previous performances. You can project how much output a player is expected to generate in terms of VAEP, for instance, and compare it to the actual VAEP that someone produced in a match. If it’s considerably lower than your expectation beforehand, then the assumption is you likely did something well as a defender. Obviously there’s a lot of assumptions here and a lot of limitations, but we’ve noticed that in some cases it’s able to find players who are known to be good at defending.

As long as you know what the limitations are, it can still help you to identify players. Maybe there will be a few false positives, but you still have the eye test and other metrics you can rely on. If you just use this very basic metric as a first filter on 50 competitions that might have potentially interesting players, maybe there are five players who really stand out, and you look at them and there’s like two who are actually good and three that are just false positives.

To analyze individual matches, these metrics are useless. You can throw them away. But if you have one season’s worth of data, well, it does tell you something. It can tell you whether a player’s worth looking at or not, and often that’s what you’re interested in.

How would you explain to a coach how you know that a metric is good?

That’s something I’m not very good at, because I haven’t had a lot of opportunities to work with coaches and scouts. That’s part of the reason I want to move to a club, because I think there’s a lot for me to learn in that area. I probably won’t learn too much on a technical level — I’d be better off elsewhere — but working at a club improves communication and lets you know how people are thinking about this, what their concerns are.

We sometimes do evaluations that involve practitioners. When we did this work on mental pressure, for instance, we had male and female professional and semi-professional footballers label data, and we used that to evaluate how well our models were performing at estimating how much pressure someone would experience in a game. And we actually found that the inter-labeller correspondence was about as high as the correspondence between this entire group of people and our model. That made us believe our model was doing something useful.

That’s something I’ve always wanted to see with scouts. People are always like, oh, data is limited, it’s imperfect. But I’ve never seen anybody show me: How much do scouts agree with one another? How well do their scouting reports pan out over time?

Yeah, that’s a really good observation. Obviously the data isn’t always right, it has its limitations, but the scouts probably have bigger limitations. They can look at fewer players, for instance. They have all these biases. I think it would be really interesting to see how well the data performs in comparison to the scouts. You should be able to do an experiment where you compare decisions purely based on data, kind of the Midtjylland/Brentford approach, with an approach that doesn’t use any data at all. I’m pretty sure the data will have things wrong quite often, but I’m also pretty sure the scouts will have it wrong even more often.

Well when you get to a club I want to know how your scouts do.

The important thing is educating people, making sure they know how to use these metrics and what they can and cannot do for you. I think that’s the way to build trust. There are roughly two ways the data can be useful to them. One way is like a force multiplier — it can take away a lot of repetitive tasks, a lot of time and effort that can easily be automated. And it can also provide new insights, things they might not have noticed while watching a player.

I think the data can be very useful also to make sure the scout doesn’t fool him or herself. Sometimes scouts notice something but it’s just an artifact, a coincidence, because they’ve only watched two or three games of this player. Something might have stood out in those games but it’s not something the player does always. Or something stood out but the scout didn’t notice because he had this one form to fill out and it wasn’t one of the criteria.

Yeah, I always go back and forth — I see something on the field and I think, okay, is this a trend in the data? Or I see something in the data and I look for what’s happening in the video that’s causing these numbers.

Right, for me it’s not two worlds next to each other. It should be intertwined.

When you look at the market, are there certain kinds of players you think are under- or overvalued based on what you’ve been seeing in the data?

I’m afraid I’m not the right person to ask, and it’s not because I don’t want to answer the question, it’s because I don’t really know. I’m a very technical guy and I’m really passionate about building metrics, but I’m not really the person who extensively uses those metrics himself.

Do you think that will change if you go work for a club, or do you think there’s a good way to structure a sporting department where you can focus on your models and somebody else can figure out how to use them?

That’s something I’ve been wondering about as well, to be honest. I think it’ll be good to be slightly more involved in actually using the metrics, because you get to know the limitations of your model better.

Speaking of limitations, earlier you were talking about what data can and can’t do. Are there things you feel like clubs think data can do for them that actually you would tell them, no, that’s not what this is for?

There was a period when people at clubs who first heard about expected goals thought that it could solve all their questions about football. Some clubs were asking about expected goals numbers for a particular player or match and I’d tell them, yeah, we can provide it to you, but why do you need it? They would give some answer and it’s like, we have something that’s way better at answering that particular question than expected goals.

That happened quite a lot, actually, until maybe a year or two ago, that people who had heard about expected goals on television or read about it in a newspaper would be like, “Yeah, I want this ‘expected goals’ — it will help answer all our questions.” But obviously it only answers only a few questions you may have as a soccer club.

What are the most interesting questions you think data could help us answer but we’re not really there yet? Metrics we could theoretically develop, or ones that are out there but aren’t being used.

A few of the things you mentioned earlier. Decisionmaking, I think that’s an underexplored area. Also the defensive part, especially as broadcast tracking data becomes more widely available. What’s often holding people back is a lack of skills. That might sound a bit rude, but I think many people have excellent ideas but are simply unable to execute them properly. Once that changes, we might see more interesting metrics being developed.

In the beginning, people said, “It’s hard to get access to data. I have these ideas but I can’t work on them.” These days there’s more than enough data to play around with, but I don’t see that many truly innovative ideas popping up in the public scene. I would find it way more interesting if people tried to solve new problems.

It seems to me that the blessing and curse right now for the analytics community is that clubs are hiring all the most talented people and then they’re refusing to let these people talk to each other publicly, so the conversations that would move the field forward happen, if at all, behind closed doors.

The people who do interesting work usually disappear after a while, they get to work for a club, and then they’re no longer actively involved in the discussion. New people enter the stage and they’re inexperienced, which is normal, and they do stuff that people have already done. It’s nice for them to learn, and I would encourage them to do that, but in the end it doesn’t really move the field forward. People are just doing the same tasks over and over again.

I don’t know if the need to seek a competitive edge will always lead clubs to withdraw their analysts from the public sphere, or if it’s possible to be more open about it. Some leading clubs do publish analytics work.

I really like that strategy, and if I get the chance I hope to be able to do something similar at a club. Because I think there’s a lot of value to the club in doing that.

I’m happy to see an increasing number of people discovering the potential value of data, but we’re still just scratching the surface. There’s a lot of possibility with the richer data available nowadays — I think the StatsBomb 360 data and SkillCorner will open a lot of new avenues for research. We’re in a good spot to make the next step. ❧

Image: Paul Verhoeven, Starship Troopers

← Space Recruits, Part 2: Data Scouting on a Budget
The Stars Revolve Around … Fred? →

Sign up for space space space

The full archive is now free for all members.

You've successfully subscribed to space space space
Welcome! You are now a space space space subscriber.
Welcome back! You've successfully signed in.
Success! You are now a paying member and have access to all content.
Success! Your billing info is updated.
Billing info update failed.