You can measure RAM and CPU usage or the latency of the database per query to find bottlenecks in your code, but how do you identify your team's performance issues?
Five years ago I started tracking metrics for my team's performance at Filestage. Measuring development productivity has always been controversial. Is it really better to more lines of code? We've all heard stories of massive performance improvements when changing a single line of code. And by the way, fast isn't enough if your app is full of bugs or if customers don't understand how to use it... are we actually delivering any value?
Before we lose our minds, we have to understand that metrics will never be 100% accurate representations of our objectives; they can always be gamed and should be used only as tools to quickly verify the state. If you really want to draw conclusions, you will have to dig deeper and actually talk to you team. This doesn't mean they are useless, but they must not cloud our judgment, and our focus should always be on our objectives.
While we're at it, let's also clear another thing up: not every metric has to be quantitative. Interviews and surveys will give you valuable insights, if your team is small that's often the most efficient approach.
So what are the objectives of a software development team? Delivering on time and maintaining high quality. If the software is delivered on time and has no bugs, no one will complain of engineering.
So what is out there? When I started tracking, everyone was talking about DORA metrics and DevOps; now it's about DevEx and SPACE , or is it the Core 4? It's easy to get lost among so many frameworks, and if it's your first time approaching them they can seem too abstract. I'll share what has worked for me.
I'm a firm believer that the best work is done when people are happy and satisfied. Yes, you can squeeze performance with pressure in the short term, but eventually people will burn out and leave. Always play the long game. I'm proud that my team consistently scores over 9/10 in satisfaction, even after we had to lay off 20% of the team to avoid bankruptcy at one point.
I insist on putting this metric first because it's the foundation of a high-performing team. If you have very low satisfaction or participation, start digging deeper. Talk to your team; if you genuinely care about them you will gain their trust, and eventually they will share their biggest problems. Is your team drowning in technical debt and firefighting? Has your product team committed to unrealistic expectations, and the pressure is too high? Are team members ever recognized for their work?
In our case, the People department sends surveys every two weeks, and we
receive reports per department using the tool
Officevibe.
Every month I copy the satisfaction and participation values into a
tracking Google Sheet.
This is the best measure of speed I've found so far. In our setup, every
PR that is merged into master is released to production. Each PR must be
approved by at least one other team member, and reviewers are assigned
automatically. All automated CI checks—tests, linting, etc. must pass
before a PR can be merged. Therefore
every PR represents a production-ready, completed technical task.
Yes, like every other metric you could game this one by making absurdly
small PRs, but what would the rest of the team think? Remember, PRs have
to be reviewed.
It will encourage small PRs, but if you ask me that's a good thing.
On average, industry benchmarks show you should merge about 3 PRs per engineer each week (elite teams reach up to 5). If you're below that, talk to your engineers and understand their biggest pain points. Are the tests taking too long, or are they too flaky? Is it simply taking a long time for PRs to be reviewed? After fixing the problems you can track your progress and see whether your improvements paid off—this is what it's all about: having a way to measure progress.
In practice, this metric is easy to obtain: I query the GitHub API with a simple script that counts all PRs merged in the last 30 days and writes the number to a Google Sheet.
Moving fast is only beneficial if we can deliver with high quality. I like measuring the number of technical tickets; every ticket comes from a customer who actively uses our product and has been bothered enough to contact support about an issue. Most frameworks recommend using change-failure rate, but for us it is very rare to need immediate remediation after a deployment, so that metric isn't meaningful. Don't blindly follow frameworks, Always choose what is best for your current situation .
Our company uses Asana, so I use their API to query the technical-tickets board and measure how many new tickets have been opened in the last month. As with other metrics, the script pushes this number to the Google Sheet every week.
In software development, not all of our work directly impacts the customer. Many times we refactor code or update a library to maintain our codebase, but the product doesn't improve for customers. In an ideal world we would spend 100% of our time working on new features. Benchmarks are interesting here: elite teams spend 65% on new features, while average teams spend about 59%.
To gauge our team's impact, we require PR titles to follow the
Conventional Commits spec
and count all PRs in the last month that were feat
. This
isn't perfect because working on new features isn't only about coding; it
also involves talking with Product, understanding the concept, proposing a
technical solution, researching, etc., whereas creating a bug fix is
mostly coding.
Comparing your team with others out there will help you identify the biggest levers for improving your team's performance. Most importantly, it will let you know when you're reaching a point of diminishing returns and when it's time to switch focus.
Use metrics to communicate the state of the team to the rest of the company; it's easier to tell a story if you can measure and showcase progress.
Start simple, you don't need an automated way to gather the data. You can create a quick Google Form or simply ask your team to estimate how many PRs they merge each week, how much time they spend on new features compared with fixing bugs, or how they would grade the development experience.