Engineering Management, Leadership and Tech Stuff

Do We Need Unit Tests?

2019-10-15T00:00:00+00:00

It happened to me several times in my career that senior engineers told me: unit tests are not useful. These are the common arguments:

You must hit a real database/3rd-party/dependency to “really” test stuff.
Unit tests pass, but then the final result is buggy.
High coverage means nothing.

I’m sure you saw various images like this one:

Of course, that’s fair—unit tests don’t test a system as a whole. And the farther our testing environment from production, the more chances we have to face unpredictable outcomes. We would love to run our tests on production! Involve real dependencies, real data, real everything. Just imagine this QA heaven: high-level automated testing on production, that hits every possible dependency and tests every possible click of a user. That’s what we need, not some detailed low-level testing. Right?

Describing perfect apples doesn’t deny a need in oranges. It’s true that testing a system as a whole, in an environment close to production, is very much useful. But why do we compare that to unit testing?

Software without unit tests

Let’s imagine a system that processes numbers, applying a set of simple operations: adding, subtracting, and multiplying. For simplicity, let’s describe our system as a function:

S(a, b) => mul(sum(a, b), (sub(a, b))

where:

sum(a, b) => a + b
sub(a, b) => a - b
mul(a, b) => a * b

Any real application can be described as a super complex function, with a multitude of simple operations and with complex inputs and outputs.

High-level tests for our system may look as follows:

S(0, 0) == 0  // (0 + 0) * (0 - 0) = 0
S(0, 1) == -1 // (0 + 1) * (0 - 1) = -1
S(1, 1) == 0  // (1 + 1) * (1 - 1) = 0

The system consists of 3 units: sum, sub, and mul. In any high-level testing, we usually narrow expectations to a simple result: we check if a record was created or a message was displayed, etc. That’s because we can’t think of all the possible outcomes of a complex operation. Even in the artificial example above, we have a simple result: a number.

Now let’s imagine somebody introduces a bug by changing sum(a, b) from a + b to a + 1. Our system tests would appear greener than a summer grass, because the change didn’t affect them, and we would happily ship the update to production:

S(0, 0) == 0  // (0 + 1) * (0 - 0) = 0
S(0, 1) == -1 // (0 + 1) * (0 - 1) = -1
S(1, 1) == 0  // (1 + 1) * (1 - 1) = 0

In our example, we had 3 units and 3 system tests. That’s 1:1 ratio. Real systems have way more units, and the ratio can easily be 1:1000 or even higher.

The more units we have, the higher the chance to miss a bug in one of those units. Add to this the complexity of modern software: we use languages that support NULLs and dynamic typing, and we use tons of 3rd-party dependencies. Even a simple a + b operation in a dynamic language, such as JavaScript, can produce lots of different results, depending on values and types of a and b. Good luck catching all those nuances in the high-level testing!

After all, when you have a bug, it’s usually in a single unit (read: method, function, line). It may cause other units to fail, but the source, in many cases, is limited to one location.

Software with unit tests only

A set of unit tests does not guarantee the system is working as a whole. When we run 10K unit tests, we shouldn’t think that we’re testing our system. We’re merely running 10K independent tests. We may have a set of tests that involve the same unit, and when they all pass, we may assume the unit is working as expected. That would be a valid assumption. But all unit tests combined don’t test the system as a whole. Just like a high-level test that validates a feature doesn’t test each individual unit.

It is, however, much easier to grow a layer of high-level tests on top of the existing layer of unit tests. Adding unit tests to an existing system is often much harder.

Additional benefits of unit tests

Until now, we talked about the granularity of testing. Unit tests are supposed to be on a very low level as opposed to other kinds of tests. At the same time, there are at least two additional reasons to invest in unit tests: code decoupling (reducing complexity) and documentation.

You won’t be able to create proper unit tests if your dependencies are highly coupled. That’s especially painful when your code depends on external service: web API, database, etc. Some developers think it’s cool to interact with a real database in tests. Yes, it is cool. In high-level testing. When good practices are followed, and unit tests are created for every unit, there’ll be hundreds if not thousands of tests over time. They need to run fast. Otherwise, developers will incline toward not running the whole suite of tests as often as they should, which will delay bug detection. Eventually, the entire deployment pipeline will be longer than desired. Low coupling and mocking are great tools for designing unit tests. The tests should be fast so that developers could run a complete suite before pushing their changes.

Apart from coupling, there’s also complexity. If your unit does not follow the single responsibility principle, it will be harder to account for all possible scenarios and do thorough testing.

TDD (test-driven development) comes as an excellent solution for maintaining simple, low coupled code. When you practice it, you also have great test coverage. Without TDD, it may be challenging to create high-quality unit tests.

Finally, when you have proper unit tests, they serve as a technical documentation of your code. You can go through the tests to learn the code’s behavior. It may be harder to understand all possible outcomes from the code itself, even if it’s super clean. Tests serve as examples, and when we learn something, we need examples to understand faster.

How to Evolve Legacy Software Systems

2019-10-06T00:00:00+00:00

There’s probably no IT company without a legacy software product. To some extent all software systems have legacy code. Whenever a system becomes flooded with it, there’s a solution many developers start proposing: rewrite!

I’m a big fan of evolution. I’m also a big fan of making software commercially successful. When software engineers start suggesting expensive solutions, it’s probably a sign of miscommunication between product team, engineering team and high level management.

Business aims to earn money. The engineering practices it uses, should focus on reducing cost of change and increasing commercial value. That may as well include a rewrite, but in my experience there are often way less radical measures to apply before going into extremes. Let’s go through them.

Acknowledgement

A system may not seem like a legacy at all. It may be a core product of a business which brings big $$$ and thanks to designers looks nice and shiny. Non-technical people will have no clue about any internal technical issues until they start surfacing and cause money loss.

It’s the responsibility of engineers to identify and signal technical issues. Engineering management should create a crystal clear communication pipeline between engineers and non-engineers. Everyone from a PM to CEO should have clarity on possible technical problems, reasons why they exist, and potential threats. If engineers signal a serious technical debt, but CEO thinks it’s all going great, there’s someone who is breaking the communication and silencing problems. While the CEO may not be directly responsible or knowledgeable of technical issues, they should be aware of business risks.

The silencing of technical issues may be a byproduct of a poor understanding of technical debt. Therefore we should make sure engineers know how to detect, classify, and fix the technical debt. Then we should make sure non-engineers have a high-level understanding of what it is.

How can we translate technical issues into business issues? Here are a few examples:

Churn: low software quality may lead to an increased number of bugs, which will create a lot of hassle for the support team and eventually may cause customers to leave the product.
Low team morale: when the software is hard to understand and change, the team isn’t motivated to work on features.
Bad product reputation: unhappy customers may give a bad rating to the product, which will make it difficult to sell even when issues are fixed.
Slow time to market: feature development may be super slow because of the high code coupling, absence of tests, or obsolete tech stack.
Impossible growth: new users are coming, the product is getting popular, but the architecture doesn’t allow to scale well or is very expensive to scale.

If we go a bit more technical, we can say that technical debt affects the following product factors:

Robustness: stability of an application and the likelihood of introducing new defects when modifying it.
Performance: the responsiveness of an application.
Security: an application’s ability to prevent unauthorized intrusions.
Transferability: the ease with which a new team can understand the application and quickly become productive working on it (the onboarding period).
Changeability: an application’s ability to be easily and quickly modified.

Eventually, we can distinguish the following technical debt categories:

Code Quality Debt: the code is hard to test and change.
Architecture Debt: high coupling between components, difficulties to work on a project.
Infrastructure Debt: scalability issues, broken or non-existent CI/CD pipeline.
Security Debt: vulnerabilities, broken or non-existent security policies.
Performance Debt: non-optimal execution, can be a byproduct of code, architecture and/or infrastructure debt.

Now that we know how to classify and acknowledge the technical debt, let’s see how to identify and prevent it.

Visibility

How do we know there’s a problem? How do we know how big the problem is? Just a feeling is not good enough to justify changes. We may have an architecture that doesn’t scale to business goals, or we may have an obsolete version of a library X. Those two may be valid issues, yet we should know more about them.

Making things visible is key. If you go through product functional and non-functional requirements, you should be able to select several vital metrics. There are numerous tools for all kinds of monitoring. The choice of them depends on your goals and your technologies. I will try to summarize the most critical metrics, which should be monitored on any software project:

Code quality: apply static analysis to identify code issues.
Test coverage: it can be a good indicator of code quality issues. Adding tests shouldn’t be hard, and reaching 90%+ coverage is super easy when the code has low coupling.
Test suite run time: forget about CI/CD if your tests run for hours.
CD lead time: how fast can you deploy something to production?
Team velocity: whatever methodology you use, measure speed. If it goes down, you can investigate why.
The number of bugs: along with the velocity metric, this can be a good indicator of technical challenges.
Vulnerabilities: make sure you get fresh updates on vulnerabilities of the components you rely on. Try to apply automatic security checks.
New versions: learn about new version of your dependencies as soon as possible.
Performance: it’s an extensive topic, but you should measure whatever matters to you. For an API, it can be response time, memory, CPU, throughput, various DB metrics.
Errors: make sure you record errors from all components, and it’s easy to search them.

Those are the essential metrics. They should be easily trackable and visible. If you can’t track one of those, it’s already a sign of technical debt.

Improving knowledge and skills

Why do we have technical debt? It’s, of course, the fault of those product people who come up with crazy requirements, don’t wanna listen about tests and refactoring, and always push for deadlines. Or is it? While non-engineers may have difficulty understanding the concept of technical debt, it’s engineering responsibility to apply the best knowledge to prevent and foresee technical issues.

We don’t know how our project will look like in X years. But we do know that most software projects grow feature by feature and that requirements change.

People have been building software for decades now. We’ve collected loads of best practices and solutions to most known problems. In the same time, now and then, someone creates a new software and invents a “new” way of organizing dependencies, a “new” way of working with a DB. A “new” wheel is invented every day. Why do engineers spend so much effort on re-inventing things? Well, mostly because they don’t know any better! The famous Design Patterns book was published in 1994. Twenty-five years later, we live in a world where many companies don’t even have a single design pattern question in their interview process. Numerous engineers who use dynamic languages either don’t know or don’t care about design patterns. To them, it’s a thing of Java.

When we create software, we face a multitude of challenges, and most of them are shared between software projects. No matter what kind of software we build. No matter what type of framework or language we use. The same problems arise now and then. Examples? Ok, how can we create business logic independent of a specific data store? Or let’s say we have a selection of algorithms, solving the same problem, but each time we need to pick an algorithm depending on current conditions. I can invent something to solve those problems. But why should I spend any time on re-inventing what was already invented long ago? To this day, we already know:

How to write clean code.
How to design scalable systems.
How to test software.
… and many more.

Cultivating and improving this knowledge in engineering teams will have a direct impact on the technical debt level. Most problems were already solved. Reusing proven solutions will decrease the chance of error.

Change of habits

Habits go along with knowledge. Let’s say you can name all the SOLID principles, but in your daily coding, the dependency inversion doesn’t exist, and code components have multiple responsibilities. That means you don’t have a habit of applying your knowledge. It should be impossible for you to create bad code. When you have a habit of doing TDD, you will not feel that you have the option of either creating tests or not. It’s part of your routine now.

A good theory should come down to good practice and fixate as a good habit.

Tests

In his book “Working Effectively with Legacy Code” Michael Feathers identifies legacy code as the code without tests. No matter how tidy and clean the code is, no tests mean it’s legacy. I tend to agree with him. Tests are also an excellent first step to improve a legacy software system.

Unit tests and TDD

TDD is a great approach to create unit tests. It’s not even testing; it’s just a method of building software. You develop two co-dependent parts (code and test) in a way that changes in each part break the other part. The two parts validate each other all the time. It’s similar to eliminating a single point of failure.

But even if TDD isn’t your choice, unit tests are great at pointing to high coupling and high complexity of your code. In a legacy system, you can start applying TDD to any new code. The old code you can refactor by creating tests for it. In most cases, your code will be very rigid to change, and tests will expose the inflexible parts.

I highly recommend starting refactoring with writing tests. Without any tests, it’s a very high-risk change.

Acceptance and integration tests

If you have a complex legacy system, consider creating integration/system tests (between components) as well as high-level acceptance tests to validate all functional requirements of the system. Such tests will guard you against unwanted regressions. Some engineers even think that acceptance tests are the only ones needed because, in unit testing, you don’t involve all dependencies together. Acceptance tests have great value without a doubt, but they don’t help you to pay the technical debt on the code level.

The evolution of any legacy software system should start with investment in tests. Here’s the basic approach I would use:

Cover functional requirements with high-level tests.
Start applying TDD to any new code.
Refactor the old code by creating tests for it.

The evolutionary approach

Rewrite of the whole system is a very bold and expensive move. How do you know that the rewrite will be better? Especially if the same developers are involved.

In my experience, gradual evolution works pretty well. Slowly, over time, piece by piece, you can rebuild the whole system. I’ve seen it happening in several projects. Evolving a legacy software system implies growing development skills, techniques, and habits. You don’t get those after rewrite.

Define the new rules and apply them to every new change. That’s the primary tactic. Be firm and brave. Don’t let the old habits come back. Whenever possible, apply the Boy Scout Rule: leave your code better than you found it.

A legacy software system is great pain, but it can also be great learning. I hope you can make the best of it!

3 Pillars of a Successful Remote Team

2019-09-27T00:00:00+00:00

Where are you the most productive: at home or your office desk? What’s your most productive time of the day? Ask those question to different people, and you’ll hear different answers. The fact is: some people work more efficiently outside of their offices. On top of that, various business factors may lead companies to hire remote workers and eventually have remote teams. In this article, I don’t want to discuss remote vs. on-site. Both setups have their pros and cons. Instead, I want to focus on 3 essential pillars of successful remote teams, which I derived from my experience. As it happened, only my first company was entirely on-site. All other companies in my career had remote teams or workers.

Transparency

Members of a remote team are limited in communication. Slacking or emailing isn’t that convenient as just chatting in the office. Remote work opens a better opportunity for focusing on a task but limits knowledge exchange. It also raises valid performance concerns: we don’t know how much time is invested by a remote worker. Do we care about that?

Each business tries to do a simple thing: earn more than it spends. Paying employees is an expense. Therefore we want to make sure whatever those employees do, brings way more money than we spend. For the sake of simplicity, let’s break down the whole business process into two phases: planning and execution. We’ll have the best results when after a perfect plan comes the efficient implementation of it. Let’s assume we have the best plan ever. How to be efficient?

Now we’ve asked the right question. We don’t really care how much time team members spend. We care about their efficiency. And the key to that is transparency. At any point in time, each team member should have answers to these questions:

What should I do now?
How should I do it?
What happens if I do it?
What happens if I don’t do it?

When the answers aren’t known, there should be a quick way to find them. That’s transparency. On-site teams need it as well, but remote teams depend on transparency even more, because there are less trust and less communication by default. High level of transparency enables fast knowledge exchange and guides people’s decisions. Consider these examples:

Peter knows what to do and knows how to do it. But nobody told him what happens if he spends 10 days on the feature instead of 5. This creates a temptation for Peter to be less efficient and devote time to something else.
Cora has a clear deadline. She knows what has to be done and what happens if the deadline isn’t met. But she never worked in that part of the product, so she doesn’t completely understand how to do certain things. Cora has a chance not to meet the deadline.

It’s normal when team members don’t know something, but they should have easy access to information. All aspects of their job should be transparent to them.

Trust by Default

As mentioned above, remote teams have less communication and less trust by default. These two aspects depend on each other. And of course, how can you trust someone who occasionally pops up in video calls and writes critical pull request feedbacks on GitHub? Trust isn’t something you can get overnight. You have to build it. Constant communication, which includes 1on1s, stand-ups, team events, and mutual success—are the ways to build trust.

In the first months of collaboration, trust doesn’t exist on a reasonable level. When someone is lagging, you think they’re not putting in enough effort. When someone is too critical, you think they’re just a jerk. You don’t trust, so you assume the worst. How can this be solved? Well, make sure everything is transparent, and expectations are set clearly. Start building trust via communication and in the meantime, trust by default. Time will set things straight. If someone is really not suitable for your team, everyone will notice that. But if you don’t trust by default, you risk focusing more on management and even micro-management, then the actual work of building the product.

Team Building

When you’re working remote, you don’t have a sharp sensation of a team. No one is around, and dedicated slack channels don’t help to create that feeling.

Why should you care? Well, if you want to benefit from the effects of collaboration and joint thinking, you should have a team. Without a team, you’ll just have a bunch of individual contributors. With such setup, people tend to put a focus on themselves, their contributions, and their goals. They will not care much about the bigger picture. Most companies want employees to care about their products and not just code commits. It’s also helpful when a contributor knows the impacts of their contributions. Do they see changes to people’s lives or only code functions? Do they know what happens if they underperform? What will happen if they don’t document their code or write it messily? Thinking beyond the individual level has a significant impact on all aspects of contribution. Therefore team effect matters. And how can we create it?

Remote workers need to be reminded of team existence. One way to do it is by having daily stand-up meetings. Another is regular team events.

On one of my jobs, I spent years talking to my colleagues over Skype, and I’d never physically seen them. We didn’t have a team-building policy at the time, we just worked. All remote, as a group of private contractors, rather than colleagues. In another company, when I had trust issues with one of my colleagues, meeting him in person (and drinking beer ;) broke the ice. It gave us a massive boost in our relationship.

Creating a strong team culture requires physical meetings once in a while. It’s an investment in the future productivity of the team and the success of the company. Teams without strong connections will quickly fall apart under stress. Meet your colleagues personally and have fun with them. Maybe that guy who argues with you on slack will not seem that bad when you shake his hand ;)

Daily Stand-Up Meetings in Remote Teams

2019-09-21T00:00:00+00:00

Daily stand-up meeting (also known as scrum meeting) is a short Agile ritual where each team member shares what they worked on, what they plan to work on and what is blocking them. It’s recommended to keep such meetings short and move any in-depth technical discussion to dedicated sessions.

Some engineers and managers don’t feel the need in daily stand-up meetings. They can either observe the process in their project management tools or have frequent discussions with team members. Managers may even see daily stand-ups as a way to control their reports. When team members feel like that, the trust gap between them and the manager only widens.

I agree that a cool project management tool and a fine-tuned agile process can give very high observability of the development process. I also think that communication between team members gives even more clarity. Still, I don’t see daily stand-up as a useless meeting. And when it comes to remote teams, daily stand-ups begin playing an even more significant role. Let me tell you why.

On-Site Team

When all team members work together and even sit together closeby, having daily stand-up meetings becomes very easy. They can just literally stand-up from their desks and exchange statuses. For the on-site teams, daily stand-up meetings are no more than a casual chat. I encourage such activity to be close to the team’s workspace. Going to a meeting room creates this weird meeting sensation, which some people want to avoid. If the team is small and communication is reasonably frequent, they may avoid daily stand-ups. In larger teams, there are several areas of focus. Therefore team members may not be aware of others work. Daily stand-up meeting reminds each team member that they’re part of a team and there’re high-level goals which the team tries to achieve. When the team becomes too large or too distributed, the feeling of being part of a team gets lost.

On-Site Team with Remote Members

When most team members work in the same office, but some are remote, the team gets split, and people feel left out of things. I was a big fan of text status updates on Slack. On one of my jobs, we even used a specialized tool for text status updates. As long as it’s about status, text works pretty well. One day our remote colleague shared his concerns in our 1on1 meeting. He was the only one remote at the time, and he told me that he felt a bit separated from the others. He worked in the team, without feeling the team. We, who sat together, didn’t feel that way of course and we were blind to the problem. Gradually we moved from text to 2-times a week and then to daily stand-ups. I got very positive feedback from my remote colleague because since then, his working day started with seeing other team members, exchanging little jokes, and learning about others progress. He felt he was part of the team.

Fully Remote Teams

I had my fair share of experience with remote teams. This kind of organization is a good fit for some people, especially those who don’t like open space. When organized well, remote teams can be very agile. In my opinion daily stand-up meeting is a must for completely remote teams. Not because of trust. If you don’t trust remote workers, you shouldn’t solve it with more control, but with transparency and clear expectations. I’ll talk about it in a different article. Daily stand-ups in remote teams play key unification role. People who work remotely don’t feel having a team. They feel like being individual contributors. This feeling may not have a visible effect on performance. But in individual contributions, people don’t account others, their communication style and decisions will stir to personal goals rather than team or organization goals. If you want to share a common goal or a product vision among all team members, you should invest in team-building activities. A daily stand-up meeting is a powerful and simple team-building activity for remote teams. Start doing daily stand-ups, and you’ll see great results shortly. Here’re a few tips to make this meeting useful:

Do it in the morning. You’re starting a new working day with your team members, encouraging each other.
Spend a couple of minutes on chit-chat and jokes. We all human, and we should make each other smile!
Keep the meeting very short and avoid in-depth technical discussions and brainstorming. Move those to separate sessions.
Keep each status very simple—no need to share every 5 minutes of your day with others.
Listen and clarify! Daily stand-up meeting can surface many issues and help you to react agile.

Overall I recommend discussing daily stand-up meetings with team members during 1on1s and retrospective sessions, especially if you just started doing them. It should be a pleasant meeting, and you should know if someone doesn’t feel that way.