Madison Labs

A little more than a year into my role as a senior software engineer at a Seattle hardware startup, my manager told me in my weekly 1-1 that I was consistently missing my feature deadlines. He said the entire leadership team from the CEO down viewed me as unreliable and he was no longer able to defend my results, particularly since everyone else on his team reliably met their deadlines. He wanted to know what I was going to do to become consistent like everyone else. While my code was well written and often met all the requirements, I just didn’t seem to understand that delivering quickly and iterating was much more important to the success of the company than trying to be right and complete the first time. In his view, my estimates had no buffer for uncertainty, interruptions such as bug fixing and inevitable complications that arise in the middle of implementation. What he wanted from me going forward was a consistent record of meeting my commitments every two weeks. Until then, I was essentially on notice.

While the message about missing deadlines had come up multiple times in the past six months, I was still shocked by the realization that I was viewed as the worst developer on the team. I had been so focused on making my current feature work reliably that I’d lost track of the week-by-week slips. Being a self-aware person who’s committed to improvement, I spent a lot of time identifying my mistakes and figuring out what I could do differently. What I learned from this introspection helped me formulate a set of guidelines for working effectively in a loosely structured startup development environment. For me, these are the guardrails I need to set myself up for success.

Guidelines for Startup Developer Success

Your highest priority as a developer is to figure out what you can deliver with high confidence (e.g. 90% or better) in the next iteration and nail down the Definition of Done with the key stakeholders before the clock starts ticking. In order to reach high confidence, you need to know empirically how many development hours you’re likely to have in the next iteration. You also need to know what your dependencies are and what the technical risks are. I give specifics about how to do this in the next section.

If you are going to be making architectural and/or technology choices that require buy-in from any of your teammates or other stakeholders, it is critical to budget several days to a week of time to build consensus, write design documents, incorporate review feedback and update your estimates. Build this non-coding time into your estimates.

As soon as you have any evidence that the scope of the deliverable is bigger than can be met with high confidence in the next iteration, you need to raise your hand and ask for help. Specifically, you want to negotiate with your stakeholders and teammates on scope reduction, adding more resources and/or breaking the deliverable up into multiple iterations.

Once the iteration is underway, if you find that the effort is ballooning past your estimates and/or your available development hours are shrinking below what you can reasonably make up, you need to raise your hand and ask for help. Again, you are looking to negotiate on scope, load balancing and deadline.

Make a record of all tech debt you are creating and finding during implementation, particularly the former because you will have to cut corners at times to make your deadlines. You may want to log bugs in your defect tracking system as you go. The point of having a tech debt record is to make visible additional work that will be required to support future features and/or improve product quality.

If you have signed up for a deliverable that requires significant exploration/tuning (e.g. a new DSP pipeline), difficult algorithms (e.g. Matlab prototype to native code) or critical components that have to be “right” from the start (e.g. a bespoke communication stack), you need to embrace the fact that multiple iterations will likely be required to deliver these things. Make sure your estimates reflect this reality and that you have the buy-in needed to take multiple iterations to deliver.

I became the least reliable developer at my last company in large part because I did not follow these guidelines. I made other mistakes too, such as reacting to my manager’s style of supervision by taking extra time to test and refine my code that I did not budget for in my estimates. However, not imposing the additional structure and margin of safety on my work and communication is what hurt me the most.

What’s the Deliverable?

In an ideal world, the deliverable you are signing up for in the next iteration is a user story, but unless you are methodical about creating the user stories and the tasks you are going to work on to support it, you might not be on the same page with the rest of your organization about what your deliverable is. If the organization wants a demo of the deliverable at the end of the iteration, and the story and tasks you’ve identified don’t result in that demo, you can have a conversation before jumping into implementation about how to resolve the differences.

Writing user stories takes practice, but there is a framework you can follow called INVEST.

I found this diagram and a good explanation of how to apply INVEST in a Medium article by Phllip Rogers entitled Back to basics: Writing and splitting user stories.

Had I followed the INVEST framework, the six week slip that pushed my standing on the development team over the edge might have been avoided and here’s how. First, rather than the deliverable being WiFi credentials over Bluetooth, it would have been several independent user stories:

As the homeowner, I want to be able to set up WiFi for a device during initial setup or at any time thereafter.
As the homeowner, I need to be able to change the WiFi configuration for a devices at any time.
As the homeowner, I want all of my devices to get on WiFi once I’ve configured or changed one device.

When I initially scoped the deliverable, I was unaware that the WiFi module in our firmware did not support the second user story because I did not write that code. It was only able to support initial setup. Discovering this myself during implementation and modifying it cost me a week of additional time. If we had written down and agreed to story #2, the developer who wrote the WiFi module might have been able to take on the firmware tasks associated with it. At the very least, he could have helped me scope that work. When I agreed to the initial due date, it was with the understanding that story #1 was the highest priority. Had we planned the sprint with all three stories on the table, I would have found out that story #3 was the highest priority. It took an additional sprint to implement story #3 because I was in the middle of implementing #1 and didn’t have time to change course.

I believe the rest of the six week delay could have been avoided by having testable user stories. This would have forced us to have an upfront discussion about the acceptance criteria, specifically reliability metrics for representative numbers of devices installed in the home. For example, had we agreed that the reliability metric was 90% for 10 devices installed, we would have been able to plan for and estimate the testing effort needed to validate that metric. I would have also been able to better estimate the additional debugging time I would have needed to pass the acceptance criteria. Armed with all of the information, we might have decided to relax the reliability bar in order to ship the feature sooner to our dogfood users. Defining acceptance criteria should have been a required step in our Definition of Done (DoD) for user stories.

“Fragile” Versus Agile

My last company, like most of the places I’ve worked in the last five or so years, cherry-picks a few things from Agile methodology, usually standup meetings, time boxed iterations and demos and doesn’t do the rest. One of my former coworkers dubbed this approach Fragile because it deprives the team of the empirical velocity data needed to make better estimates, doesn’t enforce clear Definitions of Done and skips the retrospectives that allow the team to continuously improve if done effectively. In a Fragile version of Agile, I’ve learned that it’s even more important to follow the guidelines because the organization will still hold you accountable for your commitments and evaluate the doneness of your work when neither is grounded in a clear contract.

Time Tracking for Empirical Estimates

Since we didn’t track story points and velocity across iterations at my last company, we didn’t have any empirical basis for our estimates.Whether your organization tracks velocity or not, a relatively easy way to see how many development hours you actually have in a given work week is to keep a detailed time tracking log for a week or two. In the log, record every activity, interruption and meeting as or soon after it occurs to maintain accuracy of the data. If you’re using a Google Sheet or Excel spreadsheet, I recommend something like this for each workday:

Activity	Type	Start	Finish	Duration (mins)
Stand-up	Meeting	10:00AM	10:23AM	23
Discussion	Unplanned	10:23AM	10:50AM	27
JIRA ticket 1028	Bug Fix	10:50AM	12:00PM	70
Lunch	Break	12:00PM	1:00PM	60
JIRA ticket 8980	Sprint coding	1:00PM	3:30PM	150

This format allows you to create an AutoFilter on the Type column and easily total up the number of minutes that you were able to do development for the current sprint. You can also account for any unplanned interruptions that weren’t on the calendar and time spent dealing with tech debt/quality/rework as represented by bug fixes. Once you have a week or two of data, you should be able to predict the number of work weeks required to complete a given feature. If your organization has historical velocity data, the number of story points you are able to complete in a sprint should correlate to your available coding hours.

Had I kept time logs at my last company, I would have been able to make realistic commitments that were quantitatively defensible, rather than guesstimating and trying to fit within a time frame that I thought was acceptable to my manager. Instead of velocity tracking and sprint planning, my manager kept a spreadsheet of developer assignments by work week and updated the assignments and estimates in his weekly 1-1s with us. He may not have intended the 1-1s to be the venue for us to make on-the-spot commitments for the next number of weeks, but I perceived it to be. Armed with my time log data, I would have been able to more effectively argue that a feature would take two sprints rather than one, and I would have been more willing to ask for additional time to scope the feature.

To cross-check your estimates over time, you may also want to keep track of the actual number of hours you needed to complete your features. If you had to work 12 hour days for two weeks straight to meet your deadline for a feature, having a record of that fact will not only help you fine tune your estimating skills, it will also help in discussions about load balancing, performance and compensation. Unless you are OK with your hourly compensation being discounted by working more than 40 hours on a regular basis, you may want to ask for help to get more coding time back during business hours. Perhaps your manager would be willing to excuse you from some meetings and/or reassign some of your bugs in exchange for completing a critical feature on time.

Definition of Done is Better Than Perfect

My last company, like many other lean startups, embraced the principle of “done is better than perfect”. This principle is great as a way to force continuous improvement by putting something of value in the hands of customers, getting their feedback and responding quickly to the feedback with the next update. However, there is nothing in the principle that helps to define what done means. That’s where the Definition of Done (DoD) comes in and why it’s so important to nail down DoDs for tasks, user stories, sprints and releases. In our case, the DoD for a task/story was tribal knowledge rather than documented, and included the following steps:

Merge request into master approved, with all code review comments addressed, and merged
Build passed, including unit and “e2e” functional tests
No new bugs found in manual sanity test pass

The DoD did not require unit or functional tests to cover the new code being committed, nor did it require any written acceptance criteria. We had an unwritten, high-level set of acceptance criteria that roughly applied to all tasks and stories. They looked something like this:

It has to work reliably in the CEO’s house
It has to work as well as the same functionality in our Gen1 firmware
It has to work 90-100% of the time in a 40 device configuration

Without more detail, none of these statements can be used to build up acceptance tests that confirm the task or story is done. And so none of our sprints resulted in verifiably done tasks and stories. We declared tasks and stories done based on their presence in passing master builds. If the deliverable reached this state within some number of days of the promised date, the developer had met their commitment more or less on time.

Our DoD for releases was also informal. The master branch was always deemed ready for dogfood, and if a dogfood release seemed stable after a number of weeks in the CEO’s and participating employee homes, it was deemed ready for Beta. If a Beta release had accrued a sufficient number of weeks/months of usage without new high priority issues being reported, it was moved to production.

The Role of QA

My last company did not hire any Software QA Engineers (SQA) or Software Development Engineers in Test (SDET), in part because the CEO believed all software engineers should write feature code. Our customer support person performed a set of manual sanity tests on demand in her available spare time. My manager wrote all of the automation test infrastructure for firmware and many of the tests. When he needed to get back to leading the software team full-time, he and the CEO took the approach of hiring engineering interns to maintain and extend the automation suite. My manager successfully convinced the CEO that automation testing was necessary to support the business, but unfortunately he also reached the conclusion that it was sufficient.

In my view, it’s sufficient to hire an SDET in a small startup development environment as the QA Engineer and empower them to decide whether user stories and releases are done based on their respective DoDs, which should include some amount of acceptance, ad hoc and exploratory testing. The automation suite should cover sanity and regression testing. If there are gaps in the automation suite, the QA Engineer should prioritize them and decide which should be filled by the developers and which s/he can implement automation tests for themselves. The QA engineer provides the product wide defect filter by owning the overall test plan, running manual test passes to augment the automation suite as needed, tracking and communicating test pass results, and leading the discussion about whether a given release is done based on open bugs and test pass results. This person, supported by a development team that implements unit and functional tests as part of the task and story DoDs, is the sufficient condition for a product pipeline that produces high quality releases.

Conclusion

What I learned from being the worst developer at my company is that I don’t do well in a loosely structured environment unless I impose the missing processes on myself. I don’t expect the company I work for to conform to my development practices, but I can make the argument that delivering my work on time consistently depends on me being able to use those practices. In the best case, I’ll model the behavior that other people will want to adopt. In the worst case, I’ll find out the company is not willing to back their desire for speed with a commitment to good engineering and I can decide whether I want to take my skills and experience elsewhere.