Wednesday, May 14, 2025

A Case for Pragmatic Engineering Leadership – Communications of the ACM

Computer scienceA Case for Pragmatic Engineering Leadership – Communications of the ACM


In 2021, with blockchains being all the rage, I persuaded my team to pivot some core capability onto this new technology, foolishly believing we were ahead of the curve and a progressive organization. We were vision-driven, not need-validated. Six months later, I was facing a major problem, which was an over-engineered feature that solved no user problem, and three broadly talented and experienced senior engineers who had resigned, burned out and frustrated that their very real technical concerns were largely disregarded in favor of pursuing the new and bold. I had a good learning experience: being a leader in engineering involves not only anticipating future developments but also thoughtfully considering how these changes may impact the team. You need to plan carefully and make thoughtful decisions, not just let things happen by chance. Many times, it is about shielding teams from damaging distractions.

The Digg v4 Relaunch: When ‘Vision’ Took Precedence Over Empathy

Problem: In 2010, in response to competitive pressure, the executive leadership team of Digg decided it was time to perform a radical relaunch (v4) of their system to “modernize.” The problem was not in the vision, but rather in the execution of the vision. During beta testing, the user feedback regarding v4 was overwhelmingly negative due to the removal of features that users loved and the number of bugs in v4. Although the leadership team had received and reiterated user feedback to the federation of the constituents of their website, they ultimately sidelined feedback in favor of the top-down vision without empathy for the users who had been loyal and the new users who had just joined. The Result: infamous “Digg Exodus“: users who fled their buggy and alien platform and, in so doing, lost all traffic, relevance, and confidence in Digg. Learning: Think about what your users need and care about. You ignored their wishes. You made a big change and called it bold leadership.

Zillow Offers: Where AI Hype Outpaced Risk Management

Challenge: Zillow set out to change how people buy homes with “Zillow Offers.” They dedicated AI to guessing home prices so they could make offers on homes quickly and grow rapidly. Zillow then spread this AI model across the U.S. As they accelerated their growth, they discovered a significant issue: risk management. The AI tools did not respond to the changing world of real estate, leading Zillow to offer unviable purchase prices relative to fundamentals. They did not think deeply enough about where things might go wrong with the model they were adopting, both financially and operationally. Result: Zillow had to close the division in late 2021. The company missed its predicted revenue for the year by millions. Learning: The leaders failed because they did not create models, test them, or consider market changes before expanding. Validate extensively, acknowledge limitations, and identify risks before scaling.

Principles and Practices for Pragmatic Leadership

Focus on Real Validated Problems: Leaders and team members should focus on solving genuine user problems or team problems with validated objective evidence (user research, internal data), not the latest hype.

Practice: Decision Checklist: “Does this solve a validated user/internal problem? What evidence do we have?” should come before adding any new work.

Practice: No unnecessary Roadmap Reviews: Have regular conversations where builders can chart their anonymously attributed opinions about the feasibility and value of potential work. Just let them vent the doubt and listen!

Require Validated Risk Planning: Do not scale up a new technology or make major changes without doing a thorough analysis and identifying the impact of adopting it (especially if that decision is based primarily on hope and pressure).

Practice: Pilot Programs: Require all major new technology or features to have abstract pilot work that has defined success (or failure) metrics developed before scaling.

Practice: Hype Audits: Require reasoning for adopting a major trend in writing (e.g., “Why this tech?”, “Why now?”, “What problems does it solve better than the alternatives?”, “What are the risks?”, “What is the rollback if needed?”).

Practice: Enable Technical Veto: Establish timely structures (e.g., architecture review boards) so that more junior staff can seek senior technical staff to note or veto major priority initiatives, in the event of any reasonably clear risk areas pertaining to business continuity, security, feasibility, etc.

Measure and Reward What Matters for Sustainability: Your metrics and reward incentives matter; let everybody sees what you value. Make it public that you value stability, reliability, and team health. Practice: Stability Scorecard: Make time for you and your team to track and review the key metrics that you care about, like Mean Time to Recovery (MTTR target <30 mins), critical bug rates, system uptime, and team burnout scores (target >8/10). Practice: Stability Narratives: Build a narrative for your stakeholders with “Reliability ROI” and risk reduction to operationalize the business value to be had by working on stability. Practice: Celebrate “Boring” Wins: Recognize explicitly people that you rewarded for making substantial improvements to operational reliability, paying down tech debt in sizeable chunks, and simplifying core infrastructure. 

Acknowledge & Address the Human Element: Accept that technology decisions have direct implications for morale, burnout, and attrition of the team building the systems. Accept the human element as input into your planning. 

Practice: Be attentive to red flags. Be aware of them. If engineers start to seem cynical for uncertain reasons, for example, or if a project relies on technology that is poorly understood. Be worried if there is pressure for your team to skip validation. Be wary if you hear senior engineers are leaving your organization because they can no longer tolerate the instability and lack of direction. These are often indicators of much deeper problems. 

Practice: Promote Mental Well-being: Develop a culture where engineers feel psychologically safe to discuss known technical risks or present information about unrealistic hype without any fear of retribution from management.

If Things Go Awry (Recovering from Hype-Induced Shortfalls)

Encourage Transparency: If leadership made a strategic gamble based on hype and faulty validation that didn’t pay off (like Digg V4 or Zillow Offers), it is important that leadership own the failure. Be as transparent as possible in your post-mortem regarding the situation, including any details surrounding the choices made, which assumptions and processes failed them, and so on. Transparency will build far greater trust than shifting blame(s) or making excuses.

Document Lessons Learned: In the post-mortem, make it clear there are not only learnings from the technical failure, but there is a much bigger lesson regarding the failure of how decisions were made. You can consider asking questions like What validation steps were missed? What concerns were raised? What was heard (and what was arguably disregarded)? You can also highlight opportunities to enhance the process for assessing all the risks and hype that can be done before starting new projects. You can also highlight opportunities to enhance the validation gates, piloting requirements, and overall review process that you established for the completed lessons learned cycle.

Conclusion

Engineering leaders realize long-term success comes from developing systems that are maintainable and predictable, not from keeping track of the latest technology trends. Effective technical leadership also means protecting teams from distractions as much as keeping decisions rooted in good evidence. Leadership has the potential to create future sustainability for both the product and the team by focusing on the stability and foundation on which the team builds. Hopefully, that is how we take the process seriously and make a company history of improving decision-making instead of chasing trends. Deciding where to place priorities always helps evolve better teams instead of chasing trends to create products that won’t even last longer than the hype.

References

Olanoff, D. (2010). Digg launches V4, receives a swift kick in the pants from users. TechCrunch.

Soper, T. (2021). Why the iBuying algorithms failed Zillow, and what it says about the business world’s love affair with AI. GeekWire.

Fried, J., & Hansson, D. H. (2018). It Doesn’t Have to Be Crazy at Work. Harper Business.

Edmondson, A. (1999). Psychological Safety and Learning Behavior in Work Teams. Administrative Science Quarterly.

Beyer, B. et al. (2016). Site Reliability Engineering. O’Reilly Media.

Rahul Chandel is an engineering leader with 15+ years of experience building high-performance systems across fintech, blockchain, and cloud platforms at companies like Coinbase, Twilio, and Citrix. He specializes in scalable, resilient architectures and has presented at AWS re:Invent. LinkedIn.

Check out our other content

Check out other tags:

Most Popular Articles