At Google, I was responsible for creating developer-facing documentation for an internal SDK that enabled machine learning researchers and engineers to run TensorFlow models efficiently on specialized hardware. The SDK, known internally as DarwiNN, provided a complete toolchain for model training, conversion, profiling, and deployment.
Overview
My role was to make this complex, hardware-integrated workflow clear, navigable, and productive for diverse developer audiences.
The Challenge
- Complex Toolchain – The SDK spanned compilers, profilers, simulators, runtimes, and numerics validation.
- Diverse Users – From novice TensorFlow users who wanted "drop-in" simplicity, to advanced engineers seeking low-level TPU programmability.
- Documentation Gaps – Critical workflows were scattered across systems, and engineers struggled to understand why models failed to compile or how to optimize for TPU hardware.
- Internal Infrastructure – Work required navigating Google's internal systems (bug trackers, changelists, internal search engines, go links) to locate SMEs and surface authoritative information.
My Approach
1. User Journeys
I mapped out the end-to-end developer workflow into four critical stages:
- Build a Model – Author in TensorFlow, train, evaluate quality.
- Port a Model – Convert to TFLite, compile/lower to target hardware.
- Profile a Model – Validate accuracy, measure performance KPIs.
- Deploy a Model – Integrate with runtime, optimize, and run at scale.
This framework became the backbone of the SDK documentation.
2. Audience Segmentation
I identified three primary user types and aligned documentation strategy to their needs:
- Novice Developers – wanted "out-of-the-box" functionality.
- Motivated Developers – willing to adapt models and use SDK tools.
- Advanced Developers – required low-level TPU programmability.
3. Toolchain Documentation
For each tool (Compiler, Profiler, Simulator, Runtime, Numerics), I created:
- Conceptual overviews – what the tool is and when to use it.
- User guides – step-by-step workflows.
- Best practices – performance tuning and optimization tips.
- Sample code and troubleshooting – concrete examples with error handling.
4. Cross-Team Collaboration
- Partnered with SMEs across engineering to validate technical accuracy.
- Used bug tracking systems and changelists to manage documentation requests.
- Implemented process improvements for doc accountability and timely reviews.
Deliverables
- Comprehensive SDK User Guide covering build → port → profile → deploy.
- Best Practices and Troubleshooting Guides for compilers, profilers, and runtime integration.
- Sample Code Snippets for TensorFlow/Keras workflows.
- Audience-specific guidance (novice vs advanced usage paths).
- Documentation process improvements to streamline review and approval cycles.
Impact
- Created a single, authoritative source of truth for DarwiNN SDK users.
- Significantly reduced developer confusion around model compilation and TPU deployment.
- Enabled ML researchers and engineers to move from prototype to deployment faster by clarifying workflows and exposing best practices.
- Demonstrated ability to translate highly complex, proprietary systems into accessible, developer-friendly documentation.
Skills Demonstrated
- API & SDK documentation
- Developer workflows (TensorFlow, model training & inference)
- Information architecture & user journeys
- Working with SMEs & cross-functional engineering teams
- Managing documentation processes with bug trackers and changelists
- Writing for diverse technical audiences