New offer - be the first one to apply!

September 25, 2025

Principal Software Engineer

Senior • On-site

$163,000 - $296,400/yr

Redmond, WA

Overview

OneDrive and SharePoint are rapidly growing services at the center of Microsoft's cloud, reaching almost every part of the company, like Windows and Office. You would be a part of a team that can fundamentally change the way that millions of people use their devices and interact with the most important content in their lives.

 

We are looking for a smart, agile, and intellectually curious Principal Software Engineer who loves creating and building things that delight and protect our customers. You will be working with a team of amazing engineers, PMs, and designers, and work closely with other teams across Microsoft to deliver large scale, distributed architectures, and features to meet OneDrive and SharePoint's core infrastructure needs. 

 

You'll design and deliver systems that enable partners and ISVs to migrate from other cloud providers, improve core systems performance and efficiencies, and ensure zero customer impact throughout the change management cycle. You will deliver systems to meet our business continuity planning goals, provide telemetry for optimizing the service and drive our response time for detecting and resolving service issues down.

 

Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

Qualifications

Required Qualifications:

  • Bachelor's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
    • OR equivalent experience.
  • 6+ years of experience collaborating with partner teams to meet the engineering goals in a unified manner. 
  • 6+ years of delivering and interacting with REST API’s and web services across multiple systems. 
  • 6+ years experience coding, debugging, algorithm design and problem-solving skills.
  • 6+ years experience of cloud-scale services and server/service management features.
Other Requirements:
Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include but are not limited to the following specialized security screenings:
  • Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.
Preferred Qualifications:
  • Experience with building cloud-scale infrastructure components.
  • Awareness, passion, and experience related to cloud scale distributed design and patterns.
  • Familiar with secure software design concepts. 
  • Proven track record of delivering projects that include multiple components. 
  • Ability and eagerness to work across and partner with multiple engineering teams to achieve business goals. 

Software Engineering IC6 - The typical base pay range for this role across the U.S. is USD $163,000 - $296,400 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $220,800 - $331,200 per year.

Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here: https://careers.microsoft.com/us/en/us-corporate-pay

Microsoft will accept applications for the role until September 29, 2025. 

 

 

#ODSPEng

Responsibilities

Coding
  • Provides technical leadership during code reviews for a solution/product area to assure it meets team standards, contains the correct test coverage, and is appropriate for the product or solution area. Brings expertise to code reviews to help improve code quality, proactively coaching and providing feedback to develop other engineers' skills. Ensures coding standards are followed. Screens for and establishes best practices in reviews and provides feedback on code to drive adherence to best practices. Uses automated source code analysis tools that are incorporated into the build/development process.
  • Leads by example across teams and mentors others to produce extensible, maintainable, well-tested, secure, and performant code used across the company that adheres to design specifications. Leads efforts to continuously improve code performance, testability, maintainability, effectiveness, and cost, while accounting for and incorporating relevant trade-offs. Identifies best practices and coding patterns (e.g., leveraging state-of-the-art generative artificial intelligence [GenAI], approaches to source code organization, naming conventions) and provides deep expertise in the coding and validation strategy. Creates and applies metrics to drive code quality and stability, appropriate coding patterns, and best practices. Leads efforts to identify and anticipate blockers or unknowns during the development process, escalate them, and communicate how they will impact timelines, and then drives the identification and implementation of strategies and/or opportunities to address them.
  • Acts as an expert on using debugging tools, tests, logs, telemetry, and other methods, and proactively leads verification of assumptions through while developing code before issues occur across products and teams in production. Leverages minimal telemetry data, triangulates issues, and resolves with minimal iterations. Leads incident retrospectives to identify root causes of problems, and owns the implementation of repair actions and the identification of mechanisms to prevent incident recurrence. Drives applying least-access principles, using logging, telemetry, and other appropriate mechanisms to investigate issues while retaining privacy and security, and champions those practices across the team.
Design
  • Establishes best practices and mentors others to create a clear test strategy that ensures solution quality, prevents regression from being introduced into existing code. Establishes best practices and mentors others on ensuring test plans incorporate security testing to validate security invariants (including negative cases). Provides technical leadership on adding new tests to cover gaps, deleting or fixing broken tests, and improving the speed, reliability, and defect localization of the overall test suite across a solution or product. Mentors others on, and builds testable code and considers testability during design across solutions and/or products. Acts as a thought leader for understanding different types of tests that can be done on a particular system (e.g., unit tests), and maintaining up-to-date understanding of testing architectures used both across Microsoft and across the industry, and applies them across the architecture as appropriate. Designs and executes plans for redesigning or rearchitecting difficult or untestable sections of code across solutions and/or products. Leverages artificial intelligence (AI) tools for test automation.
  • Provides technical leadership for the identification of dependencies and incorporating them into the development of design documents for a product, application, service, or platform. Leads the active identification of other teams and technologies to leverage, how they interact, and where their own system or team can support others. Helps to create relationships and links impacting upstream and downstream interactions between systems and ensures security, compliance, performance, and reliability can be achieved across the entire stack. Drives coordination and collaboration with other teams to reach common goals where dependencies and validation concerns overlap. Enables and fosters communications and proactively negotiates across teams to resolve conflicts around dependency ownership and required work. Drives agreements between dependent teams to align to the delivery schedule.
  • Oversees, influences, and owns efforts and design discussions for the overall system architecture of entire products/solutions that are deeply complex and often ambiguous. Owns the testing and exploration of various design options for entire products/solutions, ensuring the strengths and weaknesses of each option are outlined and making recommendations for which design option is best. Owns creating proposals for architecture and design documents, and leads testing of hypotheses and deeply complex proposed solutions. Shares and acts on findings from investigations, owns design decisions, and oversees the less experienced team members. Leads the development of design documents that support user stories and other product requirements. Proactively identifies and evaluates new technologies to solve classes of problems, and determines and advocates for how to integrate these technologies within existing systems. Leads design discussions with the team and shares findings/learnings from investigations, owns design decisions. Provides technical leadership to ensure system architecture and individual designs meet performance, scalability, resiliency, disaster recovery, cost of goods sold (COGS), and other requirements and expectations. Upholds Microsoft's standards of security, privacy, and other compliance requirements and expectations. Understands and coaches less experienced engineers on the importance of building solutions that expand upon the work of others. Leads the refinement of products through deeply complex data analytics, and makes informed decisions in engineering products through data integration. Reviews deeply complex designs/architectures within and across teams to provide recommendations for improvements.
Engineering Excellence
  • Leads the identification of requirements for, and the comprehensive application of automation within production and deployment across complex products, targeting zero-touch deployment when possible. Runs code in simulated or other non-production environments to confirm functionality and error-free runtime across complex products.
  • Applies and helps to create best practices and shares information with other engineers for building code based on well-established methods and secure design principles while also applying best practices for new code development and formal validation of security invariants. Leads product development and scaling to customer requirements and applies best practices for meeting scaling needs and performance expectations and security promises, and holds accountability for product/solution areas that do not meet expectations.
  • Provides technical leadership through efforts to ensure the correct processes are followed to achieve a high degree of security, privacy, safety, and accessibility across solutions and teams. Leads in developing and assures the presence of visible evidence (e.g., audit trail) to demonstrate compliance for products. Develops and maintains a deep understanding of the implications of onboarding new technologies following expectations of compliance at Microsoft. Provides thought leadership and maintains an up-to-date understanding of both global and local regulations for technologies and system applications to ensure regulations are followed and met.
  • Remains current by investing time and effort into being informed staying abreast of current developments. Proactively seeks new knowledge, evaluating new trends, technical solutions, and patterns, assessing how to adapt them to current problems, and shares knowledge with other engineers. Conducts learning and literary sessions to raise awareness on relevant engineering design principles (e.g., security, testability, performance, scalability, accessibility, product knowledge).
  • Shares and teaches others best practices about new tools and strategies. Leads efforts and mentors others to build software developer tools to support easier, faster, and more effective software engineering across products. Identifies whether open source or internal code is available to address coding needs for a set of complex products, and reuses it in a responsible manner where applicable. Holds subject matter expertise in tools inside and outside current areas of expertise. Leads identification and/or creation of tools that are useful for building the product. Shares best practices and teaches others about new tools and strategies.
  • Drives understanding and applying security best practices and establishes code invariants to model "security as code," ensuring each layer is independently secure, and minimizing risk. Supports and/or adopts, and may set security standards for clear security code review practices for a set of complex products that align with design and engineering principles to raise the security hardening for both protections and detections. Provides thought leadership on proactively incorporating deployment gates on security controls, and scanners for a set of complex products to prevent regressions and/or vulnerabilities that would have customer impact. Includes required security monitoring to ensure detection of violations. Drives collaboration with relevant security partners to define security promises and security invariants for the design of a product/solution while factoring in attacker/investigator personas for security monitoring and telemetry needs, ensure threat models and premortems validate upstream and downstream assumptions and security invariants, establish security breach drills and security incident response processes (e.g., impact analysis, containment), and ensure that artificial intelligence (AI) safety features are implemented for the AI production systems tied to a set of complex products.
  • Drives collaborating with partner teams to ensure a set of complex products works well with the components of the partner team, ensuring proper end-to-end testing, live-site coverage, scalability, performance, and DRI escalation pathways are established before going live.
Implement
  • Leads efforts for experiments that determine the impact of changes using feature flags/flighting in their code, interprets results, and decides on next steps or ship decision from results. Drives identification of the correct metrics for experimentation in determining improving customer value. Drives collaboration efforts with internal partners (e.g., Data Science, product managers) to ensure incorporation of success and guard rail metrics for experimentation.
  • Leverages their deep subject-matter expertise to partner with appropriate stakeholders (e.g., technical program managers) to lead multiple products' project plans, release plans, and work items. Breaks down long-term project vision into milestones. Guides other members for project estimation and escalates only the most critical issues. Owns efforts to ensure required security protections and detection processes are accounted for in planning. Drives efforts to ensure project plans adhere to security, privacy, and compliance requirements. Proactively drives efforts to ensure all code across multiple products/solutions is properly flighted for quicker mitigation of production incidents. Calculates capacity for planning, accounting for appropriate failover and backup/restore mechanisms for disaster recovery for a set of complex solutions. Drives making considerations for efficient operation of a set of complex products and/or solutions after it is live. Drives proactively establishing rollback plans for a set of complex products and/or solutions.
  • Leads leveraging existing deployment frameworks in the implementation of solutions within the existing framework, driving the automation of deployment tasks where possible to ensure efficiency. Drives following safe change deployment best practices (including ensuring that flights are set correctly) for their team to minimize adverse impact to users and other services. Optimizes deployments across products and components to meet differing business objectives. Leads efforts to ensure that solutions are deployed safely, rolling out security-sensitive features only to applicable, relevant customers and scenarios to reduce the attack surface. Leads efforts to monitor dependency status and ensure that only the latest, secure versions are deployed. Leads efforts to define when rollback plans should be enacted for a set of complex products. Drives building deployment infrastructure to allow developers' private builds for a set of complex solutions to be tested in a production-like environment.

Reliability and Supportability

  • Acts as an expert in design and integration and signs-off on work of others across teams or multiple products on logging and telemetry in systems and products to provide feedback on system behavior such as performance, reliability, availability, usage, and implement safety mechanisms, and for allowing monitoring and investigating security-related concerns and scenarios for both live and A/B experiments for products, services, and offerings, resulting in iterative feedback loops resulting in subsequent designs. Ensures solutions are scalable, financially responsible, and meet capture/storage guidelines. Provides technical leadership in efforts to classify, and analyze complex data and analyses on a range of metrics (e.g., health of the system, where bugs might be occurring), and sets expectations for outputs (e.g., notifications, dashboards) that improve monitoring and investigating security-related concerns and scenarios, system monitoring and/or issue identification and mitigation. Proactively considers the privacy implications of telemetry code changes, and of adding new data points.
  • Holds accountability as a designated responsible individual (DRI) and mentors other engineers across products/solutions, working on-call to monitor system/product/service for degradation, downtime, or interruptions. Alerts stakeholders as to status and initiates actions to restore system/product/service for complex issues. Develops a playbook for the team to resolve issues. Coordinates people and resources to ensure DRI responsibilities are covered across teams. Responds within service level agreement (SLA) timeframe. Has line of sight to incidences and plans to address emerging issues. Leads efforts to reduce incident volume, looking globally at incidences and providing broad resolutions. Escalates issues to appropriate owners.
  • Leads efforts in the maintenance of live site service, following security best practices when responding quickly to mitigate issues while using the minimum required permissions to do so that arise on a rotational, on-call basis. Implements and helps others implement solutions and mitigations to complex issues impacting the performance or functionality of live site services. Reviews systematic issues and ensures solutions. Ensures playbooks are logical and understandable. Uses feedback from other solutions to inform preventative measures. Reviews and writes complex incident postmortem and presents insights that drive changes to reduce or eliminate incidents across teams. Drives improving troubleshooting guides (TSGs), wikis, tests, and telemetry to make on-call better, and defining user-facing support documentation and additional test coverage to reduce likelihood of future user-initiated incidents. Drives the enablement of secure operations, security monitoring, and integration with live site investigation activities. Leads efforts to identify opportunities (e.g., lunch talks, automation, practices, tools) that can be leveraged to improve the live site experience and execute on them.
Understand User Requirements
  • Partners with and guides appropriate internal (e.g., product manager, privacy/security subject matter expert, technical lead) and external (e.g. customer escalation team, public forums) stakeholders and leverages expertise to anticipate, determine, and confirm customer/user requirements and their feasibility for one or more complex scenarios. Proactively seeks and leverages a variety of feedback channels to incorporate customer insights into future designs or solution fixes. Leads incorporation of unwritten requirements, such as appropriate continuous feedback loops that measure actionable, quantitative (e.g., customer value, usage patterns, solution performance) and qualitative (e.g., accessibility, globalization) indicators of value. Determines additional critical metrics. Understands and leads providing feedback on, and advocating for the security and privacy needs of the customer who will be using the complex set of solutions.

 

Embody our culture and values.