Infra Hardware TPM - AI Systems at Meta

Posted in Other about 4 hours ago.

Location: Menlo Park, California





Job Description:

Meta is seeking a Technical Program Manager (TPM) with experience in AI server hardware design, development, and deployment at scale. This position will work with cross-functional teams in Meta's Infrastructure organization to drive product definition, proof of concept generation, design, component selection, integration, development, validation, and end-to-end adoption of new hardware products. This position would focus on emerging next-generation AI/ML servers specialized to target Meta's AI software workloads, and would require engagement with external vendors as well as a range of internal engineering and specialized teams. This position would focus on creating strategies and executing plans to support the development and deployment of new generation AI platforms. These platforms are critical to supporting Meta's various software workloads, and are a key enabler to supporting the company's push into AI. This role would work with external and internal partners to influence and define roadmaps based on technical and business considerations, influence software development and adoption strategies including internal customer and stakeholder alignment, and drive integration with Meta's capacity planning tools and systems across the program space. This role would own one or more individual development programs where the candidate would be responsible for leading end-to-end success of the product; this include design and development, delivering hardware into data centers, influencing cluster enablement and management systems, software tooling, host provisioning, testing, software performance, and enabling new and existing software applications for production turn-up. This role would work with Infrastructure Hardware development, Infrastructure software, Capacity Planning, Data Center, Network Infrastructure and Infrastructure sourcing teams. Meta's Infrastructure Engineering organization is responsible for the growth, management and 24x7 upkeep of all Meta's products and services.



Infra Hardware TPM - AI Systems Responsibilities:



  • Lead technical program management of next-generation Artificial Intelligence/Machine Learning (AI/ML) platform(s) for Meta Infrastructure in a matrix organization covering a range of areas (Data Center, Network, Hardware Systems, Infrastructure Engineering, Software Engineering, Capacity Management) and across multiple physical locations.

  • Own overall program success spanning the end-to-end development of the hardware product. This includes the server, modules, chassis, subsystems spanning internal and external development work through successful ingestion into Meta's infrastructure and support of production workloads at scale.

  • Develop and manage programs including defining scope, requirements, development model, schedules, and deliverables with engineering teams, partners, and stakeholders.

  • Influence broader roadmaps through product interception and market fit, competitive analysis, and feasibility studies

  • Provide hands-on program management during analysis, design, development, testing, implementation, and post implementation phases.

  • Partner with Engineering counterparts across a range of specialties as well as other teams to define product roadmaps.

  • Drive overall communication to leadership, stakeholders and core working teams in regular cadence to bring awareness

  • Drive internal process improvements across multiple teams and functions.

  • Analyze infrastructure needs and produce hardware designs and prototypes to meet those needs.

  • Manage and drive strategic vendor engagement and deliveries.





Minimum Qualifications:



  • B.S. in Computer Science, Electrical Engineering or a related technical discipline, or equivalent experience.

  • 7+ years of technical program management, hardware engineering, systems engineering, technical product management, or similar experience.

  • Understanding of AI hardware stack, bottlenecks and dependencies from software workloads

  • Experience delivering complex tech programs and/or products from inception to delivery.

  • Experience operating autonomously across multiple teams, demonstrated critical thinking, and thought leadership.

  • Communication experience and experience working with technical management and leadership teams to develop systems, solutions, and products.

  • Organizational, coordination and multi-tasking experience.

  • Analytical and problem-solving experience with large-scale systems.

  • Experience establishing work relationships across multi-disciplinary teams and multiple partners in different time zones.





Preferred Qualifications:



  • Experience with data center architecture and deployment

  • Hardware Systems Design experience

  • Experience in large scale AI cluster build out

  • Experience working with Original Design Manufacturers (ODM)'s and other vendors.

  • Web or Internet start-up environment and technical infrastructure management experience.





About Meta:



Meta builds technologies that help people connect, find communities, and grow businesses. When Facebook launched in 2004, it changed the way people connect. Apps like Messenger, Instagram and WhatsApp further empowered billions around the world. Now, Meta is moving beyond 2D screens toward immersive experiences like augmented and virtual reality to help build the next evolution in social technology. People who choose to build their careers by building with us at Meta help shape a future that will take us beyond what digital connection makes possible today-beyond the constraints of screens, the limits of distance, and even the rules of physics.



Meta is proud to be an Equal Employment Opportunity and Affirmative Action employer. We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender, gender identity, gender expression, transgender status, sexual stereotypes, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics. We also consider qualified applicants with criminal histories, consistent with applicable federal, state and local law. Meta participates in the E-Verify program in certain locations, as required by law. Please note that Meta may leverage artificial intelligence and machine learning technologies in connection with applications for employment.

Meta is committed to providing reasonable accommodations for candidates with disabilities in our recruiting process. If you need any assistance or accommodations due to a disability, please let us know at accommodations-ext@fb.com.


$167,000/year to $230,000/year + bonus + equity + benefits


Individual compensation is determined by skills, qualifications, experience, and location. Compensation details listed in this posting reflect the base hourly rate, monthly rate, or annual salary only, and do not include bonus, equity or sales incentives, if applicable. In addition to base compensation, Meta offers benefits. Learn more about benefits at Meta.
More jobs in Menlo Park, California


Meta

Meta

Meta
More jobs in Other


Altru Health

American Water

American Water