Scaling Data Ingestion For Machine Learning Training At Meta

Many of Meta’s products, such as search, ads ranking and Marketplace, utilize AI models to continuously improve user experiences. As the performance of hardware we use to support training infrastructure increases, we need to scale our data ingestion infrastructure accordingly to handle workloads more efficiently. GPUs, which are used for training infrastructure, tend to double [...]
The post Scaling data ingestion for machine learning training at Meta appeared first on Engineering at Meta.

How Thermal Simulation Helps Optimize Meta’s Data Centers

Data center optimization has always played an important role at Meta. By optimizing our data centers’ environmental controls, we can reduce our environmental impact  while ensuring that people can always depend on our products. With most other complex systems, optimization of energy consumption is a trial-and-error process. But experimenting on any component of a live [...]
The post How thermal simulation helps optimize Meta’s data centers appeared first on Engineering at Meta.

MemLab: An Open Source Framework For Finding JavaScript Memory Leaks

We’ve open-sourced MemLab, a JavaScript memory testing framework that automates memory leak detection. Finding and addressing the root cause of memory leaks is important for delivering a quality user experience on web applications. MemLab has helped engineers and developers at Meta improve user experience and make significant improvements in memory optimization. We hope it will [...]
The post MemLab: An open source framework for finding JavaScript memory leaks appeared first on Engineering at Meta.

Open-sourcing TAOBench: An End-to-end Social Network Benchmark

What the research is: The continued emergence of large social network applications has introduced a scale of data and query volume that challenges the limits of existing data stores. However, few benchmarks accurately simulate these request patterns, leaving researchers in short supply of tools to evaluate and improve upon these systems.  To address this issue, [...]
The post Open-sourcing TAOBench: An end-to-end social network benchmark appeared first on Engineering at Meta.

Network Entitlement: A Contract-based Network Sharing Solution

Meta’s overall network usage and traffic volume has increased as we’ve continued to add new services. Due to the scarcity of fiber resources, we’re developing an explicit resource reservation framework to effectively plan, manage, and operate the shared consumption of network bandwidth, which will help us keep up with demand and limit network disruptions during [...]
The post Network Entitlement: A contract-based network sharing solution appeared first on Engineering at Meta.

Viewing The World As A Computer: Global Capacity Management

Meta currently operates 14 data centers around the world. This rapidly expanding global data center footprint poses new challenges for service owners and for our infrastructure management systems. Systems like Twine, which we use to scale cluster management, and RAS, which handles perpetual region-wide resource allocation, have provided the abstractions and automation necessary for service [...]
The post Viewing the world as a computer: Global capacity management appeared first on Engineering at Meta.

Introducing Meta’s Creators Of Tomorrow

We’re unveiling a showcase of digital creators across multiple countries who are displaying innovative approaches to content across our apps.
The post Introducing Meta’s Creators of Tomorrow appeared first on Meta.

Introducing Velox: An Open Source Unified Execution Engine

Meta is introducing Velox, an open source unified execution engine aimed at accelerating data management systems and streamlining their development. Velox is under active development. Experimental results from our paper published at the International Conference on Very Large Data Bases (VLDB) 2022 show how Velox improves efficiency and consistency in data management systems. Velox helps [...]
The post Introducing Velox: An open source unified execution engine appeared first on Engineering at Meta.

Improving Meta’s SLO Workflows With Data Annotations

When we focus on minimizing errors and downtime here at Meta, we place a lot of attention on service-level indicators (SLIs) and service-level objectives (SLOs). Consider Instagram, for example. There, SLIs represent metrics from different product surfaces, like the volume of error response codes to certain endpoints, or the number of successful media uploads. Based [...]
The post Improving Meta’s SLO workflows with data annotations appeared first on Engineering at Meta.

Five Security Principles For Billions Of Messages Across Meta’s Apps

At Meta, our messaging apps help billions of people around the world stay connected to those who matter most to them. This scale brings potential threats from criminals and hackers, so we have a responsibility to keep people and their data safe. We’re sharing a set of principles to ensure that security is central to [...]
The post Five security principles for billions of messages across Meta’s apps appeared first on Engineering at Meta.


Recently, we open sourced Superconsole, a Text-based User Interface (TUI) library written in Rust. The reason for creating Superconsole was to power the future iteration of the Buck build system, giving user-friendly information about what a build is d...

Facebook Wi-fi

Facebook WiFi is an easy, affordable way for you to connect people, provide a digital experience to engage offline customers, and improve customer retention.