35.8 CloudWatch Embedded Metrics Format (EMF): Logging Custom Metrics
Right, let’s talk about getting your custom metrics out of your application logs and into CloudWatch where they belong. You see, CloudWatch is a bit of a diva. It loves metrics, but it demands they be presented in a very specific, structured way. You could use the PutMetricData API call from your application code, but that’s a great way to drown yourself in network calls, SDK overhead, and code that’s more about telemetry than business logic.
The smarter way, the way AWS actually wants you to do it but is hilariously bad at advertising, is the CloudWatch Embedded Metric Format (EMF). The concept is brilliantly simple: instead of making a separate network call for every data point, you just print a special, highly structured JSON log message. A background agent (like the CloudWatch Logs Agent or, more likely nowadays, the unified CloudWatch Agent) picks it up, sees the magic structure, and forwards the metrics directly to CloudWatch Metrics for you. It’s like slipping a note to the bouncer instead of waiting in the long line.
How EMF Actually Works (The Magic Trick Explained)
Think of EMF as a two-part message. The first part is for the metric system: it says, “Hey, here are some numbers I want you to track under these names.” The second part is for the log system: it says, “Also, please store the full context of this event as a log, so some future human can figure out why that number was 100 at 3:04 PM on a Tuesday.”
An EMF log is just a JSON object that follows a strict schema. It needs a _aws section that contains all the metric witchcraft, and you can have any other custom fields you want for logging context.
Here’s the simplest possible example. Let’s say you want to track the duration of a specific function.
{
"_aws": {
"Timestamp": 1677639742000,
"CloudWatchMetrics": [
{
"Namespace": "MyApp/MyService",
"Dimensions": [["FunctionName"]],
"Metrics": [
{
"Name": "Duration",
"Unit": "Milliseconds"
}
]
}
]
},
"FunctionName": "ProcessOrder",
"Duration": 145.7,
"requestId": "a1b2c3d4",
"orderId": "ord-987"
}
When the agent sees this, it does two things:
- It extracts the metric
Duration: 145.7and ships it to CloudWatch Metrics in the namespaceMyApp/MyServicewith the dimensionFunctionName=ProcessOrder. - It ships the entire JSON blob, including your custom
requestIdandorderId, to CloudWatch Logs.
The beauty is you only pay to ingest the log once, but you get both a searchable log and a first-class metric out of it.
Generating EMF: Use a Library, You Maniac
You could handcraft these JSON objects yourself. Please don’t. It’s a tedious, error-prone nightmare. AWS provides official client libraries for this exact purpose. Here’s how you’d do it in Python with the aws-embedded-metrics library.
First, install it: pip install aws-embedded-metrics
Now, use it properly:
from aws_embedded_metrics import metric_scope
from aws_embedded_metrics.config import get_config
# Optional: Configure the library to use a different log group or disable debug logging
Config = get_config()
Config.log_group_name = "MyApp/ApplicationMetrics"
@metric_scope
def process_order(order_id, metrics):
# Start a timer
start_time = time.time()
# ... your actual business logic here ...
# Let's pretend it takes some time
time.sleep(0.1)
# Calculate duration
duration = (time.time() - start_time) * 1000 # Convert to milliseconds
# Set the namespace and dimensions for all metrics in this function
metrics.set_namespace("MyApp/MyService")
metrics.put_dimensions({"FunctionName": "ProcessOrder", "Environment": "prod"})
# This is the key line - this sends your custom metric
metrics.put_metric("Duration", duration, "Milliseconds")
# Add high-cardinality context for the log stream
metrics.set_property("requestId", "a1b2c3d4")
metrics.set_property("orderId", order_id)
metrics.set_property("duration", duration) # Also include it as a property for logging
return f"Processed order {order_id}"
# Call the function
result = process_order("ord-987")
print(result)
When you run this, the library will automatically flush the EMF log to stdout. In a Lambda environment, that’s all you need—Lambda automatically sends stdout to CloudWatch Logs. On EC2, you need the CloudWatch agent running to pick it up.
The Critical Importance of Dimensions
Dimensions are how you slice and dice your metrics. They’re the key to going from “my latency is high” to “my latency is high for the ProcessOrder function in the us-east-1c availability zone for customers on the premium plan.” You define them at the top of your _aws block.
The biggest “gotcha” with EMF dimensions is that they are immutable for the life of the log entry. You can’t change them halfway through your function. This is why the library examples use the @metric_scope decorator or a similar context manager—it creates a single EMF event. Plan your dimensions upfront.
# Good: Setting dimensions at the start
metrics.put_dimensions({"Service": "OrderAPI", "Function": "CreateOrder"})
# Bad: Trying to change them later (this will likely create a new context or be ignored)
metrics.put_dimensions({"Service": "PaymentAPI"}) # Don't do this.
Best Practices and Common Face-Palms
Mind the Cardinality: CloudWatch Metrics has hard limits. Each unique combination of metric name, namespace, and dimension value becomes a new time series. If you use a
userIdas a dimension, you’re going to have a bad (and very expensive) time. Use dimensions for bounded sets likeEnvironment,ServiceName,AZ, orErrorType.Libraries Handle Batching: The official libraries are smart. They buffer multiple metric calls (
put_metric) into a single EMF log message, which is far more efficient. Don’t try to outsmart them; just callput_metricevery time you have a new value.You Still Need Alarms: Just because your metric is in CloudWatch doesn’t mean you’re done. The whole point is to set alarms on these things! Navigate to your new metric in the CloudWatch console and create an alarm for when
Durationis above, say, 500 milliseconds for 3 consecutive periods.Debugging: If your metrics aren’t showing up, check your CloudWatch agent configuration first. Then, look at the raw logs in your log group. You should see the perfectly formatted EMF JSON objects. If you don’t, the agent isn’t seeing them or your code isn’t outputting them. If you see them but no metrics, check the schema—a missing
_awsfield or a typo inCloudWatchMetricswill break the whole operation.
The shift from active PutMetricData calls to passive EMF logging is a game-changer. It reduces code complexity, offloads work, and neatly ties your metrics to their originating log events for deep introspection. It’s one of those features that feels too good to be true until you use it, and then you wonder how you ever lived without it.