18.7 Aurora Machine Learning Integration: Calling SageMaker from SQL

Right, so you’ve got your data in Aurora. Good for you. It’s safe, it’s probably got decent replication, and you can query it with SQL. But let’s be honest, sometimes the data in the database isn’t the whole story. You want to run it through a machine learning model. The old, painful way was to write a script that SELECTs data, connects to some ML service (or worse, loads a library), runs the prediction, and then UPDATEs the rows. It’s a round-trip nightmare of latency, complexity, and boilerplate code.

Aurora Machine Learning integration is Amazon’s way of saying, “Stop that. You’re being ridiculous.” It lets you call a model hosted on Amazon SageMaker—or even an AWS Lambda function—directly from your SQL query. It’s the kind of feature that feels like a parlor trick at first but quickly becomes indispensable. You’re literally doing a SELECT that includes a call to a live ML endpoint. It’s absurd. I love it.

How It Actually Works (The Clever Bit)

Don’t worry, it’s not actual magic. When you execute a SQL function that calls the ML endpoint, the Aurora query processing layer does something smart. It doesn’t drag the entire result set back to your client application to then make the call. Instead, the Aurora backend itself—the layer that’s already co-located in the same AWS network fabric—makes a massively parallelized set of calls to your SageMaker endpoint. It streams the data from your tables, calls the endpoint with small batches, gets the predictions, and injects the results directly into your query’s result set.

The key here is the “co-located” part. The network hop from an Aurora instance to a SageMaker endpoint in the same region is measured in single-digit milliseconds. This is why doing it from your application code in, say, Ohio would be a terrible idea, but doing it from within Aurora is shockingly performant. It’s shifting the compute to where the data lives, which is almost always the right answer.

The Syntax: It’s Just a Function

They’ve kept the SQL part dead simple. You use one of two functions: aws_sagemaker.invoke_endpoint for… well, invoking a SageMaker endpoint, or aws_lambda.invoke for Lambda functions. The Lambda option is a fantastic escape hatch for when you need to do some custom pre- or post-processing that SageMaker doesn’t handle.

Here’s the basic structure for a SageMaker call. Imagine we have a model that predicts customer churn risk based on their activity data.

SELECT customer_id,
       aws_sagemaker.invoke_endpoint(
           'my-churn-prediction-endpoint',  -- Your SageMaker Endpoint name
           'JSON',                          -- The format for the request
           CONCAT(
             '{"features": ["',
             account_age, '","',
             login_count_last_month, '","',
             support_tickets, '"]}'
           )
        ) as prediction_result
FROM customers
WHERE account_age > 30;

This is the simplest form. You pass the endpoint name, the format (JSON or CSV), and then a string that contains your payload. The biggest “gotcha” right here? That payload. You are responsible for crafting a JSON string that exactly, and I mean exactly, matches the format your SageMaker model expects. If your model wants a flat JSON object with a key called "features" that is an array of numbers, and you send it a string instead, it will fail silently and you’ll get a null back. More on error handling later—it’s a trip.

Shaping Your Data and Managing the Payload

Building that JSON string by hand with CONCAT is a recipe for pain and SQL injection-style problems (even if it’s just “data injection” in this context). You will go insane escaping quotes. Let’s do this properly with JSON_OBJECT and JSON_ARRAY. This is a non-negotiable best practice.

SELECT customer_id,
       aws_sagemaker.invoke_endpoint(
           'my-churn-prediction-endpoint',
           'JSON',
           JSON_OBJECT(
             'features', JSON_ARRAY(
               account_age,
               login_count_last_month,
               support_tickets
             )
           )
        ) as raw_prediction
FROM customers;

See? Clean, readable, and far less prone to catastrophic string-building errors. The JSON_OBJECT and JSON_ARRAY functions ensure the payload is properly formatted JSON. The result, raw_prediction, will come back as a JSON string itself. You’ll likely want to extract a value from it.

SELECT customer_id,
       JSON_UNQUOTE(
         JSON_EXTRACT(
           aws_sagemaker.invoke_endpoint(
             'my-churn-prediction-endpoint',
             'JSON',
             JSON_OBJECT('features', JSON_ARRAY(account_age, login_count_last_month, support_tickets))
           ),
           '$.predicted_label'
         )
       ) as churn_risk -- This might be 'YES' or 'NO'
FROM customers;

The Inevitable Sharp Edges and How to Handle Them

This is cool, but it’s not perfect. You must be aware of the pitfalls.

Cold Starts & Timeouts: If your SageMaker endpoint scales down to zero, the first call from Aurora will suffer a cold start delay. Aurora has a default timeout of 30 seconds for the ML function. If your model hasn’t responded by then, the query fails. For mission-critical stuff, ensure your endpoint is always warm or use a provisioned concurrency setup.
Error Handling is… Basic: If the endpoint returns a 4xx or 5xx error, the function returns NULL. That’s it. No error message in your result set. The only way to know what went wrong is to check your CloudWatch logs for SageMaker. This is the single biggest rough edge. You must have logging enabled on your endpoint.
Cost: You are making potentially thousands of ML calls per query. You pay for each of those SageMaker invocations. A SELECT * on a billion-row table would be financially catastrophic and performance-wise apocalyptic. ALWAYS filter your dataset before invoking the function. Use a WHERE clause. Be surgical. This is not a tool for bulk offline batch processing; it’s for real-time inference on a relevant subset of data.
Data Leakage: This is a subtle one. If you’re filtering based on the result of the ML call, you have to be careful. For example, WHERE churn_risk = 'YES' won’t work because the churn_risk alias isn’t available in the WHERE clause. You have to wrap the whole thing in a subquery.

SELECT *
FROM (
  SELECT customer_id,
         JSON_UNQUOTE(JSON_EXTRACT(aws_sagemaker.invoke_endpoint(...), '$.predicted_label')) as churn_risk
  FROM customers
) AS predictions
WHERE churn_risk = 'YES';

This is a classic SQL problem, but it catches everyone off guard when they mix it with these fancy ML functions.

When To Use This (And When To Run Away)

This integration is brilliant for:

Real-time feature enrichment: Tagging users with a propensity score during a live session.
Anomaly detection: Flagging fraudulent transactions as they’re written to the database.
Dynamic content filtering: Applying a sentiment analysis model to user-generated content in a CMS.

It is categorically the wrong tool for:

Training models. Just no.
Batch processing your entire multi-terabyte table. Your CFO will want to have a word.
Anything requiring complex, multi-step model pipelines. Offload that to a proper SageMaker pipeline or Lambda function and just call the final result from Aurora.

The bottom line? This feature is a powerful tool that blurs the line between your data store and your intelligence layer. Use it wisely, respect the timeouts and costs, and always, always check your CloudWatch logs. Now go INVOKE something.