43.5 Transforming Data with Zod: .transform() and .refine()
Right, so you’ve got your Zod schemas validating that your data looks the way you expect. Fantastic. But sometimes, just knowing your data is valid isn’t enough. You need to change it. You need to take a string and turn it into a Date object, or you need to ensure that two fields are related in a specific way. This is where Zod moves from being a simple bouncer checking IDs to being a skilled bartender who also mixes your drink. Enter .transform() and .refine(). They’re the dynamic duo for adding logic to your data validation, but they have very different superpowers. Don’t mix them up, or you’ll have a bad time.
The Shape-Shifter: .transform()
Think of .transform() as a pure function. It takes a value, does something to it, and returns a new value. The key here is that it changes the output type of your schema. What goes in is one type, what comes out is another. This is your go-to for data coercion and normalization.
The most classic, “why isn’t this built-in” example is parsing a string into a Date object.
import { z } from 'zod';
const stringToDateSchema = z.string().transform((str) => new Date(str));
type Output = z.infer<typeof stringToDateSchema>; // Type is Date
// Usage
const result = stringToDateSchema.parse("2023-09-15T12:00:00Z");
console.log(result); // Output: 2023-09-15T12:00:00.000Z (a Date object)
console.log(typeof result); // "object"
This is incredibly powerful. Your input is a raw string from an API or form, but the output of your validation is a proper, useful Date object. The type inference seamlessly updates to reflect this. It’s pure magic. Well, it’s pure functions, which is basically the same thing.
Here’s another practical one: trimming whitespace and converting to lowercase on the fly.
const cleanStringSchema = z.string().transform((str) => str.trim().toLowerCase());
const email = cleanStringSchema.parse(" User@EXAMPLE.COM "); // "user@example.com"
Crucial Pitfall Warning: The transformation only runs after the initial validation passes. If you pass stringToDateSchema a number, it will fail on the z.string() check and the transform function will never even get a chance to run. This is good! It means your transform logic can safely assume it’s working with a valid string.
The Gatekeeper: .refine()
While .transform() changes the data, .refine() just checks it. It’s a supercharged validation step. It doesn’t change the output type; it just imposes additional, custom logic on the value. If the logic doesn’t pass, the validation fails.
Let’s say you have a password field. It’s a string, but you want to ensure it’s at least 8 characters long.
const passwordSchema = z.string().refine((val) => val.length >= 8, {
message: "Password must be at least 8 characters long. I'm not asking for much.",
});
type Output = z.infer<typeof passwordSchema>; // Type is still string
passwordSchema.parse("short"); // Throws ZodError: Password must be at least 8 characters...
passwordSchema.parse("longenoughpassword"); // passes, returns "longenoughpassword"
The second argument to .refine() is an object where you can customize the error message and other parameters. Always provide a clear message. Your future self, debugging at 2 AM, will thank you.
Where .refine() truly shines is with relational checks. The classic example is validating that a “confirm password” field matches the “password” field. This requires access to the entire data context, which you get using .refine() on an object schema.
const signUpSchema = z.object({
password: z.string().min(8),
confirmPassword: z.string(),
}).refine((data) => data.password === data.confirmPassword, {
message: "Passwords don't match. Were you not paying attention?",
path: ["confirmPassword"], // This pins the error to the specific field
});
// This will fail, with the error on the 'confirmPassword' field
signUpSchema.parse({ password: "superSecret123", confirmPassword: "superSecret456" });
See that path option? That’s a pro move. It tells Zod where to attach the validation error in the resulting error object. Without it, the error would be at the root level of the object, which is useless for displaying field-specific errors in a UI.
The Critical Difference and When to Use Which
This is the part everyone gets wrong at least once, so pay attention.
- Use
.transform()when you want to CHANGE the value and its type. (string -> Date, string -> number, raw input -> cleaned input). - Use
.refine()when you want to CHECK the value without altering it. (password length, number ranges, relational integrity).
The most common mistake is trying to use .refine() to change data. It can’t. It’s a guard, not a factory. If you find yourself doing this:
// ❌ WRONG. DON'T DO THIS.
z.string().refine((val) => new Date(val)); // This returns a boolean-ish Date object, it's a mess.
…you actually want .transform().
Conversely, if you need to check the transformed data, you chain them. This is the power move.
const validDateSchema = z.string()
.transform((str) => new Date(str)) // First, transform to Date
.refine((date) => !isNaN(date.getTime())); // Then, check if it's a valid date
validDateSchema.parse("not a date"); // Fails on the .refine() step.
The string “not a date” passes the initial z.string() check, gets transformed into an invalid Date object, and then the .refine() catches the problem. This is the correct, robust way to validate a date string.
So, to recap: .transform() is for mutation, .refine() is for assertion. Chain them like a pro to first mold your data into the right shape, and then ensure it meets all your nuanced criteria. It’s the one-two punch that separates a basic type check from a truly robust data processing pipeline.