Right, let’s talk about Keras APIs. You’ve probably seen the Sequential model. It’s the one they show you in the “Hello, World!” of deep learning tutorials because it’s dead simple. You basically stack layers like a very boring, very predictable Lego tower.

from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense

model = Sequential([
    Dense(64, activation='relu', input_shape=(784,)),  # Input layer needs `input_shape`
    Dense(32, activation='relu'),
    Dense(10, activation='softmax')  # Output layer for 10-class classification
])

You call model.add() a bunch of times, and boom, you’re done. It’s fantastic for quick prototypes, simple feedforward networks, and when you’re feeling intellectually lazy (we all have those days). But here’s the thing it can’t do: anything interesting. The moment you need to fork your data, merge two branches, have multiple inputs (like image AND text), or multiple outputs (predicting a category AND a bounding box), the Sequential API throws its hands up and says, “Not my department, pal.”

That’s where the Functional API comes in, and it’s the workhorse you should actually be using for 95% of your work. It treats your model as a directed acyclic graph (DAG) of layers. Instead of just stacking them, you explicitly define how they connect. You start with an Input tensor, and you manipulate it, passing it from layer to layer.

Why the Functional API is Your New Best Friend

The Functional API isn’t just for complex architectures; it forces you to think clearly about the data flowing through your model. You’re no longer just stacking abstract layers; you’re defining a computational graph. This clarity is a godsend when you inevitably have to debug a dimension mismatch at 2 AM.

Let’s build the same model as before, but functionally. Notice how we’re explicitly creating the input and then passing the x tensor through each layer.

from tensorflow.keras import Model
from tensorflow.keras.layers import Dense, Input

# Define the input tensor. Shape is (batch_size, 784). We omit the batch size.
inputs = Input(shape=(784,))

# The magic: Pass the tensor to a layer, get a new tensor back.
x = Dense(64, activation='relu')(inputs)
x = Dense(32, activation='relu')(x)
outputs = Dense(10, activation='softmax')(x)

# The key step: Instantiate the Model by specifying its inputs and outputs.
model = Model(inputs=inputs, outputs=outputs)

The beauty is in that syntax: layer_instance(tensor). It looks weird at first, but you get used to it. It’s calling the layer, which is a Python object, and that object’s __call__ method performs the computation. This is the core concept.

Building Real, Non-Linear Architectures

This is where the Functional API sings. Let’s build a classic: a model with a skip connection, also known as a residual block. This is trivial with Functional and impossible with Sequential.

from tensorflow.keras.layers import Add

# Input for an image, say 32x32 RGB
input_img = Input(shape=(32, 32, 3))

# First convolutional block
x = Conv2D(32, 3, activation='relu', padding='same')(input_img)
x = Conv2D(32, 3, activation='relu', padding='same')(x)

# Here's the skip connection: we Add() the original input to the transformed output.
# But wait! The input has 3 channels, and 'x' has 32. They can't be added.
# This is a classic pitfall. We need a projection.
skip = Conv2D(32, 1, padding='same')(input_img)  # 1x1 conv to match dimensions

# Now we can add them.
output = Add()([skip, x])  # Note: The Add layer takes a *list* of tensors.

model = Model(inputs=input_img, outputs=output)

See that? We had to use a 1x1 convolution to make the dimensions match before the Add operation. This is the kind of gritty detail the Functional API makes you confront, which is actually a good thing. It prevents silent errors.

Multi-Input and Multi-Output Mayhem

This is the Functional API’s killer feature. Let’s say you’re building a model that takes both a user’s profile data (a vector) and their recent activity (a time series). Two inputs, one prediction.

# Input 1: User profile data (e.g., 10 features)
profile_input = Input(shape=(10,), name='profile_data')
# Input 2: Recent activity (e.g., 20 time steps, 5 features each)
activity_input = Input(shape=(20, 5), name='activity_data')

# Process each input with its own sub-network
processed_profile = Dense(16, activation='relu')(profile_input)

# Use an RNN or 1D Conv for the time series
processed_activity = LSTM(16)(activity_input)

# Now, concatenate the two processed streams
merged = Concatenate()([processed_profile, processed_activity])

# Final prediction
output = Dense(1, activation='sigmoid', name='prediction')(merged)

# Build the model by declaring BOTH inputs.
model = Model(inputs=[profile_input, activity_input], outputs=output)

When you train this model, you’ll feed your data as a list of two arrays: model.fit([X_profile, X_activity], y, ...). The explicit naming (name='profile_data') is a best practice that saves your sanity when you’re looking at model summaries or using TensorBoard.

The One Weird Trick: Getting Intermediate Outputs

A huge debugging and feature-engineering advantage of the Functional API is that any tensor in the graph is accessible. Your model is a DAG, and you can create a new Model that uses the same input but has a different output. Want to see what the activations look like after the second dense layer? No problem.

# 'model' is our original Functional model from the first example
# We define a new model that shares the input but outputs the intermediate value
feature_extractor = Model(inputs=model.inputs, outputs=model.layers[2].output)

# Now, we can use this to get the features for any input
intermediate_features = feature_extractor.predict(some_data)

This is invaluable for debugging, for creating embeddings, or for using a pre-trained network as a feature extractor for another task. The Sequential API can do this too, but it’s clunkier and less intuitive.

The bottom line: start with the Sequential API if you’re absolutely sure your network is a straight line. For everything else—which is practically everything in modern deep learning—default to the Functional API. It gives you the clarity and flexibility you need without any real performance cost. It’s the serious practitioner’s choice.