Exploring MLX Swift: Structured Generation with @Generable Macro

Apple's latest framework in iOS 26, Foundation Models, introduced the @Generable macro to help with guaranteed type-safe structured output. You define a schema for the output of the model using the normal Swift structures and conform to the Generable protocol. The model will then generate the exact response that conforms to the schema, so you do not have to parse the response manually.

In the latest release of Foundation Models, the team made these schemas exportable as JSON, which means that you can now use the same approach with MLX Swift VLMs and LLMs as well.

This chapter shows you how to combine Foundation Models' @Generable macro with MLX Swift to get structured responses from your on-device models. You can even use the PartiallyGenerated to stream the response field by field, and use it in your SwiftUI views!

I will use the example of Qwen 3 1.7B model to show you how to use the @Generable macro with MLX Swift. I will take the LLMEval example from the official mlx-swift-examples repository and modify it to use the @Generable macro.

We will also use the /nothink tag to prevent the model from adding reasoning before the JSON response.

Unstructured Generation

Let's start with a simple example of creating a task from the user and generating a response using the @Generable macro.

An example user input that is my reality right now:

Remind me to clear the diner's credit card bill tomorrow morning. We missed it last time, and had to pay a late fee. Make sure I do not forget this time!

Because we do not have any instructions to generate structured output, the model will generate a simple response:

Sure! I'll make sure you clear the diner's credit card bill by tomorrow morning. I'll check the payment schedule and ensure it's handled properly. Don't want to miss this—let's get it done! 😊

This is unstructured text, which is not what we want. We want the model to generate a response in a particular format, so we can use it in our app, for example, to create a reminder using EventsKit.

Prompting for Structured Generation

Let's take a naive approach to prompt the model for structured generation. We will ask the model to generate a response in a particular format, and then we will parse the response manually.

Here is the structured response we want to generate:

struct Reminder: Decodable {
    let title: String
    let priority: String
    let dueDate: String
    let isRecurring: Bool
}

And here is the prompt:

let input = "Remind me to clear the diner's credit card bill tomorrow morning. We missed it last time, and had to pay a late fee. Make sure I do not forget this time!"
 
let instructions = """
You are a helpful assistant that responds with valid JSON data only.
 
Extract task information from the user's request and format it as JSON with these fields:
- title: the main task (string)
- dueDate: date in YYYY-MM-DD format or empty string if not specified
- priority: "low", "medium", or "high" based on urgency
- isRecurring: true if it's a repeating task, false otherwise
 
Respond with JSON only, no other text.
"""

And here is the response:

{
  "title": "Clear diner's credit card bill",
  "dueDate": "2025-09-29",
  "priority": "high",
  "isRecurring": false
}

Then we can decode the response:

let response = JSONDecoder().decode(Reminder.self, from: output.data(using: .utf8)!)
print(response)

And this correctly decodes the response into the Reminder struct:

Reminder(title: "Clear diner\'s credit card bill", priority: "high", dueDate: "2025-09-29T00:00:00Z", isRecurring: false)

For a simple example, this works well, but for more complex schemas, you will need to add more parsing logic, and the JSON decoding gets brittle.

Let's take it a bit further and generate a travel itinerary, similar to Apple's foundation models example:

struct Itinerary: Codable, Equatable {
    let title: String
    let destinationName: String
    let description: String
    let rationale: String
    let days: [DayPlan]
}
 
struct DayPlan: Codable, Equatable {
    let title: String
    let subtitle: String
    let destination: String
    let activities: [Activity]
}
 
struct Activity: Codable, Equatable {
    let type: Kind
    let title: String
    let description: String
}
 
enum Kind: String, Codable {
    case sightseeing
    case foodAndDining
    case shopping
    case hotelAndLodging
}

And here is the prompt:

let instructions = """
You are a helpful travel assistant that responds with valid JSON data only.
 
Create a detailed 3-day travel itinerary based on the user's request. Format the response as JSON with this exact structure:
 
{
  "title": "string - exciting trip name",
  "destinationName": "string - destination city",
  "description": "string - brief trip overview",
  "rationale": "string - explanation of how this meets the user's requests",
  "days": [
    {
      "title": "string - unique day title",
      "subtitle": "string - day theme",
      "destination": "string - area/district for this day",
      "activities": [
        {
          "type": "sightseeing" | "foodAndDining" | "shopping" | "hotelAndLodging",
          "title": "string - activity name",
          "description": "string - activity details"
        }
      ]
    }
  ]
}
 
Requirements:
- Include exactly 3 days in the "days" array
- Each day must have exactly 3 activities
- Use only these activity types: "sightseeing", "foodAndDining", "shopping", "hotelAndLodging"
- Make titles engaging and descriptions detailed
- Ensure activities match the user's interests
 
Respond with JSON only, no other text.
"""

And here is the response:

{ "title": "Tokyo Cultural Culinary Escape", "destinationName": "Tokyo", "description": "A 3-day cultural trip to Tokyo, focusing on authentic dining experiences, historical sites, and immersive cultural interactions.", "rationale": "This itinerary balances cultural exploration with culinary delights, offering a unique blend of history, food, and authentic experiences in Tokyo.", "days": [ { "title": "Day 1: Cultural Awakening in Shinjuku", "subtitle": "Historical and Cultural Exploration", "destination": "Shinjuku", "activities": [ { "type": "sightseeing", "title": "Shinjuku Gyō-ō Shrine", "description": "A serene and traditional Japanese shrine in the heart of Shinjuku, offering a peaceful and cultural experience." }, { "type": "sightseeing", "title": "Tokyo National Museum", "description": "Explore Japanese history and culture with a focus on artifacts and art from the Edo period." }, { "type": "foodAndDining", "title": "Kamakura Sake Brewery", "description": "Experience traditional Japanese sake-making in a hands-on, authentic setting." } ] }, { "title": "Day 2: Culinary Delights in Asakusa", "subtitle": "Authentic Japanese Dining and Shopping", "destination": "Asakusa", "activities": [ { "type": "sightseeing", "title": "Asakusa Sengen Shrine", "description": "A beautiful and historically significant shrine in Asakusa, known for its traditional architecture and peaceful atmosphere." }, { "type": "foodAndDining", "title": "Sukagawa Soba Nishiki", "description": "A local favorite for authentic Japanese soba noodles and traditional dining in Asakusa." }, { "type": "shopping", "title": "Asakusa Market", "description": "A bustling market where you can experience local Japanese culture and enjoy fresh food and crafts." } ] }, { "title": "Day 3: Cultural Reflection and Culinary Fare", "subtitle": "Cultural Reflection and Authentic Dining", "destination": "Shinjuku", "activities": [ { "type": "sightseeing", "title": "Shinjuku Grand Shrine", "description": "A grand and traditional Japanese shrine in Shinjuku, offering a serene and spiritual experience." }, { "type": "foodAndDining", "title": "Hakutaka Sushi", "description": "A local favorite for fresh and authentic sushi, with a traditional Japanese dining experience." }, { "type": "hotelAndLodging", "title": "Shinjuku Hotel", "description": "A modern and comfortable hotel in Shinjuku, offering a perfect end to the trip." } ] } ] }

It was successful in the parsing, but there is no guarantee the model follows the JSON format, it might add extra text before/after JSON, field names might not match exactly, data types may not be enforced (might get "false" as string instead of boolean), and no validation of field constraints.

This naive approach works sometimes, but it is unreliable and does not guarantee that the model will follow the JSON format. You can make it much more reliable by directly generating the JSON schema, but that is still something that you will have to do manually:

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "object",
  "title": "Itinerary",
  "description": "A complete travel itinerary with days and activities",
  "properties": {
    "title": {
      "type": "string",
      "description": "An exciting name for the trip"
    },
    "destinationName": {
      "type": "string",
      "description": "The destination must be one of the available landmarks",
      "enum": [
        "Sahara Desert",
        "Serengeti",
        "Deadvlei",
        "Grand Canyon",
        "Niagara Falls",
        "Joshua Tree",
        "Rocky Mountains",
        "Monument Valley",
        "Muir Woods",
        "Amazon Rainforest",
        "Lençóis Maranhenses",
        "Uyuni Salt Flat",
        "White Cliffs of Dover",
        "Alps",
        "Mount Fuji",
        "Wulingyuan",
        "Mount Everest",
        "Great Barrier Reef",
        "South Shetland Islands"
      ]
    },
    "description": {
      "type": "string",
      "description": "A brief trip overview"
    },
    "rationale": {
      "type": "string",
      "description": "An explanation of how the itinerary meets the user's special requests"
    },
    "days": {
      "type": "array",
      "description": "A list of day-by-day plans",
      "minItems": 3,
      "maxItems": 3,
      "items": {
        "$ref": "#/definitions/DayPlan"
      }
    }
  },
  "required": ["title", "destinationName", "description", "rationale", "days"],
  "additionalProperties": false,
  "definitions": {
    "DayPlan": {
      "type": "object",
      "title": "DayPlan",
      "description": "A single day's plan with activities",
      "properties": {
        "title": {
          "type": "string",
          "description": "A unique and exciting title for this day plan"
        },
        "subtitle": {
          "type": "string",
          "description": "A subtitle describing the day's theme"
        },
        "destination": {
          "type": "string",
          "description": "The area or district for this day"
        },
        "activities": {
          "type": "array",
          "description": "List of activities for the day",
          "minItems": 3,
          "maxItems": 3,
          "items": {
            "$ref": "#/definitions/Activity"
          }
        }
      },
      "required": ["title", "subtitle", "destination", "activities"],
      "additionalProperties": false
    },
    "Activity": {
      "type": "object",
      "title": "Activity",
      "description": "A single activity within a day plan",
      "properties": {
        "type": {
          "$ref": "#/definitions/Kind"
        },
        "title": {
          "type": "string",
          "description": "The activity name"
        },
        "description": {
          "type": "string",
          "description": "Detailed description of the activity"
        }
      },
      "required": ["type", "title", "description"],
      "additionalProperties": false
    },
    "Kind": {
      "type": "string",
      "title": "Kind",
      "description": "The type of activity",
      "enum": ["sightseeing", "foodAndDining", "shopping", "hotelAndLodging"]
    }
  }
}

Again, the parsing was successful, but you will have to do this manually for each schema you want to generate.

Let's see how we can use the @Generable macro to solve this problem.

Setting Up @Generable for MLX Swift

The @Generable macro requires iOS 26/macOS 26 or later, so add the Foundation Models framework to your project alongside MLX Swift:

import FoundationModels
import MLXLLM

Let us define a structured response for the itinerary. This uses the same schema as the one we defined earlier, but with the @Generable macro:

@available(iOS 26.0, macOS 26.0, *)
@Generable
struct Itinerary: Equatable {
    @Guide(description: "An exciting name for the trip.")
    let title: String
    @Guide(.anyOf(ModelData.landmarkNames))
    let destinationName: String
    let description: String
    @Guide(description: "An explanation of how the itinerary meets the user's special requests.")
    let rationale: String
 
    @Guide(description: "A list of day-by-day plans.")
    @Guide(.count(3))
    let days: [DayPlan]
}
 
@available(iOS 26.0, macOS 26.0, *)
@Generable
struct DayPlan: Equatable {
    @Guide(description: "A unique and exciting title for this day plan.")
    let title: String
    let subtitle: String
    let destination: String
 
    @Guide(.count(3))
    let activities: [Activity]
}
 
@available(iOS 26.0, macOS 26.0, *)
@Generable
struct Activity: Equatable {
    let type: Kind
    let title: String
    let description: String
}
 
@available(iOS 26.0, macOS 26.0, *)
@Generable
enum Kind {
    case sightseeing
    case foodAndDining
    case shopping
    case hotelAndLodging
}

The @Generable macro handles the schema creation, while @Guide attributes help the model understand what you want for each field. Keep descriptions detailed but specific.

Creating the JSON Schema Bridge

One thing to understand here is that Foundation Models can export your @Generable types as JSON schemas that any language model can understand. You can add a similar extension to your structure:

@available(iOS 26.0, macOS 26.0, *)
extension Itinerary {
    /// Create from Foundation Models GeneratedContent
    init(from generatedContent: GeneratedContent) throws {
        self = try Itinerary(generatedContent)
    }
 
    /// Create from JSON string (for MLX responses)
    init(fromJSON json: String) throws {
        let generatedContent = try GeneratedContent(json: json)
        self = try Itinerary(generatedContent)
    }
}

Now, in the instructions, you can directly pass the schema:

let instructions = """
You are a helpful travel assistant that responds with valid JSON data only.
 
Create a detailed 3-day travel itinerary based on the user's request. Format the response as per the given JSON schema:
\(Itinerary.generationSchema)
 
/nothink

The system instruction is important. It instructs the model to output JSON data, not the schema itself. The /nothink tag at the end prevents models like Qwen from adding reasoning before the JSON response.

Generating the Structured Response

And when you generate the response, you can directly decode the response into the Itinerary struct:

let response = try Itinerary(fromJSON: output)
print(response)

And this will give you the structured response in the response variable.

Streaming Structured Responses

You can also stream the response field by field using the PartiallyGenerated struct. This is useful if you want to show the response in a SwiftUI view as it is generated.

In the method when you are getting the response from the stream, you can directly convert the output to GeneratedContent and then to PartiallyGenerated:

for await batch in stream._throttle(
    for: updateInterval, reducing: Generation.collect)
{
    let output = batch.compactMap { $0.chunk }.joined(separator: "")
    if !output.isEmpty {
        Task { @MainActor [output] in
            self.output += output
 
            let content: GeneratedContent = try GeneratedContent(json: output)
            self.itinerary = try Itinerary.PartiallyGenerated(content)
            print(itinerary)
        }
    }
}

For the response, it starts outputting the response field by field:

LLMEval.Itinerary.PartiallyGenerated(id: FoundationModels.GenerationID(value: "56F0EF4E-8617-4D41-8117-671C6B7C12D5"), title: nil, destinationName: nil, description: nil, rationale: nil, days: nil)
...
LLMEval.Itinerary.PartiallyGenerated(id: FoundationModels.GenerationID(value: "0843AFF9-748E-403B-85AF-8B1437180BB5"), title: Optional("Sahara Desert"), destinationName: nil, description: nil, rationale: nil, days: nil)
...
LLMEval.Itinerary.PartiallyGenerated(id: FoundationModels.GenerationID(value: "0F20F739-3777-4F87-8233-157936D48167"), title: Optional("Sahara Desert Cultural Odyssey"), destinationName: Optional("Sahara Desert"), description: Optional("A 3-day cultural trip through the Sahara Desert, focusing on authentic dining experiences, cultural immersion, and natural wonders of the region."), rationale: Optional(""), days: nil)
...
And so on.

And the final response contains all the fields for the three days of the itinerary. I have omitted it for the sake of brevity. This is the same as the response we got earlier, but it is streamed field by field!

You can use this to show the response in a SwiftUI view, with multiple elements updating as the response is generated.

MLX Structured Generation

However, the above examples still do not guarantee JSON output, because we are not constraining the output when it decodes the token, but later on.

But thanks to Ivan who created MLX Structured, you can use it with MLX Swift to get guaranteed JSON output from your models using constrained decoding. You can find the details in the MLX Structured repository.

I have updated the repository so you can directly use the Generable conforming structures as well. You start by defining the Grammar:

let grammar = try Grammar.schema(generable: Itinerary.self)

To use the above grammar during text generation, you create a logit processor and pass it to TokenIterator:

let configuration = LLMRegistry.qwen3_1_7b_4bit
let context = try await LLMModelFactory.shared.load(configuration: configuration) { _ in }
 
let parameters = GenerateParameters(maxTokens: 2056, temperature: 0.5)
 
let processor = try await GrammarMaskedLogitProcessor.from(configuration: context.configuration, grammar: grammar)
 
let iterator = try TokenIterator(input: input, model: context.model, processor: processor, sampler: parameters.sampler(), maxTokens: parameters.maxTokens)
 
let stream = MLXLMCommon.generate(input: input, context: context, iterator: iterator)

And then you can follow the same pattern as the above examples to get the structured response:

var output = ""
 
for await batch in stream {
    guard let chunk = batch.chunk else { continue }
 
    output += chunk
 
    do {
        let content: GeneratedContent = try GeneratedContent(json: output)
        let partialItinerary = try Itinerary.PartiallyGenerated(content)
 
        await MainActor.run {
            itinerary = partialItinerary
        }
    }
}

This will give you the guaranteed streamed structured response in the itinerary variable.

Do note that this does affect the performance of the model, and I have seen around 15% to 25% slower generation times, depending on the complexity of the schema.

Model Selection and Performance

Not all MLX models perform equally with structured generation. In my testing:

Qwen models (especially Qwen 3 series) are excellent at following JSON schemas, even 1.7B! I like it more than Foundation Models.
Llama 3.2 models are also good with structured output when prompted properly
Gemma models work well but may need more specific prompting that you need to experiment with
Smaller models (under 1B parameters) are way less reliable but faster

And here are some tips for the best results:

Use temperature 0.1-0.3 for more deterministic JSON output
Provide examples in your system prompt for complex schemas
Test with different models to find the best fit for your use case

Moving Forward

While it is unfortunate that structured generation with @Generable is limited to iOS 26+ and macOS 26+, you can still use it for fallback cases on unsupported Apple Intelligence devices. You can directly use the same structure to extract information without complex parsing logic.

The combination of Foundation Models' schema system with MLX Swift's models gives you the best of both worlds: developer-friendly APIs and complete control over model selection and deployment.

Try building structured generation into your existing MLX Swift features for iOS 26 or macOS 26. You will find that many use cases become simpler and more reliable when you can count on getting properly typed responses from your models!