Exploring Runway ML: Generating Video from Image with SwiftUI

Turning a single image into a dynamic video is fun. Thanks to Runway ML’s powerful API, this is now possible.

Got access to the @runwayml API, exploring image to video generation!

Love how the billboard updated too!!

(Prompt: Shibuya Tokyo crossing neon lights) pic.twitter.com/M8MUXw0vWg

— Rudrank Riyam (@rudrankriyam) October 3, 2024

In this tutorial, you will learn how to create a SwiftUI app that uploads an image, sends it to Runway ML to generate a video, and displays the result—all while following Apple's Human Interface Guidelines for a seamless user experience.

What Is Runway ML?

Runway ML is a platform that brings state-of-the-art machine learning models to developers and creators. It offers an API that allows you to integrate advanced generative models into your applications effortlessly.

Overview of the Process

Upload an Image: The user selects an image to upload.
Start the Generation Task: The app sends a request to Runway ML's API to generate a video from the image.
Poll for Task Status: Every five seconds, the app checks the status of the generation task.
Display the Video: Once the task succeeds, the app retrieves and displays the video.

Understanding the API Request

Before diving into code, let's break down the API request you'll be making.

Endpoint

POST https://api.runwayml.com/v1/image_to_video

Required Headers

• Authorization: Bearer token with your API key.

• X-Runway-Version: Must be set to 2024-09-13.

Request Body Parameters

• promptImage (string): A HTTPS URL pointing to the image you want to use. The image must be JPEG, PNG, or WebP and under 16MB.

• model (string): Specifies the model variant to use. Accepted value: "gen3a_turbo".

• seed (integer): An optional number between 0 and 999,999,999. If not specified, a random seed is chosen.

• promptText (string): Optional text up to 512 characters describing what should appear in the output.

• watermark (boolean): Whether the output video should contain a Runway watermark. Default is false.

• duration (integer): The length of the output video in seconds. Accepted values are 5 or 10. Default is 10.

Building the SwiftUI App

Let's start coding the SwiftUI app that brings this functionality to life.

Setting Up the User Interface

We'll create a simple interface where the user can select an image and initiate the video generation process.

import SwiftUI
import AVKit
 
struct ContentView: View {
    @State private var selectedImage: UIImage?
    @State private var videoURL: URL?
    @State private var isProcessing = false
 
    var body: some View {
        VStack {
            if let image = selectedImage {
                ZStack {
                    Image(uiImage: image)
                        .resizable()
                        .scaledToFit()
 
                    if isProcessing {
                        ProgressView("Generating Video...")
                            .progressViewStyle(CircularProgressViewStyle())
                            .background(Color.black.opacity(0.7))
                            .foregroundColor(.white)
                    }
                }
            } else {
                Text("Tap to select an image")
                    .foregroundColor(.blue)
                    .onTapGesture {
                        // Code to select image
                    }
            }
 
            if let url = videoURL {
                VideoPlayer(player: AVPlayer(url: url))
                    .frame(height: 300)
            }
        }
        .padding()
    }
}

Allowing Image Selection

You'll need to implement image selection using UIImagePickerController.

// Add this to ContentView
import UIKit
 
extension ContentView {
    func selectImage() {
        // Implement image picker code here
    }
}

Initiating the Generation Task

Once the user selects an image, upload it to a server to get a HTTPS URL (since promptImage requires a URL). Alternatively, you can use a placeholder URL for testing.

func startGeneration(with imageURL: String) {
    let url = URL(string: "https://api.runwayml.com/v1/image\_to\_video")!
    var request = URLRequest(url: url)
    request.httpMethod = "POST"
    request.addValue("Bearer YOUR\_API\_KEY", forHTTPHeaderField: "Authorization")
    request.addValue("2024-09-13", forHTTPHeaderField: "X-Runway-Version")
    request.addValue("application/json", forHTTPHeaderField: "Content-Type")
 
    let body: [String: Any] = [
        "promptImage": imageURL,
        "model": "gen3a\_turbo",
        "watermark": false,
        "duration": 5
    ]
 
    request.httpBody = try? JSONSerialization.data(withJSONObject: body)
    isProcessing = true
 
    URLSession.shared.dataTask(with: request) { data, response, error in
        guard let data = data else { return }
 
        if let taskResponse = try? JSONDecoder().decode(TaskResponse.self, from: data) {
            pollTaskStatus(taskID: taskResponse.id)
        }
    }.resume()
}

Polling for Task Status

Create a function to check the status every five seconds.

func pollTaskStatus(taskID: String) {
    let url = URL(string: "https://api.runwayml.com/v1/tasks/\\(taskID)")!
    var request = URLRequest(url: url)
    request.addValue("Bearer YOUR\_API\_KEY", forHTTPHeaderField: "Authorization")
    request.addValue("2024-09-13", forHTTPHeaderField: "X-Runway-Version")
 
    Timer.scheduledTimer(withTimeInterval: 5.0, repeats: true) { timer in
        URLSession.shared.dataTask(with: request) { data, response, error in
            guard let data = data else { return }
 
            if let statusResponse = try? JSONDecoder().decode(StatusResponse.self, from: data) {
                if statusResponse.status == "SUCCEEDED" {
                    DispatchQueue.main.async {
                        self.isProcessing = false
                        self.videoURL = URL(string: statusResponse.output.first!)
                    }
                    timer.invalidate()
                }
            }
        }.resume()
    }
}

Displaying the Video

Once the video is ready, the VideoPlayer view will automatically display it.

Wrapping Up

By integrating Runway ML's API with SwiftUI, you've created an app that transforms images into videos using cutting-edge machine learning models. This project not only demonstrates the power of Runway ML but also shows how SwiftUI makes building interactive UIs straightforward and enjoyable.

Final Thoughts

Feel free to expand this app by adding features like custom prompts, seed values, or duration options. The combination of Runway ML and SwiftUI opens up endless possibilities for creative applications.

Note: Always ensure you handle API keys securely and comply with Runway ML’s usage policies.