Exploring Runway ML: Generating Video from Image with SwiftUI
Turning a single image into a dynamic video is fun. Thanks to Runway ML’s powerful API, this is now possible.
Got access to the @runwayml API, exploring image to video generation!
Love how the billboard updated too!!
(Prompt: Shibuya Tokyo crossing neon lights) pic.twitter.com/M8MUXw0vWg
— Rudrank Riyam (@rudrankriyam) October 3, 2024
In this tutorial, you will learn how to create a SwiftUI app that uploads an image, sends it to Runway ML to generate a video, and displays the result—all while following Apple's Human Interface Guidelines for a seamless user experience.
What Is Runway ML?
Runway ML is a platform that brings state-of-the-art machine learning models to developers and creators. It offers an API that allows you to integrate advanced generative models into your applications effortlessly.
Overview of the Process
-
Upload an Image: The user selects an image to upload.
-
Start the Generation Task: The app sends a request to Runway ML's API to generate a video from the image.
-
Poll for Task Status: Every five seconds, the app checks the status of the generation task.
-
Display the Video: Once the task succeeds, the app retrieves and displays the video.
Understanding the API Request
Before diving into code, let's break down the API request you'll be making.
Endpoint
POST https://api.runwayml.com/v1/image_to_video
Required Headers
• Authorization: Bearer token with your API key.
• X-Runway-Version: Must be set to 2024-09-13.
Request Body Parameters
• promptImage (string): A HTTPS URL pointing to the image you want to use. The image must be JPEG, PNG, or WebP and under 16MB.
• model (string): Specifies the model variant to use. Accepted value: "gen3a_turbo".
• seed (integer): An optional number between 0 and 999,999,999. If not specified, a random seed is chosen.
• promptText (string): Optional text up to 512 characters describing what should appear in the output.
• watermark (boolean): Whether the output video should contain a Runway watermark. Default is false.
• duration (integer): The length of the output video in seconds. Accepted values are 5 or 10. Default is 10.
Building the SwiftUI App
Let's start coding the SwiftUI app that brings this functionality to life.
Setting Up the User Interface
We'll create a simple interface where the user can select an image and initiate the video generation process.
import SwiftUI
import AVKit
struct ContentView: View {
@State private var selectedImage: UIImage?
@State private var videoURL: URL?
@State private var isProcessing = false
var body: some View {
VStack {
if let image = selectedImage {
ZStack {
Image(uiImage: image)
.resizable()
.scaledToFit()
if isProcessing {
ProgressView("Generating Video...")
.progressViewStyle(CircularProgressViewStyle())
.background(Color.black.opacity(0.7))
.foregroundColor(.white)
}
}
} else {
Text("Tap to select an image")
.foregroundColor(.blue)
.onTapGesture {
// Code to select image
}
}
if let url = videoURL {
VideoPlayer(player: AVPlayer(url: url))
.frame(height: 300)
}
}
.padding()
}
}Allowing Image Selection
You'll need to implement image selection using UIImagePickerController.
// Add this to ContentView
import UIKit
extension ContentView {
func selectImage() {
// Implement image picker code here
}
}Initiating the Generation Task
Once the user selects an image, upload it to a server to get a HTTPS URL (since promptImage requires a URL). Alternatively, you can use a placeholder URL for testing.
func startGeneration(with imageURL: String) {
let url = URL(string: "https://api.runwayml.com/v1/image\_to\_video")!
var request = URLRequest(url: url)
request.httpMethod = "POST"
request.addValue("Bearer YOUR\_API\_KEY", forHTTPHeaderField: "Authorization")
request.addValue("2024-09-13", forHTTPHeaderField: "X-Runway-Version")
request.addValue("application/json", forHTTPHeaderField: "Content-Type")
let body: [String: Any] = [
"promptImage": imageURL,
"model": "gen3a\_turbo",
"watermark": false,
"duration": 5
]
request.httpBody = try? JSONSerialization.data(withJSONObject: body)
isProcessing = true
URLSession.shared.dataTask(with: request) { data, response, error in
guard let data = data else { return }
if let taskResponse = try? JSONDecoder().decode(TaskResponse.self, from: data) {
pollTaskStatus(taskID: taskResponse.id)
}
}.resume()
}Polling for Task Status
Create a function to check the status every five seconds.
func pollTaskStatus(taskID: String) {
let url = URL(string: "https://api.runwayml.com/v1/tasks/\\(taskID)")!
var request = URLRequest(url: url)
request.addValue("Bearer YOUR\_API\_KEY", forHTTPHeaderField: "Authorization")
request.addValue("2024-09-13", forHTTPHeaderField: "X-Runway-Version")
Timer.scheduledTimer(withTimeInterval: 5.0, repeats: true) { timer in
URLSession.shared.dataTask(with: request) { data, response, error in
guard let data = data else { return }
if let statusResponse = try? JSONDecoder().decode(StatusResponse.self, from: data) {
if statusResponse.status == "SUCCEEDED" {
DispatchQueue.main.async {
self.isProcessing = false
self.videoURL = URL(string: statusResponse.output.first!)
}
timer.invalidate()
}
}
}.resume()
}
}Displaying the Video
Once the video is ready, the VideoPlayer view will automatically display it.
Wrapping Up
By integrating Runway ML's API with SwiftUI, you've created an app that transforms images into videos using cutting-edge machine learning models. This project not only demonstrates the power of Runway ML but also shows how SwiftUI makes building interactive UIs straightforward and enjoyable.
Final Thoughts
Feel free to expand this app by adding features like custom prompts, seed values, or duration options. The combination of Runway ML and SwiftUI opens up endless possibilities for creative applications.
Note: Always ensure you handle API keys securely and comply with Runway ML’s usage policies.
Post Topics
Explore more in these categories: