I took a nap and woke up to my X feed flooded with details about the new model by OpenAI.
OpenAI o1 series models is a new series of large language models. Which means there will be some changes in the way you have been prompting GPT4o or Claude Sonnet 3.5.
This post documents my exploration of working with o1 and o1 mini for iOS development, and how my prompting style changed to accomodate for the new model, and squeeze the most out of its thinking and reasoning capabilities.
Discovery requires experimention.
Advice on Prompting
OpenAI gave some advice on how to use these new models effectively. Here is what they suggest, in simpler terms:
- Keep it simple: These models are smart and do not need long, complicated instructions. Just tell them what you want, plain and simple.
- Skip the step-by-step stuff: The o1 models think things through on their own. You do not need to tell them to "think step by step" or explain their reasoning anymore.
- Use clear markers: If your input has different parts, use things like delimiters or section headings to separate them. This helps the model understand each part better.
- Do not overload with info: When you are giving the model extra information or documents to work with, only include what is really important. Too much info can make the model's response more complicated than it needs to be.
Example of Fetching Prominent Colours
I wanted to create a SwiftUI view that could pick out the prominent and unique colors from an image. Here is the instruction I gave to the older GPT-4o model:
TASK:
Create a Swift method to identify the most prominent and unique colors in a UIImage, and develop a SwiftUI view to display the resulting color palette.
REQUIREMENTS:
1. Color Extraction Method:
- Input: UIImage
- Output: Array of UIColors representing the most prominent and unique colors
- Use efficient image processing techniques
- Implement color quantization and clustering algorithms
2. SwiftUI Palette View:
- Input: Array of UIColors from the extraction method
- Display colors in a visually appealing grid or list format
- Include color hex values for reference
3. Integration:
- Demonstrate how to use the color extraction method with the SwiftUI view
4. Performance:
- Optimize for speed and memory efficiency
- Handle various image sizes and color depths
5. Error Handling:
- Implement proper error handling for invalid inputs or processing failures
DELIVERABLES:
- Swift code for the color extraction method
- SwiftUI code for the palette display view
- Example usage demonstrating the integration of both components
Provide clear, concise code with brief comments explaining key functionality.
And here is the example image by Jason Chen:
Here is the response:
### Step 1: Color Extraction Method
For the color extraction method, we’ll use color quantization and clustering to identify the most prominent colors in a `UIImage`. The K-Means algorithm is a popular method for clustering similar colors together, which can be implemented efficiently using the Core Graphics framework.
```swift
import UIKit
import CoreImage
/// Extracts the most prominent and unique colors from a UIImage
/// - Parameters:
/// - image: The input UIImage to process
/// - numberOfColors: The desired number of prominent colors to extract
/// - Returns: An array of UIColor representing the prominent colors
func extractProminentColors(from image: UIImage, numberOfColors: Int = 5) -> [UIColor]? {
guard let cgImage = image.cgImage else { return nil }
let ciImage = CIImage(cgImage: cgImage)
let context = CIContext()
let inputImage = CIImage(image: image)
// Reduce image size for performance without losing significant color data
let scale = CGAffineTransform(scaleX: 0.1, y: 0.1)
let smallImage = inputImage?.transformed(by: scale)
// Get pixel data
guard let bitmapData = context.createCGImage(smallImage!, from: smallImage!.extent) else { return nil }
let width = bitmapData.width
let height = bitmapData.height
let data = CFDataGetBytePtr(bitmapData.dataProvider!.data)
var colors = [UIColor]()
var uniqueColors = Set<UIColor>()
// Read pixel data
for y in 0..<height {
for x in 0..<width {
let offset = ((width * y) + x) * 4
let red = CGFloat(data![offset]) / 255.0
let green = CGFloat(data![offset + 1]) / 255.0
let blue = CGFloat(data![offset + 2]) / 255.0
let color = UIColor(red: red, green: green, blue: blue, alpha: 1.0)
// Ensure the color is unique enough to avoid duplication
uniqueColors.insert(color)
}
}
// Convert Set to array and limit to `numberOfColors`
colors = Array(uniqueColors).prefix(numberOfColors).map { $0 }
return colors
}
```
### Step 2: SwiftUI Palette View
We’ll now create a SwiftUI view that takes the array of colors returned by the above function and displays them in a grid format, including the hex values for each color.
```swift
import SwiftUI
/// Converts a UIColor to a Hex String representation
func hexString(from color: UIColor) -> String {
var red: CGFloat = 0
var green: CGFloat = 0
var blue: CGFloat = 0
var alpha: CGFloat = 0
color.getRed(&red, green: &green, blue: &blue, alpha: &alpha)
let rgb: Int = (Int(red * 255) << 16) | (Int(green * 255) << 8) | Int(blue * 255)
return String(format: "#%06x", rgb)
}
/// SwiftUI view to display a color palette with hex values
struct ColorPaletteView: View {
let colors: [UIColor]
let columns: [GridItem] = [
GridItem(.flexible(), spacing: 10),
GridItem(.flexible(), spacing: 10)
]
var body: some View {
VStack {
Text("Color Palette")
.font(.title)
.padding()
LazyVGrid(columns: columns, spacing: 10) {
ForEach(colors, id: \.self) { color in
VStack {
Rectangle()
.fill(Color(color))
.frame(height: 100)
.cornerRadius(10)
Text(hexString(from: color))
.font(.caption)
.padding(.top, 5)
}
}
}
.padding()
}
}
}
```
### Step 3: Integration Example
Here’s an example of how to integrate the color extraction method with the SwiftUI palette view.
```swift
import SwiftUI
struct ContentView: View {
@State private var colors: [UIColor] = []
var body: some View {
VStack {
if let image = UIImage(named: "sampleImage") {
// Extract colors from the image
let extractedColors = extractProminentColors(from: image, numberOfColors: 6) ?? []
// Show the extracted colors in the palette view
ColorPaletteView(colors: extractedColors)
} else {
Text("Failed to load image.")
}
}
}
}
@main
struct MyApp: App {
var body: some Scene {
WindowGroup {
ContentView()
}
}
}
```
### Step 4: Performance and Error Handling
- **Performance:**
- The image is scaled down to a smaller size to speed up the color extraction process. This improves performance without significantly affecting the accuracy of color extraction.
- For larger images, we handle memory by reducing the input image size before processing pixel data.
- **Error Handling:**
- The method handles invalid images by returning `nil` in case the `UIImage` cannot be processed.
- The color extraction algorithm uses a Set to ensure the uniqueness of the colors, limiting the number of final colors to the desired count.
### Final Notes:
- This approach focuses on simplicity while providing good performance for small to medium-sized images. For larger image datasets, additional optimization techniques like downscaling or more advanced color clustering algorithms could be applied.
Here is the final view:
Giving the same prompt to o1, and here is the much detailed response:
Here's the Swift code for the color extraction method, the SwiftUI palette display view, and an example usage demonstrating the integration of both components.
### Color Extraction Method
This method processes a `UIImage` to identify the most prominent and unique colors using k-means clustering.
```swift
import UIKit
// MARK: - Color Extraction Method
/// Extracts the most prominent and unique colors from a UIImage.
/// - Parameters:
/// - image: The UIImage to process.
/// - numberOfColors: The number of prominent colors to extract (default is 5).
/// - Returns: An array of UIColors representing the prominent colors.
func extractColors(from image: UIImage, numberOfColors: Int = 5) throws -> [UIColor] {
// Ensure the image has a CGImage
guard let cgImage = image.cgImage else {
throw NSError(domain: "Invalid image", code: 0, userInfo: nil)
}
// Downsample the image to speed up processing
let size = CGSize(width: 100, height: 100 * image.size.height / image.size.width)
UIGraphicsBeginImageContext(size)
image.draw(in: CGRect(origin: .zero, size: size))
guard let resizedImage = UIGraphicsGetImageFromCurrentImageContext() else {
UIGraphicsEndImageContext()
throw NSError(domain: "Failed to resize image", code: 0, userInfo: nil)
}
UIGraphicsEndImageContext()
guard let inputCGImage = resizedImage.cgImage else {
throw NSError(domain: "Invalid resized image", code: 0, userInfo: nil)
}
let width = inputCGImage.width
let height = inputCGImage.height
let bytesPerPixel = 4
let bytesPerRow = bytesPerPixel * width
let bitsPerComponent = 8
// Create a pixel buffer to hold image data
guard let data = calloc(height * width, MemoryLayout<UInt32>.size) else {
throw NSError(domain: "Failed to allocate memory", code: 0, userInfo: nil)
}
defer { free(data) } // Ensure memory is freed
let colorSpace = CGColorSpaceCreateDeviceRGB()
let bitmapInfo = CGImageAlphaInfo.premultipliedLast.rawValue
// Create a CGContext to draw the image into the pixel buffer
guard let context = CGContext(data: data, width: width, height: height,
bitsPerComponent: bitsPerComponent,
bytesPerRow: bytesPerRow,
space: colorSpace,
bitmapInfo: bitmapInfo) else {
throw NSError(domain: "Failed to create CGContext", code: 0, userInfo: nil)
}
context.draw(inputCGImage, in: CGRect(x: 0, y: 0, width: width, height: height))
// Extract pixel data into an array of PixelData
let pixelBuffer = data.bindMemory(to: UInt8.self, capacity: width * height * bytesPerPixel)
var pixelData = [PixelData]()
for y in 0..<height {
for x in 0..<width {
let offset = ((width * y) + x) * bytesPerPixel
let r = pixelBuffer[offset]
let g = pixelBuffer[offset + 1]
let b = pixelBuffer[offset + 2]
// Ignore alpha value
pixelData.append(PixelData(r: Double(r), g: Double(g), b: Double(b)))
}
}
// Run k-means clustering to find prominent colors
let clusters = kMeansCluster(pixels: pixelData, k: numberOfColors)
// Convert cluster centers to UIColors
let colors = clusters.map { cluster -> UIColor in
return UIColor(red: CGFloat(cluster.center.r / 255.0),
green: CGFloat(cluster.center.g / 255.0),
blue: CGFloat(cluster.center.b / 255.0),
alpha: 1.0)
}
return colors
}
// Structure to hold RGB pixel data
struct PixelData {
let r: Double
let g: Double
let b: Double
}
// Structure to represent a cluster in k-means
struct Cluster {
var center: PixelData
var points: [PixelData]
}
// K-means clustering algorithm
func kMeansCluster(pixels: [PixelData], k: Int, maxIterations: Int = 10) -> [Cluster] {
// Initialize clusters with random centers
var clusters = [Cluster]()
for _ in 0..<k {
if let randomPixel = pixels.randomElement() {
clusters.append(Cluster(center: randomPixel, points: []))
}
}
// Iterate to refine clusters
for _ in 0..<maxIterations {
// Clear points in each cluster
for clusterIndex in 0..<clusters.count {
clusters[clusterIndex].points.removeAll()
}
// Assign each pixel to the nearest cluster
for pixel in pixels {
var minDistance = Double.greatestFiniteMagnitude
var closestClusterIndex = 0
for (index, cluster) in clusters.enumerated() {
let distance = euclideanDistance(pixel1: pixel, pixel2: cluster.center)
if distance < minDistance {
minDistance = distance
closestClusterIndex = index
}
}
clusters[closestClusterIndex].points.append(pixel)
}
// Update cluster centers
for clusterIndex in 0..<clusters.count {
let cluster = clusters[clusterIndex]
if cluster.points.isEmpty { continue }
let sum = cluster.points.reduce(PixelData(r: 0, g: 0, b: 0)) { (result, pixel) -> PixelData in
return PixelData(r: result.r + pixel.r, g: result.g + pixel.g, b: result.b + pixel.b)
}
let count = Double(cluster.points.count)
clusters[clusterIndex].center = PixelData(r: sum.r / count, g: sum.g / count, b: sum.b / count)
}
}
return clusters
}
// Calculates Euclidean distance between two pixels
func euclideanDistance(pixel1: PixelData, pixel2: PixelData) -> Double {
let dr = pixel1.r - pixel2.r
let dg = pixel1.g - pixel2.g
let db = pixel1.b - pixel2.b
return sqrt(dr * dr + dg * dg + db * db)
}
```
### SwiftUI Palette Display View
This SwiftUI view displays the extracted colors in a grid format, including their hex values for reference.
```swift
import SwiftUI
// MARK: - SwiftUI Palette View
/// A SwiftUI view to display a color palette.
/// - Displays colors in a grid with their hex values.
struct ColorPaletteView: View {
let colors: [UIColor]
// Define grid columns
let columns = [
GridItem(.flexible()),
GridItem(.flexible())
]
var body: some View {
ScrollView {
LazyVGrid(columns: columns, spacing: 16) {
ForEach(0..<colors.count, id: \.self) { index in
let uiColor = colors[index]
let color = Color(uiColor)
let hex = uiColor.toHexString()
VStack {
Rectangle()
.fill(color)
.frame(height: 100)
.cornerRadius(8)
Text(hex)
.font(.caption)
.foregroundColor(.primary)
}
}
}
.padding()
}
}
}
// Extension to convert UIColor to hex string
extension UIColor {
func toHexString() -> String {
var rFloat: CGFloat = 0
var gFloat: CGFloat = 0
var bFloat: CGFloat = 0
var aFloat: CGFloat = 0
self.getRed(&rFloat, green: &gFloat, blue: &bFloat, alpha: &aFloat)
let rInt = Int(rFloat * 255)
let gInt = Int(gFloat * 255)
let bInt = Int(bFloat * 255)
return String(format: "#%02X%02X%02X", rInt, gInt, bInt)
}
}
```
### Example Usage Demonstrating Integration
This example shows how to integrate the color extraction method with the SwiftUI palette view.
```swift
import SwiftUI
// MARK: - Example Usage
struct ContentView: View {
@State private var colors: [UIColor] = []
@State private var errorMessage: String?
var body: some View {
VStack {
// Load image from assets
if let image = UIImage(named: "exampleImage") {
Image(uiImage: image)
.resizable()
.aspectRatio(contentMode: .fit)
.frame(height: 200)
.padding()
// Button to extract colors
Button(action: {
do {
self.colors = try extractColors(from: image, numberOfColors: 5)
} catch {
self.errorMessage = error.localizedDescription
}
}) {
Text("Extract Colors")
}
.padding()
// Display color palette if colors are available
if !colors.isEmpty {
ColorPaletteView(colors: colors)
}
// Display error message if any
if let errorMessage = errorMessage {
Text("Error: \(errorMessage)")
.foregroundColor(.red)
}
} else {
Text("Image not found")
}
}
}
}
```
### Key Functionality Explained
- **Color Extraction Method (`extractColors`)**: This function downsamples the input image to optimize performance, extracts pixel data, and applies k-means clustering to find the most prominent colors.
- **K-Means Clustering**: The `kMeansCluster` function assigns pixels to clusters and recalculates cluster centers iteratively to find the best representation of the image's color distribution.
- **SwiftUI Palette View (`ColorPaletteView`)**: Displays the extracted colors in a grid using `LazyVGrid`. Each color is shown with a rectangle and its corresponding hex value.
- **Integration in `ContentView`**: Demonstrates loading an image, extracting colors upon a button press, and displaying the results. Error handling is included to manage any issues during processing.
### Performance and Error Handling
- **Performance Optimization**: The image is downsampled to a smaller size before processing to reduce computational load and improve speed.
- **Memory Management**: Allocated memory for pixel data is properly freed after use to prevent memory leaks.
- **Error Handling**: Errors during image processing are caught and presented to the user through the `errorMessage` state variable.
---
By following this implementation, you can efficiently extract prominent colors from any `UIImage` and display them in a SwiftUI view with minimal performance overhead.
I was awestruck. That is some well written code in 54 seconds. Less than a minute! And here is the SwiftUI view:
I have spent hours for days with Claude 3.5 Sonnet and I got a better with just one prompt, that too in less than a minute.
Instead of panicking, I am grateful to exist in this new world. As of writing this post, o1 is limited to 30 messages per week and I have already exhausted 10 of it. My plan is to only use it when both GPT4o and Claude 3.5 Sonnet cannot get the job done.
I will continue updating this post with more experimentations. Stay tuned!