App Development

In the previous post, we completed a first version of our app, which can spice up your visits to a zoo by identifying the animals you see. If, like me, you took it for a spin to your local zoo, you might have noticed that the app is not perfect. It can sometimes misidentify animals, and it can also fail to assign a label at the correct level of specificity. For example, it might identify a zebra as a horse, or it might not recognize a Labrador as a “canine” or even a “mammal”. Why does this happen? We will explore the answer to this question in this post.

Over the course of the past few posts (see the overview here), we’ve introduced the ZooScan app and developed its UI using SwiftUI. In this fourth part, we will focus on integrating the Swift Vision framework to classify animals based on images captured by the app.

Creating a Protocol to Define Image Classifiers #

The first step is defining a protocol for our animal classification model. By using a standardized interface, we can easily switch between different models in the future if needed. Here’s how we can define the protocol:

In the previous post, we implemented the initial screen and the ImagePicker view. In this post, we will further develop the app. We will create a ViewModel and a ScannedAnimal model, and add the ‘Main’ and ‘Detail’ views. This will allow us to focus on the UI and the app structure before we dive into the machine learning part in later posts. By the way, if you’re looking for an overview of all the posts in this series, you can find them here.

In the previous post, I introduced the ZooScan app idea and shared a demo of the app in action. In this post, we’ll be getting our hands dirty. We will set up the project, create the basic UI, and implement the first steps of the app. By the way, if you’re looking for an overview of all the posts in this series, you can find them here.

To give you a basic idea of what we’ll be doing, here is an animated GIF that shows the app in action.

My son has always been fascinated by animals. We go to the local zoo multiple times a week, and when we’re on holiday, we always make a point to visit local zoos and other animal parks. On one of our holidays in Porto, we visited the local SeaLife. While we were there, their SeaScan app caught my attention. his clever app lets you scan fish and other creatures in the aquarium to instantly learn more about them. That sparked an idea: what if I build a similar app for zoo animals?

Over the past few years, we’ve seen some truly impressive advancements in AI and machine learning. I remember that 20 years ago, when I was studying Artificial Intelligence, even finding faces in images was a major challenge—something that smartphones now do effortlessly in real time. The same goes for speech recognition: once considered nearly impossible, it’s now embedded in tools like Siri, Alexa, Google Assistant, and even real-time translation apps. While voice assistants are still not perfect—as this video humorously shows—it’s clear that AI has come a long way. Today, AI opens the door to exciting new applications that were unimaginable just a few years ago. In this post, we explore key areas of AI and ML that app developers can leverage to create smarter, more engaging experiences. First, let’s look at Image Recognition…