The Power of Machine Learning: How I Built an App to Identify Dog Breeds with SwiftUI
Did you know that there are over 340 different dog breeds recognized worldwide, with 200 of them are officially recognize by the American Kennel Club? From tiny Chihuahuas to majestic Great Danes, from popular breeds like Labrador Retrievers and German Shepherds to lesser-known breeds like the Xoloitzcuintli and the Lagotto Romagnolo, the diversity within the world of dog breeds is truly astounding. But what if I told you that technology could help us identify these breeds with astonishing accuracy? In this article, I’ll take you on a journey about how Machine Learning and SwiftUI could make it possible.
I. Introduction
Machine Learning was such big and terrifying term. Just like any other human being living in 2023, I believe Artificial Intelligence is a very interesting and amazing thing to explore. On the other hand, I used to be very scared and avoid studying anything related to AI, especially when I need to choose courses to fulfill my university credits requirement or when I need to pick a specialization when I was about to enter my fourth semester of college, since AI was well known for its difficulty.
For context, right now I’m enrolled as a Learner in Apple Developer Academy @BINUS. On May 16th 2023, we were going to dive into our new challenge, the second Nano Challenge. Unlike the previous two challenges, this time we needed to finish the challenge individually. We were challenged to explore a technology that would be selected randomly. Believe it or not, Machine Learning was the technology that came up when I open the paper scroll that was given to our study group.
It was until I discuss with my friends and mentors that I finally discovered Machine Learning wasn’t that terrifying and this challenge is very doable.
II. Machine Learning and SwiftUI
According to the Oxford English Dictionary, Machine Learning is the use and development of computer systems that are able to learn and adapt without following explicit instructions, by using algorithms and statistical models to analyze and draw inferences from patterns in data. In essence, Machine Learning studies certain patterns from a given datasets and attempts to make predictions on given inputs.
According to developer.apple.com, SwiftUI is a framework that helps you build great-looking apps across all Apple platforms using the power of Swift programming language. Using Apple’s latest machine learning technologies such as CoreML, Machine Learning API, or CreateML, we are now able to build, train, and deploy machine learning models to our iOS applications.
III. The Journey
I have built a strong bond with my dogs for the past three years. Along with that, I became increasingly curious about dog breeds around the world. Having encountered a lot of dogs with different breeds, I found it challenging to keep up with the breed names and their physical characteristics. When I realized that classifying images was a part of Machine Learning, I made the decision to explore and create an app to identify dog breeds.
I started my journey by creating a new project in CreateML with Image Classification template.
After trying out several types of datasets, parameters, number of iterations, and spending hours of training the model (around 10+ hours in total) the best result I could achieve is shown in the picture below. I chose the datasets from Kaggle (https://www.kaggle.com/datasets/jessicali9530/stanford-dogs-dataset), which contains over 20.000 images with 120 classes.
On the evaluation tab, I can see that my model has some confusion on multiple similar looking dogs, such as Siberian Husky and Alaskan Malamute. On the positive side, my model performs really well in identifying other dogs, including my dogs.
“The Husky and Malamute misidentification isn’t a big problem,” as I continued to start developing the app and creating a new SwiftUI project in XCode. It wasn’t long until I realized that the Husky and Malamute problem was not the only issue I was facing. Image classification essentially classify an image based on its features. When training, they determine and learn every features of an image to find out the characteristics of each classes. So, Image Classification can’t really identify where the dog is. Those are the reasons why on several images, my model can mistakenly identify an image without any dogs for a certain breed.
I came up with some alternative solutions. The first one is to filter the results of my model classification results, for example, it has to be at least 50% confident to be classify as dogs. But, I’ve found cases where I submit a selfie of my human friends, and my model is 62% confident that it’s a Newfoundland dog, which I believe is a pretty high confidence level.
The other solution is to create another Machine Learning model, dedicated to Object Detection. In Object Detection, our model are trained to identify objects in specific areas of an image, along with their width, height, and coordinates. Although it takes more time, I chose to go with this solution. I decided to use another datasets from Roboflow (https://universe.roboflow.com/thanh-vu-tomosia-com/dog-m2zka/dataset/1), containing images and their annotations. Unlike Image Classification, Object Detection model requires annotation files with JSON format, to tell and teach our model where the dogs are located in the given images.
Now, all I have to do is to process the inputted image first using my Object Detection model, and identify whether or not there is a dog in the picture. If there is a dog, then it’ll continue to process using my previously made Image Classification model.
IV. Unleashing the App
After days of investigation and development, I finally finished my app with the following main features:
- Capture an image directly or choose from the user’s library
- Show up to 3 classification results, starting from the highest confidence
- Save the results to the user’s personal collections (implemented using CoreData)
Application screenshots:
V. Conclusion
Machine Learning wasn’t that terrifying after all, especially using the Apple Technologies. Although I’ve only learned the “easy” part of Machine Learning, which is using CreateML, this experience have encouraged me to explore more about Machine Learning in the future. I’m quite satisfied with the final result of my app, even though this app may not be perfect and still needs a lot of improvement. Because of that, I’m looking forward to receive any kind of feedbacks from anyone regarding to this app.
The source code are available on GitHub:
http://github.com/je-von/jebreed-v1
You can also try the app by yourself using TestFlight: