Lossy decodable for arrays

August 3, 2019

General Coding, Reference

Comments Off on Lossy decodable for arrays


This article is about exploring a way to decode only the “good” items from arrays.

It’s common for apps to decode arrays of data. For example, you may have a feed of user-generated content or a list of items for sale. To get this data, the app will make a network request to some backend API. Then, that API will most likely send the data back as JSON.

Swift gives us a great way to decode such data. You can simply set your objects to conform to the Decodable protocol. Then use JSONDecoder to build your objects.

Unfortunately, the data isn’t always perfect. If the data in the JSON doesn’t match your model the decoder will throw an error. And if just one field of an object or sub-object isn’t right, the entire list is thrown out.

So, what can we do about it? How can we allow the good items to go through and only reject the bad items without rejecting the entire list?

Starting with good data

Let’s look at an example. We’re going to decode a page of messages.

struct Message: Decodable {
    let sender: String
    let subject: String?
    let body: String
}

struct MessagePage: Decodable {
    let page: Int
    let limit: Int
    let items: [Message]
}

We have a page of message items which each contain a sender, optional subject and body.

Lets look at some good test data:

let goodTestData = """
{
    "page": 1,
    "limit": 10,
    "items": [
        {
            "sender": "Sender One",
            "subject": null,
            "body": "Body one."
        },
        {
            "sender": "Sender Two",
            "body": "Body two."
        },
        {
            "sender": "Sender Three",
            "subject": "Third subject",
            "body": "Body three."
        }
    ]
}
""".data(using: .utf8)!

Since the subject is optional, it can be null, missing, or contain a valid value and the decoder will easily handle each of those cases.

Finally, we can test decoding this data with something like this:

let jsonDecoder: JSONDecoder = {
    let decoder = JSONDecoder()
    decoder.keyDecodingStrategy = .convertFromSnakeCase
    return decoder
}()

do {
    let posts = try jsonDecoder.decode(MessagePage.self, from: testData)
    posts.items.forEach { print($0) }
} catch {
    print(error)
}

So far, there’s no issue. We tested the expected good data and everything is properly decoded. Ship it! Unfortunately, the real world is full of poorly constructed data. Our assumptions may be false.

Dealing with bad data

What if we were working with this data instead:

let testData = """
{
    "page": 1,
    "limit": 10,
    "items": [
        {
            "sender": "Sender One",
            "subject": null,
            "body": "Body one."
        },
        {
            "sender": "Sender Two",
            "body": "Body two."
        },
        {
            "subject": "Third subject",
            "body": "Body three."
        }
    ]
}
""".data(using: .utf8)!

Now we don’t get any items. Instead, we get an error telling us that our third item is missing a value for “sender”.

We could go back to our model and make sender optional. In some cases, it might make sense, but what does it mean if a message has no sender? This data may be required by our UI or even for another API that the app is using. Besides, if we go make all the fields optional we might as well just have used the older JSONSerialization instead.

One option would be to use manually implement the initializer and skip over any bad items by decoding into a dummy object.

Custom Decodable

struct Dummy: Decodable { }

struct MessagePage: Decodable {
    let page: Int
    let limit: Int
    let items: [Message]
    
    enum CodingKeys: CodingKey {
        case page, limit, items
    }
    
    init(from decoder: Decoder) throws {
        let container = try decoder.container(keyedBy: CodingKeys.self)
        page = try container.decode(Int.self, forKey: .page)
        limit = try container.decode(Int.self, forKey: .limit)
        var items = [Message]()
        var itemsContainer = try container.nestedUnkeyedContainer(forKey: .items)
        while !itemsContainer.isAtEnd {
            do {
                let item = try itemsContainer.decode(Message.self)
                items.append(item)
            } catch {
                _ = try? itemsContainer.decode(Dummy.self)
            }
        }
        self.items = items
    }
}

Now we get the good items in the array and allow the bad items to drop off.

We need the dummy object because the index of the decoder doesn’t increment when a decode fails. We can abstract that away with a failable decodable object.

Failable Decodable

struct FailableDecodable<Element: Decodable>: Decodable {
    var element: Element?
    init(from decoder: Decoder) throws {
        let container = try decoder.singleValueContainer()
        element = try? container.decode(Element.self)
    }
}

This allows us to remove the Dummy object and rewrite the initializer as follows:

init(from decoder: Decoder) throws {
        let container = try decoder.container(keyedBy: CodingKeys.self)
        page = try container.decode(Int.self, forKey: .page)
        limit = try container.decode(Int.self, forKey: .limit)
        var items = [Message]()
        var itemsContainer = try container.nestedUnkeyedContainer(forKey: .items)
        while !itemsContainer.isAtEnd {
            if let item = try itemsContainer.decode(FailableDecodable<Message>.self).element {
                items.append(item)
            }
        }
        self.items = items
    }

This isn’t scalable yet, it’s a lot of boilerplate to write anytime you have an array that could fail. Let’s fix that.

Lossy Decodable Array

struct LossyDecodableArray<Element: Decodable>: Decodable {
    let elements: [Element]

    init(from decoder: Decoder) throws {
        var elements = [Element?]()
        var container = try decoder.unkeyedContainer()
        while !container.isAtEnd {
            let item = try container.decode(FailableDecodable<Element>.self).element
            elements.append(item)
        }
        self.elements = elements.compactMap { $0 }
    }
}

Now we can greatly simplify the MessagePage object.

struct MessagePage: Decodable {
    let page: Int
    let limit: Int
    let items: LossyDecodableArray<Message>
}

Looking pretty good. The thing I don’t like is that you have to access the messages using items.elements. So let’s fix it.

extension LossyDecodableArray: RandomAccessCollection {
    var startIndex: Int { return elements.startIndex }
    var endIndex: Int { return elements.endIndex }
    
    subscript(_ index: Int) -> Element {
        return elements[index]
    }
}

Now we can access the elements as we did originally and everything is working nicely.

Conclusion

I like using Decodable models to represent objects returned by backend APIs. Unfortunately, the data isn’t always perfect, and it can be challenging to find a clean solution using Decodable. However, with some persistence and the right abstractions, we can create scalable solutions. I really like how this case turned out, and I hope you do too.

I’d love to hear your feedback, questions or thoughts; find me on twitter @kenboreham

🔥 Thanks for reading! 👍



Subscribe via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.





Swift Tweets