I recently wanted to use some very large numbers in Swift and quickly came across some limitations.

**Int**

My first implementation started out naively with Int (Int64) type. This is, after all, the easiest way to use integers.

The limitations are also easy to understand.

```
print(Int64.max) // 2^63 = 9223372036854775807
print(UInt64.max) // 2^64 = 18446744073709551615
```

and when you try to go beyond that you get an overflow error

```
// set value = Int.max + 1
let value = 9223372036854775808 // error: integer literal '9223372036854775808' overflows when stored into 'Int'
```

Those are some fairly large numbers. Large enough that you likely don’t ever think about it for most day to day development.

What about those times when you do need a bigger number?

**Float**

Floating point types can hold much larger values. The usual suspects Float, and Double are 32 and 64 bits respectively.

```
print(Float32.greatestFiniteMagnitude) // 3.40282e+38
print(Float64.greatestFiniteMagnitude) // 1.79769313486232e+308
```

So even a 32bit float can hold a number with nearly twice the number of digits as a 64 bit Integer. The reason a Float can hold much larger numbers is because a portion of those bits represents an exponent, which means the numbers, to some extent, increase exponentially.

These large numbers come at the cost of precision. As the value get larger, the space between values get larger. At small numbers you can have a lot of precision. This is probably how most Floating point numbers are used (at least that is my common use case). However, at very large numbers the precision gets quite bad.

```
print(Float64.ulpOfOne) // 2.22044604925031e-16
print(Float64.greatestFiniteMagnitude.ulp) // 1.99584030953472e+292
```

As you can see, if we care about not losing Integer precision then we can’t actually use the full range of Float values. So what’s the largest number that we can use before we start losing numbers? It’s going to be two to the power of the number of bits in the significand (including the implicit bit). Let’s try it out.

```
// Float range for ulp = 1
print(Float.significandBitCount) // 23
print(pow(2, Float.significandBitCount)) // 2^23 = 8388608
print(Float(8_388_607).ulp) // 0.5
print(Float(8_388_608).ulp) // 1.0
print(pow(2, Float.significandBitCount+1)) // 2^24 = 16777216
print(Float(16_777_215).ulp) // 1.0
print(Float(16_777_216).ulp) // 2.0
print(Float(16_777_217)) // 16777216 <- lost precision
```

So even though we can get much larger numbers, `16_777_216`

is the last number we can reliably use without the possibility of losing some data.

For 64 bit float

```
print(Float64.significandBitCount) // 52
print(pow(2, Float64.significandBitCount+1)) // 2^53 = 9007199254740992
print(Float64(9007199254740991).ulp) // 1.0
print(Float64(9007199254740992).ulp) // 2.0
print(String(format: "%.0f", Float64(9007199254740993))) // 9007199254740992 <- lost precision
```

So the maximum Float64 value we can use without losing precision is 9,007,199,254,740,992 vs the UInt64 18,446,744,073,709,551,615. Just as you might have suspected, the Int formats are pretty good at holding Integer values.

There is one more Float type available in Swift that might help.

```
print(Float80.significandBitCount) // 63
print(pow(2, Float80.significandBitCount+1)) // 2^64 = 18446744073709551616
let f80: Float80 = 1.8446744073709551615e19
print(f80.ulp) // 1.0
print((f80+1).ulp) // 2.0
print(f80+2) // 1.84467440737095516e+19 <- lost precision
```

Float80 is an extended precision format. The first thing you might notice about this type is that its storage size is not a power of 2. This type is not meant for storage, it’s a format that can be used to reduce roundoff and overflow error while performing intermediate computations with 64-bit floats. A requirement for doing this is that the significand is 64 bits (63 + 1 implicit bit). Therefore, it can hold all of the values of a UInt64 without losing precision.

Note: This is actually double the range of the UInt64 since Float80 can also use all of the negative numbers with the same magnitude.

But how can we get more than 64 bits to hold Integers?

**Decimal**

Decimal is another type, available in Swift, that holds Base 10 numbers. This is our last hope for built in types. Let’s look at the definition.

```
public struct Decimal {
public var _exponent: Int32
public var _length: UInt32 // length == 0 && isNegative -> NaN
public var _isNegative: UInt32
public var _isCompact: UInt32
public var _reserved: UInt32
public var _mantissa: (UInt16, UInt16, UInt16, UInt16, UInt16, UInt16, UInt16, UInt16)
public init()
public init(_exponent: Int32, _length: UInt32, _isNegative: UInt32, _isCompact: UInt32, _reserved: UInt32, _mantissa: (UInt16, UInt16, UInt16, UInt16, UInt16, UInt16, UInt16, UInt16))
}
```

You can see that Decimal has some similar components to a Float. Length, isCompact and reserved are specific to Decimal. The other components are exponent, isNegative, and mantissa which are the components of a float type.

isNegative is just a flag and is represented with 1 bit in float formats.

exponent here is signed 32 bit Int. Float32 has 8 bits and Float64 has 11, both with an implicit offset for negative values.

mantissa (an alternate word for significand) is the part we’re most interested in. The Decimal type represents this as a tuple of 8 UInt16. What this really comes down to is that the significand can hold 8 * 16 = 128 bits! That’s double the amount of bits of a UInt64 and it can represent negative numbers as well.

In actuality, the full 128 bit range does not have ulp <= 1.0. It seems to break down between 2^127+2^126 and 2^128. I haven’t figured out why, so if you know, I’d love to hear about it.

Still if you need to deal in really big numbers with very high precision, Decimal is the key. You should still use Int64 (Int) and Float64 (Double) on 64 bit platforms wherever possible to maximize performance but know that Decimal is standing by when you need it.

How do you go beyond 128 bits? Read my follow up here.

Beyond 128-bit integers

DecimalFloatFloat32Float64Float80floating pointIntInt64Integermantissasignificandulp
## How to store large integer values without losing precision

December 23, 2017

General Coding, Reference

Comments Off on How to store large integer values without losing precision

Ken Boreham

I recently wanted to use some very large numbers in Swift and quickly came across some limitations.

IntMy first implementation started out naively with Int (Int64) type. This is, after all, the easiest way to use integers.

The limitations are also easy to understand.

and when you try to go beyond that you get an overflow error

Those are some fairly large numbers. Large enough that you likely don’t ever think about it for most day to day development.

What about those times when you do need a bigger number?

FloatFloating point types can hold much larger values. The usual suspects Float, and Double are 32 and 64 bits respectively.

So even a 32bit float can hold a number with nearly twice the number of digits as a 64 bit Integer. The reason a Float can hold much larger numbers is because a portion of those bits represents an exponent, which means the numbers, to some extent, increase exponentially.

These large numbers come at the cost of precision. As the value get larger, the space between values get larger. At small numbers you can have a lot of precision. This is probably how most Floating point numbers are used (at least that is my common use case). However, at very large numbers the precision gets quite bad.

As you can see, if we care about not losing Integer precision then we can’t actually use the full range of Float values. So what’s the largest number that we can use before we start losing numbers? It’s going to be two to the power of the number of bits in the significand (including the implicit bit). Let’s try it out.

So even though we can get much larger numbers,

`16_777_216`

is the last number we can reliably use without the possibility of losing some data.For 64 bit float

So the maximum Float64 value we can use without losing precision is 9,007,199,254,740,992 vs the UInt64 18,446,744,073,709,551,615. Just as you might have suspected, the Int formats are pretty good at holding Integer values.

There is one more Float type available in Swift that might help.

Float80 is an extended precision format. The first thing you might notice about this type is that its storage size is not a power of 2. This type is not meant for storage, it’s a format that can be used to reduce roundoff and overflow error while performing intermediate computations with 64-bit floats. A requirement for doing this is that the significand is 64 bits (63 + 1 implicit bit). Therefore, it can hold all of the values of a UInt64 without losing precision.

Note: This is actually double the range of the UInt64 since Float80 can also use all of the negative numbers with the same magnitude.

But how can we get more than 64 bits to hold Integers?

DecimalDecimal is another type, available in Swift, that holds Base 10 numbers. This is our last hope for built in types. Let’s look at the definition.

You can see that Decimal has some similar components to a Float. Length, isCompact and reserved are specific to Decimal. The other components are exponent, isNegative, and mantissa which are the components of a float type.

isNegative is just a flag and is represented with 1 bit in float formats.

exponent here is signed 32 bit Int. Float32 has 8 bits and Float64 has 11, both with an implicit offset for negative values.

mantissa (an alternate word for significand) is the part we’re most interested in. The Decimal type represents this as a tuple of 8 UInt16. What this really comes down to is that the significand can hold 8 * 16 = 128 bits! That’s double the amount of bits of a UInt64 and it can represent negative numbers as well.

In actuality, the full 128 bit range does not have ulp <= 1.0. It seems to break down between 2^127+2^126 and 2^128. I haven’t figured out why, so if you know, I’d love to hear about it.

Still if you need to deal in really big numbers with very high precision, Decimal is the key. You should still use Int64 (Int) and Float64 (Double) on 64 bit platforms wherever possible to maximize performance but know that Decimal is standing by when you need it.

How do you go beyond 128 bits? Read my follow up here.

## Share this:

DecimalFloatFloat32Float64Float80floating pointIntInt64Integermantissasignificandulp