Floating-Point Numbers

Header aarith/float/floating_point.hpp

The template class floating_point represents a floating-point number of arbitrary, but compile-time static precision.

template<size_t E, size_t M, typename WordType>
class floating_point

Public Functions

inline constexpr bool is_positive() const

Tests whether the floating-point number is positive.

This returns true for zeros and NaNs as well.

Returns

True iff the sign bit is not set

inline constexpr bool is_negative() const

Tests whether the floating-point number is negative.

This returns true for zeros and NaNs as well.

Returns

True iff the sign bit is set

inline constexpr bool is_finite() const

Returns whether the number is finite.

Note

NaNs are not considered finite

Returns

True iff the number is finite

inline constexpr bool is_nan() const

Checks whether the floating point number is NaN (not a number)

Note

There is no distinction between signalling and non-signalling NaN

Returns

True iff the number is NaN

inline constexpr bool is_qNaN() const

Checks if the number is a quiet NaN.

Returns

True iff the number is a quiet NaN

inline constexpr bool is_sNaN() const

Checks if the number is a signalling NaN.

Returns

True iff the number is a signalling NaN

inline constexpr bool is_zero() const

Checks whether the floating point number is zero.

Returns true for both the positive and negative zero

Returns

True iff the floating point is zero

inline constexpr bool is_pos_zero() const

Checks whether the floating point number is positive zero.

Returns

True iff the floating point is positive zero

inline constexpr bool is_neg_zero() const

Checks whether the floating point number is negative zero.

Returns

True iff the floating point is negative zero

inline constexpr bool is_normalized() const

Checks whether the number is normal.

This is true if and only if the floating-point number is normal (not zero, subnormal, infinite, or NaN).

Returns

True iff the number is normalized

inline constexpr bool is_denormalized() const

Returns whether the number is denormalized.

Note

Denormalized numbers do not include: NaN, +/- inf and, surprisingly, zero.

Returns

True iff the number is denormalized

inline constexpr bool is_subnormal() const

Tests if the number is subnormal.

Note

Zero is not considered subnormal!

Returns

True iff the number is subnormal

inline constexpr bool is_special() const

Returns whether the number is denormalized or NaN/Inf.

Returns

True iff the number is denornmalized, infinite or a NaN

inline explicit constexpr operator float() const

Casts the normalized float to the native float type.

Note

The cast is only possible when there will be no loss of precision

Returns

The value converted to float format

inline explicit constexpr operator double() const

Casts the normalized float to the native double type.

Note

The cast is only possible when there will be no loss of precision

Returns

The value converted to double format

Public Static Functions

static inline constexpr floating_point zero()
Returns

The value zero

static inline constexpr floating_point neg_zero()
Returns

The value negative zero

static inline constexpr floating_point one()
Returns

The value one

static inline constexpr floating_point neg_one()
Returns

The value one

static inline constexpr floating_point pos_infinity()
Returns

positive infinity

static inline constexpr floating_point neg_infinity()
Returns

negative infinity

static inline constexpr floating_point min()
Returns

The smallest finite value

static inline constexpr floating_point max()
Returns

The largest finite value

static inline constexpr floating_point smallest_normalized()
Returns

Smallest positive normalized value

static inline constexpr floating_point smallest_denormalized()
Returns

Smallest positive denormalized value

static inline constexpr floating_point round_error()
Returns

The maximal rounding error (assuming round-to-nearest)

static inline constexpr floating_point qNaN(const IntegerFrac &payload = IntegerFrac::msb_one())

Creates a quiet NaN value.

Parameters

payload – The payload to store in the NaN

Returns

The bit representation of the quiet NaN containing the payload

static inline constexpr floating_point sNaN(const IntegerFrac &payload = IntegerFrac::one())

Creates a signalling NaN value.

Parameters

payload – The payload to store in the NaN (must not be zero)

Returns

The bit representation of the signalling NaN containing the payload

static inline constexpr floating_point NaN()

Returns a floating point number indicating not a number (NaN).

Returns

A non-signalling not a number value