Floating-Point Numbers

Header aarith/float/floating_point.hpp

The template class floating_point represents a floating-point number of arbitrary, but compile-time static precision.

template<size_t E, size_t M, typename WordType> class floating_point

Public Functions

inline constexpr bool is_positive() const

Tests whether the floating-point number is positive.

This returns true for zeros and NaNs as well.

Returns: True iff the sign bit is not set

inline constexpr bool is_negative() const

Tests whether the floating-point number is negative.

This returns true for zeros and NaNs as well.

Returns: True iff the sign bit is set

inline constexpr bool is_finite() const

Returns whether the number is finite.

Note

NaNs are not considered finite

Returns: True iff the number is finite

inline constexpr bool is_nan() const

Checks whether the floating point number is NaN (not a number)

Note

There is no distinction between signalling and non-signalling NaN

Returns: True iff the number is NaN

inline constexpr bool is_qNaN() const

Checks if the number is a quiet NaN.

Returns: True iff the number is a quiet NaN

inline constexpr bool is_sNaN() const

Checks if the number is a signalling NaN.

Returns: True iff the number is a signalling NaN

inline constexpr bool is_zero() const

Checks whether the floating point number is zero.

Returns true for both the positive and negative zero

Returns: True iff the floating point is zero

inline constexpr bool is_pos_zero() const

Checks whether the floating point number is positive zero.

Returns: True iff the floating point is positive zero

inline constexpr bool is_neg_zero() const

Checks whether the floating point number is negative zero.

Returns: True iff the floating point is negative zero

inline constexpr bool is_normalized() const

Checks whether the number is normal.

This is true if and only if the floating-point number is normal (not zero, subnormal, infinite, or NaN).

Returns: True iff the number is normalized

inline constexpr bool is_denormalized() const

Returns whether the number is denormalized.

Note

Denormalized numbers do not include: NaN, +/- inf and, surprisingly, zero.

Returns: True iff the number is denormalized

inline constexpr bool is_subnormal() const

Tests if the number is subnormal.

Note

Zero is not considered subnormal!

Returns: True iff the number is subnormal

inline constexpr bool is_special() const

Returns whether the number is denormalized or NaN/Inf.

Returns: True iff the number is denornmalized, infinite or a NaN

inline explicit constexpr operator float() const

Casts the normalized float to the native float type.

Note

The cast is only possible when there will be no loss of precision

Returns: The value converted to float format

inline explicit constexpr operator double() const

Casts the normalized float to the native double type.

Note

The cast is only possible when there will be no loss of precision

Returns: The value converted to double format

Public Static Functions

static inline constexpr floating_point zero()

Returns: The value zero

static inline constexpr floating_point neg_zero()

Returns: The value negative zero

static inline constexpr floating_point one()

Returns: The value one

static inline constexpr floating_point neg_one()

Returns: The value one

static inline constexpr floating_point pos_infinity()

Returns: positive infinity

static inline constexpr floating_point neg_infinity()

Returns: negative infinity

static inline constexpr floating_point min()

Returns: The smallest finite value

static inline constexpr floating_point max()

Returns: The largest finite value

static inline constexpr floating_point smallest_normalized()

Returns: Smallest positive normalized value

static inline constexpr floating_point smallest_denormalized()

Returns: Smallest positive denormalized value

static inline constexpr floating_point round_error()

Returns: The maximal rounding error (assuming round-to-nearest)

static inline constexpr floating_point qNaN(const IntegerFrac &payload = IntegerFrac::msb_one())

Creates a quiet NaN value.

Parameters: payload – The payload to store in the NaN
Returns: The bit representation of the quiet NaN containing the payload

static inline constexpr floating_point sNaN(const IntegerFrac &payload = IntegerFrac::one())

Creates a signalling NaN value.

Parameters: payload – The payload to store in the NaN (must not be zero)
Returns: The bit representation of the signalling NaN containing the payload

static inline constexpr floating_point NaN()

Returns a floating point number indicating not a number (NaN).

Returns: A non-signalling not a number value