DigitHelm

Floating Point Converter | IEEE 754 Binary

Convert any decimal to its IEEE 754 32-bit and 64-bit binary representation, coloured bit display, field breakdown, special value detection, and hex-to-float reverse conversion. Runs entirely in your browser using JavaScript typed arrays.

Presets

Accepts any real number, scientific notation (1e-10), Infinity, −Infinity, or NaN.

Press Enter to convert · Esc to reset

What Is the Floating Point Converter | IEEE 754 Binary?

IEEE 754 is the international standard (IEEE Std 754-2008) for binary floating-point arithmetic, implemented in virtually every modern CPU, GPU, and programming language runtime. It defines how real numbers are approximated using a fixed number of bits split into a sign, an exponent, and a significand (mantissa). The standard covers 32-bit single precision, 64-bit double precision, and extended formats up to 128 bits.

The key insight is the biased exponent: the exponent field stores an unsigned integer, and the actual exponent is obtained by subtracting the bias (127 for float32, 1023 for float64). This allows efficient comparison of floating-point magnitudes using integer comparison hardware.

The implicit leading 1 (the "1.mantissa" form) is a clever space-saving trick: since all normalised numbers have a leading 1 in binary, that bit need not be stored. Only subnormal numbers (exponent field = 0) use "0.mantissa", they represent the region nearest to zero where precision gracefully degrades rather than abruptly underflowing to zero.

This converter uses JavaScript's ArrayBuffer with typed arrays (Float32Array, Float64Array, Uint32Array) to extract the exact IEEE 754 bit pattern, the same bit layout that your CPU and JavaScript engine use internally. All conversion results reflect live computation in your browser with no rounding or server round-trip.

Formula

IEEE 754 Normal Number Formula
value = (−1)^s × 2^(e − bias) × 1.mantissa
Single (32-bit): 1 sign + 8 exponent + 23 mantissa = 32 bits · bias = 127
Double (64-bit): 1 sign + 11 exponent + 52 mantissa = 64 bits · bias = 1023
s = sign bit (0 = positive, 1 = negative) · e = stored exponent (unsigned) · mantissa = fractional bits after implicit leading 1
Special Values & Subnormal Numbers
Subnormal: e = 0 → value = (−1)^s × 2^(1−bias) × 0.mantissa
+Infinity: e = all 1s, mantissa = 0, s = 0 · e.g., 1/0 in IEEE 754
−Infinity: e = all 1s, mantissa = 0, s = 1
NaN: e = all 1s, mantissa ≠ 0 · e.g., 0/0, sqrt(−1)
±Zero: e = 0, mantissa = 0 · +0 ≠ −0 in IEEE 754 (but compare equal)
ULP (Unit in Last Place) = 2^(e−bias−23) for 32-bit · Machine epsilon: 2⁻²³ ≈ 1.19×10⁻⁷ (float32), 2⁻⁵² ≈ 2.22×10⁻¹⁶ (float64)
SymbolNameDescription
sSign bit0 = positive, 1 = negative; determines the sign of the value
eStored exponentUnsigned integer stored in the exponent field; e = actual_exp + bias
biasExponent bias127 for 32-bit; 1023 for 64-bit; centres the exponent range around 0
e − biasActual exponentTrue power of 2 used in the value formula; ranges from −126 to +127 (float32)
1.mantSignificandImplicit leading 1 plus the 23 (or 52) explicit mantissa bits
ULPUnit in Last PlaceSmallest difference between two adjacent representable numbers near a given value
εMachine epsilonSmallest number such that 1 + ε ≠ 1; 2⁻²³ for float32, 2⁻⁵² for float64
NaNNot a NumberResult of undefined operations like 0/0 or √(−1); not equal to any value including itself

How to Use

  1. 1
    Choose a mode: Select "Decimal → IEEE 754" to convert a decimal number to binary, or "Hex → Float" to decode a hex bit pattern back to decimal.
  2. 2
    Load a preset: Click a preset (0.1, 0.2, π, 1/3, Infinity, NaN…) to explore well-known floating-point values. Each preset has a note explaining why it is interesting.
  3. 3
    Enter your value: In decimal mode, type any number (including negative, scientific notation like 1.5e-10, or special values like Infinity). In hex mode, enter 8 hex digits for 32-bit or 16 for 64-bit.
  4. 4
    Press Convert or Enter: Click "Convert" or press Enter. Both 32-bit and 64-bit representations are computed simultaneously in decimal mode.
  5. 5
    Read the bit display: Each bit is shown as a coloured square, red for the sign bit, blue for exponent bits, green for mantissa bits. Hover over any bit to see its position and value.
  6. 6
    Check the field breakdown: The breakdown table shows the stored exponent value, the bias, the actual exponent (stored − bias), and the mantissa fraction as a decimal. Special values (NaN, ±Infinity, subnormal) are classified automatically.
  7. 7
    Compare precision: The comparison banner shows your input decimal, its float32 stored value, and its float64 stored value side by side, making rounding errors immediately visible.
  8. 8
    Press Esc to reset: Press Escape or click "Reset" to clear all inputs and return to the default state. Previous inputs are saved to localStorage and restored on your next visit.

Example Calculation

Example 1: 0.1 in float32, the classic inexact decimal

0.1 cannot be represented exactly in binary because it is a repeating fraction in base 2, just as 1/3 repeats in base 10. Float32 stores the nearest representable value.

0.1 in binary: 0.000110011001100110011... (repeating) IEEE 754 float32 bit pattern: Sign: 0 (positive) Exponent: 01111011 = 123 → actual = 123 − 127 = −4 Mantissa: 10011001100110011001101 Stored value = 1.10011001100110011001101 × 2⁻⁴ = 0.10000000149011611938... (not exactly 0.1) Relative error ≈ 1.49 × 10⁻⁷ (about 1 in 7 million)

Example 2: 1.5 in float32, an exactly representable value

1.5 = 1 + 1/2, which terminates in binary. Float32 stores it without any rounding error.

1.5 in binary: 1.1 (exact) IEEE 754 float32 bit pattern: Sign: 0 (positive) Exponent: 01111111 = 127 → actual = 127 − 127 = 0 Mantissa: 10000000000000000000000 Stored value = 1.1 × 2⁰ = 1.5 (exactly) Hex: 3FC00000 Numbers exactly representable in float32 are of the form m × 2^k where m is a 24-bit integer and k is an integer.

Example 3: Decoding hex 7F800000 in 32-bit

Entering a raw IEEE 754 hex pattern in "Hex → Float" mode decodes it to the corresponding floating-point value. 7F800000 is the bit pattern for positive infinity.

Input hex: 7F800000 Binary: 0 11111111 00000000000000000000000 ↑ ──────── ─────────────────────── Sign Exp=255 Mantissa = 0 Exponent field = all 1s (255) AND Mantissa = 0 Classification: +Infinity (result of overflow or division by zero) Other special 32-bit patterns: FF800000 = −Infinity 7FC00000 = NaN (quiet NaN) 00000000 = +Zero 80000000 = −Zero

Understanding Floating Point Converter | IEEE 754 Binary

float32 vs float64, When Does It Matter?

The choice between 32-bit (single) and 64-bit (double) precision is a fundamental trade-off between memory, speed, and accuracy. Most general-purpose programming (JavaScript, Python, Java) defaults to 64-bit double. Graphics and machine-learning workloads increasingly use 32-bit or even 16-bit (float16/bfloat16) for throughput on GPUs.

Propertyfloat32 (single)float64 (double)
Bits3264
Sign bits11
Exponent bits811
Mantissa bits2352
Bias1271023
Max finite value≈ 3.4 × 10³⁸≈ 1.8 × 10³⁰⁸
Min positive normal≈ 1.18 × 10⁻³⁸≈ 2.23 × 10⁻³⁰⁸
Decimal precision≈ 7 significant digits≈ 15–16 significant digits
Machine epsilon ε2⁻²³ ≈ 1.19 × 10⁻⁷2⁻⁵² ≈ 2.22 × 10⁻¹⁶
Memory4 bytes8 bytes

Why 0.1 + 0.2 ≠ 0.3

This is the most famous floating-point surprise. Both 0.1 and 0.2 are stored as the nearest representable float64. When added, the result is the nearest representable float64 to their sum, which is 0.30000000000000004, not exactly 0.3:

0.1 (float64) = 0.1000000000000000055511151231257827021181583404541015625 0.2 (float64) = 0.200000000000000011102230246251565404236316680908203125 Sum stored = 0.3000000000000000444089209850062616169452667236328125 0.3 (float64) = 0.299999999999999988897769753748434595763683319091796875 So: 0.1 + 0.2 === 0.30000000000000004 (in JavaScript, Python, Java, C)

The fix for financial applications is to use integer arithmetic in the smallest unit (e.g., cents instead of dollars), or a decimal library like Python's decimal module or Java's BigDecimal. For scientific computing, you typically accept the error and use relative tolerance checks (|a − b| < ε × |b|) instead of exact equality.

Special Values in Practice

  • +Infinity / −Infinity: Produced by overflow (e.g., 1e38 × 1e38 in float32) or division by zero in IEEE 754. Arithmetic with infinity follows mathematical rules: 1/∞ = 0, ∞ + 1 = ∞.
  • NaN (Not a Number): Produced by undefined operations: 0/0, ∞ − ∞, √(−1). NaN is contagious, any arithmetic involving NaN returns NaN. The test x !== x is only true when x is NaN.
  • +Zero and −Zero: IEEE 754 has two zeros with different sign bits. They compare equal (+0 === −0 in JavaScript) but behave differently: 1/(+0) = +∞ while 1/(−0) = −∞.
  • Subnormal numbers: When the exponent field is all zeros, the implicit leading 1 becomes a 0, allowing gradual underflow. The smallest positive subnormal float32 is ≈ 1.4 × 10⁻⁴⁵. Subnormal arithmetic can be slower on some CPUs.

Rounding Modes

When a value cannot be represented exactly, IEEE 754 defines five rounding modes:

  • Round to nearest, ties to even (default): Rounds to the nearest representable value; if exactly halfway, rounds to the value with an even last bit. This minimizes statistical bias in long computations.
  • Round toward +∞: Always rounds up (ceiling). Used to compute upper bounds in interval arithmetic.
  • Round toward −∞: Always rounds down (floor). Used to compute lower bounds.
  • Round toward zero: Truncation, drops the fractional part. Equivalent to Math.trunc().
  • Round to nearest, ties to away from zero: The "school rounding" mode; less common in hardware.

IEEE 754 in Programming Languages

Virtually every modern language uses IEEE 754 double precision by default for its primary floating-point type. Key type mappings:

Languagefloat32 typefloat64 type
JavaScript / TypeScriptFloat32Array elementsnumber (all floats)
Pythonnumpy.float32float (built-in)
C / C++floatdouble
Javafloatdouble
Rustf32f64
Gofloat32float64
GLSL / HLSL (GPU)mediump floathighp float / double

Bit patterns are extracted using language-specific mechanisms: in C via memcpy or unions, in JavaScript via ArrayBuffer and typed arrays, in Python via struct.pack. This converter uses the JavaScript typed-array approach, the same technique used in WebGL and WebAssembly.

Frequently Asked Questions

Why can't computers represent 0.1 exactly?

0.1 in binary is the repeating fraction 0.0001100110011…, analogous to 1/3 = 0.333… in decimal. It has no finite binary representation.

  • • IEEE 754 stores only 23 mantissa bits (float32) or 52 bits (float64), so the pattern is truncated and rounded.
  • • The nearest float32 value is 0.100000001490116119, close, but not exact.
  • • This is why 0.1 + 0.2 === 0.30000000000000004 in JavaScript, Python, Java, and C.

The fix for financial code is to work in integer units (e.g., cents) rather than fractional dollars.

What is the difference between float32 and float64?

Both formats follow IEEE 754, but they allocate bits differently:

  • float32 (single): 1 sign + 8 exponent + 23 mantissa = 32 bits. Approximately 7 significant decimal digits, range up to ≈ 3.4 × 10³⁸.
  • float64 (double): 1 sign + 11 exponent + 52 mantissa = 64 bits. Approximately 15–16 significant digits, range up to ≈ 1.8 × 10³⁰⁸.
  • • JavaScript's number type is always float64; float32 is only accessible via Float32Array.

GPUs and ML workloads often prefer float32 (or float16) to maximise throughput at the cost of precision.

What is the exponent bias and why is it needed?

The exponent field stores an unsigned integer, and the bias converts it to a signed actual exponent: actual = stored − bias.

  • • float32 bias = 127: stored 127 → actual exponent 0; stored 130 → actual +3; stored 124 → actual −3.
  • • float64 bias = 1023: stored 1023 → actual exponent 0.
  • • Exponent 0 (all zeros) is reserved for subnormals and ±zero.
  • • Exponent 255 / 2047 (all ones) is reserved for ±Infinity and NaN.

The biased representation means floating-point magnitudes can be compared using plain integer comparison, a useful hardware optimisation.

What are NaN and the two infinities?

IEEE 754 uses the all-ones exponent field to encode three special categories:

  • +Infinity: exponent = all 1s, mantissa = 0, sign = 0. Produced by overflow or 1 / 0.
  • -Infinity: same but sign = 1. Produced by -1 / 0 or large negative overflow.
  • NaN: exponent = all 1s, mantissa ≠ 0. Result of 0/0, ∞ − ∞, or sqrt(−1). NaN is the only value not equal to itself: NaN !== NaN is always true.

There are also two zeros: +0 (sign = 0) and -0 (sign = 1). They compare equal, but 1/(+0) = +∞ while 1/(−0) = −∞.

What is a subnormal (denormalized) number?

When the exponent field is all zeros, the number is subnormal. The formula changes: the implicit leading 1 becomes a 0, giving 0.mantissa × 2^(1−bias) instead of the normal 1.mantissa × 2^(e−bias).

  • • Subnormals allow gradual underflow: precision reduces smoothly near zero instead of abruptly jumping to 0.
  • • Smallest positive float32 subnormal: 2⁻¹⁴⁹ ≈ 1.4 × 10⁻⁴⁵.
  • • Smallest positive float32 normal: 2⁻¹²⁶ ≈ 1.18 × 10⁻³⁸.
  • • Subnormal arithmetic can be significantly slower on CPUs that handle it in microcode rather than hardware.

How do I check if two floats are equal in code?

Never use === to compare floating-point results that came from arithmetic, rounding errors make exact equality unreliable.

  • Relative tolerance: |a − b| < ε × max(|a|, |b|), scales with the magnitude of the values.
  • Absolute tolerance: |a − b| < ε, use when values are expected to be near zero.
  • • JavaScript machine epsilon: Number.EPSILON ≈ 2.22 × 10⁻¹⁶. A practical tolerance for most scientific work is 1e-9.
  • • For financial calculations, use integer arithmetic in the smallest unit (e.g., cents) to eliminate float errors entirely.

Why do GPUs use float16 or bfloat16?

Modern ML accelerators trade precision for throughput, packing more multiply-accumulate operations per second into the same silicon:

  • float16 (IEEE half precision): 1 sign + 5 exponent + 10 mantissa bits. About 3 decimal digits of precision, range up to 65504.
  • bfloat16 (Brain Float 16): 1 sign + 8 exponent + 7 mantissa bits. Same exponent range as float32, so it avoids gradient overflow in neural networks despite lower mantissa precision.
  • float8: Emerging format in transformer inference (e.g., FP8-E4M3, FP8-E5M2) used on H100 GPUs.

This converter focuses on the standard float32 and float64 formats defined by IEEE 754-2008.

How does the converter extract the exact bits?

All conversions run live in your browser using JavaScript typed arrays, no server round-trip.

  • float32: A Float32Array and a Uint32Array share the same ArrayBuffer. Writing the float value and reading the integer gives the raw bit pattern, identical to a C union or memcpy.
  • float64: Two Uint32Array elements hold the high and low 32-bit words, avoiding JavaScript's 53-bit integer precision limit for the full 64-bit pattern.
  • • The bit layout is then split into sign, exponent, and mantissa fields by masking and shifting, exactly as CPU hardware does internally.

Related Calculators