Fixed-point with fixed_pkg

Understand ufixed, sfixed, fractional bits, resize and conversion to std_logic_vector.

Why fixed-point

An FPGA naturally manipulates bits. To represent values such as 0.75, -1.25 or a filter gain, floating-point is often avoided. A common solution is fixed-point: a binary integer with a known binary point position.

Example on 8 bits in Q4.4:

4 bits for the integer part;
4 bits for the fractional part;
the physical value is the binary word divided by 2^4.

The `fixed_pkg` package

VHDL-2008 provides fixed_pkg:

library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;
use IEEE.FIXED_PKG.ALL;

It mainly defines:

Type	Use
`ufixed`	unsigned fixed-point number
`sfixed`	signed fixed-point number

Important point: indexes do not only represent bit positions. They also indicate the binary point position.

signal r_gain : ufixed(1 downto -6);

Here:

indexes 1 downto 0: integer part;
indexes -1 downto -6: fractional part;
total width: 8 bits.

Simple example

signal r_a : ufixed(3 downto -4);
signal r_b : ufixed(3 downto -4);
signal r_y : ufixed(4 downto -4);
 
r_y <= resize(r_a + r_b, r_y'high, r_y'low);

r_y has one more integer bit than r_a and r_b, so it can keep a carry.

Signed: `sfixed`

For a value that can be negative, use sfixed.

signal r_x : sfixed(0 downto -11);
signal r_k : sfixed(1 downto -6);
signal r_y : sfixed(2 downto -11);
 
r_y <= resize(r_x * r_k, r_y'high, r_y'low);

Multiplication naturally increases the result width. resize brings the result back to the chosen format.

Conversion from and to ports

Ports often remain std_logic_vector, especially for compatibility with IPs and tools.

entity fixed_adapter is
  port (
    i_x : in  std_logic_vector(11 downto 0);
    o_y : out std_logic_vector(11 downto 0)
  );
end entity fixed_adapter;
 
architecture rtl of fixed_adapter is
  signal w_x : sfixed(0 downto -11);
  signal w_y : sfixed(0 downto -11);
begin
  w_x <= to_sfixed(i_x, w_x'high, w_x'low);
  w_y <= resize(w_x, w_y'high, w_y'low);
 
  o_y <= to_slv(w_y);
end architecture rtl;

The convention must be known on both sides: here, port i_x carries a signed number with 1 integer bit and 11 fractional bits.

Constants and testbench

Conversions from real are practical in testbenches or to define constants.

constant c_GAIN : sfixed(0 downto -7) := to_sfixed(0.75, 0, -7);

Do not use real as an RTL signal to describe hardware. real is for simulation and static calculations, not for a synthesizable bus.

Saturation and rounding

With fixed-point, two problems matter:

rounding: what happens to removed fractional bits?
overflow: what happens if the value no longer fits?

fixed_pkg can control these behaviors through resize. Exact tool support can vary, so keep formats simple at first and always validate in simulation.

r_y <= resize(r_acc, r_y'high, r_y'low);

For a critical design, also test:

minimum value;
maximum value;
negative value;
transition close to zero;
overflowing result.

Best practices

Document the format: Q1.11, Q4.4, etc.
Keep ports as std_logic_vector if the block must integrate with IPs.
Convert quickly to sfixed or ufixed inside the block.
Use resize after additions, multiplications and format changes.
Verify boundary values in simulation.

Key takeaways

Need	Choice
Positive fractional value	`ufixed`
Signed fractional value	`sfixed`
Adapt a format	`resize`
Convert from a port	`to_sfixed` or `to_ufixed`
Convert to a port	`to_slv`
Real-valued testbench calculations	`real`, not synthesizable RTL

📝 Test your knowledge - Chapter quiz