Форум разработчиков электроники ELECTRONIX.ru > Умножение на максимальной частоте

enzaime

Nov 15 2015, 17:38

Приобрёл себе такую платку DE0 nano с ПЛИС на ней EP4CE22F17C6
В документации прочитал, что 1 18х18 умножитель может работать на частоте 287 МГц, а 9х9 умножитель на частоте 340 МГц
https://www.altera.com/content/dam/altera-w.../cyiv-53001.pdf стр 26.

А мне хочется использовать 32 битное умножение, но на частоте >10 МГц оно правильно не работает. Можно ли как-то ускорить это дело? А то 10 МГц как-то мало, я ожидал что-то вроде 150 МГц.
Использовал я это дело так (результат смотрел на компе. По сигналу ready считывается значение регистра mult, причём на умножение тратится 1 такт, так можно понять успевает оно выполниться на задаваемой частоте или нет) :

Код

library IEEE;
use IEEE.STD_LOGIC_1164.all;
use ieee.numeric_std.all;

entity arith is
     port(
         clk : in STD_LOGIC;
         start : in STD_LOGIC;
         a : in std_logic_vector(31 downto 0);
         b : in std_logic_vector(31 downto 0);
         ready : out STD_LOGIC;
         mult : out std_logic_vector(63 downto 0)
         );
end arith;

--}} End of automatically maintained section

architecture arch of arith is
signal state:natural:=0;
begin
    process(clk)
    begin
        if(rising_edge(clk)) then
            if(start='1') then
                if(state=0) then
                    state<=1;
                    ready<='0';
                end if;
            else
                if(state=1) then
                    mult<=std_logic_vector(unsigned(a)*unsigned(b));
                    state<=2;
                end if;

                if(state=2) then
                    ready<='1';
                    state<=0;
                end if;
            end if;
        end if;
    end process;
     -- enter your statements here --

end arch;

Maverick

Nov 15 2015, 18:29

Цитата(enzaime @ Nov 15 2015, 19:38)

Приобрёл себе такую платку DE0 nano с ПЛИС на ней EP4CE22F17C6
В документации прочитал, что 1 18х18 умножитель может работать на частоте 287 МГц, а 9х9 умножитель на частоте 340 МГц
https://www.altera.com/content/dam/altera-w.../cyiv-53001.pdf стр 26.

А мне хочется использовать 32 битное умножение, но на частоте >10 МГц оно правильно не работает. Можно ли как-то ускорить это дело? А то 10 МГц как-то мало, я ожидал что-то вроде 150 МГц.
Использовал я это дело так (результат смотрел на компе. По сигналу ready считывается значение регистра mult, причём на умножение тратится 1 такт, так можно понять успевает оно выполниться на задаваемой частоте или нет) :

если быстро можно так

Код

library ieee;
USE ieee.std_logic_1164.all;
USE ieee.std_logic_arith.all;

entity pipelined_multiplier is

-- generic size is the width of multiplier/multiplicand;
-- generic level is the intended number of stages of the
-- pipelined multiplier;
-- generic level is typically the smallest integer greater
-- than or equal to base 2 logarithm of size, as returned by
-- function log, which you define.
generic (size : integer := 32; level : integer := 1); -- level : integer := log(size)

port (
a : in std_logic_vector (size-1 downto 0);
b : in std_logic_vector (size-1 downto 0);
clk : in std_logic;
pdt : out std_logic_vector (2*size-1 downto 0));
end pipelined_multiplier;

architecture exemplar of pipelined_multiplier is
type levels_of_registers is array (level-1 downto 0) of
unsigned (2*size-1 downto 0);
signal a_int, b_int : unsigned (size-1 downto 0);
signal pdt_int : levels_of_registers;

begin
pdt <= std_logic_vector (pdt_int (level-1));

process(clk)
begin
if clk'event and clk = '1' then
-- multiplier operand inputs are registered
a_int <= unsigned (a);
b_int <= unsigned (b);
-- 'level' levels of registers to be inferred at the
-- output of the multiplier
pdt_int(0) <= a_int * b_int;
for i in 1 to level-1 loop
pdt_int (i) <= pdt_int (i-1);
end loop;
end if;
end process;
end exemplar;

При level = 1 (минимально возможный pipeline) на StratixIV дает тактовую частоту 346,5 МГц, а на четвортом циклоне будет меньше...

bogaev_roman

Nov 16 2015, 09:04

Цитата(enzaime @ Nov 15 2015, 20:38)

В документации прочитал, что 1 18х18 умножитель может работать на частоте 287 МГц, а 9х9 умножитель на частоте 340 МГц
А мне хочется использовать 32 битное умножение, но на частоте >10 МГц оно правильно не работает. Можно ли как-то ускорить это дело? А то 10 МГц как-то мало, я ожидал что-то вроде 150 МГц.

Все что написано в документации от альтеры достоверно. Вы приводите цифры для чистого умножителя (задержка только на умножителе), а в самом коде используете несколько слоев логики, при этом судя по максимальной частоте в 10МГц, задержка (очень большая, даже интересно сколько там логики) набегает по какому-то из входов. Приведите отчет из таймквеста для максимально длинного пути - там либо последовательных слоев логики несколько десятков, либо элементы находятся на разных частях кристалла.

Maverick

Nov 16 2015, 09:35

даю на всякий случай

Multiplication of Large Integers (Karatsuba Algorithm)

"The example design is a fully pipelined 64 x 64 bit multiply with a latency of 6. It uses
three 36 x 36 bit pipelined DSP block multipliers implemented in the sample file
mult_3tick.v. The adder/compressor logic occupies 431 combinational cells. The
pipeline registers are implemented in 520 cell registers and a small inferred
RAM-based shifter. You can disable the RAM inference with synthesis assignments or
a ”synthesis preserve” attribute. Operating frequency on a 2S15C3 (Stratix II) device is
approximately 265 MHz."

The example files are available on the Altera website at the following URL:
www.altera.com/literature/manual/cookbook.zip.

примеры на verilog