From 239106c2fd880e416ac0983cf492bf0af9077ca2 Mon Sep 17 00:00:00 2001 From: Xiretza Date: Sun, 29 Mar 2020 18:03:16 +0200 Subject: [PATCH 1/8] Add nbsp before citations --- sections/soc/soc.tex | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/sections/soc/soc.tex b/sections/soc/soc.tex index 2a7b7f9..c4ba3fd 100644 --- a/sections/soc/soc.tex +++ b/sections/soc/soc.tex @@ -5,7 +5,7 @@ \part{Meta} \section{History} -The project started out with the desire to build a CPU from scratch. Examples such as The NAND Game\cite{nandgame} and Ben Eater's Breadboard Computer series\cite{breadboard_computer} served as inspirations and guidance during development. +The project started out with the desire to build a CPU from scratch. Examples such as The NAND Game~\cite{nandgame} and Ben Eater's Breadboard Computer series~\cite{breadboard_computer} served as inspirations and guidance during development. At first, a design similar to Ben Eater's consisting solely of discrete integrated circuits was considered, but soon discarded in favor of an FPGA-based design. Designing the logic alone was a difficult task, implementing it in discrete hardware would have pushed the project far over the allotted maximum development time. @@ -52,16 +52,16 @@ Vendor tools are usually free-of-charge for basic usage, but this also means the \subsection{Free Software Tools} -A somewhat recent development is the creation of Free Software\footnotemark{} FPGA toolchains. A breakthrough was achieved by Claire (formerly Clifford) Wolf in 2013 with yosys\cite{yosys-paper, yosys}, a feature-complete Verilog synthesis suite for Lattice's \texttt{iCE40} FPGA series. +A somewhat recent development is the creation of Free Software\footnotemark{} FPGA toolchains. A breakthrough was achieved by Claire (formerly Clifford) Wolf in 2013 with yosys~\cite{yosys-paper, yosys}, a feature-complete Verilog synthesis suite for Lattice's \texttt{iCE40} FPGA series. \footnotetext{``Free Software'' refers to software that grants its user the freedom to share, study and modify it - see \url{https://www.fsf.org/about/what-is-free-software}.} -Since then, both yosys and place-and-route tools like nextpnr\cite{nextpnr} have matured, however Lattice's iCE40 and ECP5 remained the only supported FPGA architectures for place-and-route. +Since then, both yosys and place-and-route tools like nextpnr~\cite{nextpnr} have matured, however Lattice's iCE40 and ECP5 remained the only supported FPGA architectures for place-and-route. -Thus, two obstacles remained for Free toolchains to be viable for this project: synthesizing \emph{from} VHDL code and synthesizing \emph{to} Artix-7 FPGAs. During the development of the project, both of these were solved: Tristan Gingold released ghdlsynth-beta\parencite*{ghdlsynth-beta}, a bridge between GHDL\cite{ghdl} and yosys allowing VHDL to be synthesized just the same as Verilog, and Dave Shah added Xilinx support to nextpnr\cite{nextpnr-xilinx}. The latter was preceded by many months of volunteer work reverse-engineering the Xilinx bitstream format as part of \textit{Project X-Ray}\parencite*{prjxray}. +Thus, two obstacles remained for Free toolchains to be viable for this project: synthesizing \emph{from} VHDL code and synthesizing \emph{to} Artix-7 FPGAs. During the development of the project, both of these were solved: Tristan Gingold released ghdlsynth-beta~\cite{ghdlsynth-beta}, a bridge between GHDL~\cite{ghdl} and yosys allowing VHDL to be synthesized just the same as Verilog, and Dave Shah added Xilinx support to nextpnr~\cite{nextpnr-xilinx}. The latter was preceded by many months of volunteer work reverse-engineering the Xilinx bitstream format as part of \textit{Project X-Ray}~\cite{prjxray}. With these two pieces in place, the project was switched over to a completely Free toolchain, removing any depencies on vendor tools: \begin{itemize} - \item yosys, with ghdl as a frontend for processing VHDL, is used to synthesize the design + \item yosys, with ghdl as a frontend for processing VHDL and ghdlsynth as a bridge between them, is used to synthesize the design \item nextpnr-xilinx, together with the Project X-Ray database, is used for place-and-route \item tools from Project X-Ray are used to convert the routed design to a bitstream \item openFPGALoader is used to transfer the bitstream to the FPGA via JTAG @@ -114,7 +114,7 @@ DVI and HDMI are serial digital transmission standards. Three data lines (corres \subsection{Ethernet} -The Arty development board contains an RJ-45 Ethernet jack connected to an Ethernet PHY, which exposes a standardized media-independent interface (MII) to the FPGA. The LiteEth core\cite{liteeth}, which is released under a Free Software license, is used to integrate the Ethernet interface into the SoC. +The Arty development board contains an RJ-45 Ethernet jack connected to an Ethernet PHY, which exposes a standardized media-independent interface (MII) to the FPGA. The LiteEth core~\cite{liteeth}, which is released under a Free Software license, is used to integrate the Ethernet interface into the SoC. \subsection{WS2812 driver} @@ -153,7 +153,7 @@ Due to a mistake in the adapter board layout, the nibbles of the address and dat \subsection{RISC-V Compliance Tests} -The RISC-V Compliance Test Suite\cite{riscv-compliance} can be used to empirically confirm the correct functionality of a RISC-V processor. It consists of a series of programs that perform some operations related to a specific feature, then write some result data to a memory region. This memory region is then compared to a ``golden signature'', which was produced by a processor implementation that is known to be correct. +The RISC-V Compliance Test Suite~\cite{riscv-compliance} can be used to empirically confirm the correct functionality of a RISC-V processor. It consists of a series of programs that perform some operations related to a specific feature, then write some result data to a memory region. This memory region is then compared to a ``golden signature'', which was produced by a processor implementation that is known to be correct. The initial implementation of the compliance tests uncovered several bugs in the processor core: From 89f0a1565e48e567d4476e46b89dc419c71c1ecd Mon Sep 17 00:00:00 2001 From: Xiretza Date: Sun, 29 Mar 2020 18:04:34 +0200 Subject: [PATCH 2/8] Improve FPGA comparison tables --- preamble.tex | 2 ++ sections/soc/soc.tex | 22 ++++++++++++++++++---- 2 files changed, 20 insertions(+), 4 deletions(-) diff --git a/preamble.tex b/preamble.tex index ef3dcdd..b6379d9 100644 --- a/preamble.tex +++ b/preamble.tex @@ -338,3 +338,5 @@ minimum height=1cm, align=center, text width=3cm, draw=black, fill=blue!30] bottom=0pt } \newcommand{\icode}[1]{\codeBox{\texttt{#1}}} + +\usepackage{booktabs} diff --git a/sections/soc/soc.tex b/sections/soc/soc.tex index c4ba3fd..9d58c79 100644 --- a/sections/soc/soc.tex +++ b/sections/soc/soc.tex @@ -15,28 +15,42 @@ As a starting point, a Terasic DE0 development board\footnote{\url{https://www.t The only method of synthesis for Altera devices is to use the proprietary Quartus IDE. However, the last version of Quartus to support the Cyclone III series of FPGAs (version 13.1) had already been out of date for several years at the start of the project. Because of this and the increasing resource demand of the developing core, an Arty A7-35T development board\footnote{\url{https://store.digilentinc.com/arty-a7-artix-7-fpga-development-board-for-makers-and-hobbyists/}} with a Xilinx Artix-7\footnote{\url{https://www.xilinx.com/products/silicon-devices/fpga/artix-7.html}} FPGA was ordered from Digilent. -The two FPGAs compare as follows: +A comparison between the two FPGAs themselves can be seen in \autoref{tab:fpga-comparison}, a comparison between the peripherals on the development boards in \autoref{tab:devboard-comparison}. -\begin{tabular}{l r r} +\begin{table}[h] +\centering +\begin{tabular}{l|r|r} +\toprule & Altera EP3C16 & Xilinx XC7A35T \\ +\midrule Logic Elements & 15000 & 33280 \\ Multipliers & 56 & 90 \\ Block RAM (kb) & 504 & 1800 \\ PLLs & 4 & 5 \\ Global clocks & 20 & 32 \\ +\bottomrule \end{tabular} +\caption{Comparison between Altera and Xilinx FPGAs} +\label{tab:fpga-comparison} +\end{table} -The periphery on the development boards: - +\begin{table}[h] +\centering \begin{tabular}{l|r|r} +\toprule & Terasic DE0 & Digilent Arty A7-35T \\ +\midrule Switches & 10 & 4 \\ Buttons & 3 & 4 \\ LEDs & 10 + 4x 7-segment & 4 + 3 RGB \\ GPIOs & 2x 36 & 4x PMOD + chipKIT \\ Memory & 8MB SDRAM & 256MB DDR3L \\ Others & SD card, VGA & Ethernet \\ +\bottomrule \end{tabular} +\caption{Comparison between the peripherals on Terasic and Digilent FPGA development boards} +\label{tab:devboard-comparison} +\end{table} While the Digilent board offers fewer IO options, the DDR3 memory can be interfaced using Free memory cores and allows for much larger programs to be loaded, possibly even a full operating system. The missing VGA port has been substituted by a HDMI-compatible DVI interface that is accessible through one of the high-speed PMOD connectors. From 40bc2327fdc5a76c4033f6c5f5d36cdd77566535 Mon Sep 17 00:00:00 2001 From: Xiretza Date: Sun, 29 Mar 2020 18:05:05 +0200 Subject: [PATCH 3/8] Add information about DRAM interface --- Diplomschrift.bib | 6 ++++++ sections/soc/soc.tex | 4 +++- 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/Diplomschrift.bib b/Diplomschrift.bib index d5cc914..3746535 100644 --- a/Diplomschrift.bib +++ b/Diplomschrift.bib @@ -60,6 +60,12 @@ url = {https://github.com/enjoy-digital/liteeth} } +@software{litedram, + author = {Florent Kermarrec}, + title = {LiteDRAM}, + url = {https://github.com/enjoy-digital/litedram} +} + @software{open-fpga-loader, author = {Gwenhael Goavec-Merou}, title = {openFPGALoader}, diff --git a/sections/soc/soc.tex b/sections/soc/soc.tex index 9d58c79..c8d8505 100644 --- a/sections/soc/soc.tex +++ b/sections/soc/soc.tex @@ -155,7 +155,9 @@ The exact timing differs between models, so all periods can be customized using \subsection{DRAM} -% TODO +The Arty A7 development board contains a 256MB DDR3 memory module. Since the FPGA only contains about 1.8MB of block RAM, of which some is already reserved for various hardware functions (e.g. the text buffer and WS2812 driver), the external memory is absolutely necessary to run larger programs. + +Interfacing with DDR3 memory is notoriously difficult, requiring complex logic on both physical and logical layers. For this reason, the Free Software LiteDRAM core~\cite{litedram} is used to integrate the entire memory interface into the SoC. While irrelevant to the SoC, it can still be considered a slight oddity the LiteDRAM core actually contains an entire separate RISC-V core to coordinate initialization of the memory. \subsection{External Bus} From e891568008b435f7d7cdf3f786cb817b89d55971 Mon Sep 17 00:00:00 2001 From: Xiretza Date: Sun, 29 Mar 2020 18:05:21 +0200 Subject: [PATCH 4/8] Small fixes --- sections/soc/soc.tex | 2 +- sections/vhdl_intro/vhdl_intro.tex | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/sections/soc/soc.tex b/sections/soc/soc.tex index c8d8505..3f9b129 100644 --- a/sections/soc/soc.tex +++ b/sections/soc/soc.tex @@ -149,7 +149,7 @@ The driver is designed to be attached to external circuitry that provides color The LEDs are controlled using a simple one-wire serial protocol. After a reset (long period of logic 0), the data for all LEDs is transmitted serially in one single blob. Each LED consumes and stores the first 24 bits of the stream and applies them as its color value (8 bits each for red, green, blue), all following bits are passed through unmodified. The second LED thus uses the first 24 bits of the stream it receives, but since the first LED already dropped its data, these are actually the second set of 24 bits of the source data. -Every bit is encoded as a period of logic 1, followed by a period of logic 0. The timing of these sections determines the value, see \ref{fig:ws2812_timing}. +Every bit is encoded as a period of logic 1, followed by a period of logic 0. The timing of these sections determines the value, see \autoref{fig:ws2812_timing}. The exact timing differs between models, so all periods can be customized using generics in the VHDL entity. diff --git a/sections/vhdl_intro/vhdl_intro.tex b/sections/vhdl_intro/vhdl_intro.tex index f185710..f202cb1 100644 --- a/sections/vhdl_intro/vhdl_intro.tex +++ b/sections/vhdl_intro/vhdl_intro.tex @@ -59,7 +59,7 @@ gtkwave counter_tb.ghw counter_tb.gtkw \begin{figure} \includegraphics[width=\textwidth]{counter_gtkwave.png} -\caption{Screenshot of the resulting waveform in GTKWave} +\caption{Screenshot of the counter test bench waveform in GTKWave} \end{figure} \section{Synthesizing a design} From 3b65229eaef19e0a309b515323facea619461f85 Mon Sep 17 00:00:00 2001 From: Xiretza Date: Sun, 29 Mar 2020 19:03:49 +0200 Subject: [PATCH 5/8] Add information about formal verification --- Diplomschrift.bib | 6 +++ sections/core/core.tex | 1 + sections/soc/alu_formal.vhd | 81 +++++++++++++++++++++++++++++++++++++ sections/soc/soc.tex | 18 +++++++++ 4 files changed, 106 insertions(+) create mode 100644 sections/soc/alu_formal.vhd diff --git a/Diplomschrift.bib b/Diplomschrift.bib index 3746535..360f2a9 100644 --- a/Diplomschrift.bib +++ b/Diplomschrift.bib @@ -83,3 +83,9 @@ title = {RISC-V Compliance Task Group}, url = {https://github.com/riscv/riscv-compliance}, } + +@online{symbiyosys-slides, + author = {Clifford Wolf}, + title = {Formal Verification withSymbiYosys and Yosys-SMTBMC}, + url = {http://www.clifford.at/papers/2017/smtbmc-sby/slides.pdf}, +} diff --git a/sections/core/core.tex b/sections/core/core.tex index 3fc3a7e..b710ae6 100644 --- a/sections/core/core.tex +++ b/sections/core/core.tex @@ -50,6 +50,7 @@ The decoder receives an instruction and interprets it. Among others, it determin The registers store the 32 general-purpose values required by \instrset{} (each 32-bit wide). They are accessible through two read ports and one write port. As specified by the RISC-V standard, the first register (\icode{x0}) is hard-wired to 0, and any writes to it are ignored. \section{Arithmetic and Logic Unit (ALU)} +\label{sec:core-alu} \entityheader{alu} diff --git a/sections/soc/alu_formal.vhd b/sections/soc/alu_formal.vhd new file mode 100644 index 0000000..db99aa5 --- /dev/null +++ b/sections/soc/alu_formal.vhd @@ -0,0 +1,81 @@ +formal: block + signal prev_a : MATH_WORD; + signal prev_b : MATH_WORD; + signal prev_operation : alu_operation_t; + + signal expected_result : MATH_WORD; + + signal has_run : std_logic := '0'; + + signal prev_enable_math : std_logic; +begin + default clock is rising_edge(clk); + + process(clk) + begin + if rising_edge(clk) then + prev_a <= a; + prev_b <= b; + prev_operation <= operation; + prev_enable_math <= enable_math; + end if; + end process; + + -- assume inputs won't change while calculation is ongoing + assume always not valid -> ( + a = prev_a and + b = prev_b and + operation = prev_operation + ); + + -- assume the "run" input is active at least until the result is valid + assume always prev_enable_math -> (enable_math or valid); + + process(clk) + begin + if rising_edge(clk) and enable_math = '1' and valid = '1' then + has_run <= '1'; + + expected_result <= (others => '0'); + case operation is + when ALU_AND => + expected_result <= a and b; + when ALU_OR => + expected_result <= a or b; + when ALU_XOR => + expected_result <= a xor b; + when ALU_SHIFTL => + expected_result <= std_logic_vector(shift_left( + unsigned(a), + to_integer(unsigned(b(4 downto 0))) + )); + when ALU_SHIFTR_L => + expected_result <= std_logic_vector(shift_right( + unsigned(a), + to_integer(unsigned(b(4 downto 0))) + )); + when ALU_SHIFTR_A => + expected_result <= std_logic_vector(shift_right( + signed(a), + to_integer(unsigned(b(4 downto 0))) + )); + when ALU_ADD => + expected_result <= std_logic_vector(signed(a) + signed(b)); + when ALU_SUB => + expected_result <= std_logic_vector(signed(a) - signed(b)); + end case; + end if; + end process; + + -- When a result has been computed, it must be correct + assert always valid and has_run -> math_result = expected_result; + -- Eventually, a result will always be available + assert always enable_math -> eventually! valid; + + -- Hints for induction + assert always not valid -> enable_math; + assert always not valid and ( + operation = ALU_SHIFTL or + operation = ALU_SHIFTR_L or + operation = ALU_SHIFTR_A) -> not (or current_b(current_b'left downto 5)); +end block; diff --git a/sections/soc/soc.tex b/sections/soc/soc.tex index 3f9b129..b563bcc 100644 --- a/sections/soc/soc.tex +++ b/sections/soc/soc.tex @@ -181,4 +181,22 @@ The initial implementation of the compliance tests uncovered several bugs in the \end{itemize} Since these tests are easily automated, they were added to the GitLab Continuous Integration (CI) configuration. Whenever a new git commit is pushed to GitLab, the tests are run automatically, and any failures are reported to the responsible committer via email. + +\subsection{Formal Verification} + +While carefully selected simulation is useful to uncover bugs and to ensure they can't happen again (regression testing), it never offers complete certainty - it is simply impossible to manually cover all possibilities of inputs. With formal verification, the circuit under test is expressed using a mathematical model and an algorithm (a \emph{SAT solver}) ensures that certain manually-selected criteria are always fulfilled. A detailed explanation of the algorithm can be found in~\cite{symbiyosys-slides}. + +As an example of formal verification, the core's ALU (\autoref{sec:core-alu}) has been extended with a formal verification definition, which can be seen in \autoref{lst:alu-formal}. Skipping over some helper logic in the beginning, the first statements add assumptions about the entity's input signals. These are rules that must be obeyed by designs using the component, otherwise the correct function cannot be guaranteed (and is indeed unproven). + +Below these assumptions, a process is used to calculate the expected result whenever a calculation is requested. While most operations are implemented the same as in the main entity and thus have little value as a known-good comparison value, the bit-shift operations are implemented incrementally in the main ALU and directly in the verification; thus, the less resource-intensive ALU implementation can be confirmed to function exactly like this more expensive method. + +Afterwards, the first assertions actually happen: these are the theorems that the formal verification suite will prove to be correct. The only two proven statements that are actually relevant to users are that when a result has been computed, it will equal the value computed using the aformentioned process, and that a computation will always finish eventually. + +Finally, two more assertions are used to give hints to the formal verification algorithm, specifically the induction step. It is sometimes very difficult or even impossible to arrive at a successful induction; these assertions can be proven trivially, eliminating a number of potential scenarios that would otherwise make a successful complete proof impossible. + +\lstinputlisting[ + style=vhdlstyle, + label={lst:alu-formal}, + caption={Formal verification block for ALU}, +]{alu_formal.vhd} \end{document} From 254a064680ef51bcdd29cfec549ec12227e2d6b6 Mon Sep 17 00:00:00 2001 From: Xiretza Date: Sun, 29 Mar 2020 21:57:24 +0200 Subject: [PATCH 6/8] Check all citation links and add urldate= --- Diplomschrift.bib | 31 +++++++++++++++++++++++-------- 1 file changed, 23 insertions(+), 8 deletions(-) diff --git a/Diplomschrift.bib b/Diplomschrift.bib index 360f2a9..ad8538d 100644 --- a/Diplomschrift.bib +++ b/Diplomschrift.bib @@ -2,12 +2,14 @@ author = {Olav Junker Kjær}, title = {The Nand Game}, url = {http://nandgame.com}, + urldate = {2020-03-29}, } @online{breadboard_computer, author = {Ben Eater}, title = {Building an 8-bit breadboard computer!}, url = {https://www.youtube.com/playlist?list=PLowKtXNTBypGqImE405J2565dvjafglHU}, + urldate = {2020-03-29}, year = {2016}, } @@ -15,6 +17,7 @@ author = {Clifford Wolf, Johann Glaser}, title = {Yosys - A Free Verilog Synthesis Suite}, url = {http://www.clifford.at/yosys/files/yosys-austrochip2013.pdf}, + urldate = {2020-03-29}, year = {2013}, } @@ -22,70 +25,82 @@ author = {Various Contributors}, title = {Yosys - Yosys Open SYnthesis Suite}, url = {https://github.com/YosysHQ/yosys}, + urldate = {2020-03-29}, } @software{nextpnr, author = {Various Contributors}, title = {nextpnr - a portable FPGA place and route tool}, url = {https://github.com/YosysHQ/nextpnr}, + urldate = {2020-03-29}, } @software{nextpnr-xilinx, author = {David Shah}, title = {nextpnr-xilinx}, url = {https://github.com/daveshah1/nextpnr-xilinx}, + urldate = {2020-03-29}, } @online{prjxray, author = {SymbiFlow}, title = {Project X-Ray}, url = {https://github.com/SymbiFlow/prjxray}, + urldate = {2020-03-29}, } @software{ghdlsynth-beta, author = {Tristan Gingold}, title = {ghdlsynth-beta}, url = {https://github.com/tgingold/ghdlsynth-beta}, + urldate = {2020-03-29}, } @software{ghdl, author = {Tristan Gingold}, title = {ghdl}, url = {https://github.com/ghdl/ghdl}, + urldate = {2020-03-29}, } @software{liteeth, author = {Florent Kermarrec}, title = {LiteEth}, - url = {https://github.com/enjoy-digital/liteeth} + url = {https://github.com/enjoy-digital/liteeth}, + urldate = {2020-03-29}, } @software{litedram, author = {Florent Kermarrec}, title = {LiteDRAM}, - url = {https://github.com/enjoy-digital/litedram} + url = {https://github.com/enjoy-digital/litedram}, + urldate = {2020-03-29}, } @software{open-fpga-loader, - author = {Gwenhael Goavec-Merou}, - title = {openFPGALoader}, - url = {https://github.com/trabucayre/openFPGALoader}, + author = {Gwenhael Goavec-Merou}, + title = {openFPGALoader}, + url = {https://github.com/trabucayre/openFPGALoader}, + urldate = {2020-03-29}, } @software{gtkwave, - author = {Tony Bybell}, - title = {GTKWave}, - url = {http://gtkwave.sourceforge.net}, + author = {Tony Bybell}, + title = {GTKWave}, + url = {http://gtkwave.sourceforge.net}, + urldate = {2020-03-29}, } @online{riscv-compliance, author = {Jeremy Bennett, Lee Moore}, title = {RISC-V Compliance Task Group}, url = {https://github.com/riscv/riscv-compliance}, + urldate = {2020-03-29}, } @online{symbiyosys-slides, author = {Clifford Wolf}, title = {Formal Verification withSymbiYosys and Yosys-SMTBMC}, url = {http://www.clifford.at/papers/2017/smtbmc-sby/slides.pdf}, + urldate = {2020-03-29}, } From ffb1fe406088ab58e4b7f18c6217614f7857b3b7 Mon Sep 17 00:00:00 2001 From: Xiretza Date: Sun, 29 Mar 2020 21:58:14 +0200 Subject: [PATCH 7/8] Add citations for GitLab CI and RISC-V spec --- Diplomschrift.bib | 14 ++++++++++++++ sections/core/core.tex | 2 +- sections/soc/soc.tex | 2 +- 3 files changed, 16 insertions(+), 2 deletions(-) diff --git a/Diplomschrift.bib b/Diplomschrift.bib index ad8538d..28683e8 100644 --- a/Diplomschrift.bib +++ b/Diplomschrift.bib @@ -98,9 +98,23 @@ urldate = {2020-03-29}, } +@online{gitlab-ci, + title = {GitLab CI/CD}, + url = {https://docs.gitlab.com/ee/ci/}, + urldate = {2020-03-29}, +} + @online{symbiyosys-slides, author = {Clifford Wolf}, title = {Formal Verification withSymbiYosys and Yosys-SMTBMC}, url = {http://www.clifford.at/papers/2017/smtbmc-sby/slides.pdf}, urldate = {2020-03-29}, } + +@reference{riscv-spec-unprivileged, + authors = {Andrew Waterman, Krste Asanović}, + title = {The RISC-V Instruction Set Manual - Volume I: Unprivileged ISA}, + year = {2019}, + url = {https://content.riscv.org/wp-content/uploads/2019/12/riscv-spec-20191213.pdf}, + urldate = {2020-03-29}, +} diff --git a/sections/core/core.tex b/sections/core/core.tex index b710ae6..e19768c 100644 --- a/sections/core/core.tex +++ b/sections/core/core.tex @@ -4,7 +4,7 @@ \part{The Core} -The core implements the \instrset{} architecture as specified by the RISC-V standard. +The core implements the \instrset{} architecture as specified by the RISC-V standard~\cite{riscv-spec-unprivileged}. It is constructed according to the traditional RISC pipeline: diff --git a/sections/soc/soc.tex b/sections/soc/soc.tex index b563bcc..496b1fa 100644 --- a/sections/soc/soc.tex +++ b/sections/soc/soc.tex @@ -180,7 +180,7 @@ The initial implementation of the compliance tests uncovered several bugs in the \item The Instruction Set Manual specifies exceptions that must be raised when a misaligned memory access occurs. These exceptions were not yet implemented, but since the compliance tests check for them, the functionality was added to make the tests pass. \end{itemize} -Since these tests are easily automated, they were added to the GitLab Continuous Integration (CI) configuration. Whenever a new git commit is pushed to GitLab, the tests are run automatically, and any failures are reported to the responsible committer via email. +Since these tests are easily automated, they were added to the GitLab Continuous Integration (CI)~\cite{gitlab-ci} configuration. Whenever a new Git commit is pushed to GitLab, the tests are run automatically, and any failures are reported to the responsible committer via email. \subsection{Formal Verification} From 3dc6bbf390b4f0176e955bd51ecf9587f47473fc Mon Sep 17 00:00:00 2001 From: Xiretza Date: Sun, 29 Mar 2020 22:01:05 +0200 Subject: [PATCH 8/8] Misc fixes and improvements --- sections/soc/soc.tex | 31 +++++++++++++++++++++---------- 1 file changed, 21 insertions(+), 10 deletions(-) diff --git a/sections/soc/soc.tex b/sections/soc/soc.tex index 496b1fa..41ad5ab 100644 --- a/sections/soc/soc.tex +++ b/sections/soc/soc.tex @@ -52,7 +52,7 @@ Others & SD card, VGA & Ethernet \\ \label{tab:devboard-comparison} \end{table} -While the Digilent board offers fewer IO options, the DDR3 memory can be interfaced using Free memory cores and allows for much larger programs to be loaded, possibly even a full operating system. The missing VGA port has been substituted by a HDMI-compatible DVI interface that is accessible through one of the high-speed PMOD connectors. +While the Digilent board offers fewer IO options, the DDR3 memory can be interfaced using Free memory cores and allows for much larger programs to be loaded, possibly even a full operating system. The missing VGA port has been substituted by an HDMI-compatible DVI interface that is accessible through one of the high-speed PMOD connectors. \section{Tooling} @@ -83,33 +83,43 @@ With these two pieces in place, the project was switched over to a completely Fr \section{Peripherals} +The complete FPGA design consists not only of the CPU core, but a number of components that allow it to operate and communicate with the outside environment. They are connected using a shared 32-bit bus. + \subsection{UART} % TODO \subsection{DVI graphics} -The graphics submodule consists of a VGA timing generator, a text renderer with a font ROM, and a DVI encoder frontend: +As can be seen in \autoref{fig:video-core-diagram}, the graphics module consists of several subcomponents: + +\begin{itemize} + \item The VGA timing generator creates the impulses and counters necessary to drive a VGA-, DVI- or HDMI-based display + \item The text renderer draws text characters onto the screen using a built-in font ROM + \item The TMDS encoder frontend converts the internal parallel signals into a set of high-speed serial streams necessary for DVI or HDMI. +\end{itemize} \begin{figure}[h] \includegraphics[width=\textwidth]{graphics.png} \caption{Block diagram of the video core} +\label{fig:video-core-diagram} \end{figure} \subsubsection{VGA timing} The timing of VGA signals dates back to analog monitors. Even though this original purpose is only very rarely used nowadays, the timing remained the same for analog and digital DVI all the way to modern HDMI. -In analog screens, the electron beams (one for each primary color red, green and blue) scan across the screen a single horizontal line at a time while being modulated by the color values, resulting in a continuous mixture of all three components. When a beam reaches the end of a scanline, it continues outside the visible area for a small distance (the ``Front Porch''), is then sent to the beginning of the next line by a pulse of the hsync (Horizontal Sync) signal, and draws the next line after another short off-screen period (the ``Back Porch''). +In analog screens, the electron beams (one for each primary colour red, green and blue) scan across the screen a single horizontal line at a time while being modulated by the colour values, resulting in a continuous mixture of all three components. When a beam reaches the end of a scanline, it continues outside the visible area for a small distance (the ``Front Porch''), is then sent to the beginning of the next line by a pulse of the hsync (Horizontal Sync) signal, and draws the next line after another short off-screen period (the ``Back Porch''). The same applies to vertical timings: after the beam reaches the end of the last line, a few off-screen Front Porch lines follow, then a pulse of the vsync (Vertical Sync) signal sends the beam to the top of the screen, where the first line of the next frame is drawn after several invisible Back Porch lines. \begin{figure}[h] \includegraphics[width=\textwidth]{vga_timing.png} \caption{Diagram of VGA timing intervals} +\label{fig:vga-timing} \end{figure} -The VGA timing module generates these hsync and vsync signals, along with a blanking signal (active during any front porch, sync and back porch) and, while in the visible area (i.e. not blanking), the row and column of the current pixel relative to the visible area. +The VGA timing module generates these hsync and vsync signals as visualized in \autoref{fig:vga-timing}, along with a blanking signal (active during any front porch, sync and back porch) and, while in the visible area (i.e. not blanking), the row and column of the current pixel relative to the visible area. \subsubsection{Text renderer} @@ -118,17 +128,18 @@ The text renderer converts a logical representation of a character, such as its \begin{figure}[h] \includegraphics[width=0.7\textwidth]{text_renderer.png} \caption{Block diagram of the text renderer} +\label{fig:text-renderer-diagram} \end{figure} -First, the current pixel coordinate (created by the VGA timing generator) is split up into two parts: the character index, which specifies the on-screen character the pixel belongs to, and the offset of the pixel in this character. The character index is passed to the text RAM, which contains the codepoint for each on-screen character. This codepoint, along with the pixel offset, is looked up in the font ROM to determine the color of the pixel. +As can be seen in \autoref{fig:text-renderer-diagram}, the current pixel coordinate (created by the VGA timing generator) is split up into two parts: the character index, which specifies the on-screen character the pixel belongs to, and the offset of the pixel within this character. The character index is passed to the text RAM, which contains the codepoint for each on-screen character. This codepoint, along with the pixel offset, is looked up in the font ROM to determine the colour of the pixel. \subsubsection{TMDS encoder} -DVI and HDMI are serial digital transmission standards. Three data lines (corresponding to red, green, and blue channels) along with a clock line transmit all color information as well as synchronization signals. The encoding used for these signals is Transition-minimized differential signaling (TMDS). It is a kind of 8b/10b encoding (transforming every 8-bit chunk of data into a 10-bit chunk) that is designed to minimize the number of changes of the output signal. +DVI and HDMI are serial digital transmission standards. Three data lines (corresponding to red, green, and blue channels) along with a clock line transmit all colour information as well as synchronization signals. The encoding used for these signals is Transition-Minimized Differential Signaling (TMDS). It is a kind of 8b/10b encoding (transforming every 8-bit chunk of data into a 10-bit chunk) that is designed to minimize the number of changes of the output signal. \subsection{Ethernet} -The Arty development board contains an RJ-45 Ethernet jack connected to an Ethernet PHY, which exposes a standardized media-independent interface (MII) to the FPGA. The LiteEth core~\cite{liteeth}, which is released under a Free Software license, is used to integrate the Ethernet interface into the SoC. +The Arty development board contains an RJ-45 Ethernet jack connected to an Ethernet PHY. The PHY handles the physical connection to an copper twisted pair Ethernet network (Layer 1 of the OSI model) and exposes a standardized media-independent interface (MII) to the FPGA. The LiteEth core~\cite{liteeth}, which is released under a Free Software license, is used to integrate the Ethernet interface into the SoC. \subsection{WS2812 driver} @@ -145,9 +156,9 @@ A hardware driver for WS2812 serially-addressable RGB LEDs is also included in t \label{fig:ws2812_timing} \end{figure} -The driver is designed to be attached to external circuitry that provides color data for any given LED index (address). This can either be discrete logic that generates the color value from the address directly, or a memory that stores a separate color value for each address. +The driver is designed to be attached to external circuitry that provides colour data for any given LED index (address). This can either be discrete logic that generates the colour value from the address directly, or a memory that stores a separate colour value for each address. -The LEDs are controlled using a simple one-wire serial protocol. After a reset (long period of logic 0), the data for all LEDs is transmitted serially in one single blob. Each LED consumes and stores the first 24 bits of the stream and applies them as its color value (8 bits each for red, green, blue), all following bits are passed through unmodified. The second LED thus uses the first 24 bits of the stream it receives, but since the first LED already dropped its data, these are actually the second set of 24 bits of the source data. +The LEDs are controlled using a simple one-wire serial protocol. After a reset (long period of logic 0), the data for all LEDs is transmitted serially in one single blob. Each LED consumes and stores the first 24 bits of the stream and applies them as its colour value (8 bits each for red, green, blue), all following bits are passed through unmodified. The second LED thus uses the first 24 bits of the stream it receives, but since the first LED already dropped its data, these are actually the second set of 24 bits of the source data. Every bit is encoded as a period of logic 1, followed by a period of logic 0. The timing of these sections determines the value, see \autoref{fig:ws2812_timing}. @@ -197,6 +208,6 @@ Finally, two more assertions are used to give hints to the formal verification a \lstinputlisting[ style=vhdlstyle, label={lst:alu-formal}, - caption={Formal verification block for ALU}, + caption={Formal verification block for the ALU}, ]{alu_formal.vhd} \end{document}