Compare commits

...

9 commits

8 changed files with 99 additions and 34 deletions

View file

@ -91,6 +91,15 @@
urldate = {2020-03-29},
}
@inbook{vhdl-types,
author = {Klaus Fricke},
title = {Digitaltechnik - Lehr- und Übungsbuch für Elektrotechniker und Informatiker},
publisher = {Springer Vieweg},
year = {2013},
doi = {10.1007/978-3-8348-2213-0},
chapter = {15.3},
}
@online{riscv-compliance,
author = {Jeremy Bennett, Lee Moore},
title = {RISC-V Compliance Task Group},

View file

@ -127,7 +127,7 @@ geschlechtsunabh"angig verstanden werden soll.
\clearpage
%\MR\input{sections/Kapitel/MR/EntwicklungAufgaben.tex}
\subfile{sections/vhdl_intro/vhdl_intro.tex}
\part{FPGA-based System on Chip (SoC)}
\subfile{sections/soc/soc.tex}
\subfile{sections/core/core.tex}
@ -200,10 +200,14 @@ geschlechtsunabh"angig verstanden werden soll.
%\subsection{Projektterminplanung}
%\MR\input{sections/Anhang/Projektterminplanung/projektterminplanungMR.tex}
\clearpage
%\subsection{Arbeitsnachweis Diplomarbeit}
%\MR\input{sections/Anhang/Arbeitsnachweis/arbeitsnachweisMR.tex}
\begin{appendices}
\subfile{sections/vhdl_intro/vhdl_intro.tex}
\end{appendices}
\clearpage
\label{LastPage}
%\addtocontents{toc}{\protect\end{multicols}}
\end{document}

View file

@ -340,3 +340,4 @@ minimum height=1cm, align=center, text width=3cm, draw=black, fill=blue!30]
\newcommand{\icode}[1]{\codeBox{\texttt{#1}}}
\usepackage{booktabs}
\usepackage[toc,page]{appendix}

View file

@ -2,11 +2,17 @@
\begin{document}
\part{The Core}
\section{The Core}
The core implements the \instrset{} architecture as specified by the RISC-V standard~\cite{riscv-spec-unprivileged}.
It is constructed according to the traditional RISC pipeline:
\begin{figure}[h]
\includegraphics[width=\textwidth]{core_diagram.png}
\caption{Block diagram of the CPU core}
\label{fig:core-diagram}
\end{figure}
As can be seen in \ref{fig:core-diagram}, it is constructed according to the traditional stages of a RISC pipeline:
\begin{description}
\item[Fetch] fetches the next instruction from memory.
@ -16,21 +22,13 @@ It is constructed according to the traditional RISC pipeline:
\item[Writeback] stores a potential result value from Execute or Memory stages to the destination register.
\end{description}
\section{Overview}
\begin{figure}
%\includegraphics[width=\textwidth]{core_diagram.png}
% TODO
\caption{Block diagram of the CPU core}
\end{figure}
\section{Control}
\subsection{Control}
\entityheader{control}
The control unit is responsible for coordinating subcomponents and the data flow between them. Internally, it is based on \icode{instruction\_info\_t} structures, which contain all the information required to pass an instruction along the different pipeline stages. Before the fetch stage, when an instruction is first scheduled, it contains only the instruction's address (because nothing else is known about it). Then, information is added incrementally by the different stages.
\section{Decoder}
\subsection{Decoder}
\entityheader{decoder}
@ -43,32 +41,32 @@ The decoder receives an instruction and interprets it. Among others, it determin
\item Whether the instruction should branch, and if so, under what condition
\end{itemize}
\section{Registers}
\subsection{Registers}
\entityheader{registers}
The registers store the 32 general-purpose values required by \instrset{} (each 32-bit wide). They are accessible through two read ports and one write port. As specified by the RISC-V standard, the first register (\icode{x0}) is hard-wired to 0, and any writes to it are ignored.
\section{Arithmetic and Logic Unit (ALU)}
\subsection{Arithmetic and Logic Unit (ALU)}
\label{sec:core-alu}
\entityheader{alu}
The ALU contains a math/logic unit as well as a comparator. It is used both explicitly by instructions such as \icode{add} or \icode{shiftl}, as well as to add offsets to base addresses for memory instructions and to decide whether an instructions should branch.
\section{Control and Status Registers (CSR)}
\subsection{Control and Status Registers (CSR)}
\entityheader{csr}
The control and status registers contain configurations relevant to the core itself. For example, they can be used to control interrupts.
\section{Memory Arbiter}
\subsection{Memory Arbiter}
\entityheader{memory_arbiter}
Since both fetch and memory stages need to access the same system memory, access to this common resource has to be controlled. The memory arbiter acts as a proxy for both fetch and data memory requests and stalls either until the other one completes.
\section{Exception Control}
\subsection{Exception Control}
\entityheader{exception_control}

View file

@ -1 +1 @@
<mxfile host="www.draw.io" modified="2019-12-11T07:44:55.731Z" agent="Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) QtWebEngine/5.13.2 Chrome/73.0.3683.105 Safari/537.36" etag="cypgiirZT7mhMw4UHRcL" version="12.3.9" type="device" pages="1"><diagram id="-MZbLzNk5HDaKAr47o4G" name="Page-1">7Rzbcps49Gv8mA6SQMBjkjbNQzrT2cxOu/uyQ0Gx2WLjBSVx+vUrjLhI4mYbgZP0yeZIR5LP/SK8QNfr3efE266+xAGJFtAIdgv0cQEhsLHBPjLISw5xXTsHLJMwyEFGBbgPfxGOWUAfw4CkHJaDaBxHNNyKQD/ebIhPBZiXJPGzOO0hjgIBsPWWRAHc+16kQr+FAV1xKDaMauCWhMsV39otBtZeMZkD0pUXxM81EPq0QNdJHNP823p3TaKMeCJdblpGy4MlZEOHIDjeM/32c0nA57sr9w7Z6W6dXgCTH46+FL+YBIwA/DFO6Cpexhsv+lRBr5L4cROQbFmDPVVz7uJ4y4CAAf8llL5wbnqPNGagFV1HfJTsQvo9Q/8ALf74V23o444vvX94KR42NHmpY2XPf9UHK7z9U4GY0iT+WXIv2yT/zdkPbaUlB6XxY+LzWVdgdfv90vx666/+dsMv/t31LrgoZNJLloR2EbrkOFMVEq8JOyHDS0jk0fBJPIfHZXZZzqvYyr5wzh7CZWteLttHcfk3k9uY3HXIJy965DsFxGf2OFF4L3L2eRVScr/19hR4ZlZc5OJDGEXXcRQne1wUeMR58EuC10aw75AfDxlGvKFcKKBRMqKB7k8koWRXA6mk46OocCPci5gmf36ubLLDQauaNS7QRtcoZChUnVKjaurE9aRVoUTFMDl/brx1GGVTbkn0RGjoe/upXkIvM6fJBvzIS9PQL8A3YVSeYhMUkzbxhuQQPm6Mq3jwRIXiqF/jkO1cSpOFRGmCZkZEYZFc1TleJSvsd3svtWnbbEI6fCfTLna6Ofhslbjmp6iEt6TMCfIM55RncJA814SUy19dQg1RQisxroQUHKYWrYEH6PFJYyvCAA9knuqCmqUSu5YolY4ujcHWoRojY0AMR9aYLo7U3O3l3Z9vwNVa5+ZqAZrDNKn2oSWcNTqt1XEmQEtw2aw9psxuW5dmyztZNujU6575evyghecRtsl9WqPntbtdb80ROraYnX1wXdjjDfdPX0kSMh6RRJN+DHCR2NKiSAC5orwibA7So4YQ0P0gulsbIlkpc3KcrJQAGuKpbaPb2yoIyEI9CDY8EQFNoPhq9pyQZZgySU3Pw6krHrxBS9rzZ1NkGkaqUwe4wasDQ5tbxwrFFxBHlBNAoDj+7zEuBi7SPWku2QSAt7s9fYpx9m2ZfXpBkPzjFcux4+Ur5oMKOxlZqcgzLwqXm8zQMvJmpuoqIz4znNElH1iHQZCbfsKO4/3YL5XxiusbW9e6Wlgfs+3Ycszg56fOJaLNKNcYz6iT+RXKLFKcbXjhjiUJliN6VQsoktAkCBDpkgNXsxz8ePVyEIQJ8bkgEC+lukTDPTPRQPM2J15JUUKMw8qw7OSaRG8g1SZOI1fnGjKSkYIfeSdLrjVoqrXN2o15NbU2e4JaW4dcn0+xDUFtxTZbVgDUV2yTMKDrTqEyUA3Pr+//eAOBuVxts+autpmzNrZ+G6de4zS4E+BoMU7AgFL+r60V4LqHtgJkjMpythYbDHA6igOmsIFF4vbmS5OVlmEbLJBYZcQzVxkH65+uMqOJxCwNAckV6NO+Q3WvW41gkV4ejQAEBE2RB1Qij8Cj3pqs30D04dhI5HFD9AEMNGH4UdQShGrQWHQuzVcbiZeZ9RJsxknkhYZI3vJSa528sCiACSUVoOsqIJqlwVX1Toe3To+tcEzvscZ0L9ZA95LrxQnu5TQpst9dLKLr6k8/p/UEEraUeTZcghupzNa+U5urlzFMa5IIG7nvTqoPy2OPMeMjGkdzbpWRU1+sTWVQ2327VpWRMCZKSgt/1RWvpStvm32lLDgjv+JsvattLUMr4bW0rTesa5PtWmNsrL6YLdU8kVWyvRbI2VZDGCdJx2j+15rpTtyMlmqugttgo6Ot4Gb2Wp2RUn6W6kmeutfsKCgIo0mKYe/PVZvYPLdi2GDd0FYMg5KfdPUUwwCUL5k73ZdXVQR3gmJVwY+37I2VDiR25vbG5izG6JhgfEYDJlXy4XgXVfrNlDPQTOnp59tSvwjrMlJA9cXdRkpGwNYUrnuWDvMrqh0d0ww/TltOetlZU7xr2m1d37GTbPXVl8NeSIFmT8PKkGOAQxGQaU4QNDhvP2hQKXsGObxaOvEZSZI4Ush/jjeTx+pB2rYtve0Ciq5knTNN944dTZzBqgL8vgM12b3j3nAON3iodk82uoNCpngBw8TSEiOFc9AVrzv0XcCQ5yPUXbopXqs6dr50Hj3OCb+7QudhajVzRwYPbVdjPboITKXMIr10OdZ7kkjZqKcwihz3NISiCdqKIF2HOnT+NPqrVqQeCPVXila/vstTQLbPc1/dxrNcyjjqv9xeUTY+3FrOVgVvyzPHf1nctho36niVW0LA7gQ25x30pAF05LSpaEmIlzeb8qbDM1r2WP1RZc6n6u8+0af/AQ==</diagram></mxfile>
<mxfile host="www.draw.io" modified="2020-03-30T19:50:01.047Z" agent="Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) QtWebEngine/5.14.1 Chrome/77.0.3865.129 Safari/537.36" etag="Bu3RFgc4ZeOEaW5U-Bph" version="12.9.1" type="device"><diagram name="Page-1" id="90a13364-a465-7bf4-72fc-28e22215d7a0">7Vxbc5s4FP41fkwHcecxjp10OuluNplOm6eODIpNgy0Xy42dX78SSDYIzMUG7LCbPATdhb5zvnN0JDLQbuabuxAuZ1+xh4KBqnibgTYaqCpQbJv+YTnbOMdUnThjGvoer7TPePLfkWjJc9e+h1apigTjgPjLdKaLFwvkklQeDEP8lq72goP0qEs4RZmMJxcGIveTsc//7ntkxvOB6ewLPiN/OuOD26oZF0yg+zoN8XrBRxyo2kv0ExfPoeiLv+pqBj38lsjSxgPtJsSYxE/zzQ0K2OqKhRPtyFbMdqANZ2Qe0ASgj1Hx7YHGoEpj+nIhWpDkcIf6+/L9F9Zn43vT+P3FvXl+f/z2HlzpcS9/YLAWg8ijooV3zXCiqQVeoPQkXvwguMEBDqO6GphAgNg8VyTEryhRoprK8Pp2VyKAUnNfhL868lLQ89e6Q3iOSLilFd72cBt8rWcJoEVeiAJI/D9pMYJc7qa77nYjPGCfzkRVuI5Yoh+uIYatpLtY4XXoIt4qiUDdjggMp4hkOqLLD7eJaktWYXV4wgBo+QMdmlimgaUXN3Ds0+qbsqxK9bUT65fNZ8deR9aXFpQ+xBiJVEJK91mRPtbQTa1Z3TR1y7StvummJLi2eaxulnXUkG6aB8Y5NC+5fpmk6sZp9Z0STQOafmKDMl0GoKbyZxpIa3qydj6C+/Dns48mf2nqXQCw703xVY4yUpV44kkckhme4gUMxvvcYeRrINarQlP7OvcYL7nS/kKEbLmXBdcEp1UabXzygzdnz8/s+ZNq8ORokygbbUViQd/0h+iBJZLNWHrfLkqJhofoQNL9XH6I9S7Oupu9/v3Pz29Dfey+2840eEB3j1fCV4zUqqCenk8vlXkjIQWFIBdNMkXBZkAYueJIBPfwm7/XWBRcrSIAr2kFYC03+0L6NGV/b2ilEAeiMzq3uL+4NCNZabl5m/kEPS1htLxv1KsvpP2xyX4ZmsJ7ZX1MA7ha8WcXz32XPx/E8w8KCdoksrKQ7LYUSlodNYenEyaBEk/WJsju0DEw5lpS65y6ClK6WqyqaY0zksrLm+7Ut0R3j1dMNauYuYsK1IqewIkm3jSMtEUDqtjyNW2bVcnwlLjNcn3J1p5sd4rgaZySRsjFHgp7SUm67FHoWUYCao6X2hojgQ9DSQkCUlMMBEoYKMtlx3OSVpWT9FY4qS6R2FY9IpHrW8Bun0jytpeN+DZPj70kEUvex5ydRPICAh3uQXbE8ZwoKSWRWl5MJlShRD+57LKi7EBENMRlcsDQj7Nv/UBM34OrWfT6+THUqoykZxmpSMuaI6TTJEa7HLNjVbU76V2rVb5rTcgBj4olhaCOXBWGwBqTkaohtW72vHlHAk3YhfHGRUvi48UgOgzq7f7XFCdqF7P95d7ERWh9ZWfTOpuzaXxsZ9NWajqbpta+s2m0RCrX9996SSKys2mZHTqbuQiaLSH4iKb+irDAg/KIoNdLNOX4Q6do5nKXUWgRuNd0ZhOQ9vvqHVacZgGsio5bOxHQugZALw5btkPoVkt0cIuIO+slCajnjB/kYmi3hOEIEkgLv6I5ZrPsIZaOo58Py3xn1PkoPn7RGVc9yj+e4J2KBG+1QvDZgyTJJdeBJCZNXQzTlXqB6UwDo+Q6iXwVo279TmyX0xLvfQ8pabFLpb3kPMpnlxYAF2pygPRQMMFvLcW+rWOC3514tIWH9d3Hqw8wniMd6muSjBy4t1eX8Sz5oifo4OImyF7d7kos1Z6KpdFzsWz4xmLhYhddKKbGZMkeCTU/6B2z/oZLFPp0bBQm8x/2mWWGK8QERnF/bXTlNBVkl6/Fqllb5OSYIr01U1QcUen0LFa5fL2vejRmt6L3ddXVMSVpK/vcwWzXqy1c0zOqd1Pabcvx0nNrtxi/YMsgPowTmwKQ3R+IJpMqWwy6hDlbDBFTUa7DiU9S1/Um8jBluw8KBklDCQN/ynB0KXQR9Awy34XBNS+Y+54XMxeiM4WTqCsGOtcg2q8xHBgj1hclK74ohTctqkuFIX/VlHOwCswcsVBbE4vy++HdiEXiEGUIF6//Jamwpej3jjuSUpH3BVJrUlF82t6e9//RLpqbWR/gsi6a61ZHF80tq+SiubxdsDrwJ8xzOrR9PTE0mxbl0zYt/2NcA2O7IsbORWEsbpp1jPHReHV0ebPad4ftGx3ZqdUrRpyaYnm1gZtZRp5LSt1IFFKRYRNgfmO4XpK++qAGSHsOuqJnfVCtJR809+vkBj5czUXVwz9ZjOCCgcygloPt4XN9cYwugDSymwmrSxwb+EgnF8c53PYbSMnD1uysRtrNAEmT+38BFNPy/j8taeN/AQ==</diagram></mxfile>

Binary file not shown.

Before

Width:  |  Height:  |  Size: 52 KiB

After

Width:  |  Height:  |  Size: 408 KiB

View file

@ -1,9 +1,7 @@
\documentclass[../../Diplomschrift.tex]{subfiles}
\begin{document}
\part{Meta}
\section{History}
\section{Development History}
The project started out with the desire to build a CPU from scratch. Examples such as The NAND Game~\cite{nandgame} and Ben Eater's Breadboard Computer series~\cite{breadboard_computer} served as inspirations and guidance during development.
@ -54,9 +52,11 @@ Others & SD card, VGA & Ethernet \\
While the Digilent board offers fewer IO options, the DDR3 memory can be interfaced using Free memory cores and allows for much larger programs to be loaded, possibly even a full operating system. The missing VGA port has been substituted by an HDMI-compatible DVI interface that is accessible through one of the high-speed PMOD connectors.
\section{Tooling}
\section{FPGA Tooling}
FPGA design is done using a Hardware Description Language (HDL). The two most well-known HDLs are Verilog and VHDL (VHSIC (Very high speed integrated circuit) HDL). As part of our studies at HTL, we exclusively worked with VHDL. For this reason, and because VHDL offers a better type system, it was chosen as the language of choice for the project.
FPGA design is done using a Hardware Description Language (HDL). The two most well-known HDLs are Verilog and VHDL (VHSIC (Very high speed integrated circuit) HDL). As part of our studies at HTL, we exclusively worked with VHDL. For this reason, and because VHDL offers a strong type system~\cite{vhdl-types}, it was selected as the language of choice for the project.
To refresh the reader's memory on the VHDL language, and as a quick guide for the tools involved in this project, see Appendix~\ref{app:vhdl-intro}.
\subsection{Vendor Tools}
@ -87,7 +87,8 @@ The complete FPGA design consists not only of the CPU core, but a number of comp
\subsection{UART}
% TODO
The easiest way to communicate with an embedded system is usually through a serial interface. To ensure the best compatibility with existing software, a National Semiconductor 16550 UART was reimplemented from scratch instead of creating a new design. Thus, the modules's functionality and design can be found in the 16550's datasheet.
% TODO ref
\subsection{DVI graphics}
@ -166,16 +167,67 @@ The exact timing differs between models, so all periods can be customized using
\subsection{DRAM}
The Arty A7 development board contains a 256MB DDR3 memory module. Since the FPGA only contains about 1.8MB of block RAM, of which some is already reserved for various hardware functions (e.g. the text buffer and WS2812 driver), the external memory is absolutely necessary to run larger programs.
The Arty A7 development board contains a 256MB DDR3 memory module. Since the FPGA only contains about 1.8MB of block RAM, some of which is already reserved for various hardware functions (e.g. the text buffer and WS2812 driver), the external memory is absolutely necessary to run larger programs.
Interfacing with DDR3 memory is notoriously difficult, requiring complex logic on both physical and logical layers. For this reason, the Free Software LiteDRAM core~\cite{litedram} is used to integrate the entire memory interface into the SoC. While irrelevant to the SoC, it can still be considered a slight oddity the LiteDRAM core actually contains an entire separate RISC-V core to coordinate initialization of the memory.
Interfacing with DDR3 memory is notoriously difficult, requiring complex logic on both physical and logical layers. For this reason, the Free Software LiteDRAM core~\cite{litedram} is used to integrate the entire memory interface into the SoC. While irrelevant to the SoC, it can still be considered a slight peculiarity that the LiteDRAM core actually contains an entire separate RISC-V core to coordinate initialization of the memory.
\subsection{External Bus}
Bridging the internal SoC bus with the external peripheral bus requires a few steps. For one, the external data bus is bidirectional, so tri-state outputs must be used on the FPGA. In addition, the internal bus arbitrates components using addresses alone, while the external bus uses chip enable signals and overlapping address spaces.
Bridging the internal SoC bus with the external peripheral bus requires a few steps. For one, the external data bus is bidirectional, so tri-state outputs must be used on the FPGA. In addition, the internal bus arbitrates components using addresses alone, while the external bus uses chip enable signals and overlapping address spaces. Lastly, the bus must be slowed down. While the internal bus runs at a frequency of 50 MHz, a reasonable frequency for the external circuitry is around 1 MHz. To achieve this, a clock divider is used to only change the state of the external bus interface every 64th clock cycle, resulting in an effective bus speed of under 1 MHz.
Due to a mistake in the adapter board layout, the nibbles of the address and data buses are reversed (MSB to LSB are pins 7 to 0 on the FPGA, but 3 to 0 followed by 7 to 4 on the board). Thanks to the completely arbitrary mapping of FPGA pins, this can be mitigated without using any additional resources.
\section{Software}
\subsection{Bootloader}
The CPU loads its machine code from an FPGA-internal block RAM. The initial value for this RAM is part of the bitstream, and if any changes to it are required, the entire project has to be resynthesized. Because this takes upwards of 5 minutes, a different solution was created: a fixed bootloader is encoded into the block RAM, which is able to read additional program code (the payload) from the UART at runtime and store it to available memory. After the transfer is complete, it simply jumps to the base address of the payload and continues execution from there. When the current payload exits or a hardware reset is actuated, a new program can be loaded instantly.
Because many subroutines are used in both the loader and the payload, duplicating them in the payload would be a waste of space. Using custom linker scripts and compiler flags, the payload is linked against the functions in the loader. Whenever a loader function is called from the payload, execution jumps to bootloader code, executes the requested actions and then returns to the payload.
\subsection{Drivers}
Several components required writing functions to make them easier to use. Some are as simple as writing a value to a specific memory location:
\begin{lstlisting}[
language=c,
label={lst:yarm-set-rgb-led},
caption={Function to set the colour of an RGB LED on the Arty board}]
void set_rgb_led(size_t num, uint32_t color) {
((volatile uint32_t*)ADDRESS_RGB_LEDS)[num] = color;
}
\end{lstlisting}
Others, like the function to write a character to the screen are more complicated and use further subroutines:
\begin{lstlisting}[
language=c,
label={lst:yarm-vga-putchar},
caption={Function to write a character to the screen}]
void vga_putchar(screen_t *s, unsigned char c) {
switch(c) {
case '\n':
set_cursor_pos(s, s->row + 1, 0);
break;
case '\b':
// DEL
case 0x7F:
if (s->col > 0) {
set_cursor_pos(s, s->row, s->col - 1);
}
if (c == 0x7F) {
set_curr_char(s, ' ');
}
break;
default:
set_curr_char(s, c);
set_cursor_pos(s, s->row, s->col + 1);
}
}
\end{lstlisting}
\section{Testing}
\subsection{RISC-V Compliance Tests}

View file

@ -1,11 +1,12 @@
\documentclass[../../Diplomschrift.tex]{subfiles}
\begin{document}
\part{A short introduction to VHDL}
\section{A short introduction to VHDL}
\label{app:vhdl-intro}
Designing a processor is a big task, and it's easiest to start very small. With software projects, this is usually in the form of a ``Hello World'' program - we will be designing a hardware equivalent of this.
\section{Prerequisites}
\subsection{Prerequisites}
Other than a text editor, the following Free Software packages have to be installed:
@ -21,7 +22,7 @@ Other than a text editor, the following Free Software packages have to be instal
\end{description}
\end{savenotes}
\section{Creating a design}
\subsection{Creating a design}
A simple starting design is an up/down counter. The following VHDL code describes the device:
@ -41,7 +42,7 @@ In order to test this design, a test bench has to be created:
title=\texttt{counter_tb.vhd},
]{vhdl/counter_tb.vhd}
\section{Simulating a design}
\subsection{Simulating a design}
\begin{lstlisting}[
style=terminal,
@ -62,7 +63,7 @@ gtkwave counter_tb.ghw counter_tb.gtkw
\caption{Screenshot of the counter test bench waveform in GTKWave}
\end{figure}
\section{Synthesizing a design}
\subsection{Synthesizing a design}
An additional Xilinx Design Constraints (XDC) file is required to assign the signals to pins on the FPGA: