Add information about getting started programming

This commit is contained in:
Cadence Ember 2022-09-25 20:14:43 +13:00
parent 174664c775
commit 5b9d456f7a
Signed by untrusted user: cadence
GPG key ID: BC1C2C61CF521B17

View file

@ -227,3 +227,49 @@ Which port to run the server on.
@defthing[strict_proxy boolean #:value true]{
Whether to put more URLs through the proxy. If false, just a minimal set is proxied. If true, additionally proxies page stylesheets and links to image files, thereby reducing the potential for end-users to connect to Fandom servers.
}
@; -----------------------------------------------------------------------------
@section{Developing BreezeWiki}
@subsection{Racket}
BreezeWiki is programmed in the @hyperlink["https://racket-lang.org" "Racket"] programming language.
If you are new to programming entirely, BreezeWiki is likely not a good introductory project for you. I recommend purchasing a physical book about programming targeted at beginners. (In 2022, internet searches for beginner programming questiosn are sadly filled with low quality results or straight up misinformation.)
If you already know programming concepts but are new to Racket, I recommend reading @hyperlink["https://docs.racket-lang.org/quick/" "Quick: An Introduction to Racket with Pictures"] and trying the instructions for yourself. This should fill you in on the basics of practically using the Racket language. After reading that, if you want to know the fundamentals even more in-depth, you can check out the @hyperlink["https://docs.racket-lang.org/guide/index.html" "Racket Guide"], which is better to jump around in rather than read from start to end.
@subsection{Files}
@code{breezewiki.rkt} and @code{dist.rkt} are entrypoints. They do as little as possible, just requiring the page functions and starting the web server.
@code{src/page-*.rkt} contains the page functions. Each file has instructions on how to render a specific kind of page. For example, @code{src/page-wiki.rkt} renders the usual wiki pages by contacting Fandom servers, altering the HTML tree, and sending the response to the browser.
Everything else in @code{src/} is a utility file that is required in by the pages as needed.
@subsection{Fandom endpoints}
BreezeWiki mostly uses the MediaWiki APIs rather than scraping full pages pages. If I need to add functionality in the future that can't be covered through the APIs, I would consider changing to scraping full pages.
@subsection{HTML tree transformations}
@hyperlink["https://www.mediawiki.org/wiki/API:Parsing_wikitext" "MediaWiki's Parse API"] returns the contents of a page as HTML, but returning this wholesale isn't good enough. Several modifications need to be made to the page before it's suitable for display on BreezeWiki, such as altering links to have the correct prefix, making all images visible without JavaScript, and enclosing tables to allow horizontal scrolling.
To do these transformations, the @code{update-tree} function recurses through the whole HTML tree from MediaWiki, executing a function on each element. This function returns the new element to replace with, allowing update-tree to eventually build a whole brand new tree. If the function decides that there's nothing to be done for a particular element, it just returns the same element. You can see the largest example of this in @code{src/page-wiki.rkt}, where many transformations can be applied to every element.
Background: Racket is a dialect of Lisp, a class of programming languages that are good at manipulating lists of symbols. XML and HTML trees are represented as @italic{X-expressions}, which you can read a bit more about within the @hyperlink["https://docs.racket-lang.org/pollen/second-tutorial.html#%28part._.X-expressions%29" "Pollen documentation: X-expressions"]. (The rest of the Pollen documentation is unrelated.) An X-expression is often called an xexp or an xexpr for short. BreezeWiki includes @code{src/xexpr-utils.rkt} for some helpful functions that query and manipulate X-expressions.
@subsection{Programming}
If @code{breezewiki.rkt} is launched (in a REPL) it will dynamically load in the pages and watch them and their dependencies for changes. If you edit and save a file within @code{src/} it will likely be reloaded automatically.
Creating a new page requires editing @code{src/dispatcher-tree.rkt} to define which URLs the page should appear from, and @code{breezewiki.rkt} and @code{dist.rkt} to cause the page to actually be loaded. The first time you create a page it won't be loaded automatically and you'll need to restart BreezeWiki.
Automatic reloading helps with rapid development and avoiding long process startup times. But if the long startup times still frustrate you, you can use @code{raco make breezewiki.rkt dist.rkt} to byte-compile files for faster startup.
@subsection{Testing}
Most files also contain unit tests, within the @code["(module+ test"] sections. Evaluating a file within either DrRacket or Emacs-racket-mode automatically runs the tests in that file. Since almost all functions are pure functions (i.e. operate based on their inputs and outputs rather than relying on application state) they are easy to unit test.
You can run all tests in all files with @code["raco test --direct (grep -l 'module+ test' (git ls-files))"] which is helpful to check for regressions before committing.