Source code

From Emulation General Wiki
Jump to navigation Jump to search

Source code is a collection of text files containing instructions that a computer either runs as-is (interpretation) or translates into an executable file beforehand (compilation / assembly). Source code is written using a human-readable computer language so that it can be translated by a compiler or an interpreter. It's also possible to decompile an executable file, though decompilers aren't as common.

The language used changes how the source code is read, and just like emulation it too has its own high and low-level types. If a program were written in an assembly language (which often involves taking advantage of system-specific attributes), an assembler would write the machine code "word for word," reflecting how the machine takes instructions. If a program were written in a low-level language, a compiler would read the code and translate the equivalent in machine code. If a program were written in a high-level language, it would often work without requiring a compiler. Some compilers and interpreters also do error checking to make sure the programmer's code is either properly written or formatted. Many languages also check that the code won't inherently cause bugs, such as Rust.

Software can be ported to other types of computers but, without the source code, it's often prohibitively difficult to do. Other ways to port software include binary translation and platform emulation.

Language levels[edit]

Software can be programmed in many different languages (even multiple in one program), and just like high and low level emulation, they have different levels of abstraction. Here are the different ones, from lowest to highest.

Assembly[edit]

Assembly is the closest representation of machine code without being machine code. There are basically no abstractions from the architecture, meaning everything is close to what the machine processes. This used to be ideal for platforms at a time when compilers weren't optimized enough to give equivalent performance to assembly, and as a result you'd find that early console games were programmed in assembly more often than higher level languages. Assembly is commonplace in dynamic recompilation as well because it allows developers to optimize code closer for an architecture than even a low-level language like C or C++.

Low[edit]

A low-level language allows programmers to get closer to the system they work on, taking advantage of architecture or platform-specific quirks without having to learn the architecture like assembly. Low-level languages have the advantage that they're easier to port to other platforms by nature of being more abstract from the hardware.

Examples of low-level languages include (but are in no way limited to) C and C++.

Medium[edit]

Medium-level languages have attributes of both low and high-level paradigms like Rust (which is designed to be performant and system-focused but also memory safe). Some high-level languages can also be lower than others.

High[edit]

High-level languages push away most system specific quirks in favor of instructions intended to work on any platform. This was pioneered by Java, whose goal was for developers to "write once, run anywhere".

In high-level languages, many of the same instructions can be run across different architectures and platforms. They may have a compiler, a compiler cache, a dynamic recompiler, and/or an interpreter.

Esoteric[edit]

Esoteric languages are built around a specific idea or a joke, as part of a challenge. These languages are intended to be comedic, confusing, and/or thought-provoking.

One example includes Brainfuck, a Turing-complete programming language with only eight one-character commands (as opposed to the thousands of standard languages and architectures) and one instruction pointer. Another is Shakespeare, a programming language designed to resemble a Shakespearean play. There's also Rockstar, a language designed around "the lyrical conventions of 1980s hard rock and power ballads", meant to lampoon the software industry's use of "rockstar developers" in recruiting.

Version control[edit]

Version control refers to the management of data as it changes. A version control system is a program that tracks changes in data. Its most common use is to allow programmers to collaborate on a source code repository without accidentally ruining any components. There are several version control systems, but the most ubiquitous by virtue of ties to the Linux kernel is Git, so much so that a ton of services are built around Git, like GitHub and GitLab. Other systems include CVS (the very first of its kind), Subversion, and another developed alongside Git called Mercurial.

Licensing[edit]

Main article: Licensing

Software is copyrightable, but the source code can be made available to users however the author chooses. A copyright license is a legal document that tells people how the software can be used and what limitations come with using it.

The more successful emulation projects are often open source (though you definitely will find exceptions).

See also[edit]