Source code

From Emulation General Wiki
Jump to navigation Jump to search

Source code is a collection of text files containing instructions that a computer either runs as-is (interpretation) or translates into an executable file beforehand (compilation / assembly). Source code is written using a human-readable computer language so that it can be translated by a compiler or an interpreter. It's also possible to decompile an executable file, though decompilers aren't as common.

The language used changes how the source code is read, and just like emulation it too has its own high and low-level types. If a program were written in an assembly language (which often involves taking advantage of system-specific attributes), an assembler would write the machine code "word for word," reflecting how the machine takes instructions. If a program were written in a low-level language, a compiler would read the code and translate the equivalent in machine code. If a program were written in a high-level language, it would often work without requiring a compiler. Some compilers and interpreters also do error checking to make sure the programmer's code is either properly written or formatted. Many languages also check that the code won't inherently cause bugs, such as Rust.

Software can be ported to other types of computers but, without the source code, it's often prohibitively difficult to do. Other ways to port software include binary translation and platform emulation.

Language levels[edit]

Software can be programmed in many different languages (even multiple in one program), and just like high and low level emulation, they have different levels of abstraction. Here are the different ones, from lowest to highest.

Assembly[edit]

Assembly is the closest representation of machine code without being machine code. There are basically no abstractions from the architecture, meaning everything is close to what the machine processes. This used to be ideal for platforms at a time when compilers weren't optimized enough to give equivalent performance to assembly, and as a result you'd find that early console games were programmed in assembly more often than higher level languages. Assembly is commonplace in dynamic recompilation as well because it allows developers to optimize code closer for an architecture than even a low-level language like C or C++.

Low[edit]

A low-level language allows programmers to get closer to the system they work on, taking advantage of architecture or platform-specific quirks without having to learn the architecture like assembly. Low-level languages have the advantage that they're easier to port to other platforms by nature of being more abstract from the hardware.

Examples of low-level languages include (but are in no way limited to) C and C++.

Medium[edit]

Medium-level languages have attributes of both low and high-level paradigms like Rust (which is designed to be performant and system-focused but also memory safe). Some high-level languages can also be lower than others.

High[edit]

High-level languages push away most system specific quirks in favor of instructions intended to work on any platform. This was pioneered by Java, whose goal was for developers to "write once, run anywhere".

In high-level languages, many of the same instructions can be run across different architectures and platforms. They may have a compiler, a compiler cache, a dynamic recompiler, and/or an interpreter.

Esoteric[edit]

Esoteric languages are built around a specific idea or a joke, as part of a challenge. These languages are intended to be comedic, confusing, and/or thought-provoking.

One example includes Brainfuck, a Turing-complete programming language with only eight one-character commands (as opposed to the thousands of standard languages and architectures) and one instruction pointer. Another is Shakespeare, a programming language designed to resemble a Shakespearean play. There's also Rockstar, a language designed around "the lyrical conventions of 1980s hard rock and power ballads", meant to lampoon the software industry's use of "rockstar developers" in recruiting.

Version control[edit]

Version control refers to the management of data as it changes. A version control system is a program that tracks changes in data. Its most common use is to allow programmers to collaborate on a source code repository without accidentally ruining any components. There are several version control systems, but the most ubiquitous by virtue of ties to the Linux kernel is Git, so much so that a ton of services are built around Git, like GitHub and GitLab. Other systems include CVS (the very first of its kind), Subversion, and another developed alongside Git called Mercurial.

Licensing[edit]

Main article: Licensing

Software is copyrightable, but the source code can be made available to users however the author chooses. A copyright license is a legal document that tells people how the software can be used and what limitations come with using it.

Public domain
There is no copyright (i.e. No Rights Reserved). Works enter the public domain when they:
  1. were released before the current copyright expiry date. This is why old paintings, plays, and books are so commonly quoted and used in modern works, because they'd have to negotiate the rights with the author otherwise. Most software is not released this way because it is still covered by the current American copyright term.
  2. are dedicated through a license like Creative Commons Zero or the Unlicense. This is the only option for modern works to be released into the public domain because, per the Berne Convention, copyright is seen as opt-out, not opt-in. If a public domain dedication can't be made (probably because the jurisdiction doesn't recognize the public domain), then the license grants users the equivalent freedoms.
Open-source
The program is released under a copyright license that permits four freedoms: that it can be run at any time, studied and modified for the user's own purposes, distributed to anyone, and improved for everyone else. This bypasses most of the issues encountered with public domain works. For anything else copyrightable, the term "open content" often applies.
It's worth noting that open-source does not replace copyright. And likewise, the license cannot be removed after the work has been released under it. To see the various open-source licenses available, see choosealicense.com. Also see the appendix at the same website.
Source-available
The program is released under a copyright license more restrictive than an open-source license, but the source code is still publically available. The biggest example is Snes9x, which is released under a non-commercial license. This license makes it not open-source, as it restricts the users' commercial use.
Closed source / Proprietary
The program's source code isn't available. Often because the ecosystem behind the platform is closed, sometimes by nature (like Windows and Android), or sometimes by force (like every modern console).
Freeware
The source code isn't available but the program is still free.
Shareware / Trialware
A limited demo version of the program is free. This was common for DOS games.

The more successful emulation projects are often open source (though you definitely will find exceptions).

See also[edit]