SimpleScript is an interpreted, BACIC-like programming language built with Python. It is small, clean, and powerful. The language itself is straightforward and allows for cleaner-looking mathematical operations and data processing. The SimpleScript language can he used in two ways: through direct commands in the interactive shell, and by executing complete programs. You can read more about both execution methods below.
Ensure Python 3 is installed on your system. SimpleScript does not use any external libraries. This makes it entirely portable since it does not require the installation of any third-party libraries. TL;DR: all you need is Python 3 installed on your system. Nothing else is needed to write and execute SimpleScript programs.
You can use Python to launch the interactive shell. This will allow you to play with the SimpleSCript language in your terminal. You can use the shell to test expressions.
In your BASH terminal, run the shell.py
script using Python.
$ python shell.py
You can run programs using the RUN
builtin function in the interactive shell.
There is no file extension defined for SimpleScript. Feel free to name your file anything.
The interpreter will attempt to run all files executed, regardless of extension.
$ python shell.py
$ RUN ("my_program.simple")
By default, error messages are not displayed. To toggle the visibility of error messages, use the debug
command.
This will allow all error messages to be printed after any interpretation. This is handy for improving the language itself. It's also handy to see smaller syntax and execution errors that the interpreter may have encountered.
The reason it's not enabled by default is that one of the principles of SimpleScript is to rarely stop you dead in your tracks. Error handling measures have been built to inform-if-needed, otherwise it will attempt to sally forth.
$ DEBUG
To exit the interactive shell, simply use the exit
keyword. This will terminate your program and perform garbage collection.
You'll be dumped back into the BASH terminal you launched from.
$ EXIT
More specifically, garbage collection of the variables and functions you created and used will occur. All system variables and functions will return to their original state.
This means that if you overrode the system variable FALSE
to the value 10
, for example, that it would be restored back to its default value of 0
.
Here is a small example program written in SimpleScript. It uses some of the language's features including loops, functions, and variables. SimpleScript is Turing-complete; the language is simple to use yet robust enough to handle whatever you throw at it.
############################################
# Sample SimpleScript Program #
# [bash]$ python3 shell.py #
# [SimpleScript]$ RUN("my_program.simple") #
############################################
# Function to prepends a prefix to the word
FUNC function(prefix) -> prefix + "SimpleScript"
# Join function for elements
FUNC join(elements, separator)
VAR result = ""
VAR len = LEN(elements)
FOR i = 0 TO len THEN
VAR result = result + elements/i
IF i != len - 1 THEN VAR result = result + separator
END
RETURN result
END
# Maps elements to a function
FUNC map(elements, func)
VAR new_elements = []
FOR i = 0 TO LEN(elements) THEN
APPEND(new_elements, func(elements/i))
END
RETURN new_elements
END
# Print using builtin function
PRINT("Greetings universe!")
# Loop example snippet
FOR i = 0 TO 5 THEN
PRINT(join(map(["l", "sp"], function), ", "))
END
For more detailed documentation, you can continue below. Every feature of SimpleScript is outlined in the documentation. If you encounter any issues, or feel like the language is missing something, you can contribute to the project on GitHub. Happy coding!
The following special variables have set values in the language. However, they can be remapped in your program. This is by design and allows your program to define its own basic terms and concepts.
Variable | SimpleScript Variable | Value | Use |
---|---|---|---|
Null/NaN | NULL |
0 | Represents an empty value |
Logical True | TRUE |
1 | Represents a True Boolean |
Logical False | FALSE |
0 | Represents a False Boolean |
Note that since these values are stored in a symbol table, you can also define true
as your own version of TRUE
; the values of TRUE
and true
can be different if you want. SimpleScript will never tell you that you shouldn't reassign these special variables. Configure your program as you'd like.
There are many builtin functions in SimpleScript. You can, of course, write your own. But these can be quite useful. You can chain together many functions into much larger compound functions that include both user-defined and builtin functions. You can also redefine builtin functions on a per-program basis; your changes will not be saved to the interpreter.
Function Name | SimpleScript Command | Description | Example |
---|---|---|---|
Run | RUN |
Runs a program | RUN("my_program.simple") |
PRINT |
Prints strings of text | PRINT("This is a string") |
|
Print Return | PRINT_RET |
Returns a String instance of the input value | PRINT_RET(123) |
Input | INPUT |
Accepts input from the stream | INPUT() |
Input Int | INPUT_INT |
Accepts integer input from the stream | INPUT_INT() |
Clear | CLEAR , CLS |
Clears the terminal screen | CLEAR() , CLS() |
Is Number | IS_NUM |
Return TRUE if argument is a number |
IS_NUM(123) |
Is String | IS_STR |
Returns TRUE if argument is a string |
IS_STR("This is a string") |
Is List | IS_LIST |
Returns TRUE if argument is a list |
IS_LIST([1, 2, 3]) |
Is Function | IS_FUNC |
Returns TRUE if argument is a function |
IS_FUNC(PRINT) |
Append | APPEND |
Append value to a list | APPEND(list, 5) |
Pop | POP |
Remove an element from a list by index | POP(list, 3) |
Extend | EXTEND |
Concatenate two lists together | EXTEND(list_a, list_b) |
E.g. if you set PRINT
to add two numbers instead of printing strings, that change will only take effect in your current program.
The next time you run a SimpleScript program, PRINT
will default back to printing strings.
Here is a table including all supported mathematical operations in SimpleScript. Most are similar to Python or BASIC's implementations, but I've made some small changes which hopefully improve the look and feel of larger mathematical expressions.
Operation | SimpleScript Command | Description | Notes |
---|---|---|---|
Addition | + |
Performs addition | |
Subtraction | - |
Performs subtraction | |
Multiplication | * |
Performs multiplication | |
Division | / |
Performs typical floating division | Cannot divide by 0 |
Integer Division | | |
Performs division using integers | Cannot divide by 0 |
Power | ^ |
Raises expression to a power | Supports negative powers |
Modulo | % |
Returns remainder of division |
All these operations can be performed on both numeric values and sub-expressions, including variables. Variables are executed before any other numeric operation, so you can actually assign variables while performing numeric operations on other values.
Basic arithmetic operations can be performed directly on numbers, variables, and other values. Mathematical operations all return their associated value. Anytime a floating point operation is introduced, the data type returned will be a float.
$ (2 + 1) | (4 - 1)
1
$ 10 % 5
0
$ 1 + 2 - 3 * 4 / 5
0.6000000000000001
$ ((1 + 2) * 3 ^ 2) / 2
13.5
$ 10 | 2 + (8 ^ 2) - 6
63
As you'd expect, division by zero is not allowed. If you try, the interpreter will inform you that the operation you're trying to execute is invalid. It will even highlight the error in your file. Negative powers are valid operations that will evaluate to the same result as would a similar BASIC operation.
$ 12345 / 0
Traceback (most recent call last):
File <stdin>, line 1, in <program>
File <stdin>, on line 1
12345 / 0
^
$ 98765 / (4 - 5 + 1)
Traceback (most recent call last):
File <stdin>, line 1, in <program>
File <stdin>, on line 1
98765 / (4 - 5 + 1)
^^^^^^^^^
Operations that eventually evaluate into zero are also not allowed, as seen in the above example. In this case, the entire sub-expression is highlighted by the interpreter. This hopefully provides enough tracing and error information to fix the issue in your program. The stack trace will always be displayed in error messages.e," there is a great deal that can be improved upon. I, for one, am less-than-satisfied with much of the variable and function mechanisms of SimpleScript.
Variables can be declared by using the VAR
keyword. Variable names can include letters and underscores in their name. Variables can be re-assigned without restriction. Variables are stored in a symbol table, so they can be mutated and altered throughout execution.
$ VAR my_variable = 12345
12345
$ my_variable
12345
$ VAR my_other_variable = 67890
67890
$ my_other_variable
67890
$ VAR addition = my_variable + my_other_variable
80235
$ addition
80235
Also, variables can be assigned in the middle of expressions which may seem odd to those used to assigning variables before use. However, this can allow one to spawn helper variables in context to one statement instead of having to add to the parent's scope.
$ VAR x = 10 + (VAR y = 5)
15
$ y
5
This kind of inline assignment is thanks to the interpretation of the abstract syntax tree (AST). Instead of searching line-by-line for variables to be assigned, SimpleScript simply treats the VAR
keyword like a token that triggers the parsing of its sub-tree. In short, SimpleScript considers VAR
definitions a higher priority than brackets in the BEDMAS order of operations.
SimpleScript allows for lists to be defined. They are immutable, unlike Python lists. You can define a list, add elements to a list, remove elements from a list, and join two lists together. Since lists are immutable, you'll need to leverage list joins and value extraction to obtain your resulting list.
$ VAR list = [1, 2, 3, 4, 5]
[1, 2, 3, 4, 5]
$ VAR list = [1, 2, 3]
[1, 2, 3]
list + [4, 5]
[1, 2, 3, 4, 5]
$ VAR list = [1, 2, 3, 4, 5]
[1, 2, 3, 4, 5]
$ VAR other_list = [6, 7, 8, 9, 0]
[6, 7, 8, 9, 0]
$ list * other_list
[1, 2, 3, 4, 5, 6, 7, 8, 9, 0]
$ VAR list = [1, 2, 3, 4, 5]
[1, 2, 3, 4, 5]
$ list - 1
[1, 3, 4, 5]
$ list - 0
[3, 4, 5]
$ list - (-1)
[3, 4]
You can get elements by-index in lists using the /
operator. You can also use negative index values to fetch values from the ends of arrays.
As always, you can chain expressions together. For example, you can evaluate an expression that results in an index for the list.
$ [1, 2, 3, 4, 5] / 0
1
$ VAR list = [1, 2, 3, 4, 5]
[1, 2, 3, 4, 5]
$ list / -1
5
Lists are useful in performing computations on data. For example, images can be expressed as arrays of integers. Using SimpleScript, you can use this list representation to perform computations on the image by interacting with its respective list.
Strings are essentially just lists of individual characters. In SimpleScript, you can define and operate on strings the same way you would in BASIC. You can concatenate two or more strings together, and you can repeat strings by multiplying it with a number.
$ "This is a string!"
This is a string!
$ "I can start here, " + "and end here!"
I can start here, and end here!
$ "Echo! " * 5
Echo! Echo! Echo! Echo! Echo!
$ FUNC welcome (name, repeat) -> "Welcome, " * repeat + name
<function welcome>
$ welcome ("Michael", 5)
Welcome, Welcome, Welcome, Welcome, Welcome, Michael
You can set variables to equal strings and you can also write anonymous functions which perform actions on strings.
The following comparison operators are supported in SimpleScript. They can be used in addition to variable assignment and function executions. This is due to how the interpreter understands the ASTs being generated. The result is that you can chain together long and complex comparisons without needing to stop to define anything.
Operator | SimpleScript Command | Description |
---|---|---|
Exactly Equals | == |
Evaluates to TRUE if both sides are equal |
Not Equals | != |
Evaluates to TRUE if both sides are not equal |
Less Than | < |
Evaluates to TRUE if the left side is smaller than the right side |
Greater Than | > |
Evaluates to TRUE if the left side is larger than the right side |
Less Than Or Equals To | <= |
Evaluates to TRUE if the left side is smaller or equal to the right side |
Greater Than Or Equals To | >= |
Evaluates to TRUE if the left side is larger or equal to the right side |
These operators are the same as BASIC or Python's operators, so their syntax should hopefully be familiar to some. Comparisons between 1 and 0 can be interpreted as comparisons between TRUE and FALSE respectively.
No language would be complete without logical operators. These operate the same way as logic gates do. All of AND, OR, and NOT operate in the same way they do in regular BASIC and Python.
Operator | SimpleScript Command | Description | Notes |
---|---|---|---|
Logical AND | AND |
Evaluates to TRUE if both sizes are TRUE | |
Logical OR | OR |
Evaluates to TRUE if at least one side is TRUE | |
Negation | NOT |
Evaluates to the opposite Boolean value of the expression | TRUE becomes FALSE, and vice-versa |
These can be chained into variable definitions and other assignments and function declarations to evaluate the truth values of abstract statements. The underlying ASTs of these operations are built in such a way to be able to handle applications on abstract entities; you can apply these logical operators to any expression.
SimpleScript allows you to add break, continue, return, and end commands in your loops and functions. They all work the same way they do in BASIC and Python. Here is a table describing each control flow operator.
Statement | SimpleScript Command | Description |
---|---|---|
Return | RETURN |
Tells the function to return something |
Break | BREAK |
Breaks out of loops |
Continue | CONTINUE |
Continues (skips) to next iteration in loop |
END | END |
Delimit the end of a function or loop |
You can add these control flow statements into your loops and functions the same way you would in BASIC or Python. Please read the relevant documentation section for examples.
You can make if-statements by using the IF
, ELSE
, and ELIF
keywords. Pairing this with logical or comparison operators can allow for dynamic variable assignment.
$ IF TRUE THEN 123
123
$ IF (1 + 2) <= (3 + 4) THEN TRUE
1
$ IF 1 != 1 THEN 100
$ IF 1 != 1 THEN 100 ELSE 999
999
$ IF TRUE == 0 THEN (VAR x = 9) ELIF TRUE == 1 THEN (VAR x = 5) ELSE (VAR x = 0)
5
$ IF (10 - 5) == 5 THEN (VAR x = TRUE) ELSE (VAR x = FALSE)
1
$ x
1
These control flow operations also allow for the inline assignment of variables, as seen above. In the case that no value is assigned (e.g. IF 0 THEN 123
), then no value is outputted.
For-loops are control flows that execute a set number of times before terminating. For-loops are great for iteration and compounding operations. SimpleScript follows the BASIC model of for-loops, where a loop condition are defined and a body expression is grouped up and executed. More simply, SimpleScript for-loops are formatted like for-loops in C.
$ FOR i = 1 TO 10 THEN 2 ^ i
[2, 4, 8, 16, 32, 64, 128, 256, 512]
$ VAR x = 1
1
$ FOR i = 1 TO 5 THEN VAR x = x + i
[2, 4, 7, 11]
$ x
11
$ VAR y = 100
100
$ FOR i = 100 TO 0 STEP -1 THEN VAR y = y - i
[0, -99, ... -4949, -4950]
$ y
-4950
$ VAR z = 1
1
$ IF 10 == (20 - 10) THEN FOR i = 0 TO 10 THEN VAR z = z + i
[1, 2, 4, 7, 11, 16, 22, 29, 37, 46]
$ z
46
You can take advantage of BREAK
and CONTINUE
commands to control the flow of your loops.
As you can see from the following example, we can chain together multi-line for-loops that execute different control flow commands on certain values of i
.
$ VAR a = []
[]
$ FOR i = 0 TO 10 THEN; IF i == 4 THEN CONTINUE ELIF i == 8 THEN BREAK; VAR a = a + i; END
[0]
$ a
[0, 1, 2, 3, 5, 6, 7]
SimpleScript allows you to assign variables inline with your for-loop. This opens up the possibility for dynamic variable assignment depending on predefined contexts. The greatest benefit, however, is that because SimpleScript allows for mutations of variables, you can initialize a variable beforehand and then use it to perform meta-computations in the for-loop itself.
Like for-loops, while-loops take some condition and execute an expression. The difference here is that the provided expression will keep executing until the provided condition proves false. This allows us to repeat programs and expressions until we reach some desired goal.
$ VAR x = 1
1
$ WHILE x < 10 THEN VAR x = x + 1
[2, 3, 4, 5, 6, 7, 8, 9, 10]
$ x
10
$ VAR y = 10
10
$ WHILE y > 0 AND (TRUE == 1) THEN VAR y = y - 2
[8, 6, 4, 2, 0]
$ y
0
Just like for-loops, control flow statements like BREAK
and CONTINUE
can be used to augment your while-loops.
Here is the same example found in the previous section. Instead of using for-loops, we're going to get the same resulting list using while-loops.
$ VAR a = []; VAR i = 0
[, 0]
$ WHILE i < 10 THEN; VAR i = i + 1; IF i == 4 THEN CONTINUE; IF i == 8 THEN BREAK; VAR a = a + i; END
[0]
$ a
[1, 2, 3, 5, 6, 7]
Variables and sub-expressions can be chained together to form large compound conditions for your while-loop. The body of the loop can be similarly built. These larger compound control flow programs can chain into function calls and other sub-routines, making them very powerful.
You can use the FUNC
keyword to create functions in SimpleScript. Like Python, you can assign functions to variables for future use. Function names can be made up of letters and underscores.
$ FUNC my_math (a, b, c) -> a + b - c
<function my_math>
$ my_math (1, 2, 3)
0
$ VAR my_func = FUNC my_math (a, b, c) -> a + b - c
<function my_math>
$ my_func
<function my_math>
$ my_func (1, 2, 3)
0
$ my_math (1, 2, 3)
0
$ debug
$ FUNC bad_func (a, b) -> a | b ^ 2
<function bad_func>
$ bad_func ()
Traceback (most recent call last):
File <stdin>, line 1, in <program>
bad_func ()
^^^^^^^^
$ debug
$ FUNC bad_func (a, b) -> a | b
<function bad_func>
$ bad_func (1, 2, 3, 4)
Traceback (most recent call last):
File <stdin>, line 1, in <program>
bad_func (1, 2, 3, 4)
^^^^^^^^^^^^^^^^^^^^
$ FUNC test(); VAR foo = 5; RETURN foo; END
[<function test>]
$ test()
[5]
If debug
is enabled, the interpreter will complain if you add too many or too few function parameters. You can also create anonymous functions (e.g. lambda functions in Python) using similar syntax.
$ FUNC (a, b) -> a ^ 2 + b ^ 2
<function <anonymous>>
$ VAR anon_func = FUNC (a, b) -> a ^ 2 + b ^ 2
<function <anonymous>>
$ anon_func (2, 3)
13
Like variables and flow control loops, you can chain together large compound function calls inside smaller anonymous function declarations. The interpreter's backend has been built to handle abstract layers of expressions and nesting; there is no restriction to the number of nested compound functions you can use.
You can chain multiple statements together using multiple lines. Not only does this clean up your program, but it allows you to execute more than one operation in loops. Semi-colons are used to delimit newlines in your program. You can also write multi-line expressions in the interactive shell by using semi-colons.
$ VAR result = IF 5 == 5 THEN "Math is working well" ELSE "Not working"
[Math is working well]
$ result
[Math is working well]
$ IF 5 == (3 + 2) THEN; PRINT("Math is cool");PRINT("Working well") ELSE PRINT("Not working")
Math is cool
Working well
[0]
This same rule applies to all other loops and control flow operations. Same goes for functions. You write longer, more complex programs by chaining together multiple operations in loops and function bodies.
Comments are simply anything to the right of a #
character. There are no multi-line comments in SimpleScript.
$ # This is a comment!
$ # I will not be executed!
Comments are for personal use. The interpreter will simply skip over them. It is a personal preference when and where to use comments. There is no standard way of using them, as the underlying language itself doesn't care about them. Do whatever brings you joy.
The SimpleScript language is built around the grammar of BASIC. The grammar was used in the creation of the Lexer and the Parser. It served to inform the design and development of the creation of the abstract syntax trees and their tokens.
statements : NEWLINE* statement (NEWLINE+ statement)* NEWLINE*
statement : KEYWORD:RETURN expr?
: KEYWORD:CONTINUE
: KEYWORD:BREAK
: expr
expr : KEYWORD:VAR IDENTIFIER EQ expr
: comp-expr ((KEYWORD:AND|KEYWORD:OR) comp-expr)*
comp-expr : NOT comp-expr
: arith-expr ((EE|LT|GT|LTE|GTE) arith-expr)*
arith-expr : term ((PLUS|MINUS) term)*
term : factor ((MUL|DIV) factor)*
factor : (PLUS|MINUS) factor
: power
power : call (POW factor)*
call : atom (LPAREN (expr (COMMA expr)*)? RPAREN)?
atom : INT|FLOAT|STRING|IDENTIFIER
: LPAREN expr RPAREN
: list-expr
: if-expr
: for-expr
: while-expr
: func-def
list-expr : LSQUARE (expr (COMMA expr)*)? RSQUARE
if-expr : KEYWORD:IF expr KEYWORD:THEN
(statement if-expr-b|if-expr-c?)
| (NEWLINE statements KEYWORD:END|if-expr-b|if-expr-c)
if-expr-b : KEYWORD:ELIF expr KEYWORD:THEN
(statement if-expr-b|if-expr-c?)
| (NEWLINE statements KEYWORD:END|if-expr-b|if-expr-c)
if-expr-c : KEYWORD:ELSE
statement
| (NEWLINE statements KEYWORD:END)
for-expr : KEYWORD:FOR IDENTIFIER EQ expr KEYWORD:TO expr
(KEYWORD:STEP expr)? KEYWORD:THEN
statement
| (NEWLINE statements KEYWORD:END)
while-expr : KEYWORD:WHILE expr KEYWORD:THEN
statement
| (NEWLINE statements KEYWORD:END)
func-def : KEYWORD:FUN IDENTIFIER?
LPAREN (IDENTIFIER (COMMA IDENTIFIER)*)? RPAREN
(ARROW expr)
| (NEWLINE statements KEYWORD:END)
This grammar was lifted from davidcallanan/py-myopl-code as they already had the most accurate and feature-rich BASIC grammar I could find online. This grammar is completely barebones; all BASIC implementations would be identical. This was lifted to save me the hastle of drafting the initial grammar myself, a task reserved for POWs and grammar enthusiasts.
The process of building and interpreting a programming language from scratch can be split into three main components. Each component produces their own deliverable that is used as an input for the following process.
- Lexing: Turns text strings into a list of tokens
- Parsing: Turns the list of tokens into an abstract syntax tree (AST)
- Interpreting: Evaluates every node of the AST to return a result
The three components are named accordingly in the bin/
directory. They are: lexer.py
, parser.py
, and interpreter.py
. These three components are the backbone of (most) programming languages.
Here are some of the best physical and digital resources I could find on the subject of creating an interpreter for a programming language from scratch:
- Language Implementation Patterns: Create Your Own Domain-Specific and General Programming Languages (Pragmatic Programmers)
- Writing Compilers and Interpreters: A Software Engineering Approach
- Modern Compiler Implementation in Java
- Modern Compiler Design
- Compilers: Principles, Techniques, and Tools (2nd Edition)
- The "Let’s Build A Simple Interpreter" Blog Series
- The "Make YOUR OWN Programming Language" Video Series
- Introduction to Programming Languages/Interpreted Programs
- Clean Architecture: A Craftsman's Guide to Software Structure and Design (Robert C. Martin Series)
Hopefully these help inform similar projects in the future. While SimpleScript is "complete," there is a great deal that can be improved upon. I, for one, am less-than-satisfied with much of the variable and function mechanisms of SimpleScript.