|
|
|
Arrays in QDL are defined in var-statements by specifying the dimensions in square brackets. The contents of the square brackets have two basic formats:
Var Array1: [5] Integer; Var Array2: [1..5] Integer;
Both of these statements make an array with five elements. The first is what I call a C-style declaration, and its subscripts range from zero to four. I call the second a Pascal-style declaration, and its subscripts range from one to five.
The syntax of a QDL array type is as follows:
array-spec: [[start-subscript..[end-subscript] | element-count]]
QDL supports multi-dimensional arrays:
Var AnArray: [1..2][1..4][1..3] Integer;
[1..2] is the "major dimension" and is listed first; [1..4][1..3] are the minor dimensions and are list afterward. For example:
AnInteger = AnArray [2][4][3]; // Get very last entry in the array
No information about the size of an array is guaranteed to be available at run-time. Therefore, the QDL compiler can only generate code that checks array subscripts at run-time if the size of the array is specified explicitly. The SizeOf pseudo-function cannot be applied to an array of unknown size. The value of Class ([]).Size is zero, but this has no significance. Again, QDL arrays are the only variable-size data type.
When you use the syntax [N] to specify an array dimension, where N is the number of elements in the array, it means the same as if you had written [0..N-1].
Furthermore, when you use [] in function headers to mean "an array of unknown size", it means the same as if you had written [0..]. Thus, it is possible to declare an array of unknown size whose starting subscript is N using [N..].
It is possible to declare an array of zero size, using [0]. But why?
QDL fixed-size arrays are "deprived" forms of classes:
It would have been possible to implement arrays as non-classes, but that would prevent their use as subjects of template parameters. Because the Class instance must store the size of instances of an object, a new Class instance is created for each new set of array dimensions and subject of the array. The name stored therein is composed using the following syntax:
([count])*resolved-typeidentifier
The first part of this is the number of elements in each array dimension. The second part is the name of the type of the elements of the array, and the namespace containing it. If the type of the elements is parameterized, the parameters are not stored.
For example, if you make an object of type [2..5][-1..1] # FileStream, the name would be [4][3]Std::Memory::#.
The rest of the class structure would be filled as follows:
There is a second class instance for each minor dimension as well. Extending the example above, there would be a second Class instance with the name [3]Std::Memory::#.
Every array class has the following members, all of which can be expanded inline by the compiler.
An array class also has the a
A single Class instance exists for unknown-size arrays, and its name is []; however, an unknown-size array is not a class because all classes must have a known size.
Take a look at the following:
Function Main(): A: [10] Integer; MyFunc A; End;
Function MyFunc (B: @ [] Integer): ... End;
In this case, MyFunc has no way of knowing the size of array B. A function may specify a constant size in its argument list:
Function MyFunc (B: @ [8] Integer): ... End;
This example simply makes the compiler assume that the array passed to it has a size of 8. When a function is declared like this, any array passed to the function must either be:
The number of dimensions must remain the same during the pass. For example, the following is illegal:
Function Main(): A: [3][3] Integer; MyFunc A; End;
Function MyFunc (B: @ [8] Integer): ... End;
Either the function or the caller can use a range of array subscripts:
Function Main()
{
Var A: [-5..5] Integer;
MyFunc A;
}
Function MyFunc (B: @ [100..107] Integer)
{ ... }
In this case, B[100] references A[-5], and B[107] references A[2].
Furthermore, a function can have a multidimensional array reference as an argument. In this case, all minor dimensions must have matching ranges.
For example, the following is legal:
Function Main()
{
Var A: [5..10][2..4] Integer;
MyFunc A;
}
Function MyFunc (B: @ [1..2][1..3] Integer)
{ ... }
B[1][1] would access A[5][2], B[1][2] would access A[5][3], B[2][1] would access A[6][2], and so on.
You may have noticed that all of the declarations of B, the argument to MyFunc, have been references to arrays (signified by the @ sign) rather than simply arrays. This is because QDL does not allow you to pass arrays as arguments to functions. You can, however, pass instances of classes to functions.
A reference is a variable that acts as an alias for another variable, or an array. A @ symbol at the beginning of a data type indicates that it is a reference. For example:
Var X, Y: Integer (5); Var Ref: @ Integer (X);
The variable Ref is not an integer, even though it is used like one. Instead, it is an alias for an integerwhen you access Ref, you are actually accessing X! For example:
StdOut.Print Ref;
This causes 5 to be printed. With me so far? Now, you can change the value of X with a statement such as this:
Ref = 1; StdOut.Print X;
This causes 1 to be printed. Now, unlike C++ references, QDL references can be changed to be an alias for something else. This is done by assigning the address of a variable to @ref-name, like this:
@Ref = @Y; Y = 10; StdOut.Print Ref;
This causes 10 to be printed. A shorthand operator is also provided, @=:
Ref @= Y;
So, you ask, what in tarnation are references good for? There are endless uses, but here are two common ones:
For example, you could write a function that converts a coordinate to a vector, using reference arguments:
Function Convert ("[Coords]" X, Y: Double "To Vector" Angle, Magnitude: @ Double);
Since the angle and magnitude parameters are aliases, the Convert function can modify the caller's variables. For example:
Var Ang, Mag: Double; Convert Coords 100, 50 To Vector Ang, Mag;
If you dynamically allocate an object, you must store it in a reference variable. For example, if you have an class called BigObject:
Var MyBigObj: BigObject @; @MyBigObj = New(BigObject); Delete MyBigObj;
References support limited casting. A reference to an object of class C can be cast:
During the cast, whether explicit or implicit, the reference type may be changed; for example, from #C to @C. (The meaning of this will not become clear until much later.)
When it is required that the casting of references must be done explicitly, QDL makes use of RTTI to make sure the object of C is also an object of D or S. If the object is not, an exception is thrown. If the object of C does not have an RTTI pointer (i.e. C is Stripped), then cast types 1, 3 and 4 are not allowed.
In QDL you can create customized reference types. This is an advanced topic which is covered much later; for now, realize that:
In QDL, Null is a keyword that represents a reference or the value stored in a reference that doesn't point to anything.
Var Ref: Integer @ (Null);
In the first QDL specification, Null was defined as Cast (0: *), but I've identified several flaws with this approach. Therefore, Null instead has the special data type Null; this type name does not identify a class and cannot be used in any context where a type name is expected. The data type name exists only as a means for the compiler to identify the type in error messages.
Note: I've also removed the "plain pointer" data type, *, from QDL.
The value Null has the special property that it can be implicitly be converted to any pointer or reference type, in particular contexts. Only two operators are applicable to Null alone:
@Ref = @Null; @Ref = Null; Ref @= Null;
No other operators are applicable to Null except in conjunction with another data type. The following rules apply in expressions involving Null:
Attempting to access a Null reference generally causes a program to crash. Don't do it.
@ is a reference to nothing in particular (a plain normal reference). It is allowable to make a variable of type @; its address can be assigned the value of a pointer to any data type that has run-time type information (RTTI), meaning any non-Stripped class. The referenced object in a plain reference cannot be used directly; there are only five things you can do with one:
A class with a Reference clause is a reference, and as such it can have a plain form if the specification class is empty or missing, and Inherits is not used in the clause.
An instance of any non-Stripped class, or reference thereto, can be implicitly converted to a plain reference.
You cannot create a pointer variable in QDL, but that's not to say there isn't a pointer data type. In fact, there is, and the good old C star operator (*) is used to name and dereference the type.
Note: QDL does not have a -> operator.
The size of a pointer type can be determined with SizeOf, but pointers are not classes, so the Class pseudo-function cannot be applied to pointer types. There are three distinct categories of pointers, each of which may have different sizes on a given platform:
Every pointer type that has only one level of indirection has a corresponding reference type, where the * at the beginning of the type name is replaced with the name of the reference class, typically @.
Note: Custom reference types can only reference data (category 1.)
A pointer can be converted to a pointer type in a different different category, but data loss may result. The meaning of the resultant pointer is undefined and generally unusable. If a pointer is converted to a different type within the same category, the pointer will still point to the same memory. The pointer will have the same value if converted back.
The @ reference type that corresponds to a pointer type has the same size. A pointer can be converted to its corresponding reference type and back without data loss.
Since most of the people learning QDL will already know C, it seems appropriate to jump-start the understanding of these people with a comparison of QDL references with C pointers.
C had pointers as a central feature. C++ introduced references, which are basically pointers accessed in a different way. QDL drops C-style pointer variables altogether, but compensates the loss completely by giving new abilities to reference variables. I always figured the ampersand (&) was not a good symbol to represent references, so in QDL I use the "at" sign (@), which when you think about it (being an English-speaker) is more intuitive. To illustrate the use of references, look at the following C code:
int a, *aptr;
a = 0x1234;
aptr = &a;
*(unsigned char *)aptr = 0xFF;
printf ("%X %X", a, *aptr);
The equivalent QDL code would be:
a: Integer; aRef: @Integer; a = 0x1234; @aRef = @a; *(@aRef: *UInt8) = 0xFF; StdOut.Print (a: String, Hex), ' ', (aRef: String, Hex);
On x86, both of these would output: 12FF 12FF
As shown, you can take the address of something in QDL using @. But unlike C, there are no pointer variables in which to store the address. You can store the address in a reference variable, however, because @reference-variable is an lvalue.
FYI, the method used above to set the low-order byte to 0xFF is nonportable; The preferred way would be:
aRef |= 0xFF;
The essential difference between C pointers and QDL references is how to imply the use of the address and the use of the object or value at that address. This little table covers it:
For a ptr/ref called P: Address of P Address in P Value pointed to C: Pointer P: &P P *P C++: Reference P: unavailable &P P QDL: Reference P: @@P @P P
And although I urge you to avoid it, QDL can also have references to references. In C:
int i1, i2, i3;
int *p[3] = { &i1, &i2, &i3 };
int **doublepointer = p;
(*doublepointer)[1] = 123; // Sets i2 to 123
The equivalent QDL code would be:
i1, i2, i3: Integer;
p: [3] @Integer { i1, i2, i3 };
DoubleRef: @[]@ Integer (p);
DoubleRef[1] = 123; // Sets i2 to 123
And, you can even do bad things like assigning integers to pointers and vice versa. In C:
int x, y; void *p; x = (int) p; p = (void *) y;
The almost equivalent, and equally dysfunctional, QDL code:
x, y: Integer; p: UInt8 @; x = (@p: Integer); @p = (y: * UInt8);
In C, an array's name acted as a pointer, and a pointer could be used as if it were an array. This is not true in QDL... not even close. Check out these variables:
AnArray: [10]Integer; RefToArray: @ [] Integer; Simple: Integer; SimpleRef: @ Integer; DoubleRef: @ @ Integer;
The data types of all these variables is different. The first two are array types, and the rest are not. You can't perform pointer arithmetic on an array, and you can't treat the references as if they were arrays, even if you drop to pointer level using @. The dereference (*) operator cannot be used on an array.
These statements are legal:
@RefToArray = @AnArray; // ArrayRef now points to AnArray RefToArray[5] = 123; // Change the value of AnArray[5] @SimpleRef = @Simple; // SimpleRef now points to Simple SimpleRef = 456; // Change the value of Simple to 456 @SimpleRef = @AnArray[5]; // SimpleRef now points to a member of the array @DoubleRef = @SimpleRef; // DoubleRef now points to SimpleRef DoubleRef = 456; // Change the value of AnArray[5] to 456
But here are some wrong statements with (possibly "unsafe") corrections:
@RefToArray = @Simple; // Wrong, ArrayRef must point to an ARRAY! @RefToArray = (@Simple: * []Integer); // Force it with a type cast @SimpleRef = AnArray; // Wrong, SimpleRef must point to a specific integer @SimpleRef = @AnArray; // Still Wrong @SimpleRef = @AnArray[0]; // Right, you must specify an element number @RefToArray = @AnArray[4]; // Wrong, the address of AnArray[4] is a plain integer @RefToArray = @@AnArray[4]; // Wrong again, can't take the address of the address @RefToArray = AnArray + 4; // Wrong again, pointer arithmetic not allowed on arrays @RefToArray = (@AnArray[4]: *[]Integer); // You can force the issue with a type cast @DoubleRef = @Simple; // Wrong, DoubleRef must hold a reference to a reference @DoubleRef = @@Simple; // Wrong, @Simple is an rvalue; its address can't be taken @DoubleRef = @(@SimpleRef = @Simple); // Right, an intermediate reference is required
Since QDL has pointers, I included pointer arithmetic in the language. It works basically the same as in C. There are six operators that can use pointer arithmetic: plus, minus, increment and decrement, addition-assignment and subtraction-assignment (+, -, ++, --, +=, -=). The latter four require an lvalue to modify (in other words, they work on the address in a reference variable.) In the following list, Ptr and Ptr2 represent pointers to type Type, and Int represents an integral number or character. IntX represents the unsigned signed integral type that is the same physical size as a pointer on a given platform.
Pointer arithmetic cannot be applied to pointers in categories (2) and (3) as defined here. If the pointer is to an array type, pointer arithmetic cannot be applied if the array size is unknown. Two different array types that contain the same kind of elements cannot be subtracted
You may be wondering why the array subscripts are on the left side, when C/C++ and Java have them on the right side. No, it's not because Pascal does it that way. Actually, at first I did have them on the right side, since I prefer that style; in a bit I'll explain how they got over on the left.
It took me several years to become confident in deciding what a particular C data type declaration might mean. For example:
int (*a)[20]; // Line 1 int *a[20]; // Line 2 int *(a[20]); // Line 3 int **a; // Line 4 unsigned *(*f)(void *g); // Line 5
I bet that at least 3 out of 5 of these declarations would baffle 99% of new C programmers. Now, after some staring, I can tell that
I had designed QDL so that, when describing a data type, brackets were never required, and the variable name was separated from the data type. In order for the no-brackets thing to work, all modifiers have to be on one side of the data type. For example, it would not be obvious whether
X: @ Integer [10];
Declares an array of ten references to integers, or a reference to an array of ten integers. A C programmer would probably say it should be the former, so you would need brackets only if you wanted the latter. Of course, as I've been trying to illustrate, I don't want this to resemble C, so I had everything go on the right side:
Var W: Integer @; Var X: Integer [10] @; Var Y: Integer @ [10]; Var Z: Integer @ [10] @;
Then, the rule for determining the data type becomes simple: just "read" the data type backwards:
The problem arises when you try to make a multidimensional array:
Var AnArray: Integer[1...2][1...4];
The English-language meaning is phrased in exactly the reverse order of the declaration. Therefore, it would stand to reason that the declaration of AnArray above would mean "An array of 4 arrays of 2 integers." But if this were the case, then 4 would become the major dimension and 2 minor dimension, or, in other words, the arrays of 2 integers would be stored contiguously in memory. Thus, when you would go to access this array in code, you would have to reverse the subscripts:
AnArray [4][2]
This, of course, would be counterintuitive, so I made an exception for multidimensional arrays: the meaning would be reversed for adjacent subscripts. Thus, the above declaration would actually mean "An array of 2 arrays of 4 integers", so that you would access the array in code using the same order:
AnArray [2][4]
This makes sense only as long as you keep things this simple. When you start adding references and dynamic arrays, things break down:
Var A: Integer [2]@[4]; Var B: Integer [List][2];
In these cases the compiler cannot do the reversing thing, so the subscripts are again backwards. Very confusing indeed for the poor newbie coder! After writing eight more chapters I decided this wasn't acceptable, so I decided it should all go on the left after all. Reading a data type becomes even easier now, since you don't have to read it backwards:
Var C: [2]@[List] Integer; Var D: @[] String;
C is an array of 2 references to dynamic lists of Integers; D is a reference to an unknown-size array of Strings.
Initialization in QDL is done using constructor arguments. Initializing arrays is an extension of this: instead of providing a single set of constructor arguments in brackets, curly braces are used to enclose lists of constructor arguments. The syntax is as follows:
array-constructor-arg-list: ( constructor-arg-list | (constructor-arg-list)
)
initialize-list: ( { ( <array-constructor-arg-list> |
<initialize-list> ) [,][...] }
initialize-lists are used for initialization of two categories of data types:
We will discuss array initialization first, but even for this discussion you should have a working knowledge of Classes. Therefore, you may wish to read further in this documentation to learn about other parts of the language, then return to this section to fully understand this discussion. Then again, for simple array initialization, you don't need knowledge about classes. Here is an example:
A: [-2..2] Integer { 1, 2, 3, 4, 5 };
This defines an array A with A[-2] initialized to 1, A[-1] initialized to 2, and so on. Because even simple data types are considered to be classes, each number in the list is actually an argument list for the constructor of Integer. Normally, constructor arguments are enclosed in brackets, but they are optional in this case because braces and commas are used to delimit the arguments. A constructor with multiple arguments, however, may require commas between arguments. In this case you must enclose the arguments in brackets. For example:
Class C:
Function Constructor (X, Y: Integer);
Function Constructor (X: Integer "AND" Y: Integer);
Function Constructor (X: Integer)
End Class;
ArrayOfC: C[4] { (1, 2), (3, 4), 5 AND 6, (7, 8) * 9 };
This example highlights three things:
initialize-lists can be nested when there is more than one subscript in the array. For example:
X: [2][3] Integer {
{ 1, 2, 3 }
{ 11, 12, 13 }
};
This nesting is optional; the above would have the same result if written like this:
X: [2][3] Integer { 1, 2, 3, 11, 12, 13 };
The nesting is not arbitrary; you cannot nest anywhere you want, and at any level within an array. You must be consistent about whether or not you are using nesting within a single level of an array. You cannot nest an inner level unless you also nest the enclosing level. All three of the following would be illegal:
W: [2][3] Integer { 1, { 2, 3 }, 11, { 12 }, 13 };
X: [2][3] Integer { 1, 2, 3, { 11, 12, 13 } };
Y: [4][3][2] Integer {
{ 0, 1, 10 }, { 11, 20, 21 },
{ 100, 101, 110 }, { 111, 120, 121 },
{ 200, 201, 210 }, { 211, 220, 221 },
{ 300, 301, 310 }, { 311, 320, 321 },
};
However, the following would be legal:
Z: [4][3][2] Integer {
{ 0, 1, 10, 11, 20, 21 },
{ 100, 101, 110, 111, 120, 121 },
{ 200, 201, 210, 211, 220, 221 },
{ 300, 301, 310, 311, 320, 321 },
};
Unlike in C/C++, you cannot simply break off an initializer list anywhere you want. Both of the following would be illegal:
X: [4] Integer { 1, 2, 3 }; // Not enough initializers!
Y: [3][2] Integer
{ { 1, 2 }, { 3, }, { 5, 6 } };
If you don't want to initialize the whole array, the ellipsis (...) becomes your friend. It specifies that the remaining elements should be initialized using the class's no-argument constructor (which, for integers, does nothing, so the rest of the array remains uninitialized.) The ellipsis cannot be used when any of the remaining elements to be initialized do not have a no-argument constructor, or have one that is inaccessible (because of the access mode).
For example:
X, Y, Z: Integer;
Refs: Integer @[10] { X, Y, Z, ... };
After initialization, Refs[0] references X, Refs[1] references Y, Refs[2] references Z, and the remaining elements contain Null.
An array of objects of a class that does not have an accessible no-argument constructor cannot be created without the use of an initialize-list, and when an initialize-list is used, it cannot contain the ellipsis.
If you want to use the no-argument constructor in the middle of an initialize-list, you must include the brackets, or it will be considered an error:
X: [5]Integer { 1, 2, , 4, 5 }; // ERROR!
Y: [5]Integer { 1, 2, (), 4, 5 }; // OK
An array's most major subscript can be of unknown size. In this case, the compiler will count the number of initializers and set the array size to that many elements. For example:
Z: [][2] Integer { 1, 2, 3, 4, 5, 6, 7, 8 };
The compiler will count four arrays of two initializers and therefore make the major subscript. The data type of Z becomes [4][2] Integer, just as if you had written:
Z: [4][2] Integer { 1, 2, 3, 4, 5, 6, 7, 8 };
Of course, when creating arrays in this way, you cannot use the ellipsis.
A: [][3] Integer { { 1, 2, 3 }, { 4, ... }, { 7, 8, 9 } };
// Legal, second subscript IS known
B: [][3] Integer { { 1, 2, 3 }, { 4, 5, 6 }, { 7, 8, 9 }, ... };
// Illegal, first subscript is unknown
C: [][3] Integer { 1, 2, 3, 4, 5, 6, 7, ... };
// Also illegal
Classes can be initialized using an initialize-list. The initialize-list initializes all members of a class that use an empty-initializer (( )). For example:
Class C:
A: Integer(5);
C, D, E: Integer();
B: Integer;
Constructor(): B(7) { }
Constructor(X: Integer): B(X), D(X), E(X) { }
End Class;
C1: C { 1, 2, 3 };
C2: C (4);
In this case, C1 is initialized using 1 for C, 2 for D and 3 for E. A is then initialized to 5, while B is not yet initialized. After initializing the members according to the specification of the initialize-list, the compiler always calls the default constructor. Thus, C1.B ends up with a value of 7. An object cannot be initialized using an an initialize-list unless the class has an accessible no-argument constructor.
Next, C2 is initialized. Since C2 is initialized using an ordinary constructor, its members are initialized in this order: first, A gets set to 7, then 4 is assigned to D, then E, then B.
The rules that apply to array initialization can be logically adapted to object initialization. Thus, the following are all legal:
Class C: A: Integer(); B: String(); C: Boolean(); End Class;
Var V, W: C { 3, "two", True };
Var X: C { 12 ... };
Var Y: C[4] {
12, "twelve", False,
13, "thirteen", True,
14, "fourteen", ...
};
Var Z: C[3] {
{ 1, "one", true },
{ 2, (), false },
{ 3 ... },
};
Class D: C: C(); X: Integer[2][2](); Y: String(); End Class;
L: D { { 1, "one", true }, { 1, 2, 3, 4 } ... };
M: D { 2, "two", true, { { 1, 2 }, { 3, 4 } }, "hello" };
N: D { { 3, ... }, 1, 2, 3, 4, "hello" };
But these are illegal:
O: D { { 1, "one", true }, { 1, 2, 3 ... } };
P: D { 2, "two", true, { 1, 2 }, { 3, 4 }, "hello" };
Q: D { 3, ..., 1, 2, 3, 4, "hello" };
An initialize-list can be used in a call to New. For example:
IntArray: @ [] Integer;
IntArray @= New ([] Integer { 1, 2, 3, 4, 5 }); // Create and init array of five integers
This could also be shortened to:
IntArray: @ [] Integer (New ([] Integer { 1, 2, 3, 4, 5 }));
Template class objects can be initialized with initialize-lists.
Class Example Template: Template Type; J: Integer (); TypeVar: @ Type (); K: String (); End Class;
C: C;
T: Example Template C {
4, { 3, "Two", True }, "Zero"
};
When initializing an instance of a class, there is no way to initialize members of base class(es). For example:
Class C: W, X: String(); End Class; Class D: Inherits C; Y, Z: Integer(); End Class;
DInst: D { "Err", "or", 1, 2 }; // ERROR! Can only initialize immediate class
DInst: D { 1, 2 }; // That's better
| Table of Contents | Qwertie's Site/Mirror |