AngelScript 2.0.0 WIP 4 (2005/01/19)

Started by
13 comments, last by WitchLord 19 years, 3 months ago
I've finally released the first WIP of AngelScript. The complete list of changes is too long to list here, so I'll just mention the major ones. The full list can be found on the site. For this release I concentrated on fixing security issues (pointers, parameter references) and preparing the library for future features (script structures, debug hooks, memory management, etc). There are no real new features, only improvements. - Pointers have been removed from the script language. (Object handles will somewhat replace these) - Arrays are now resizable by calling the resize() method. - Script functions with parameters are now much easier to call due to the improved interface for passing arguments and retrieving return values. No longer will it be necessary to allocate memory and pass hidden pointers in order to work with object parameters. - Function parameters that are sent by reference must use the flags in, out, or inout to tell the engine how the parameter is used. This is necessary in order to fix a security issue with parameter references where a reference may become obselete before its time. The drawback is that parameter references no longer can be used for increasing performance as the compiler will still copy the object. (Object handles can be used instead) - The library has been prepared internally to support custom memory management per object type, e.g. if an object type is using a memory pool instead of dynamically allocating memory this will be easily supported by registering a couple of memory management functions for that object type. I still need to implement the object handles and fix a few minor issues, but AngelScript 2.0.0 is already useable and can execute all the tests in the feature test framework (except the ones that used pointers). With the object handles the script will be able to declare reference counted pointers for those objects that support this. In expressions the object handles will work just as the real object, but will have the ability to replace the reference for another object by doing a reference assignment. Thus there will not be much need for manual dereferencing or -> operators. The performance of AngelScript 2.0.0 is slightly lower than 1.10.1, about 10% when working with primitives and more when working with objects. But this is likely to change before the final version. Regards, Andreas Jönsson Author of AngelScript [Edited by - WitchLord on January 19, 2005 6:43:01 PM]

AngelCode.com - game development and more - Reference DB - game developer references
AngelScript - free scripting library - BMFont - free bitmap font generator - Tower - free puzzle game

Advertisement
Please excuse my cynicism, but how are object handles different from pointers in this context? It seems to me what you have now are References - nothing but references - with almost-pointers tacked on to fix a couple of issues. Would it not perhaps be better to always move objects by value by default, and use references in the same way C++ does?
Your cynisism is excused [wink]

The object handles are not implemented yet. So at the moment all objects are handled by value, with the possibility of passing them by reference to functions. WIP 2 will include the object handles.

The problem with parameter references like C++ does them are that they are not safe, there are no safeguards to control the life time of the value. Thus it is possible that when the value is being used in the function the reference is no longer valid. In AngelScript I have fixed this by copying the object to a temporary variable whose life time is guaranteed, and then passing the reference to that one, when the function returns the variable is copied from the temporary variable back to the original location. This is obviously not as efficient as passing the reference to the real variable, but it is safe which is the main issue with scripting languages.

Object handles are necessary because there are some things that are just not possible to do without having them. Linked list for example wouldn't be possible without object handles. With object handles it will also be possible to pass objects to functions without the need to make inefficient copies of them behind the scene, so they will also serve as a performance enhancement.

An object handle will add a reference to an object so that the life time of the object is guaranteed. Thus an object handle is not the same as a C++ pointer, they are more like a smart pointer. The syntax for the object handles are as follows:

obj@ a;          // Declare an object handle that don't point to anythingobj@ c = @obj(); // An initialized object handleobj b;           // A normal object@a = @b;         // A reference assignment b = a;           // A value assignmenta.mthd();        // Call the object method@a = null;       // Clear the reference@a == null;      // Compare the reference with null@a != @b;        // Compare object referencesa == b;          // Value comparison


[edit]Changed the # to @[/edit]

Note that in the script non-object handles are in fact object handles, except that they cannot be replaced and are guaranteed to always hold an object. Thus it is allowed to take hold the reference to an object variable even though the original object variable goes out of scope.

I decided to use a new token for the object handles, instead of going with the * as C/C++ does. I did this so that C/C++ programmer won't mistake them for normal pointers, as they are not. Also when using an object handle it behaves just as a normal object variable so there are no extra syntax like -> to access the object's members. I believe this is for the best.

I know some people chose AngelScript because its syntax was the same as C/C++, and I'm sorry to have to disappoint those people. But keeping with the C/C++ syntax 100% isn't in the best interest to the library. I hope you will agree with me and continue to use AngelScript as it continues to improve.

Regards,
Andreas

PS. Opinions/critiques/suggestions are welcome as always.

[Edited by - WitchLord on January 6, 2005 7:28:48 AM]

AngelCode.com - game development and more - Reference DB - game developer references
AngelScript - free scripting library - BMFont - free bitmap font generator - Tower - free puzzle game

Hi WitchLord

its good to see that you are working in your new version of AS, congrats to you, good job

I started this post with some suggestions/opinions, but as I have been away from AS for some months, i dont remember exactly the details of this new version, so i simply let this post as a "good work, mate" message

best regards

Lioric
Unfortunatly, I won't be upgrading to 2.0 for awhile, mainly because I already have 1.10.x working and partly because I'm not all that concerned with safety. If some scmuck messes with my scripts and crashes my game, that's his problem.

I think my other thought still stands, though. If passing by reference creates a temporary copy anyway, why call them references? Just pass everything by value. I think that, for your object-handles, you can handle them the same way you do references now. No changes to syntax required; just clone the object and use reference counting to manage the lifetime of this object.

Also; please don't use the # symbol. There are some situations where theres no way to distinquish it from a preprocessor command.
It's ok, the first version of 2.0.0 will probably not be all that stable anyway. I'll still keep the 1.10.1 as a stable version, but I'll put up warnings about the security issues.

Parameter references are still useful even though they are being copied, because they allow a function to return more than one value. Object handles can only be taken for object types that have been prepared for them, i.e. those that have some way of controlling their life time. Primitive types don't have this so object handles cannot be used for this.

Another reason why I still want to keep the by reference parameters is that they are compatible with the way C/C++ works. Which is one of the reasons I'm making AngelScript in the first place, to have AngelScript be able to call normal C/C++ functions.

Still, maybe I shouldn't say that the parameters are sent by reference. I'll update the manual with this. Thanks for the idea.

Good point about the # symbol. I'll change it to something else. What about the @ or the $? I kind of like the @, because it can be read as 'at'.

AngelCode.com - game development and more - Reference DB - game developer references
AngelScript - free scripting library - BMFont - free bitmap font generator - Tower - free puzzle game

OK, I haven't really got enough time to actually try out v2 yet, but there are a few things that seem strange on the WIP page, mainly it seems like you have added the following functions:
SetArgDWord()SetArgQWord()SetArgFloat()SetArgDouble()SetArgObject()

Can I ask why there is a need for all these functions? Could it not be replaced by one function (perhaps just SetArgument) that is overloaded to take a DWord, QWord etc? You could always cast the object to make sure the correct function is called.

A better idea might be to use the stream operators, like cout and cin. Then to set arguments and retreive returned values, you could do something like:
float firstParam = 10;string secondParam = "test";float returnValue;context << firstParam << secondParam;//execute scriptcontext >> returnValue;

Hopefully when I get a bit more time, I'll download V2 and take a look.

Oh, and to answer your questiong about which symbol to use for references, why not use the same ones as C++ i.e. '&'?
I personally don't like the stream interface. But I'll think about it.

I chose the different function names out of consistency, because the GetReturn..() need the different names.

I don't want to use the & token for the same reason I don't want to use *, to not confuse the object handles with C++ references or pointers, because they are not the same thing. The function parameters sent by reference will use the & token, because they do the same thing.



AngelCode.com - game development and more - Reference DB - game developer references
AngelScript - free scripting library - BMFont - free bitmap font generator - Tower - free puzzle game

I thought about suggesting the & symbol, but declined because of your convincing argument against * above. Niether of those really ever made any sense to begin with. The only pointer symbol whose orgin is understandable is ->.

With #, the only problem that would arise is if someone used a preprocessor directive name for a variable. If they were to make the variable 'define', and later tried to apply refrencing to it - '#define'. Uh-oh. I could ignore directives that the preprocessor didn't understand, but that means it could potential make code that is correct now incorrect later. If they had the variable pragma, and later I added support for pragmas - uh-oh again. This could also make scripts compile to two different, correct, results depending on wether the preprocessor is run. Ideally, if the developer is using the preprocessor and the script is actually dependent on it, it won't ever be used in both environments. (And I can't think of how it's actually possible unless it was specifically written just to prove it's possible, but you never know.)

I noticed you had some issues arise with passing objects on the stack. Is this part of why you decided not to pass things on the stack (by value) even if you mimic that behavior? My main concern is accidental modification of by-reference arguments. I'm going to assume that the in/out keywords are for this purpose, and that modifying an argument labeled as 'in' will have no effect on the original in the calling code. I'll also assume that 'out' would be equivilant to pass-by-reference in C++. If that is the case, might I suggest that the in keyword be optional? Assume every argument is of 'in' type, unless it is explicitly marked 'out'. The scriptwriter could, of course, explicitly specify the in. Furthermore, could you ellaborate on inout? How is it different? Aren't the concepts embodied by in and out mutually exclusive?

On the stream operators - I was just trying to think of a way to elegantize the passing of parameters to script in my own project, and I'm going to have to go see how streaming works out for that now desertcube. The only problem I see is the potential for inserting them in reverse order.
I agree completely about the # symbol. I simply forgot about the potential use of a preprocessor similar to that of C/C++.

At the moment I've chosen to use the @ symbol instead, but it is a simple matter to change it if that should prove necessary later on.

Internally AngelScript passes all objects by pointers to functions, but when calling application functions that are registered to take an object by value it copies the object to the C++ stack. The decision to do it like this was that it will make it easier to write the exception handler, customized memory management per object type, dynamic type determination of objects, etc.

The decision to use the in, out, and inout keywords for reference parameters was because of security. AngelScript had to support passing variables by reference, because C++ does it, but it also had to be able to strictly control the lifetime of the references so as to provide a secure/stable environment for the scripts.

In the scripts there is really no use declaring a parameter as &in, as the object will still be copied to a temporary object in order to avoid having the reference become invalid before its use. Since objects are always passed by their pointers internally the only difference between &in and passing by value is that when passing by ref the calling function is responsible for freeing the object instead of the called function. For application function the &in type of reference is important because it is an often used way of passing object parameters, due to it's efficiency.

The &out type of parameter reference passes the reference to an initialized temporary object to the function so that the function can assign values to it. When the function returns the value stored in the temporary object will be copied to its determined destination.

The &inout type of parameter reference creates a temporary object just like the &in type and copies the object to it before passing it to the function. When the function returns the value in the temporary object is copied back to its determined destination, just like the &out type.

One thing to note here is that for the &out type of parameters the argument expression is evaluated after the function returns, and for the &inout type the argument expression is evaluated both before and after the function returns. If the expression is complex, i.e. calls any functions etc, the compiler will warn that the expression is evaluated twice. The fact that the expression is evaluated only when it is needed serves to make sure the reference is valid (because it might change during the function call).

I don't think the by ref parameter types will be used that often because they don't provide any advantage in performance. Only when it is necessary to return more than one value from a function will they be necessary, in which case it is a good idea to explicitly specify how the refernce will be used.

Exactly how the interface looks for passing parameters to the script functions isn't very important and is a small thing to change, so I won't worry too much about that right now. First I want to get version 2.0.0 working with all the features necessary to make it as good and even better than 1.10.1.



AngelCode.com - game development and more - Reference DB - game developer references
AngelScript - free scripting library - BMFont - free bitmap font generator - Tower - free puzzle game

This topic is closed to new replies.

Advertisement