Javascript (a.k.a. ECMAScript) is sort of the "mighty midget" of scripting
languages. Its small syntax and core object model give it a very tiny
footprint, yet it has some very strong introspective capabilities that allow
programmers who understand it to do some very powerful things with it. This
paper focuses on using these capabilities to address some of the languages
deficiencies. Specifically, it addresses the deficiencies that make it awkward
to develop larger, modular systems in Javascript.
Javascript's Deficiencies
In the interest, perhaps, of keeping Javascript's footprint small, the designers chose to omit several high level syntactic constructs found in most other object-oriented programming languages, notably classes and namespaces. The advantage of both is that they provide name scoping, eliminating the verbosity of explicitly identifying the scope of every symbol and simplifying the refactoring process. Javascript provides something similar to the basic functionality of classes, but requires that it be implemented explicitly by binding a name to a prototype:
// Foo class "constructor" function Foo() { ... } // Foo method doSomething() Foo.prototype.doSomething = function () { ... }
The Foo.prototype usage is rather awkward (and is itself a simplification of defining the function prior to the assignment and then performing the assignment). One option is to simply assign the prototype to a variable and perform the bindings on that variable:
// let "c" be my class... var c = Foo.prototype; c.doSomething = function() { ... } c.doSomethingElse = function() { ... }
This certainly decreases the verbosity of the declarations, and also removes the redundancy of having the class name present throughout all of them. However, it only provides one degree of encapsulation: we have classes, but we have no namespaces. However, we can use a similar pattern for namespaces:
// create an object to hold our namespaces var ns = new Object; // create the Foo constructor as part of our namespace ns.Foo = function () { ... } var c = ns.Foo.prototype; c.doSomething = function () { ... } c.doSomethingElse = function () { ... }
Not too shabby! We still have some problems, though:
We now have to explicitly dereference the namespace on every symbol access.
We don't have the visual signals of encapsulation that most of us are used to (no curly braces surrounding the scope).
There are some easy remedies for these issues:
var ns = new Object; // namespace ns { // class Foo { ns.Foo = function () { ... } var c = ns.Foo.prototype; c.doSomething = function () { ... } // } end Foo // } end ns // "import" foo into the global namespace var Foo = ns.Foo;
Here we are using special comment mark-ups to give us the appearance of scope, and simple variable assignment to import names from namespaces. But now we have a slight disconnect between the physical appearance of the code and the actual elements that are providing the scoping: for example, consider the case of the careless programmer who adds the "clueless" method to the class:
// class Foo { ns.Foo = function () { ... } var c = ns.Foo.prototype; c.doSomething = function () { ... } function clueless() { ... } // } end Foo
The problem is, this code looks scoped, although at this point it should be obvious that it is not.
In the end, most would agree that this is a fairly minor criticism, although
most would also probably prefer better scoping constructs.
Broken Inheritence
The recommended approach for inheritence - deriving one class from another - is through the use of the "prototype" attribute:
// class A function A() { ... } // class B inherits from A function B() { ... } B.prototype = new A;
It's not immediately obvious to most Object-Oriented programmers what's going on here: that's because Javascript is a "prototype based" Object-Oriented language. It has no "classes" per se, only instances and delegation. These are powerful concepts in their own right, but this classless model of development doesn't sit well with most programmers, and even the patterns recommended by the standard documentation have some inherent flaws.
Consider, for example, the problem of constructors. Let's fill in our class A a little bit:
function A(name) { this.name = name; }
So now we can create an "instance" of A with a name attribute as follows:
var a = A('this is A!');
No problems so far. But now let's try to extend it:
function B(rank) { this.rank = rank; } B.prototype = new A;
We now might think we have a class "B" which "inherits" from A... but we don't. In fact, what we have is a function that can be used to create an instance which delegates to an instance of A. To fully grasp the limitations of this, consider the problem of definining A's "name" instance variable. We can't simply call "A" or even "new A" within B: A is for constructing new instances of A. The new instance of A that we've used for the prototype doesn't help us because we can't pass "name" to it for every new instance of B - there's only one A, but many B's.
The recommended approach for this is to simply do so explicitly within the B constructor:
function B(name, rank) { this.name = name; this.rank = rank; }
Obviously, this violates A's encapsulation. Is there a better way?
We can use the "constructor/initializer" pattern to work around this:
function A(name) { this.A_init(name); } A.prototype.A_init = function(name) { this.name = name; } function B(name, rank) { this.B_init(name, rank); } B.prototype = new A; B.prototype.B_init = function(name, rank) { this.A_init(name); this.rank = rank; }
The obvious problem with this is that it's an awful lot of code - too much code, in fact. We don't need to create alternate constructors for every class, it is sufficient to simply alias those of the classes that we override:
function A(name) { this.name = name; } function B(name, rank) { this.A(name, rank); } B.prototype = new A; B.prototype.A = A;
This solution also works for the case of a non-constructor "extension method":
B.prototype.A_overridenMethod = B.prototype.overridenMethod; B.prototype.overridenMethod = function () { this.A_overridenMethod(); ... do my extensions ... }
Unfortunately, we are now back to the problem of class names embedded in our implementation, albeit not as widespread as the case considered earlier in this article. A seductive solution to this is to replace references to the base class name (B.prototype.A, A_overridenMethod...) with the word "super", the keyword used in Java and Smalltalk to indicate that the method is that of the superclass instead of the immediate class. The problem with this is that it's not transitive - calling the constructor "super" would work for our 2 level class hierarchy, but as soon as we derive from B and create another "super" constructor, we've made A's constructor inaccessible (and created an infinitely recursive constructor, to boot).
There's another issue here that we haven't touched on yet: that of construction of prototypes. A prototype is an instance, not a class, and as such carries "instance baggage" even though it doesn't need to. This is not a very serious problem in our last example - B.prototype ends up with a variable "name" which is set to undefined. It can be more of an issue if any heavy calculation is performed or data heavy structures are created in the constructor. The standard work-around for this sort of thing is to simply create your constructor so that behaves differently when creating an instance as a prototype. This is normally done by checking for the absence of arguments:
function A(name) { if (name != undefined) this.name = name; }
The situation becomes more complicated if the instance can be constructed
without arguments - then we have to define a particular argument signature which
triggers prototype instantiation.
A Better Solution?
The solutions I've described so far all have the virtue of simplicity - they can be easily implemented without any special support code and their intent is very clear to anyone with basic knowledge of the language. However, they do leave something to be desired - we've essentially worked around Javascripts warts, we haven't really extended the language in any way.
A richer solution to all of these problems is to take advantage of the
language's essentially minimalistic and introspective nature and create
constructs to stand in for the absence of built-in features. The remainder of
this article describes my "ispect" module, which you are free to use, modify and
redistribute for any purpose.
Namespaces
I described a namespace strategy earlier which uses objects as namespaces. We can do the same thing within the scope of a constructor function:
// namespace ns function ns() { // class Foo this.Foo = function () { .... } // class Bar this.Bar = function () { ... } const privateConstant = 100; } var ns = new ns();
This is essentially the equivalent of what we described earlier, but it has two additional advantages:
variables and functions can be defined as private to the namespace (access protection)
it looks more like traditional scoping
We get access protection by virtue of Javascript's use of closures - Foo and Bar have access to ns's local variable context even after ns's execution is complete. So Foo and Bar can read privateConstant, but no one outside the namespace can.
We can still use the trick of assigning the members to variables in the local context to import them:
var Foo = new ns.Foo;
A somewhat more elegant approach to this is to write a function to do this importing for us:
function importNames(namespace, names) { if (!names) { // the user didn't provide any names to import - build a list of all // names in the namespace names = []; for (var name in namespace) names.push(name); } // import the names by assigning them to the global context ("this") for (var i in names) { name = names[i]; this[name] = namespace[name]; } }
importNames() makes use of the fact that this in a normal function is associated with the global context, so when we set its attributes, we are, in fact, setting attributes in the global context.
From within the global namespace we can use importNames() to import all of the public symbols in ns like so:
importNames(ns); // create an instance of the newly imported Foo class var f = new Foo();
Unfortunately, this function isn't much use within another namespace: it implicitly imports symbols into the global context. To import within a namespace it is necessary to use the variable assignment trick described earlier:
function other_ns() { // XXX won't work! imports the symbols globally! importNames(ns); // this is what we want - import privately into this namespace var Foo = ns.Foo; }
The very same approach that we use to create scope in namespaces can be used to create classes. This trick merits some further discussion, because there's a lot involved in it.
Strictly speaking, Javascript has no classes. It only has instances and prototypes. Attribute resolution in an instance delegates back up the prototype chain, so if an attribute is not resolved in the instance, the interpreter attempts to resolve it in the prototype, then in the prototype's prototype, and so on upwards. The prototypes, in turn, are simply other instances. This is what creates our problem with instance variables: we're always constructing instances, even when we don't want to.
Of course, there's nothing preventing us from making some of our instances more classlike. We could define all of our inheritable objects so that they contain no instance specific data (essentially what the constructor/init pattern described above does) and only create instance data for final, non-derivable instances. However, in order to do this we have to once again do a lot more coding.
Fortunately, Javascript is minimalistic enough that we can play some tricks with it to get the kind of results that we want.
Let's start with our namespace approach, but instead of a namespace let's say we are defining a class:
// the body of the class function FooBody() { // this will be our constructor this.init = function(name) { this.name = name; } // ... and an accessor method, just to drive the point home this.getName = function() { return this.name; } } // instantiate the class var FooClassObj = new FooBody(); // define a constructor function Foo(name) { this.init(name); } Foo.prototype = FooClassObj;
The key to this example is that all of the personailty of the class is contained within the FooBody function. Everything else is just support infrastructure. What we are doing is essentially defining a dedicated class object. Foo is just an empty constructor - all functionality is delegated to the FooBody instance (FooClassObj).
It is important to note that the "this" in the init() and getName() functions in FooBody is not the same as the "this" in the FooBody function itself. The latter refers to the class object, the former refers to an instance.
Seeing that all of the real class personality is in FooBody, we can easily generalize the rest:
// function which returns a normalized class object function makeClass(body) { // instantiate the class var classObj = new body; // define a generic constructor function constructor() { this.init.apply(this, arguments); } constructor.prototype = classObj; // ... and return it! return constructor; } // now define the class with our class-creator var Foo = makeClass(function() { // this will be our constructor this.init = function(name) { this.name = name; } // ... and an accessor method, just to drive the point home this.getName = function() { return this.name; } });
We've basically taken everything that we were doing in the previous example and wrapped it up into a function (makeClass()). Then we've used this function to create Foo.
Foo now looks very much like any other Javascript constructor function: we can instantiate it with new and we'll get an object with a name attribute and a getName() method.
Now let's consider what we've actually accomplished by having done this:
We have created a prototype for Foo that looks very much like the class objects supported by other languages - it has no instance variable assignments.
We have defined all of the characteristics of the class in a genuine scope (that of the anonymous function passed into makeClass()).
Part of the consequences of these details is that we can now define public and private class variables: variables defined within the anonymous "body function" will be accessible from all functions in that scope, but not externally (a private class variable). Instance variables associated with "this" from within the body function (but not within it's nested "methods") will be accessible from the class object and all instances, derived class and externally (a public class variable).
Here's a more complex example to illustrate these points:
var InstanceCounter = makeClass(function () { // private counter variable - always accessible from nested functions // since Javascript defines them as closures, which carry their // environment with them wherever they go. var counter = 0; // public class variable to hold the text we want to show before the // count this.text = 'Counter value: '; // constructor this.init = function() { counter += 1; } this.printCounter = function () { print(this.text + counter); } }); // set the title text for all instances InstanceCounter.prototype.text = 'Check this out: '; var i = new InstanceCounter(); // print "Check this out: 1" i.printCounter();
Our makeClass() function is missing one very important feature - there's no way to define a base class. This is easily remedied: simply add another argument for an optional base class. But now that we've introduced inheritence into the picture, we've set ourselves up for the problem we described earlier: how do we access base class methods that we've overriden?
We could use the same trick we used before: moving the base class method to an alternate name and then defining the new method in terms of it. Overriding a constructor using such a technique would look something like this:
var Derived = makeClass(Base, function() { this.Base_init = this.init; this.init = function() { this.Base_init(); ... my construction ... } });
Again, this brings us to the problem of potentially having the base class name scattered throughout the derived class. What we'd really like is something along the lines of the aforementioned "super" keyword.
Once again, we can turn to Javascript's own object model and introspection features for a solution. First of all, we need access to the base class methods that we want to call. In the previous example, we obtained these by storing them in another variable prior to overriding them. An alternate approach is to access them through the superclass, which is the prototype of the prototype of the instance object.
Unfortunately, Javascript does not provide platform independent mechanism for obtaining an object's prototype - the "__proto__" instance variable works in Mozilla/SpiderMonkey, but not on IE. We'll ignore techniques for morphing to accomodate the platform (that's not the subject of this article) and simply create our own references by adding the "_class" and "_base" instance variables to the instances and classes, respectively. Our makeClass() method now looks like this:
// function which returns a normalized class object function makeClass(base, body) { // make the base class our prototype - "base" will be a construction // function, so we obtain it's prototype which will be our "class // object". if (base) body.prototype = base.prototype; // instantiate the class var classObj = new body; // add the _base variable to the class object if (base) classObj._base = base.prototype; // define a generic constructor function constructor() { // assign our _class variable this._class = classObj; this.init.apply(this, arguments); } constructor.prototype = classObj; // ... and return it! return constructor; }
At this point, we can directly call a base class function with this._class._base.func, but this alone will not work very well because the "this" pointer passed into func will not be our "this", but rather this._class._base. We need to use the function's call() method to provide it with an alternate "this":
var Derived = makeClass(Base, function() { this.init = function() { // call the base init function, propagating the current "this" // variable this._class._base.init.call(this); ... my construction ... } });
This is rather ugly. It would be preferable to hide the expansion of _class._base and the function calling mechanism. We can do this with a couple of wrapper function and some heavy use of closures. First we define a "super-wrapper":
// constructor for an object that wraps a superclass. Uses the "getter" // mechanism to create "superclass methods" (methods of the base class // which are bound to an instance of a derived class). function SuperWrapper(base) { // returns a function that is bound to an instance - calling the // returned function will be the equivalent of "inst.func(...)" function bind(func, inst) { return function () { return func.apply(inst, arguments); } } // function to return a "getter" that returns a superclass method that // has been bound to an instance function getterMaker(key) { return function () { return bind(base[key], this._inst); } } // create properties for each method in the base class for (var key in base) { if (typeof(base[key]) == 'function') this[key] getter = getterMaker(key); } }
Next, we'll change part of our makeClass() function so that it creates one of these for every base class and insure that the class has a superclass() method:
// add the _base variable to the class object if (base) { classObj._base = base.prototype; // create a generic super-wrapper for the class superWrapper = new SuperWrapper(base.prototype); // create a base wrapper which uses it as a prototype - this provides // the getters of the super-wrapper with an instance to bind to // superclass methods classObj._baseWrapper = function BindingSuperWrapper(inst) { print('creating base wrapper'); this._inst = inst; } classObj._baseWrapper.prototype = superWrapper; // give the class a superclass method if it hasn't inherited one from // the base class (or defined one explicitly) if (!classObj.superclass) { classObj.superclass = function() { if (!this._super) { this._super = new this._baseWrapper(this); } return this._super; } } }
Now we can change our derived class to look like this:
var Derived = makeClass(Base, function() { this.init = function() { this.superclass().init(); ... my construction ... } });
This code merits some illustration beyond the comments. The thing that we are automating here is the association of a class method with a derived class instance, something we would have otherwise had to do with an explicit "apply".
We are doing this by creating an instance of SuperWrapper. SuperWrapper is object full of getters (constructed attributes) - one for each method in the base class. Each of these getters, in turn, creates and returns a function object (actually, a closure) which will call the base class method on the derived class instance, giving us the illusion that we are calling the base class function directly.
The SuperWrapper instance is created for each class. But it needs to operate on an instance, so calling the superclass() method the first time for an instance creates an "instance wrapper" from a function whose prototype is the SuperWrapper instance for the class:
+------------+ | BaseClass | Each class has a single |class object| (heavy-weight) SuperWrapper, +------------+ instances each have a light-weight ^ front-end to it. | prototype | +------------+ +------------+ |DerivedClass|-------------------|SuperWrapper| |class object| | instance | +------------+ +------------+ ^ ^ | prototype | prototype | | +---------------------+ +----------------+ |DerivedClass instance|-----------|instance wrapper| +---------------------+ +----------------+
This kind of structure really pushes Javascript to its limits: I have only found it to work on Mozilla/Spidermonkey, information on how to get the same effect on IE would be welcome.
The good news is that our makeClass() function now manages to contort the language to provide an aesthetic and uniform solution to calling base class functions. Since it demands nothing of the base class, it can even be used with base classes that were not created with makeClass(). The bad news is that we have done so at the expense of performance: all of those closures and getters and shadow instances have a cost. I believe that the particular design chosen has the minimum cost attainable for this kind of functionality: very little data is associated with the instance, bound methods need not persist beyond their use.
On the other hand, this design does require that we instantiate a bound method for every superclass method invocation. Also, since Javascript memory management is done with garbage collection (at least in the more modern interpreters), this business of releasing them as soon as we're done with them is likely to increase the amount of work needed to reclaim free space.
I would consider caching the created bound methods (or at least providing the option to do so) except that the "getter" mechanism which makes these method invocations so tranparent does not seem to be overridable: once an attribute is declared a "getter", you can not set it back to a value. It would be possible to assign the bound method to a shadow attribute, but we would still need to call the getter every time, and we risk accumulation of a large number of bound method objects as they would be stored one per instance - even if only invoked once, which is the case for the most common use in a constructor.
The redeeming virtue of this approach is that it imposes very little overhead if you don't actually use it: there is a small startup cost of constructing the SuperWrapper for each class, but an instance wrapper is not created until someone asks for it by calling superclass().
For performance, the best approach that of reassigning the base class methods
prior to overriding them as described earlier. For maintainability and
ease-of-use, the superclass() method is at least marginally
preferrable. Nothing about the system's current design precludes the
reassignment approach, and there is nothing to stop you from mixing both
approaches.
Putting it all Together
A major goal of all of the techniques that I've described is to promote modularity, allowing the construction of much larger systems from components deployed on the enterprise or global scale. In keeping with this ideal, and as the reader might suspect, I have packaged the tools that we have examined into a file and a corresponding namespace. The file is distributed under the BSD license: please feel free to incorporate it into yor own works.
A small usage example follows:
// import the good stuff ispect.importNames(ispect, ['makeClass', 'importNames']); // create a class var Foo = makeClass(null, function () { this.init = function () { this.name = name; } ));
There are some minor differences between this module and the final forms of
the examples shown: the "ispect" namespace function accepts an argument - this
is used to pass in the global "this" variable so that names can be imported into
the global context. Also, the names of the special variables (preceeded here
with an underscore) are preceeded with "_ispect_" instead to further decrease
the likelihood of a name clash with user variables.
Conclusion
Javascript has some warts that make it a less than ideal object-oriented programming language. We have covered some techniques for using Javascript's own simplicity to alleviate some of these weaknesses, culminating in the description of a complete module which attempts to provide a fairly comprehensive solution to these problems.
The reader is encouraged to explore and elaborate on these techniques, as I
have tried to elaborate on the techniques of those cited in the References. And by all means, feel free to use, modify
and redistribute the ispect package, if you find it suitable.
References
Douglas Crockford's JavaScript Page - lots of very illuminating ideas on the language.
KevLinDev - calling superclass functions.