Motivation

When the return value is not enough, you have to mutate an object passed in by reference to return “out” parameters. This requires a new object allocation per call, if not per “out” parameter, in generalized bindings of ES and JS to component models. It makes scripted functions bulkier and costlier than they should be given that you can pass N > 1 arguments “in”.

Similarly, assigning multiple variables, possibly permuting them, requires explicit temporary variables. Perl, Python and other such languages support multiple assignment. Convenience matters enough for many domains where ES/JS (AS too I am sure) are used alongside, or in competition with, such languages, that this is a small but real “competitive” issue.

See iterators and generators and date and time for specific use-cases (both happen to have fairly strong requirements for something like group return).

Sketch

Group assignment:

(x, y, z) = (3, 4, 6)
(a, b) = (b, a)         // temp-free swap
(p, q, r) = (r, p, q)   // rotate right 1
 
function fib() {
  var i = 0, j = 1
  while (true) {
    yield i
    (i, j) = (j, i + j)
  }
}

It is an error for group size to differ between left- and right-hand sides.

Group return:

function arraylike_iterator_next(arraylike) : (uint, *) {
  var index = this.index
  if (index >= arraylike.length)
    throw StopIteration
  this.index = index + 1
  return (index, arraylike[index])
}
 
class TripleIterator(S, P, O) {
  function next() : (S, P, O) {
    if (... /* iterator exhausted */)
      throw StopIteration
    var t : TripleRep = ... /* next triple (TripleRep is internal) */
    return (t.subject, t.predicate, t.object)
  }
  ...
}

This sketch proposes obvious syntax for group-return type annotation. Group size is fixed by the function’s type signature. As with assignment, the signature and all return statements must use the same group size.

A call to a group-returning function not occupying the entire right-hand side of group-assignment evaluates to the first value in the returned group.

Issues

  • The intent here is that groups are not lists or tuples, they are special forms of assignment and return. So no new type is required, only extra annotation syntax for function return type. Is this no-new-type assertion bogus? We don’t have types for control structures, so it’s plausible, but function return value is typed.
  • Incompatibly changes Edition 3 syntax: disallow comma-operator expression in return operand position.
  • Compatibly extends assignment to allow parenthesized comma expression on left-hand side, but what about the right-hand side? Need it be parenthesized?
  • Examples above show parenthesized group return, but again if we are incompatibly changing the grammar for the return statement, why require parens?
  • Reasonable expectation based on other languages may want a way to assign from an array to a group, and possibly vice versa.

Brendan Eich 2006/02/28 23:49

It’s somewhat awkward to say that it “doesn’t add a new type”, because functions can occur in non-assignment expressions, and typically we want every expression to have a type.

One option (the common lisp and lua option) is to make only the first return value be “the” type of a function call, and require multiple-value-bind (or your syntactic form of it) if you want to access any other stack-slot values.

The other option – what I think you need if you want to surprise the least number of pythonists – is adding tuple types for real.

I’m not really sure I like the proposal, neither stack slots nor tuple types. Personally, I’m OK with passing a dict and having the callee fill it in. I’ve never much liked language features for multiple value returns on the left hand side. But I tend to just use named “out” parameters in all my code anyways. I’m an odd duck that way.

graydon 2006/03/02 16:51

Thanks for the comments. Did I neglect to say how a call that group-returns is evaluated other than on the right-hand side of assignment? I was thinking of the Lua way, as you might have surmised.

Talking to Sjoerd Visscher in the Mozilla bug for generators and iterators, I’m persuaded that adding a List type with (a, b, c,) syntax is not a good idea, because it would be almost a twin of Array. Costs outweigh the benefit. But that conversation was in the context of making the in operator and for/in loop more Pythonic. The group-assignment/return idea did not come up.

Adding up all of

  • group-assignment/return as proposed here
  • in being anomalous for arrays because it tests/iterates keys and not values
  • the CL/Lua approach to group-return evaluation outside of group-assignment considered lame

may shift the trade-off to favor adding a list or tuple type. Anyone else have an opinion?

Brendan Eich 2006/03/02 17:11

Sorry, I did not mean to imply the CL approach to multiple value binding is lame; I don’t really see it or tuples as any better than one another. I only meant to say that I don’t often feel the need for them myself, when stuck in a language which lacks them. Argument from minimalism, not convenience.

If you are definitely arguing in favour of “some” method for multiple-value return, I think the CL technique is possibly the lighter-weight one. It doesn’t add a typing judgment throughout the language; it adds a specific rule for assignment expressions / statements. So it’s arguably lower-impact. I don’t mind it.

I must confess to not fully understanding the big problem with arrays, in that bug. Operator in tests keys; ok, so you have a different operator-in than Python. Big deal. It’s a backward-compatibility wart. You might add a prototype Array.prototype.contains() or something, which does the Python operator-in. Is the issue deeper than that?

Your iteration over keys is, fwiw, totally Pythonic:

$ python
...
>>> x={1:"foo", 2:"bar"}
>>> for i in x:
...     print i
... 
1
2

graydon 2006/03/02 18:48

Yes, objects in ECMA-262 are like dicts in Python, with the same initialiser syntax and for-in iteration over keys instead of values. This was a case of “convergent evolution”, not planning ahead.

The problem is that ECMA has no sequences or lists, but its array initializers look like Python’s list initialisers. Yet for-in iterates over keys, not values, for Python lists. So we can’t please the hard-core Pythonistas, not in all ways at any rate.

ECMA may be more consistent-and-minimal, but without a powerful macro system, minimalism is often opposed to usability or convenience. The case that looks less convenient in ECMA is something like:

var pentatones = [261.6 * Math.pow(2, i/12) for (i in [0, 2, 4, 7, 9, 12])]

or the Edition 3 form:

var pentatones = []
for (var i in [0, 2, 4, 7, 9, 12])
  pentatones.push(261.6 * Math.pow(2, i/12))

It seems surprising even to me, and I’m a JS-ista, not a Python-ista, for i to iterate over 0, 1, 2, 3, 4, and 5. And I’m the guy who made arrays be objects, way back when, in the name of ultra- (rushed-) minimalism. Others have made the same complaint, whether or not they know Python. They expect for-in, at least for arrays, to iterate over values, not keys.

Adding Array.prototype.indexOf (probably we should use the same Java-inspired name as is used for the corresponding String method that finds a value and returns its index, instead of contains; SpiderMonkey already supports both indexOf and lastIndexOf as generic Array methods) does not help for-in loops. Requiring something like the E4X for each (v in a) loop variant, which iterates v over values in a, is

  • verbose,
  • confusingly named (why does each connote values instead of keys?), and
  • not Pythonic.

Of course, adding lists or tuples to ES won’t help the Python users who expect, in an initialiser or literal form, [] to bracket the values over which the loop iterates. We can’t change either array initialiser syntax or array iteration, so anyone crossing between the two languages will have to switch initialiser syntaxes.

Given the Pythonic objects and un-Pythonic arrays with which we are stuck, do the benefits of:

  • key iteration for a list type,
  • group assignment, a special case of a list initialiser on the left-hand side of assignment, and
  • generalized list expressions that evaluate to themselves in other expression contexts including return’s operand,

justify the costs of adding lists? This is a reordered version of my earlier list, which called the CL/Lua evaluation of group expressions “lame”. Unlike the earlier list, this one favors evaluating groups (outside of the left-hand side of assignment) to list values. It appears to me to require one fewer special case: a list evaluates to itself just as an array does – a list (unlike an array or any such thing) does not evaluate to its first item. But I agree that the CL/Lua way is lighter-weight.

As Sjoerd argued, users would want lists and arrays to convert to one another freely. That adds some complexity. Yet group return would just be list return, whether of an initialiser or a computed list value.

Group assignment would need slightly special syntax for backward compatibility – no unparenthesized list on the left of = (that would be a comma expression with an assignment as one of the terms) – but would not need the trailing comma that list initialisers would need:

var list = (a, b, c,)
(x, y, z) = (1, 2, 3,)

Yet more complexity for users, if not for the spec in terms of number of special cases. And it’s not particularly pretty.

So I still agree with Sjoerd that having both arrays and lists is one two many of the same thing. Hence the desire to specialize assignment and return subgrammars. That still leaves for-in iterating over array keys, not values, but that’s an issue for iterators and generators.

Brendan Eich 2006/03/02 20:46

For what it’s worth, ECMAScript in Opera already has destructuring array assignment:

   [a,,[c,d]] = expr

This assigns to a, c, and d, pattern matching on the structure of the value. Excess elements are ignored. I was planning to extend this to objects too but I never bothered – I just wanted simple multiple return values.

Granted it’s not free to create an array, but this comes down to the implementation a little bit too. In Opera, the expression [a,b] requires just one allocation.

Lars T Hansen 2006/03/03 09:45

Neat, more minimal than this proposal. But one is greater than zero, and in particular, anything like a Date.nanotime() method should not allocate an object. Perhaps there is a way to optimize away the object, but a special form would be cheaper.

JS was over-minimized due to rushing in the old days. As I’ve argued above and elsewhere, Edition 4 will take time to reach the market, then years to supplant Edition 3. But the language will be around even longer, so it’s worth getting things right, worrying about convenience, etc.

Given browser uptake and cost of standardization, I do not expect TG1 to continue with Edition 5 next year, 6 in 2008, etc. We would rather make Edition 4 sufficient as the kernel language for meta-programming and extension, so that users and the standard library contenders can continue the evolution in the market.

Thus ends my anti-minimalism sermon. It doesn’t bear on the destructuring array assignment extension in Opera, except to the extent that we agree even one object allocation for group return is one too many. Let me know if any of this is controversial.

Brendan Eich 2006/03/03 10:07

I can’t say that I agree that one object allocation for a group return is one too many, and like you I don’t think that ought to be controversial :-) I get your point about excessive minimalism, but it’s also possible to add features that do not pay for themselves in the long run. We should look into whether the array syntax can be used for this and whether a decent implementation can optimize it away.

(There is actually a standard trick here in which a continuation that receives multiple values is marked specially so that an invocation that returns multiple values will not cons up an object, but instead return the values through a faster protocol. With proper tail recursion this works very well, but even without it will cover many cases in practice.)

Lars T Hansen 2006/03/03 11:19

The case against even one object allocation is a Date.nanotime() returning a high-resolution time representation. I admit it’s not an air-tight case – if your timestamping use-case is perturbed by array allocation, get a real profiler ;-).

I like what you did, it’s a smaller change as well as an implemented precedent.

Would it be over-specifying to require the optimization you mention, as we are specifying proper tail call optimization too?

Updated: Lars, what was your plan for object initializers as lvalues?

Brendan Eich 2006/03/04 18:19

I think it would be wrong to require the optimization. An implementation that has fast allocation and good garbage collection would not derive any significant benefit from it, and for all you know it may preclude other optimizations, so that it in some cases the optimization would be a net loss.

In general I think that a language should be designed with the implementation technology in mind but that the language spec should concern itself only with semantic matters, not prescribe optimizations. However, the spec could point to implementation precedent or literature (in this case Ashley and Dybvig, http://citeseer.ist.psu.edu/27481.html). Going out on a limb here a bit, it may in fact be useful for the spec to include an appendix that contains implementation advice for many of the features in the language.

(Incidentally, Scheme people do not think of proper tail recursion as optimization but as the semantics of space usage. Implementations that do not support proper tail recursion are considered “not Scheme”, not merely buggy or low-quality.)

Anyhow, destructuring assignment is a powerful facility in its own right, and should be allowed on object types too. One desirable syntax is this:

    {fisk:a, fnys:b} = <expr>

which means exactly this:

    (tmp = <expr>,
     a = tmp.fisk,
     b = tmp.fnys,
     tmp)

for some fresh temporary. The syntax is desirable because it mirrors object construction (and is exactly analogous to how the destructuring array assignment works). It can be extended in an obvious way to nested objects and arrays:

   { fisk: { foo: a }, fnys: [c,,d] } = <expr>

Static typing of this should be straightforward.

However the syntax is ambiguous since object literals cannot appear at the beginning of statements. The parser would have to use several tokens of lookahead, and if anything but simple variables is allowed on the right hand side of a : then unbounded lookahead is required, as above, where you don’t know if you’re parsing an object literal or a statement until you see the second :.

Modulo hacking the parser quite a lot, there are a couple of fixes for the problem: use a dedicated keyword (like match) or use a syntax that does not mirror object construction (eg, require that all fields names are quoted).

   match {fisk:a, fnys:b} = <expr>;

(I wouldn’t want to use let or var because they take us away from the multi-variable assignment form that this was intended as.)

Once we have a distinguished keyword we can overload it for case-based pattern matching:

   match (<expr>) 
   {
      case {fisk:a, fnys:b} : <stmt> ...
      ...
   }

which matches the first case that can be destructured. This will be especially useful if the a, b can be constants as well as variables. Multi-case pattern matching should be well understood now, so an ambitious implementation should be able to optimize it.

This is taking us far away from multiple return values, though.

Lars T Hansen 2006/03/06 01:07

I suppose one could always use either different brackets or a prefix:

   {{ fisk: a, fnys: b }} = <expr>
   [[ a,,c ]] = <expr>
 
   &{ fisk: a, fnys: b } = <expr>
   &[a,,c] = <expr>

Sort of Perl-ish... not very pretty, even if it solves the problem.

Lars T Hansen 2006/03/06 01:59

I would expect

    {fisk:a, fnys:b} = <expr>

to assign the variables fisk and fnys based on the values of <expr>.a and <expr>.b respectively, rather than as you describe above. This would better mirror the syntax of object initalization, in which the slot precedes the :, and the initialization expression follows it.

Mike Shaver 2006/03/08 14:49

Lars: right, I used the evil phrase “tail call optimization”, but I’m familiar with Scheme. Guy Steele was loaned from other duties at Sun to help us codify ECMA-262 Edition 1 – he was a big help, and even brought along Richard Gabriel for company, but for some reason we never did proper tail calls.

I wrote the proposal narrowly, but that trick seldom works. The parenthesized syntax wants to grow into tuples or lists. I’m now firmly in favor of at least destructuring array assignment as you’ve implemented it. The object assignment you described works as I expected, but it is odd-looking to shaver, and he will not be alone, I predict.

Shaver: the deal in both array and object cases is that on the left-hand side of assignment, the initialiser property value is an identifier to bind to the value of its key bound to the corresponding object on the right-hand side.

Lars: quoted property identifiers in object initialisers are allowed, so I don’t see how that helps disambiguate. Although you have taken us somewhat afield from group assignment/return, I like where you’ve gone and would welcome a separate proposal on destructuring assignment and match.

Brendan Eich 2006/03/08 19:02

Quoting names would help because it would limit the amount of lookahead the parser would have to do to figure out what it’s looking at. The trio of tokens <left-curly-brace> “foo” <colon> cannot be the start of a statement, so if the parser sees <left-curly-brace> in a statement context it would only have to look ahead two more tokens to know whether it’s looking at a tuple destructuring.

Mike: I agree with you that the syntax (ordering of name and value around the colon) is odd-looking and feels weird, I don’t actually care for it myself. I just figure that it’s a benefit that it matches construction syntax, and my guess is that one would become used to it. And for matchings with more structure it feels a lot better, I think.

I’ll put together a simple proposal on destructuring and link it from the proposals page.

Lars T Hansen 2006/03/09 01:04

Ah, bounded (n > 1) lookahead. Let’s not do that. I love your proposal, and I hereby retract this one in favor of it.

Brendan Eich 2006/03/10 11:01

 
proposals/group_assignment.txt · Last modified: 2006/03/10 19:21 by brendan
 
Recent changes RSS feed Creative Commons License Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki