Review Board

1.0

ETCollection higher-order messaging for merging into stable

Updated 5 months, 1 week ago

Niels Grewe Reviewers
EtoileCore
None Étoilé Stable
Part of the higher-order messaging functionality has already been reviewed (http://review.etoileos.com/r/102/) This changeset includes the base functionality and further improvements:
* Many, many patches by Quentin (thanks a lot!)
* The ability to filter collections based on the attributes of their contents
* ETEachProxy to iterate over contents of collections used as arguments to a message (This depends on a receiver that knows how to handle this, currently only map and filter proxies do)
* Unit tests for the stuff that uses blocks.
Extensive unit tests included.
Posted 9 months, 3 weeks ago (November 16th, 2009, 9:15 a.m.)
Looks good and the new features are just great! I added various small comments. The main one is about to unify the naming between 'inject' and 'fold' API.

Is there a particular reason why you wrote the new 'recursive' functions that work in tandem with the each proxy in a recursive style? I don't suggest to rewrite them, but I was just curious to know because I would have tried to write them in a non-recursive style to make them easier to understand. I haven't thought really hard whether that's possible or not.

The spacing is a little bit inconsistent in some places, it doesn't really matter though. For future code, I suggest to write function calls and macros with arguments like that: f(a, b, c);
and methods as below:
- (void) blabla: (id)x bla: (id)z and [self blabla: y bla: x];
More examples are visible in the coding style page.

The test suite runs fine for me on both GNUstep and Mac OS X.
  1. Thanks very much!
    I'm not savvy enough with algorithms to know whether there is an iterative equivalent for the recursive functions (They spawn loops and will call themselves not only once but multiple times, so I don't think I can turn this into some simple tail recursion...)
    As for the indentation of method declarations/definitions, we probably have a philosophical disagreement here: "- (void)blabla" seems more semantic to me (I'm siding with David on that issue: methods have an implicit return value of type id, which is then cast to the appropriate type). We probably have to decide for one way or the other eventually (and update the coding style-guidelines, if nessecary). For now, I'd like to know whether you would be okay if the "- (void)blabla" style were used consistently. (Your other comments about indentation are all valid and sound, btw.)
    Leaving that aside, I incorporated your suggestions and committed an updated version to trunk. I'll try to update the diff here as well, but subversion is giving me a hard time again...
  2. I ran the test suite on both GNUstep and Mac OS X with the committed diff and everything is still ok :-)
    
    I just have a small comment, why not get rid of -_injectObject:intoBlock:shallInvert: private method and just use ETHOMFoldFunction in both -leftFold and -rightFold?
    
    For the indentation discussion, the cast seems like an implementation detail and I find it more readable with the space (the return value appears as a distinct element exactly as the arguments do). If I had to explain it more rationally I would say… If the method return value was named, the method could be written like that:
    - (void)result computeWithInitialValue: (id)aValue;
    We could say the return value is anonymous (like a variable without a name), then to have a space between the return type and the method name makes sense because it applies to this implicit return variable and not the first method keyword.
    I think these indentation issues are is quite subjective and I'm not convinced we should always change the coding style even when we found good reasons to do so. Because such pretty good reasons will probably be found in the future… but are they really worth a change given our relatively big code base and benefits which are not so obvious?
    
    This puts aside, I agree we need to make some small adjustments to the coding style though and may be make it a bit less strict in some special cases (to accommodate taste differences). Perhaps a good example would be NSString* vs NSString *. I think the first is a bit better in a semantic perspective but I use the second because its readability is much better. imo the best is to have consistency within each module and ensure every module respects the coding style decently. And people who contributes should adapt  themselves to the things the module doesn't do exactly as suggested by the guidelines.
    
    If the HOM code uses - (void)blabla consistently, I think it's fine. It's a minor thing :-)
  3. Indentation makes great discussions, but I'll leave those for another time ;-)
    I must admit that keeping -_injectObject:intoBlock:shallInvert: was sheer laziness - I removed it now...
The note doesn't sound really clear to me. May be add an example to illustrate the point.
A small example would be nice.
A small example would be nice.
A small example would be nice.
A small example would be nice.
For consistency purpose, I would suggest to unify the naming of all the folding methods. e.g. Rename -injectXXX: methods to:
-leftFoldWithInitialValue:intoBlock:
-rightFoldWithInitialValue:intoBlock:

The alternative is to rename the 'fold' HOM methods with 'inject'.

Then -injectXXX methods could be removed and -inject:into: added to a Smalltalk-specific framework. -inject:into: would just call -leftFoldWithInitialValue:intoBlock: or its associated HOM function.
  1. I usually used Smalltalk-derived method names whenever it made sense (-zipWithCollection: (HOM-based) and -collectBlock:withCollection: (block-based) is another case of this). But I clearly see that this might be undesirable and since -(left|right)FoldWithInitialValue:intoBlock: sounds nice to me, I'll opt for that naming scheme.
  2. ok
ETCollectionObject may be.
ETCollectionObject may be
From what I understand the first two if of the method are only executed the first time the method gets called. So why not move them in -initWithOriginal:? If that's not possible I would move this if block into the previous one and keep it separated from the rest with a blank line like that:

if (nil == contents)
{
  contents = 
  if (nil == contentsEnum)
  {
    contentsEnum = 
  }
}

id object =
  1. Actually, only the "if (nil == contents)" statement can be moved into the initializer. The enumerator needs to be recreated repeatedly, since we enumerate the contents of the collections once for every element of a receiver-collection and a new enumerator is needed for each element. (Now that I'm thinking about it, this can probably be optimized by using a counter ivar and caching the IMP of NSArray's -objectAtIndex:)
  2. I missed this point about the second if statement. It's clearer with the first if statement put in the initializer.
You can replace the two lines with DESTROY(contentEnum);
ETCollectionObject may be
I'd add a comment like:
/* No method arguments (only self and _cmd as invisible arguments) */

and/or use a boolean variable like that:

BOOL isUnaryMessage = (argCount < 3);

if (isUnaryMessage)
{
  return;
}

To also put a blank before and after this if statement would be better.
; i++
Move this comment before the 'if', otherwise we get the impression it applies to the code that follows.
Selector: @selector(
I have never commented on that since it isn't really critical but I suggest to use spaces around all binary operators (except '.' and '->') as explained in the C operators section of the coding style page.
The reasoning behind that is to make obvious that unary operators (and also '.' '->') have higher precedence than binary operators without putting parenthesis around them. 
For unary operators, putting parenthesis remains necessary when the precedence order between them isn't clear.
= (slotField.fields[index] | 1 << i%8);

and I think I would rather write:
= (slotField.fields[index] | (1 << (i % 8)));
Am I missing something or is the same than:
slotField.fields[0] = 1;
  1. It's not the same. Let fields[0] be 00000100 (i.e. there is an each-proxy as the first argument). Then "fields[0] = 1" will make it 00000001, but "fields[0] = (fields[0] | 1)" will produce 00000101, which is what I want here.
  2. ok, makes sense.
Wouldn't a comment as the one below be clearer?
// the slots to be set as the invocation arguments

otherwise we get the impression 'slots' data structure itself has to be filled.
I would turn the next two loop into two inline functions to obtained something like that:

findFirstSlotIDWithEachProxy(slotID);
NSUInteger nextSlotID = slotID + 1;
findFirstSlotIDWithEachProxy(nextSlotID);

nextSlotID to be consistent
Shouldn't this variable rather be named 'eachProxy'?
A constant such as LastSlotID should be defined for 127.
while (
Remove the trailing semicolon.
And I would rather say:

// Set the present argument

I'm not really sure what a 'slot' means precisely. Is a slot exactly the same thing than an argument or is it the value/object identified by a slot id marked with '1'? That's why I suggest this change.
Similarly to my previous comment, I suggest something as below. Although you might have a better way to express all that...

/* If there are no more arguments to be set, the invocation is ready to be invoked, otherwise more slots remains to be set as arguments. */
You could make a bit more explicit with:
BOOL isFistTime = (0 == count);

if ((NULL != ctx->elementHandler) && isFirstTime)
{
}
else
{
  if (ctxt->modifiesSelf && isFirstTime)
}
Wrap on two or three lines with a tab at the beginning or align the arguments vertically in a better way.
(ctxt->modifiesSelf && (0 == count))
'when we run for the next target' doesn't sound really clear to me. May be something like:
when we invoke the invocation on the next target provided by 'levelProxy'
Same as previously
Should be named nextSlotID I think
If this is an ETEachProxy instance, I'd rename it eachProxy otherwise we get the feeling there is another kind of proxy involved.
while (
Same as previously
result = (result || (BOOL)filterResult);
result = (result || recursiveFilterWithInvocation(inv, slots, nextSlot));
ETCollectionObject
ETMutableCollectionObject
&& NO == isArrayTarget // to be consistent within the function
if (useBlock)
Add a small comment or may be a function:

MarkSlotsAsEmpty(eachedSlots);
or
NewSlots(); that returns an initialized slotField_t structure
  1. I'm opting for a comment saying that the first byte is zeroed out to indicate that the field has not yet been filled. MarkSlotsAsEmpty(); and NewSlots(); would, to me, indicate that the whole field had been zeroed out (and someone might use it for bitwise operations and get undefined results). 
You could do:
BOOL usesEachProxy = (eachedSlots.fields[0] & 1);

if (usesEachProxy)
{
}
else
{
}

Would be nice to have one less statement nesting too, but I don't have any suggestion to eliminate a level.
  1. I removed one level of nesting by replacing
    
    if ([object respondsToSelector: sel])
    {
        complex; nested; statements;
    }
    
    with
    
    if (NO == [object respondsToSelector: sel])
    {
       continue; //Skips the rest of the loop-body.
    }
    less; nested; statements;
The alignment of the arguments is weird since they don't represent pairs or something like that. I suggest to just wrap on several lines.
if (useBlock)
FOREACHE(content, element, id, contentEnumerator)
if (NO == shallInvert) // to be consistent with the rest of the function
Is this casting really needed?
See previously in ETHOMMapCollectionWithBlockOrInvocationToTargetAsArray
See previously in ETHOMMapCollectionWithBlockOrInvocationToTargetAsArray
Indent by wrapping with a tab at the beginning.
if (useBlock)
contentsFirst, firstObject
Missing tab
ETCollectionObject?
ETMutableCollectionObject?
ETCollectionObject?
id<ETCollectionObject>?
Merge with previous line.
Missing spaces after each comma.
Merge with previous line.
Missing spaces
Better to autorelease on the line where the instance is created
ETCollectionObject?
That's really minor... It's better to use a more descriptive variable name such as 'should/shallInvert' than 'aFlag'

Apple does it a lot, so I took the habit to do it in my own code, but I don't think it's a good idea in the end. It doesn't matter that much when the methods are small, but when the method size grows a bit, it's better to get rid of it.
ETCollectionObject?
ETCollectionObject?
ETCollectionObject?
id<ETMutableCollectionObject>?
id<ETMutableCollectionObject>?
id<ETCollectionObject>?
ETCollectionObject?
ETMutableCollectionObject?
ETMutableCollectionObject?
Missing space
for (int i = 0; i < [xxx]; i++) 
Use ASSIGN
Use ASSIGN

     

  
  1. ?
  2. Oops, looks like my comment was lost.
    There are memory leaks here because -copyWithZone: returns a retained instance and then the setters retain these copies.
    
    So either you need to autorelease/release the copies like that:
    [copy setString: AUTORELEASE([[self string] copyWithZone: aZone])];
    or a raw copy which is better I think, especially when the setters depend on the new object state (partially initialized at the time the setters are called):
    copy->stringAttribute = [stringAttribute copyWithZone: aZone];
    
  3. Right, -copy makes you take ownership of the created object. I should really remember that. I fixed this and another similar mistake in ETCollection+HOM.m.
Insert blank line.
Should rather be:
UKObjectsEqual(@"letters: foobar",[[inputArray leftFold]
<tab>stringByAppendingString:
UKTrue([folded isEqual: @"foobar"] || [folded isEqual: @"barfoo"]);
UKTrue([folded isEqual: @"foofoobar"] || [folded isEqual: @"barfoofoo"] || [folded isEqual: @"foobarfoo"]);

or

UKTrue([S(@"foofoobar", @"barfoofoo", @"foobarfoo") containsObject: folded]);
This line is really long. It is not ideal, but you could move and indent stringByAppendingString: on the next line.
This line is really long. You could do:
NSMutableDictionary *first = [NSMutableDictionary dictionaryWithObjectsAndKeys: 
	@"foo",@"one",@"FOO",@"two",@"Foo",@"three",nil];
Wrap on several lines with a tab at the beginning.
Missing blank line
I would use a much simpler wrapping style:
NSMutableArray *expected = [NSMutableArray arrayWithObjects: @"fooFooFOO",
	@"fooFooBAR", @"fooBarFOO", @"fooBarBAR", @"barFooFOO", @"barFooBAR", 
	@"barBarFOO", @"barBarBAR", nil];
May be add a blank line in other test methods as you have done in this one to separate test assertions a bit.
The note would be better placed at the beginning of the file.
What do you mean by 'remaining collection classes'? May be rewrite the last sentence to be clearer.
Review request changed
Updated 9 months, 3 weeks ago (November 19th, 2009, 12:52 a.m.)
Update to reflect the results of Quentin's review.
Ship it!
Posted 5 months, 2 weeks ago (March 24th, 2010, 8:51 a.m.)
Here is the review of the last diff. Everything looks ok to me.

There is still the -[NSDictionary identifierAtIndex:] issue, but I haven't figured out how we should handle that cleanly. From an EtoileUI perspective, the problem is roughly… How do we generate a key when an object is put in a dictionary by drag and drop? Should we reuse keys when objects are moved across dictionaries? May be only if the two dictionaries serve the same purpose? This last point probably only makes sense when we have two model objects which uses a dictionary as their storage. In this case, we can check their type/class to know their purpose.

Another thing I was wondering about… Is it ok to use -mappedCollection in the same way than -do in Smalltalk. For example:
[[views mappedCollection] display];
iirc, there are cases where this doesn't work. I don't recall exactly which ones. May be when the argument message return type is a BOOL or void.
This use case is very common and -mappedCollection doesn't express the intent very well, a more explicit -do message could be nice. What do you think?
int index = 1 / 8; without parenthesis would be fine since it's a simple arithmetic expression.
to be replaced <-- missing 'd'
When you mention a method, you can write -methodName. By adding a minus or plus sign at the beginning of the method name, autogsdoc automatically generates a link when the method belongs to the same class than the documentation.
Cross-class references can be done with -[OtherClass methodName] iirc.
Insert a newline here to have the note in a separate paragraph.

Also may be it is worth to explicitly state that:
[anObject addThing: [things each]] won't do anything. I tried to use this pattern several times since it feels natural :-)
I suggest a newline here too
I suggest a newline here too
I suggest a newline here too
'as its first argument' to know the thing to which 'first' applies to?
Not really sure about that since English is not my native tongue.
Same remark… 'as its second argument'
An example would be nice
May be it's overkill, but -mapInfo could be declared in an informal protocol