Class Variables and Categories

The Google Mac blog just posted a short piece on mixing Objective-C Categories with class initialization. The gist is that implementing +initialize with a category leaves you without a call to the super version, and multiple versions don't run in a guaranteed order.

The post also discusses the possibility of using the +load method, which is similar to +initialize, but less widely used. Here's the conclusion the author comes to:

Luckily, if we break with tradition, we can use the "constructor" attribute. Yes, the syntax is ugly, but it does potentially solve a lot of our problems.

Constructors (which is a horrible name that must have been intentionally designed to cause confusion with C++/Java constructors) are guaranteed to be called after +load but before main.


Here's what the code from the post looks like:

void __attribute__ ((constructor)) InitializeFooBar(void) {
static BOOL wasInitialized = NO;
  if (!wasInitialized) {    
// safety in case we get called twice.
    [Foo initializeBar];
    wasInitialized = YES;
  }
}


Let's step back for a moment.

The post doesn't describe exactly what sort of things they need to do inside of their class-initializing category methods, but there is a brief reference to "initializing class variables." If that's the case, let's try something more tame.

Let's assume we want to add a class variable called "GMSuperSecretString" to NSString, which is used by a method called "doFancyGoogleStuff".

NSString * makeMyString() {
  return @"Sunnyvale";
}

@interface NSString (GoogleMac)
+(NSString *)GMSuperSecretString;
-(NSString *)stringByGooglifying;
@end

@implementation NSString (GoogleMac)

+(NSString *)GMSuperSecretString
{
  static NSString * string = nil;
  if ( !string ) {
    // do initialization
    string = makeMyString();
    [string retain];
  }
  return string;
}

-(NSString *)stringByGooglifying
{
  NSString * secret = [NSString GMSuperSecretString];
  return [self stringByAppendingString:secret];
}

@end


This works. It's possible there's something more that the Google folks are looking to do here, but some variation of this should handle about 98% of the cases that will come up. If anybody thinks I'm missing something here, please speak up.

I'm assuming the constructor syntax works, but I've never used it. My belief is that the more rickety a syntax looks, the more likely Apple is to sweep it away or change its behavior. Class methods are a safe bet.

I'm also a bit unclear on the whole premise here because the post says that they want to use categories to initialize class variables for built-in classes, but categories can't directly add variables, so I'm not sure where those variables are coming from.

In any case, if you need to do class variables through categories cleanly, there you go.
Design Element
Class Variables and Categories
Posted Nov 29, 2006 — 14 comments below




 

David Young — Nov 29, 06 2503

Yeah, I kinda don't understand these Google Mac blog posts in which they say that doing something absolutely crazy (and usually nasty) is a must-know for either performance and/or safety when Cocoa programming. What's up with that? I'm all for performance analysis and new GCC features, but the conclusions they arrive at baffle me.

There is evidence that they are abusing categories. Not only is it true that if you have two +initialize's in two categories, behavior is indeterminate, but the same is true of any two base or category methods on the same class whose selectors collide. Categories are not meant for overriding methods, but for extending base classes, which is why you see so many categories namespaced, e.g., _web_safeMakeObjectsPerformSelector:. I'm sure you know this, but since they've got comments turned off over there, it's not as though the topics are open for discussion, which somewhat irks me.

(n.b., based on their previous article, one thing I'm sure they want is thread safety, which is omitted from the above.)

Joachim Bengtsson — Nov 29, 06 2504

Plus, didn't someone recently write about swizzling selectors after the method's first call (and consequently initialization) so that the 'if' has no performance penalty? It might even have been here or on GMBlog, I don't remember...

Geoff Schmit — Nov 29, 06 2505

I'm new to the Objective C world, so take this with a couple of grains of salt. I assume that one reason why categories are not allowed to add additional instance variables is that, if they could, the size of the object could not be known. However, class variables do not affect the size of the object and, therefore, theoretically at least, new class variables could be defined by categories.

Just as a warning, I've encountered a handful of issues related to execution order and execution in general when using the constructor attribute as well as C++ static constructors and destructors in a highly componentized system running on Mac OS X 10.3. dyld was dramatically improved and a vast major of these issues were addressed in Mac OS X 10.4.

ken — Nov 29, 06 2506

Google's version is threadsafe. Less lazy initialization is one way to avoid the problems with double-checked locking.

ken — Nov 29, 06 2508

Also, regarding the purpose of the work.. I'm gonna guess that we're seeing the first steps of a workaround for the fragile ivars problem. That is, I bet they want to be able to add ivars in categories, and add ivars to classes without recompiling subclasses.

There are some Apple docs for load, initialize and the constructor attribute at .

Dave — Nov 29, 06 2509

Hey all,

Maybe I can clear some things up. This definitely wasn't intended as a general purpose technique, we only use it in a couple of places. We did feel that it was interesting as we hadn't seen it used before, and just wanted to give people something to think about, not to mention clarify a bit how +load works, as Apple's documentation is a little lax in that area.

In one particular case we are adding a category onto NSString that has a collection of methods for creating enumerators for word breaking strings in special ways. These methods all use some hand created NSCharacterSets that are reasonably expensive to build. We are expecting these methods to be called from multiple threads, and as we're parsing a lot of text, we want them to be fast.

Before I continue, quick digression: one could easily argue that we should have a word-breaker class that takes a NSString. I'm not going to argue that, and we may well change the design in the future. It was a case where it started as a simple category, that grew substantially, and now we've got a good pile of code that depends on it. End of digression.

So, we've got some static NSCharacterSets that we need to initialize inside of a category. We can't use +initialize in a category for the reasons explained in the article. We can't depend on +load, because according to Apple's docs, and the code we have no guarantee that NSCharacterSet is even going to be +load'ed when we are loaded. We want things to be thread safe, and want to avoid the call to @synchronized whenever possible, so we want to avoid the lazy initialization. The constructor attribute gave us exactly what we wanted in this special circumstance.

As far as it being rickety syntax, it's not something Apple is going to be taking away lightly, as it is built into gcc, unlike some pragmas that they had previously.

The example that you show is an example of lazy-initialization. I did mention that it works fine, except that it has the usual lazy-initialization caveats of being slow.

Dave — Nov 29, 06 2510

David Young:

Yes, we know about name collisions in categories. I'm not quite sure how you came to the conclusion that we are trying to override methods in categories. On the contrary, I was using it as an reason not to use +initialize.

Sorry, about the comments, you can certainly give feedback in the discussion forum.

Finally, one of the main reasons we are using this technique is for thread safety. Lazy-initialization causes potential thread problems, this should be safe (assuming that you aren't starting up threads in +load, +initialize, or contructors).

Dave — Nov 29, 06 2511

Joachim:

We did actually put a post up showing a technique of swizzling to avoid thread problems, but as I mentioned at the end of the Getting Loaded article, that's probably not a good idea. Ron Avitzur wrote up a good piece on why it is just DLCP in disguise..

David Young — Nov 29, 06 2512

Dave:

I got the idea that you were trying to use categories to override methods from this sentence:

The problem with +initialize is that it is virtually useless for categories, in that if I override a class's initialize method in my category, I can't call the original initialize. Also, if I have two categories on a class, and both have initialize methods, it is unclear which one will be called.

I apologize for the misunderstanding. But the paragraph as written suggests that you can't write two categories on the same class containing +initialize methods, which is correct, but for the wrong reason.

It does seem like thread safety is backing you guys into implementation corners. I wouldn't want to be dependent upon __attribute__ ((constructor)) as I agree with someone else here that it seems like the kind of thing that might break in future gcc releases.

Have you considered maybe moving your "class variables" into a class of their own? Perhaps the code you want to put in category methods might be better moved into its own class, which you can initialize early (e.g., ahead of the time at which threads are spawned) and thus avoid taking locks later on?

Chuck — Nov 29, 06 2513

I wouldn't want to be dependent upon __attribute__ ((constructor)) as I agree with someone else here that it seems like the kind of thing that might break in future gcc releases.
It's a documented part of GNU C. What makes people think they're likely to break it?

Scott Stevenson — Nov 29, 06 2515 Scotty the Leopard

It's a documented part of GNU C. What makes people think they're likely to break it?

In general, the more obscure a technique is, the less I think you should rely on it. Apple is free to make changes to their version of gcc at any time, and they're more likely to change things that are rarely used.

However, when I made that comment I was under the impression that a construct was an Objective-C language construct. After reading the page Dave MacLachlan linked to in the original post I realize it's more general.

Ironically, though, the docs say this about constructors and destructors:

These attributes are not currently implemented for Objective-C

For some reason, I feel Douglas Adams would have something to say here.

Jean-Daniel Dupas — Nov 30, 06 2516

Attributes are GCC features and not hacks. They are widly and commonly used.
These attributes are not currently implemented for Objective-C
This mean that methods could not (currently) have attributes, but this article proposes to use it on a C function.
Do not forget that Obj-C is a true superset of C, and what is true for C in .c file, is true for C in .m files.

Scott Stevenson — Nov 30, 06 2520 Scotty the Leopard

Do not forget that Obj-C is a true superset of C, and what is true for C in .c file, is true for C in .m files

Yes, of course. It was basically an attempt at ironic humor on my part.

Jean-Daniel Dupas — Dec 01, 06 2522

Oky, sorry ;-)

I have just post this precision because lots of people misinterpret this note and think that attribute does not works in Obj-C (and not only for Obj-C methods).




 

Comments Temporarily Disabled

I had to temporarily disable comments due to spam. I'll re-enable them soon.





Copyright © Scott Stevenson 2004-2015