This page will collect some non-random examples of LongFunctions in LispLanguage. The examples are chosen to be part of very robust and widely acknowledged as successful open-source projects. The criteria of "longness" is set to dispell the widely held (on Wiki) superstition that 5-9 lines is the ideal everybody should be striving for. Other examples are welcome, until we collect from all major languages and a wide diversity of projects to dispell once and for all that myth. Lisp is particularly interesting as it is maybe '''the''' most powerful and concise language around, yet Lisp hackers don't seem to shy away from writing LongFunctions when it makes sense. ''Have you considered the possibility that it is ''because'' Lisp is the most powerful and concise language around that its expert users (and whomever wrote defclass was most certainly no neophite) can get away with functions this long, when in weaker, more verbose languages they wouldn't?'' You know, you can show all the long functions you want, no one disagrees that long functions are completely normal. You seem to be missing the point entirely. Just because they are normal, doesn't mean they can't be improved. Short functions are easier to understand, period, that can't be argued. It's easier to understand 7 to 10 lines of code than it is to understand 50, you cannot deny that. All we are saying is that, given this fact, one should try and write smaller methods rather than larger ones because it helps developers grasp programs quicker, it makes bugs easier to see, it makes programming faster, etc.. GrandMasterProgrammer''''''s could probably write interpreters in assembly, that doesn't make it right or easy for others to grasp. Long methods may be absolutely brilliant, yet still communicate very little to the reader, small methods communicate much more clearly to the reader, and we think that is eXtremely important. Small methods enable evolutionary development, or at least ease it greatly. ''But you construct a StrawMan by doing the wrong arithmetics. You should not compare the understandability of 50 lines of code versus 7 lines of code. You should compare 50 lines of code versus 7 (procedure calls in the main function ) + 7 * 8 (or >8 lines of definition for the factored out of the current scope functions). So much for your "period, it can't be argued".'' No, I'm not talking about total size of the code. I'm talking about what you can fit in your head, chunk size. Small methods communicate better, require less mental gymnastics from the reader, and are generally in every way easier to work with. It may take hundreds or thousands of methods to build an application regardless of how big they are, so why not have them just the right size to make them easy to work with. Small bricks are more flexible than 12x12 slabs of concrete, and are easier to work with! ''And I am talking not about the understandability of every individual little piece, but understandability, and, dare I say provability on occasions, for a conceptual blocks. If I have to fully understand an algorithm and to make sure it's implementation is correct, I'd rather look at 20 functions 50 lines each than 100 functions ten lines each. I don't care how easy is every little tiny piece, if I have to keep 100 pieces in my head (even if only their contracts), I'm rather uneasy. '' And I'd rather not do that either. You shouldn't have to ever fit it all into your head if you have little pieces and they are properly unit tested. No one can fit the entire program in their head, I don't care how smart they think they are. The large scale picture must, by its very nature, be more abstract so the mind can grasp it. At that level, you aren't thinking about methods, but about objects/packages/modules. To paraphrase KentBeck, a programs like a giant wave, you can't tame it, so just learn to surf. ''Well, it's your preference to read KentBeck, my preferred authors are more like EwDijkstra, and dare I say that '''they are not very compatible'''. But I'm not aware of any instance where KentBeck has publicly displayed his treatment of non-trivial algorithms, nor am I aware of any instance where Kentbeck or any other Agile guru has ever tried even a sketch of ProofOfCorrectness. As a matter of fact, his latest books contain really toyish examples, so I wouldn't even remotely assume that his great advices are applicable everywhere, especially where they contradict Dijkstra. See also the discussion below. There are instances where "surfing" is just not good enough and you need a hell of a lot more than that. '' Dijkstra's point has always been that programs should be broken down into manageable units so that the correctness of each unit can be independently proven, thus allowing you to reason about the whole program without keeping all the state in your head. That was the whole reasoning behind StructuredProgramming. This is supported by LotsOfShortMethods, not hindered by it. I don't see how you jump from "ProofOfCorrectness" to "LongMethods are good" - a long method is harder to reason about or prove than a short one! -- JonathanTang ---- ''You shouldn't have to ever fit it all into your head if you have little pieces and they are properly unit tested.'' That's called a wave with your hand using a magic precondition. You don't know if the code has been properly unit tested. You don't know what proper unit tests are. You don't know what the code is supposed to do. You don't know if unit tests have been updated. You still must understand the unit tests which is admits to the same errors as any other code. And critically, unit tests mean only about 50% given integration and system tests, especially using heavily mocked tests. -- Anonymous A ''If I wrote the code, or my team did, I most certainly do know that. Expecting code to be tested isn't a magic precondition, it's a reasonable expectation that anything complex should have a few tests to prove it actually works. There's always the chance they don't cover every condition of course, but when you find one of those conditions, you'd be wise to write a test for it too, it'll save your ass later.'' Most of the time tests do not actually prove that it actually works, KentBeck should have taught you that much. They only prove that it works on the tested cases. And there's the extra-doubt that the test cases were actually good test cases and not wishy-washy tests for trivial conditions. Most of the code I could see in public from agile gurus or mere programmer, had wishy-washy tests, especially for concurrent behavior, if that was tested at all. -- CostinCozianu Most interesting problems happen in system tests, especially given an attitude where a few tests can show something works. You had damn well better read the code to solve those. -- AnonymousOnPurpose ----- From CLisp CLOS source (the overwhelming majority of definitions in there fail to be 5-9 lines of code, but this macro is particularly big): (defmacro defclass (name superclass-specs slot-specs &rest options) (unless (symbolp name) (error-of-type 'sys::source-program-error (TEXT "~S: class name ~S should be a symbol") 'defclass name)) (let* ((superclass-forms (progn (unless (listp superclass-specs) (error-of-type 'sys::source-program-error (TEXT "~S ~S: expecting list of superclasses instead of ~S") 'defclass name superclass-specs)) (mapcar #'(lambda (superclass) (unless (symbolp superclass) (error-of-type 'sys::source-program-error (TEXT "~S ~S: superclass name ~S should be a symbol") 'defclass name superclass)) `(FIND-CLASS ',superclass)) superclass-specs))) (accessor-def-forms '()) (slot-forms (let ((slot-names '())) (unless (listp slot-specs) (error-of-type 'sys::source-program-error (TEXT "~S ~S: expecting list of slot specifications instead of ~S") 'defclass name slot-specs)) (mapcar #'(lambda (slot-spec) (let ((slot-name slot-spec) (slot-options '())) (when (consp slot-spec) (setq slot-name (car slot-spec) slot-options (cdr slot-spec))) (unless (symbolp slot-name) (error-of-type 'sys::source-program-error (TEXT "~S ~S: slot name ~S should be a symbol") 'defclass name slot-name)) (if (memq slot-name slot-names) (error-of-type 'sys::source-program-error (TEXT "~S ~S: There may be only one direct slot with the name ~S.") 'defclass name slot-name) (push slot-name slot-names)) (let ((accessors '()) (readers '()) (writers '()) (allocation '()) (initargs '()) (initform nil) (initer nil) (types '()) (documentation nil)) (when (oddp (length slot-options)) (error-of-type 'sys::source-program-error (TEXT "~S ~S: slot options for slot ~S must come in pairs") 'defclass name slot-name)) (do ((optionsr slot-options (cddr optionsr))) ((atom optionsr)) (let ((optionkey (first optionsr)) (argument (second optionsr))) (case optionkey ((:READER :WRITER) (unless (function-name-p argument) (error-of-type 'sys::source-program-error (TEXT "~S ~S, slot option for slot ~S: ~S is not a function name") 'defclass name slot-name argument)) (case optionkey (:READER (push argument readers)) (:WRITER (push argument writers)))) (:ACCESSOR (unless (symbolp argument) (error-of-type 'sys::source-program-error (TEXT "~S ~S, slot option for slot ~S: ~S is not a symbol") 'defclass name slot-name argument)) (push argument accessors) (push argument readers) (push `(SETF ,argument) writers)) (:ALLOCATION (when allocation (error-of-type 'sys::source-program-error (TEXT "~S ~S, slot option ~S for slot ~S may only be given once") 'defclass name ':allocation slot-name)) (case argument ((:INSTANCE :CLASS) (setq allocation argument)) (t (error-of-type 'sys::source-program-error (TEXT "~S ~S, slot option for slot ~S must have the value ~S or ~S, not ~S") 'defclass name slot-name ':instance ':class argument)))) (:INITARG (unless (symbolp argument) (error-of-type 'sys::source-program-error (TEXT "~S ~S, slot option for slot ~S: ~S is not a symbol") 'defclass name slot-name argument)) (push argument initargs)) (:INITFORM (when initform (error-of-type 'sys::source-program-error (TEXT "~S ~S, slot option ~S for slot ~S may only be given once") 'defclass name ':initform slot-name)) (setq initform `(QUOTE ,argument) initer (make-initer argument))) (:TYPE (when types (error-of-type 'sys::source-program-error (TEXT "~S ~S, slot option ~S for slot ~S may only be given once") 'defclass name ':type slot-name)) (setq types (list argument))) (:DOCUMENTATION (when documentation (error-of-type 'sys::source-program-error (TEXT "~S ~S, slot option ~S for slot ~S may only be given once") 'defclass name ':documentation slot-name)) (unless (stringp argument) (error-of-type 'sys::source-program-error (TEXT "~S ~S, slot option for slot ~S: ~S is not a string") 'defclass name slot-name argument)) (setq documentation argument)) (t (error-of-type 'sys::source-program-error (TEXT "~S ~S, slot option for slot ~S: ~S is not a valid slot option") 'defclass name slot-name optionkey))))) (setq readers (nreverse readers)) (setq writers (nreverse writers)) (dolist (funname readers) (push `(DEFMETHOD ,funname ((OBJECT ,name)) (SLOT-VALUE OBJECT ',slot-name)) accessor-def-forms)) (dolist (funname writers) (push `(DEFMETHOD ,funname (NEW-VALUE (OBJECT ,name)) (SETF (SLOT-VALUE OBJECT ',slot-name) NEW-VALUE)) accessor-def-forms)) `(LIST :NAME ',slot-name ,@(when accessors `(:ACCESSORS ',(nreverse accessors))) ,@(when readers `(:READERS ',readers)) ,@(when writers `(:WRITERS ',writers)) ,@(when (eq allocation ':class) `(:ALLOCATION :CLASS)) ,@(when initargs `(:INITARGS ',(nreverse initargs))) ,@(when initform `(#| :INITFORM ,initform |# :INITER ,initer)) ,@(when types `(:TYPE ',(first types))) ,@(when documentation `(:DOCUMENTATION ',documentation)))))) slot-specs)))) `(LET () (EVAL-WHEN (COMPILE LOAD EVAL) (ENSURE-CLASS ',name :DIRECT-SUPERCLASSES (LIST ,@superclass-forms) :DIRECT-SLOTS (LIST ,@slot-forms) ,@(let ((metaclass nil) (direct-default-initargs nil) (documentation nil)) (dolist (option options) (block nil (when (listp option) (let ((optionkey (first option))) (when (case optionkey (:METACLASS metaclass) (:DEFAULT-INITARGS direct-default-initargs) (:DOCUMENTATION documentation)) (error-of-type 'sys::source-program-error (TEXT "~S ~S, option ~S may only be given once") 'defclass name optionkey)) (case optionkey (:METACLASS (when (eql (length option) 2) (let ((argument (second option))) (unless (symbolp argument) (error-of-type 'sys::source-program-error (TEXT "~S ~S, option ~S: ~S is not a symbol") 'defclass name option argument)) (setq metaclass `(:METACLASS (FIND-CLASS ',argument)))) (return))) (:DEFAULT-INITARGS (let ((list (rest option))) (when (and (consp list) (null (cdr list)) (listp (car list))) (setq list (car list)) (warn (TEXT "~S ~S: option ~S should be written ~S") 'defclass name option (cons ':DEFAULT-INITARGS list))) (when (oddp (length list)) (error-of-type 'sys::source-program-error (TEXT "~S ~S, option ~S: arguments must come in pairs") 'defclass name option)) (setq direct-default-initargs `(:DIRECT-DEFAULT-INITARGS (LIST ,@(let ((arglist nil) (formlist nil)) (do ((list list (cddr list))) ((atom list)) (unless (symbolp (first list)) (error-of-type 'sys::source-program-error (TEXT "~S ~S, option ~S: ~S is not a symbol") 'defclass name option (first list))) (when (member (first list) arglist) (error-of-type 'sys::source-program-error (TEXT "~S ~S, option ~S: ~S may only be given once") 'defclass name option (first list))) (push (first list) arglist) (push (second list) formlist)) (mapcan #'(lambda (arg form) `(',arg ,(make-initer form))) (nreverse arglist) (nreverse formlist))))))) (return)) (:DOCUMENTATION (when (eql (length option) 2) (let ((argument (second option))) (unless (stringp argument) (error-of-type 'sys::source-program-error (TEXT "~S ~S, option ~S: ~S is not a string") 'defclass name option argument)) (setq documentation `(:DOCUMENTATION ',argument))) (return)))))) (error-of-type 'sys::source-program-error (TEXT "~S ~S: invalid option ~S") 'defclass name option))) `(,@metaclass ,@direct-default-initargs ,@documentation)))) ,@(nreverse accessor-def-forms) ; the DEFMETHODs (FIND-CLASS ',name)))) '''Comments:''' Nice try, but no cigar. The longest run at a single level of indentation is the quasi-quote to do with slot kinds, at 10 lines. Lisp is expressive enough, and the indentation conventions well-known and universal enough, that this macro doesn't even read to me as a single thing (and I'm no lisp expert) in the sense that I can ''by eye'' identify regions of the code that I can examine in near isolation and think about separately. The injunction agasint writing long functions/methods in C-like languages is to do with the ease with which code can be written in them for which that is impossible. ''Well, you may want to pay a closer look at the block (do (let (...optionkey ...) ( case optionkey ... ))) block that surely spans tons of rows by LongFunctions discussion standards, and I almost forgot to mention it also falls under CaseStatementsConsideredHarmful, SwitchStatementSmell dogma, so would be absolutely verbotten. What do we do, do we declare that the injunction against LongFunctions makes an exception for any language more powerful than C (i.e. with support for higher order functions)? Well, let's do that, I'm all for it.'' I, as a lisper (AlainPicard), concur. In fact, I often do this sort of thing (well, maybe not to that extreme, but still) but it is basically a technique of not polluting the global function namespace. Long chains of FLETS, MACROLETS and CASE/COND analysis don't, to me, conceptually lead to difficult to understand programs if each part can be analyzed individually. The only bit you really lose on is the ability to test individually each component; often, however, the gain you get by being able to share lexical variables between these parts makes it a net win nonetheless. REALLY long functions in lisp are normally autogenerated/expanded from a macro; the classical example are FSM generators. ''I'm using this techniques in Java myself and in OCAML where it is the default style. However somebody on the LotsOfShortMethods thought police already almost accused me of incompetence in a funny page JavaStaticClassesIsaTotalMess, where I was defending one of the little feature of Java language that enables me to write more modular code and avoid polluting the package namespace. That somebody was considering a language design flaw nevertheless because would "encourage" the creation of long files (!!!). I wished more people on my team would use inner classes in Java more often (static and inner). That's why I think it is important to communicate to these guys to pay a little attention to real code out there.'' I don't mind inner functions used as a scoping mechanism - it's still possible to reason about each inner function independently, so this doesn't increase the cognitive load on my brain. But that's not what's going on in the CLisp example above. The bulk of the code is long chains of WHEN, UNLESS, IF, and COND statements, and a lot of just-plain-sequencing. There's no reason these couldn't be pulled out into inner functions, or dispatched upon a la DataDirectedProgramming. -- JonathanTang That's what packages are for. You don't need to worry about polluting the namespace. It is the namespace. And I guess if you like long methods it's clear why you don't have a problem with large dense inscrutable classes. I am now waiting for the please go read these books because I am not getting paid and it's too complex for a wiki page post. I have seen real code like yours and that is exactly why I don't like it. ''Let's not repeat the same arguments over here. At least three other folks agreed that you have not defended your position on that page, so go ahead and do it there. But you may express your undoubtedly opinionated position on the LISP snippet above. '' I haven't used lisp since college so I won't bore people with my undoubtedly irrelevant opinion on something I don't know about. As for defending my opinion, if you don't see the complexity and confusion of long methods and classes as a problem (in java), then what can say? If you think recreating another namespace mechanism other than the package is a good idea, then what can I say? Both are legitimate takes. ---- The longest method in my IbmSmalltalk V3.5 image (by character count) is '''Windows''''''Platform''''''Framework class >>#_PRAGMA_PlatformConstants''' clocking in at 663,973 characters. Eliding most of it (to spare your browser), it looks like this: _PRAGMA_PlatformConstants "%%PRAGMA DECLARE (name: PlatformConstants isPool: true) (pool: PlatformConstants declarations: ( (name: AbeBottom isConstant: true valueExpression: '3') (name: AbeLeft isConstant: true valueExpression: '0') (name: AbeRight isConstant: true valueExpression: '2') (name: AbeTop isConstant: true valueExpression: '1') '''...''' (name: WcTreeview isConstant: true valueExpression: '''SysTreeView32'' nullTerminated') (name: WcTreeviewa isConstant: true valueExpression: '''SysTreeView32'' nullTerminated') )) " What conclusions can we gain from looking at this "particularly big" method in Smalltalk? None. Could it be written any other way? Who knows. Who cares. It doesn't help answer any questions about coding style, the relationship between method length and anything else, or anything else. The defclass macro from the CLisp CLOS source offers approximately as much insight into these questions in Lisp. By the way, the following Smalltalk fragment extracts the longest method in Smalltalk. I think it's easily ported to all dialects. | aSize aSelector | aSize := 0. Object allMethodsDo: [:each | (each sourceString size > aSize) ifTrue: [aSize := each sourceString size. aSelector := each selector]]. aSelector ''How about getting an average method line count in smalltalk? I don't know how, but I'd be interested in the result. Of course long methods exist, no one deny's that, but I'd bet on average, smalltalks line count is pretty low compared to other languages, C especially.'' Yeah, I was just thinking about that myself. I can certainly collect the average line count. Even more interesting, but also more time consuming, is a histogram. I think the distribution will be interesting, and I'd like a tool that would let me click through the histogram to see the resulting code. I'll do the average first. ''Make sure to get method count stats as well. An object with 100 methods is not easy to understand either.'' [Voice 1: methods don't belong to objects in lisp] [Voice 2: Method count stats address a different (though perhaps related) topic. This topic concerns the length of a '''method'''.] ---- Umm, I can only claim to have skimmed the CLisp code above, but IMHO it looks gross. The average functions in TheArtOfTheMetaObjectProtocol were about 5-9 lines. The parts of CLOCC (mostly networking) I've looked at have had mostly 1-liners. Corman Lisp sources tend to run at a line or two, with rare functions being about 10-20. I think those may be better examples of Lisp code than the CLisp junk above. -- JonathanTang ''And why would you qualify the code above as junk ?'' I don't know why he would qualify it as such, as CLISP is a very high quality implementation. For reference, I checked how CMU does it, and the sources for defclass are embedded in an entire file (429 lines long) of which 101 lines is the DEFCLASS macro (the rest being helper expander functions used in that macro). I still don't think either of these are particularly hard to understand (due to length). After all, DEFCLASS does a helluva lot, in Lisp. -- AlainPicard I wish I had TheArtOfTheMetaObjectProtocol with me, as I could swear that defclass was no more than a printed page in that. It's 11 lines in the TinyClos implementation that came with Chicken Scheme, though that is perhaps apples-and-oranges. I ''think'' it was about a screenful in CormanLisp, but I don't have the source handy and I don't really want to trust my memory on that. Defclass does a lot, but what it does can generally be broken up into subtasks. I think that's the point that many people arguing for LotsOfShortMethods are trying to make. Would it harm readability to pull out a 100-line lambda and give it a name with a flet or labels? Or use a macro to take a table of forms and expand that into the COND? There ''is'' a lot of shared context in this example, but it seems like you could make the code a lot more readable by naming the various parts, and not having to trace through the execution and skip over all the when/unless blocks. I should probably retract my comment about it being "junk" above too, now that I've had a chance to read (as opposed to skim) it. It's readable. I just think it could be ''more'' readable if it was broken up a bit instead of being one monolithic piece. -- JonathanTang ''Here's Gregory Kicczales' Scheme implementation of a '''subset''' of the functionality of the code above. I refrained to bring it into discussion, because unlike CLisp it is not a professional-level maintained code, but since you mentioned Tiny Clos, here it is:'' ; ; Now we can get down to business. First, we initialize the braid. ; ; For Bootstrapping, we define an early version of MAKE. It will be ; changed to the real version later on. String search for ``set! make''. ; (define make (lambda (class . initargs) (cond ((or (eq? class ) (eq? class )) (let* ((new (%allocate-instance class (length the-slots-of-a-class))) (dsupers (getl initargs 'direct-supers '())) (dslots (map list (getl initargs 'direct-slots '()))) (cpl (let loop ((sups dsupers) (so-far (list new))) (if (null? sups) (reverse so-far) (loop (class-direct-supers (car sups)) (cons (car sups) so-far))))) (slots (apply append (cons dslots (map class-direct-slots (cdr cpl))))) (nfields 0) (field-initializers '()) (allocator (lambda (init) (let ((f nfields)) (set! nfields (+ nfields 1)) (set! field-initializers (cons init field-initializers)) (list (lambda (o) (get-field o f)) (lambda (o n) (set-field! o f n)))))) (getters-n-setters (map (lambda (s) (cons (car s) (allocator (lambda () '())))) slots))) (slot-set! new 'direct-supers dsupers) (slot-set! new 'direct-slots dslots) (slot-set! new 'cpl cpl) (slot-set! new 'slots slots) (slot-set! new 'nfields nfields) (slot-set! new 'field-initializers (reverse field-initializers)) (slot-set! new 'getters-n-setters getters-n-setters) new)) ((eq? class ) (let ((new (%allocate-entity class (length (class-slots class))))) (slot-set! new 'methods ()) new)) ((eq? class ) (let ((new (%allocate-instance class (length (class-slots class))))) (slot-set! new 'specializers (getl initargs 'specializers)) (slot-set! new 'procedure (getl initargs 'procedure)) new))))) ''Some 60 lines of code, again for a subset of functionality. Apparently yet another GrandMasterProgrammer was not aware that his code may be subsequently victim of the public outrage on wiki about LongFunctions.'' I saw that code (it's in the TinyClos implementation I'm looking at). I don't find it quite as hard to read as the defclass example above, because about 45 of those 60 (I counted 70ish in my implementation, but eh) are internal definitions that bind a variable to the result of a computation. With a let or flet or define, you can look at the code and then forget about it once it's been bound. With an if, when, unless, or cond, you have to trace through the branches, because the code within them can have multiple effects on the following code. Again, internal functions are not a problem (you won't see me arguing that JavaStaticClassesIsaTotalMess). Long chains of conditional logic are a potential problem. Short methods (where method is a self-contained block, not necessarily at top-level) are more readable, more maintainable, and dare I say, more provable than long ones. -- JonathanTang ---- As requested above, the following Smalltalk code finds the method with the highest linecount, along with the linecount, and also calculates the average linecount. Note that methods with no sourcecode (there are a few) are excluded from the statistics. | aMaxLineCount aLineCountSum aMethodCount aLineCount aSelector aString | aMaxLineCount := 0. aLineCountSum := 0. aMethodCount := 0. Object allMethodsDo: [:each | (aString := each sourceString). aString ifNotNilObject: [aMethodCount := aMethodCount + 1. aLineCount := aString occurrencesOf: (String platformLineDelimiter at: 1). aLineCountSum := aLineCountSum + aLineCount. (aLineCount > aMaxLineCount) ifTrue: [aMaxLineCount := aLineCount. aSelector := each selector]]]. 'Longest method selector is #%1 with lineCount: %2, average line count: %3' bindWith: aSelector with: aMaxLineCount printString with: (aLineCountSum/aMethodCount) asFloat printString The method answered the following, from the IBM Smalltalk V3.5 image: * Longest method selector is #_PRAGMA_PlatformConstants with lineCount: 9955, average line count: 8.91135178499799 ''Average 8.9, nice average!''