A position paper presented at the OOPSLA 2001 workshop: ''Software Archaeology: Understanding Large Systems'' * http://c2.com/doc/SignatureSurvey * http://www.okisoft.co.jp/esc/SignatureSurvey/SignatureSurvey.htm (in Japanese) ''from the introduction ...'' Typically, a browsing mechanism for large, structured programs will expose a small part of the program in response to some form of query, possibly as simple as picking from a list. However, when examining an unfamiliar program one needs to get a feel for the whole program at once. To meet this need, one is better off writing a custom tool for viewing the whole program. Such a tool is easily written as a cgi script which runs on a server, reads all of the code base and then reports a summary of that code as a single html document. See also TipsForReadingCode. --------- I recently started bringing the idea of using the signature survey method into the German .NET Community. I wrote a short summary about it, referencing your original article. I summed it up and provided it here: * http://bitbucket.org/cessor/signaturesurvey I find it an extremely useful tool, although my understanding of why it is useful differs a little from your original ideas. Rather than actually browsing code I reduce it to creating a very flat list. I just wanted to say thank you for a very great idea that today, at a conference, inspired a lot of people. -- Johannes from Heidelberg, Germany ------- A version of SignatureSurvey exists for wiki. Try browsing with it and compare it with VisualTour and LikePages ... * http:wikiSig?StartingPoints * http:wikiSig?ManualTopTen * http:wikiSig?RandomPages The wikiSig script finds all of the pages cited on a root page and summarizes their structure. It is a shorthand for the structure of the page. It currently uses the following code: * p Page - a link to another Wiki page * C Category - special case of the above, a link to a Wiki Category (eg: CategoryHomePage) * S Signature - the canonical signature line (-- WikiName) * H http Link - an external link or an image link (eg: http://example.com) * | Rule - a horizontal rule. * * Bullet - A bullet list (only tabbed unordered lists; asterisk-only, ordered lists and definition lists are not yet implemented) * sp Paragraph - a space in the page signature indicated a plain old ordinary paragraph. * .,?(){} Punctuation A very simple example is the PietHein page. It's sig reads as follows: . . | H | C The best way to grok the SignatureSurvey is to mentally turn it on its side. ------ BrianDorsey suggested I try applying SignatureSurvey to wiki after he and AsimJalis had done the same to the SeaPig MoinMoin wiki. See http://www.seapig.org/PageSignatureCode for their report. -- WardCunningham ---- I'm looking at this and thinking it could be very useful for refactoring. At a glance, you can see all the long pages, threadmess, external links, etc. But the characters it picks out could be better suited for this: * Don't include (){}. They result in too much noise when you have, for example, Lisp or C code on the page. * Do include []. This is commonly used to delimit ThreadMode. * Maybe include indented text. This shows the code examples more reliably than (){}. Might be non-trivial to implement though. * Include formatting changes such as bold/italics. They're either used as headers (useful for seeing the structure) or to delimit ThreadMode comments (useful for refactoring targets). * Possibly include '-- lowercase' as a signature, instead of just '-- WikiWord'. Many people use initials instead of a full signature, particularly for long ThreadMode discussions. This'll result in some false positives with em-dashes, though. -- JonathanTang ---- This reminds me of the people in the movie TheMatrix ''watching'' characters "rain" down the monitor and ''seeing'' reality. -- DaveParker ---- As an alternative to using html to generate the summary and detailed view, you could use an IDE's cross-referencing ability to move around the code quickly. All you need is a high-level signature view, which can be easily provided through a plugin. However, it might not be enough to generate a fixed signature - since the important thing is the active exploring... "forming theories" about the code and "refuting them". So there should be a way to define the signature pattern yourself. This is a first attempt at such a plugin for IntellijIdea - http://www.intellij.org/twiki/bin/view/Main/JavaSig. It's a simple plugin, that generates a signature for each java source file in the project by replacing a given regex with an empty string. The regex can be modified. Using an IDE has the additional benefit of providing an intermediate view - the fully-folded snapshot of the source file (Ctrl Shift Minus - in IntellijIdea). -- YogiKulkarni ---- It's not a big deal to me, but I think (non-signatory) double hyphens fool the SignatureSurvey script. -- JoeWeaver ''You are partially correct. Double hyphens trigger an 'S' result in the SignatureSurvey only if the hyphens (surrounded by spaces or not) are followed by a WikiWord. Double hyphens are ignored as long as they are not followed by a WikiWord (with or without spaces). See EmDashInAscii. -- ElizabethWiethoff'' How does [the script] work, anyway? It shows no results at all on my homepage, but does on, e.g., TopMind. -- DougMerritt The vertical axis lists all the pages linked to from that page. The horizontal axis displays the summaries for those pages. For example, take http://c2.com/cgi/wikiSig?PainfulLanguage. You get a list of all links from PainfulLanguage: AbapLanguage, AlistairYoung, BackWardLanguage, etc. Besides each link is the summary for that page. Take MalbolgeLanguage as an example, because that's a fairly short page. The . indicates the end of the sentence. Then there are 2 newlines, represented as spaces, an H for the external link, and then 2 more newlines. Then there's a P (page-link) for TuringComplete, and another period. Then ,. for the sentence about its Turing Completeness. A , after "However", and then Ps for GeneticAlgorithm and HelloWorld. The next sentence just has a single P, for HaltingProblem, and then there's the rest of the punctuation in the sentence. Then more whitespace, and the final P for BefungeLanguage. Basically, long strings of punctuation mean dense text. Lots of Ps means someone likes AccidentalLinking, like me. Punctuation + S means ThreadMode. Lots of | (horizontal rules) is also a good indication of ThreadMode, or lots of disjoint comments. The length of the sig corresponds roughly to the length of the page. This could be a good guide to refactoring. At a glance, you can see long pages that might require splitting, dense ThreadMode, unrelated comments, etc. I wish you could get a summary for a page without having to search for one of its backlinks though. And I have no clue why DougMerritt's page doesn't show up. Maybe it's too long, or maybe some character in it is throwing off the script. -- JonathanTang ---- CategoryWiki CategoryWikiMaintenance CategoryWikiStructure