* The VisualBasic page and everything that hangs under it. Especially the rant pages need RefactorMercilessly. * Almost all of the X vs Y pages. Do title search on "vs". JavaVsCpp, JavaVsSmalltalk, EmacsVsVi, ComVsCorba. * Pages containing the word "Microsoft" like EveryoneButMicrosoftConsortium. Screens after screens, pages after pages, the same arguments are iterated and reiterated. * SelfOrganizingSoftware - I find the interaction hard to follow * RestArchitecturalStyle - likewise * LiskovSubstitutionPrinciple * Very large pages (listed at http:longs.cgi ) are usually the result of rampant thread mode. Many other pages are listed in RefactorByMerging and RefactorByRenaming. ---- I was considering doing a one-off crawl on Wiki to attempt to put together a list of pages which most need refactoring. I know the whole thing is in need of this, but there are 8000+ pages - its the Augean Stables problem, and some prioritising needs done. Things I was thinking of doing: * identifying very small pages - usually synonyms that can be refactored so there are no links to them. Maybe we could get Ward to 'really' delete a list of these. * identifying coherent communities (using IBM's HITS algorithm). Basically stuff which does not 'belong' to any coherent group of pages is highly likely to be drivel, and the communities themselves are interesting because they provide a guide to look for duplication. If you think this is a really BAD idea say so. If I do it, results will be posted so they can be used to guide refactoring. The HITS results may even prove useful in constructing new Roadmaps. -- BrianEwins. ''It's worth trying HITS and anything else like it, yes please. Let's amass as much knowledge about the WikiPage network as we can. Let's then take time to draw the right conclusions. A five year project? -- RichardDrake'' Before I dive off the deep end, let me tell you what I already know: * we cannot trawl all of Wiki easily. For one thing, Ward's server will not cooperate if you pull too fast, and I don't blame him. If I do this I'd have to bleed pages off over a number of days. So please, don't anyone else do it as well. If you already have code written to look for clusters then maybe you could volunteer here instead, otherwise I'll get on with it. * The 'Pages containing...' suggestions, in general. Well, I'd already looked at that and there are only 2000 or so (of the 8000+ pages) which do not share a common prefix/suffix with other pages. The rest of the pages fall into under 1000 bins. You could look at the most common prefix/suffix groups ('XPattern, ExtremeX, WikiX) as being fairly coherent, and so a 'quick win' task for refactorers. (I stemmed plurals and ignored 'the's here, it didnt make much difference.) * There are only 45 pages containing the string 'vs'. Sure, go for it, but I doubt that will help much. Worse, editing those now will put them to the top of the quickChanges list and people might add more. In case you're wondering how I know this stuff the titles of all pages can be got by searching for '$'. --BrianEwins What's the point of finding the pages that most need refactoring, other than putting off the difficult work of refactoring one page that needs it, whether it's the worst or not? TheBestIsTheEnemyOfTheGood. -- francis ---- '''There are only 45 pages containing the string 'vs'. Sure, go for it, but I doubt that will help much. Worse, editing those now will put them to the top of the quickChanges list and people might add more.''' An interesting defense mechanism. Try to fix the wiki and it opens old wounds. I don't think its built for that. Fixes that irritate even small numbers of QuickChanges junkies seem to grow into wars of lengthy, repetitive, redundant, repetitive debate. The victors are the most stubborn, so only stubbornly defended information survives. It might be better to add stuff to the parts you want to grow. Open new wounds. Perhaps wiki should forget. Drop the idea that any of this is archival and age out content that doesn't get looked at or changed. Has anyone tried that with a Wiki? -- EricHodges Yes, It is called RecentChanges. 3 to 5 days of Wiki. It shortly forgets anything else exists. I don't know how the user can know when a page is looked at, only when it has been changed. Some of the best stuff on wiki hasn't changed for several years. I think it has been looked at, but I have no way of knowing for sure. ---- It seems to me that the best time to go back to a page is sometime after you've forgotten about it. PagesToRefactor should make that possible because I expect it to appear regularly on RecentChanges. ''You expected it to but it didn't. Today we re-started it on a slightly different tack with a suggestion from BrianEwins. If it fails to stay up to date it becomes a deleted page.'' ---- See PairRefactoring, WikiMindWipeRepair, RefactoringNotes, RefactorByMerging ---- CategoryWikiMaintenance CategoryWikiRefactoring