Measuring the impact of user/customer requirement changes upon code ---- * Collecting a list of typical change scenarios (requirements changes) * Assigning a percentage (likely-hood frequency) estimate to each scenario * Different paradigms, methodologies, or languages are taken through each scenario * Quantity of changes are calculated. Various counting metrics include: ** LinesOfCode changed ** Number of unnamed blocks changed (such as IF and While blocks) ** Number of named units changed (methods, functions). ''Perhaps distinquish/count between inner-most named units and all named units.'' ** Number of statements changed ** Number of characters changed ** Number of contiguous groups of code changed ** LinesOfCode suggests other counting techniques involving tokens, punctuation, etc. No one metric is probably sufficient to get a good profile. Otherwise, somebody could "game the system" by focusing on a narrow metric alone. And it may ultimately be left to the study reader to apply the metric weighting factors they deem most appropriate. The general formula is: R = w1*m1 + w2*m2 + w3*m3 ...etc... where wN is a weight and mN is a metric score. A zero weight would render a given metric "ignored". More on this can be found at: * http://geocities.com/tablizer/goals.htm#base (NoteAboutGeocities) ----- Possibly, how many UnitTest''''''s are broken? Isn't the cost of a proposed change covered as part of the PlanningGame? ---- It's really a bottom-up approach to requirements analysis on a deployed and possibly mature product. Client observes A, which seems wrong, proposes behavior B. Developer finds code section eligible for making B replace A. Now the fun begins. If that change is made, how does it impact the long standing, undocumented requirements of the system in subtle ways? It's all about the dense interconnectedness of requirement, a topic young software pundits love to avoid. Just run the unit tests? Maybe, but what if it's not a unit that breaks? --AnonymousDonor If something needs to change, the first thing that actually ''does'' get changed are the unit tests. So, yes, you really can "just run the unit tests," and when everything passes again, you're good to go. Secondly, you should be working in a branch of the project, so as to not affect others. You'll want to periodically poll from the master branch to keep your stuff maximally synchronized with the working project, of course, to minimize integration issues. At any rate, it's very doable, and has been done before where I work. Your presumption is that you change production code without touching the unit tests in the process (preferably, ''first''). If you do that, ''you get what you deserve.'' --SamuelFalvo {How do you prove your unit-tests?} * Write unit tests to prove more unit tests, which prove more unit tests. This is one problem I also have with unit testing (in addition to below). Tests are infinite upon infinite! ** False. You prove unit tests through any number of different methods. But the simplest that could possibly work is to just admit that unit tests are a double-check on your code. They do not prove the correctness of your code. Nor are unit tests provable. They are ''discrepency detectors.'' If they fail, then ''something'' is wrong. It's your job to determine what it is. OTOH, if you want to get all formal, you could always attempt to prove your code by induction, whereby the inductive process yields invariants, themselves of which are codified as unit tests, etc. --SamuelFalvo ** ''False? or NULL? You can write more tests to check more tests... I was tongue in cheek when I said ''prove more unit tests'' because my point was that it is near impossible and leads to infinite testing :-) Unit testing can almost be considered TwiceAndOnlyTwice, but could go further and be ThriceAndOnlyThrice, etc. See also a domino unit test problem: http://www.cs.utexas.edu/users/EWD/transcriptions/EWD10xx/EWD1036.html For what it's worth, I am a fan of testing code but I do not fall for the hype of eliminating type systems and other integrity checks (which many extreme programmers advocate), and I do not even fall for the hype that there really is a true ''unit test'' since nothing really exists as a true unit (due to reliance on other systems... procedures and classes rely on other units!). I think a lot of unit testing info written on the net and in books is absurd.. but ''testing'' itself is useful, make no mistake. Whether one calls it unit testing, system testing, integration testing, etc. I find buzzy and vague. In the end, all of it is system testing - no matter how many try and argue that we can test a unit itself. Some consider a class a unit. Some consider a method a unit, etc. Some consider... compilation a partial test (others violently disagree and say that compilation and type checking has nothing to do with testing - my view is that unit test advocates who are against strong typing are simply avoiding one type of unit test, strong typing.. and this is hypocrisy). It reminds me of OOP - everyone has a different view on what testing is. Back to my dictionary I guess.'' [''Well if one was refactoring (even just for aesthetics, changing variable names and moving code into smaller methods/procedures) and/or if one was replacing algorithms with better ones (say more reliable or faster ones), and if the same behavior was expected out of a unit, then many tests '''may not''' change (especially since units always wrap other units, no matter how hard you try to dissect them all into invidual units - that is impossible since there are library calls and other units relying on other units, etc). As for just running the unit tests and assuming all is okay if everything passes... '''bleh''', I'll admit that I think unit testing creates some egotistical false sense of security (just like - if it compiles, it's good!). Hey, the tests pass! That means my program is perfect - ''neener neener neener!'' (Not.)''] [''This above issue bit me several times where I've reported a bug to unit testing advocates in charge of a code base - they simply write some dumb unit tests to prove the bug doesn't exist, using the simplest code possible - but their simple code doesn't allow the '''more complex problem''' to show up, which involves more than one unit integrating with another (missing factors). Only I, the user, could write a more complex test and beat it into them that their overly simplified tests were not showing the '''real problem''' of the working system. A problem, therefore, with unit testing - is that '''sometimes''' the unit tests are too dumb and simple and don't take into account the bigger picture.''] * This is why you use integration tests. Many of my tests exist ''solely'' to test interactions between two or more other modules. --SamuelFalvo * ''Integration tests, system tests, unit tests... lots of tests, yes - much further beyond unit testing. Testing to me is common sense.. we need to test out cars before putting them on the road. But one can never truly have a "true unit test". Because all units rely on something else, such as a variable, another library procedure (consider strCat or some function in a unit). So a unit test, is a system test - in a sense that one can never take out a pure unit and test just the unit (unless, you use bits.. 1 and 0). Again, I think testing is useful in a common sense kind of way (test the car before putting it on the road). I just don't agree with all the info about unit testing that is advocated and propagated in books and on the internet. Many of the unit testing stuff contains hypocritical flaws (such as some claiming tests help you debug faster than formal methods because testing doesn't take as much time.. the author of BitTorrent (Bram) makes these absurd claims). FWIW again, I think testing is useful! Unit testing? BuzzPhrase.'' They are certainly at fault here. Don't expect your unit tests to do the job of SystemTest''''''s. These are the sorts of small programs attached to BugZilla bug reports, which treat the system as a whole as a BlackBox, encapsulating overall desired behavior. See also: http://www.cs.utexas.edu/users/EWD/ewd10xx/EWD1012.PDF ------ If something is broken, it is eventually going to have to be fixed. IOW, code change. Over the long run wide-spread or unintentional impacts will be counted. However, what is not measured is the "2:00am" factor. Some side-effects will cause minor disruptions and some major. Nor is the cost of finding it counted. It would be interesting to see designs that are easy to change but hard to find the spots to change, and visa versa (MentalIndexability). ''I would suggest you add a "beeper-during-vacation" multiplier for changes you feel would cost more from that perspective. The reader may or may not agree with your markups, but at least you document your interpretation of the ranking. People will not always agree with your weightings because their experience may paint a different picture from yours. However, the important thing is to '''document your assumptions'''. A future reader may still find your analysis useful even if they plug in different weights. Think of the analysis as being a framework and not necessarily a final product, at least outside of your needs. Plus, it can narrow the points of contention so that people can focus on those specifics.'' ------- '''Scoring Technique''' Here's a rough model of how to put the above suggestions together from a relational perspective. Note that this is not promoting relational techniques here, only using it as way to show the relationships between the different factors. "Ref" suffixes refer to foreign keys. scenarios table --------- scenID scenDescript codeSnippet // if applicable styleGroup // paradigm/technique identifier metrics table -------- metricID metDescript // example: "Number of statements changed" scores table -------- scenRef metricRef score // example: number value of statements changed scenarioLikelyhood table -------- scenRef proponentRef // person or group making estimate probability // 1.0 = 100% metricsRelevancy table ------- metricRef proponentRef relevancy // 0 to 1.0 In the past with measurements roughly based on this model, different proponents have given widely different weights of relevancy and frequency. Each side usually accuses the other of biases such as a selective memory. Another explanation is that CompaniesHireLikeMinded such that people tend to end up in companies or teams with somewhat similar techniques, habits, and preferences. For example, in a situation where we were studying the change impact of asterisks in SQL SELECT statements in this wiki, somebody brought up the situation of a fixed-position array being used to store specific "cells" in rows (cell[1], cell[2], etc.). If the position of columns in the schema changed, obviously it could brake a lot of code. But that practice is not something I see very often in production code, and so I would rank it with a low probability. However, some shops or some languages may tolerate or encourage such practice. The model above allows each shop to plug in their own frequency estimates if they don't like those of the original "judges". It thus allows one to pick and choose which givens they accept and reject. --top ''Thats an awful lot of '''thin''' tables.'' Some designs call for it, some not. The entities just happen to be skinny in this case. ---- See also: ChangePattern, WhyNoChangeShootout, SoftwareDevelopmentAsInvesting, DecisionMathAndYagni ---- AprilZeroEight CategoryMetrics, CategoryChange, CategoryDecisionMaking