''Continued from TypeSystemCategoriesInImperativeLanguages...''

{See section 2 in http://lucacardelli.name/Papers/TypeSystems.pdf (Note: This describes a formalism for S languages. D1 langauges would need to delay the type judgments until run-time. D2 languages would only make type judgments if an explicit request is made, e.g. cfArgument.)}

That is not a very approachable document. It sounds too much like you guys; which is the last way I'd document anything for typical programmers, the target audience. The tag model is more "mechanical". It essentially uses a kind of semi-abstracted machine language in which we empirically test one of two options for a given operator: the op "uses" (examines) the variable's tag, or it only uses the value, such as for parsing. (There may be ops that use both, but I cannot think of any at the moment.) 

We "run" the abstract machine language for both assumptions and see which one empirically matches the actual language output. For languages that never use the tag based on experiments (have no detectable tag), we can know to ignore the idea of a tag and that any "type determination" is done by examining the value and only the value.

TypeTagDifferenceDiscussion provides a starting catalog of experiments to try, if applicable to a given operator. This catalog can be expanded to make it gradually more thorough.

I am not claiming it's necessary the most perfect or most thorough way to test/model languages, but it's the best known optimization of simplicity and forecasting power for common dynamic languages. Or at least it's an alternative if one finds your model hard to digest. Like I said, different model may fit different WetWare differently. 

If the abstract machine language is too abstract for your taste (the abstraction leaves fuzziness), we can turn it into a more concrete (virtual) machine language by essentially creating a chip emulator or writing a full interpreter, but I hope we don't have to go that far.

So what's missing as to make it not sufficiently clear to you?

''A formal description of your model would help us to understand it, even if we're not the ultimate audience.  A "user friendly" or target-audience-oriented document can certainly be derived from the formal description.''

''Some things that are missing or not clear:''
* ''How does your model account for the fact that writeln("34" + 34) is -- depending on the language -- "3434", 68, or an error?''
** I already told you it's language dependent. We'd have to run multiple tests per language to see what its rules are. For example, one languages may only look at the tag or parsed type of the left side, some only the right side, some both with a more complex rule set. The possible combinations of tag analysis, value analysis, and side (left/right) analysis rules can be fairly large. We either have to do per language experiments to find the right combo (to config a matching model), or read the manual and hope it's clear and accurate.
** ''The descriptions at the top of TypeSystemCategoriesInImperativeLanguages easily account for these, based on operator dispatch being determined by value types.  It is a limitation of your model that you're forced to defer to it being "language dependent" and having to "run multiple tests per language."'' 
** Please illustrate this alleged easiness. And how do you empirically know it's "being determined by value types"?
** ''It's described -- in simple terms -- in the literature of ComputerScience and SoftwareEngineering, and almost every programming language reference manual.''
** Claiming it's "clear" 46 times does not make it so.
** ''Look in virtually any programming language reference manual, and you'll see descriptions of operator dispatch based on operand type, whether the phrase "operator dispatch" is used or not.''
** And they are often vague, like your description of cfArgument.
* ''How does your model account for the fact that variable assignments can fail in C, C++ and Java but not in Ruby, Python, and PHP?''
** I already told you I'm not addressing static languages. (As a preview, static languages can be modeled with "read-only" tags.) 
** ''The descriptions at the top of TypeSystemCategoriesInImperativeLanguages easily account for these, yet need not make overt mention of "static languages".  It is a limitation of your model that a significant language category is not supported.''
** Claiming it's "clear" 47 times does not make it so.
** ''That does not address my point, which is that your model is missing a significant language category.''
** Like I said before, it's because it simplified the model for dynamic languages: I made a design tradeoff decision.
** ''The model would be even simpler if you eliminated dynamic languages, too.  I find your claim of "simplified" rather dubious, given that your model seems to demand a "'semi-abstracted' machine language" to fully understand it, a series of experiments with no clear end point, and some amorphous and previously-unknown "tag" that you refuse to confirm or deny whether it is (or is not) the same as "float, boolean, integer, date, string, etc."''
** Nothing empirical has 100% certainty. Nothing. It's a matter of finding the least flawed or most UsefulLie.

* ''How does "tag" differ from "float, boolean, integer, date, string, etc."?''
** The tag holds the type name or ID. Each language may have a different set. If a lang has a feature similar to typeName() that gives the name of the type tag, then our experiments will usually be lot simpler. We could start by testing things like this: 
      a = "123"; 
      writeLn(typeName(a)); 
      a = 123; 
      writeLn(typeName(a)); 
      a = 1.23; 
      writeLn(typeName(a)); 
      a = {1/2/2003}; // or whatever the date syntax is for given lang 
      writeLn(typeName(a)); 
.
** ''So it does not differ at all from "float, boolean, integer, date, string, etc."?'' 
** What is "it"? In the XML part of the model, it would have "tag='float'" if the language supported floats. 
** ''So "tag" does not differ at all from "float, boolean, integer, date, string, etc."?''
** What are you calling a "tag" in the XML version of the model?
** ''The attribute you named "tag", and whatever you've been calling a "tag".''
** The attribute and its value in the XML is what it is. I'm not going to try to classify it and be pulled into your LaynesLaw trap. Call it a "snarfleglox" if that makes you happy.
** The pair of the attribute name and its value together are what I informally call the "tag".

* ''Given that the presence or absence of tags is implicit rather than explicit, how do we definitively determine whether a language uses tags or not?''
** To repeat: TypeTagDifferenceDiscussion provides a starting catalog of experiments to try, if applicable to a given operator. This catalog can be expanded to make it gradually more thorough. 
** ''But how do you distinguish tags (or not) from the characteristics of "output" of certain expressions?''
** More experiments.
** ''How do you know when you've successfully identified "no tags", as opposed to needing to do more experiments?''
** Somehow my "100%" statement got deleted. (I swear it was there.) I'll repeat: there is no 100% way to know for sure; that's life. Manuals can be wrong too. The more experiments we do, the more assurance we have. We can approach 100% by doing yet more experiments, but never touch that asymptote. As an example, in all my years of using ColdFusion, it's never acted like it uses tags (for scalars): any "type comparing" done in the language is via parsing the value only. Someday I may find a case that proves that wrong; it may indeed happen. But I model based on the known info I have.
** ''Or you could read the language reference manual, ask fellow developers and/or the language developers, and thus get a 100% reliable answer (especially if you ask the language developers) without having to do unnecessary experiments, observations and speculation.''
** I already told you what happens when I ask typical developers. Do you remember my description of the results?
** ''Of course, but I'm assuming you have access to some developers with better understanding than bottom-feeding CodeMonkeys.  What happened when you asked the language developers?  What happened when you read the language reference manual?''
** Don't you understand? I want a model that can help "bottom-feeding CodeMonkeys" ALSO. I'm not assuming AynRandDesignPhilosophy for manuals.


''These are questions your target audience will inevitably ask.''

If they are as forgetful as you.

''I'm not "forgetful" at all.  I'm summarising the limitations of your model, and pointing out the fact that these limitations will inevitably be recognised and probed by your target audience.''

You have not shown an objective limitation, only word games.

''The above are objective limitations.''

Not modelling static languages, fine, I'll give you that one, but that's your goal, not mine. UselessTruth. No 100% guarantees, yours has the same limitation from an empirical perspective.

{Additional questions raised by your most recent description.}
* What's the syntax of this "semi-abstracted" machine language? (Note: We need the rules for how it's put together, not code snippets.)
** It models type-related interactions, not every part of the language. That would bloat up the model. Technically possible, but outside of the goal.
** {If you don't give us the syntax of the machine language, how can we perform your experiments using it?}
** It's a model, not a syntax. For example, epicycles are a model, not a syntax.
** {Your "semi-abstracted" machine language is a language. Like all languages, it has a syntax. What is it?}
* What are the restrictions on how the target language is compiled to the "semi-abstracted" machine language? (You'll need this one twice, once for each option.)
** It would be more like an interpreter, not a compiler.
** {Translating the language being tested into the machine language that performs the test is compilation.}
* What are the semantics of this "semi-abstracted" machine language? (Again, we need the rules so we can run it. Examples are not sufficient.)
** Example? Semantics are "in the head" such that I don't wish to rely heavily on such if possible because I want to model I/O, not head models.
** ''Semantics are not "in the head".  Informally, they are what languages '''do'''; they describe the effect that statements have on the machine.''
** I disagree that "machine" is the proper perspective for intended use of the model, per below.
* What are the rules for determining which one "empirically" matches the actual language output? (This comes up because both compiles should result in the same output as the actual language output. There shouldn't be any differences at all between any of the three.)
** I'm not following. Please restate.
** {When one compiles one language into another, the output stays the same. Since equality is transitive, that makes the output of all three programs the same. Yet somehow, this output is supposed to be able to identify "tags". How do you do that?}
** I still am not seeing what you are trying to get at. Perhaps an example would help. We tune/config the model so that the input matches the output of the real deal. It's as simple as that.
** {You have three programs in each of your experiments. You have the program in the language you are testing. You have the program in the abstract machine language using your first option. You have the program in the abstract machine language using your second option. Since your second and third programs are compilations of the first, all three will produce the exact same output. You've stated that you use the output to determine if the language has "tags". What rules are you using to determine if the language has "tags"? (And I can't give you an example until you answer the three previous questions.)}
** I'm still not seeing what you are getting at. You seem to be focusing on the phrase "has tags". It's a prediction model and is judged by its accuracy in prediction, not on vocabulary.
** {You start your description by saying "The tag model is more 'mechanical'." You describe the two options for your abstract machine language as "the op "uses" (examines) the variable's tag, or it only uses the value." The purpose of your experiments is to determine whether or not the language in question is one of the "languages that never use the tag." The only given reason to care about the results of the experiment is whether "we can know to ignore the idea of a tag and that any "type determination" is done by examining the value and only the value." Of course I'm focusing on the "tag" since that's what your description focuses on. I'm not focusing on output, except where it's used to detect tags, since that exception is the only time you mention output.}
                     <var name="a" tag="number" value="123"/>
                     ..............TTTTTTTTTTTT.VVVVVVVVVVV
.
** The section marked with T's is what the model examines. Now whether the language itself "has tags" or not is not meant as an implementation diagnosis; it only means the language matches the model that uses tags to mirror it. In other words, such a language ACTS LIKE it has tags, like the model's tags. I thought we cleared up the virtualization issue earlier. Remember the discussions about a non-tag language using tags under the hood to speed up implementation and/or save RAM space, but otherwise don't act different than a version without, other than being slower?

** {Where did this come from? Is it from the source language, that abstract machine language, or the output? What rules do you use to determine if it "matches the model that uses tags to mirror it?"}

** I1 = I2 and O1 = O2 where "I" is input and "O" is output. (Remember, input includes the source code of the app)

** {From your description above, I1=I2 and O1=O2 in every experiment. So what rules do you use to determine if it "matches the model that uses tags to mirror it?" (And yes, I remembered that the source code is included in the input. There's only one piece of source code in each experiment described above. It thereby follows, that I1=I2 always.)}

** If we assume a languages parses to determine if X can be interpreted as a number for operator Y, and that assumption is wrong, it won't match actual language output.

** {Your description above makes no mention of this step. Where does it fit in? What are the effects on the rest of the model? Since any value that is usable by a computer can be encoded in a string, the assumption that a language uses parsing cannot ever be inconsistent with any input (including source code) and output. So what are the rules you use to determine that the assumption was wrong?}

** I don't know how or where you misinterpreted it. I'm not sure what you are asking regarding parsing.  If an operator produces a different result based on whether a variable was assigned with quotes versus without, then obviously the operator is "looking at" something else besides the value portion. The "quote-ness" is being "stored" in/with the variable somehow such that there is something besides the value that the operator has access to. You may not want to call that "something else" a "tag", but it's there (or at least its side-effects). '''Even your model separates "type" from "value" (XML)''', so why are you complaining when my model does the same thing?. It appears to be hypocrisy. 

** {Quotes? You've made no mention of quotes above. Where do they fit in? (BTW, I didn't complain. I asked you where a step not mentioned before fit in.)}

** Quotes are language-specific. I'm not going to hard-wire language-specific stuff into the model's base; that would be silly and limiting. One does experiments per language to see if and how quotes affect the results: SCIENCE. (Quotes are a common thing to test, but not the only.) And I am not against the idea of using manuals to get ideas for what to test. But ultimately the empirical results are valued over interpretation of manuals.

** {So how do we determine that quotes (or whatever) affected the results? It won't change the output of either of the above compiled options.}

** '''Science!''' Which specific experiments are you referring to?

** {The ones you describe at the top of this page. "It essentially uses a kind of semi-abstracted machine language in which we empirically test one of two options for a given operator: the op "uses" (examines) the variable's tag, or it only uses the value, such as for parsing." BTW, scientists include how they are detecting the differences they seek when they plan their experiments. You appear to want to avoid doing so. Why?}

** Like I said, the experiments have to be planned per language. I give examples of the kinds of experiments one can do based on '''common language patterns''', but the exact experiments would be shaped by the language itself. I cannot make a catalog up-front of all possible dynamic languages. A dynamic language can be whatever the hell the author wants it to be. Captain Kirk can make a Planet Exploration Guide based on experience from prior planets, but each planet can be different with new interesting features and new villains and 3-breasted green babes. 

** {So, to use your model, we have to perform experiments, but we can't know what they are. Can't say I see the point in your model then.}

*** What's the alternative? As far as "don't know what they are", it's not realistic at all to know all questions and test of a given language up front. I've already given suggested tests and samples for commonly-found patterns such that one is '''NOT starting from scratch'''; but those may not be enough to finish on alone.

*** {One can do what we did on TypeSystemCategoriesInImperativeLanguages. There languages are classified based on how they associated types with variables and values. This is how it's done in mathematics. One can build a model by listing it's parts and the rules for manipulating it and tying it to reality. This is how it's done in science. Just presenting code-snippets and making pronouncements can only help if we are limiting ourselves to those '''exact''' cases you've made pronouncements on.}

*** For one, the type of math used may not be approachable to the average developer. As far as "This is how it's done in science", I'm skeptical. Show it. And, any general "pronouncements" should be empirically tested or at least testable, which has to be done on a case-by-case basis anyhow if we want to be thorough.

*** ''Your correspondent was pointing out that classifications based on associations between parts is fundamental to building models, and that this is the process used in mathematics and science.  He's not claiming that TypeSystemCategoriesInImperativeLanguages is mathematical.  If you're "skeptical" that it's how it's done in science, then you are clearly unaware of scientific methods and therefore have no basis upon which to comment on what is, or is not, done in science.''

*** It's a poor write-up either way.

*** {What write-up? We were talking about alternatives to performing experiments when there is no way to know what experiments to perform. As is done with your model.}

*** I gave plenty of example experiments. They won't necessarily answer ALL questions, as each lang can have different features never encountered before, but they are a good start. Yes, they could be regimented more with step lists and check-boxes on forms to fill out etc., but the principle is still the same. The Trek planet survey analogy started somewhere around here still applies: There will be general similarities in planet "families" to guide and cull analysis steps, but still each planet is unique and some degree of detective-like work will be needed to fine tune the analysis and the physical/chemical model of it. Thus, your "no way to know what experiments to perform" is either bogus or misleading. If you know a magic shortcut, I'm all ears.

*** {Yes the Trek analogy still applies. In the trek analogy, they know what experiments they need to run to determine if a planet is of a particular type before they know that the planet even exists. There's nothing magic about it. They simply know what the parts of their model are, and the rules for how they interact. You know, those things you refuse to tell anyone about. (And I'd still like to know what you were referring to when you said "write-up".)}

*** That applies to both sides. Your write-up near the top of TypeSystemCategoriesInImperativeLanguages does not clearly define those things.

**** {So that's the write-up you're referring to. I certainly had no problem understanding what the original author wrote (who wasn't me). In particular, it clearly defines what is meant by S, D1, and D2 languages. From those definitions, one can devise "experiments" that tell you which category a given language falls into. They'll likely have this form: Search the language definition for the relationship between types, values, and variables. See if there are more than one type associated with variables. If there is, it's an S language. If not, see if there are more than one type associated with values. If there is, it's a D1 language. If not, it's a D2 language. If that's not good enough for you, create a template (like I did in TopsTagModel) and I'll see if I can put it in a format you will find clearer.}

**** "Associated with" is too fuzzy as given. Due to the ButterflyEffect, everything we see is potentially "associated with". It needs rigorous boundaries for a non-vague model, such as the scope, duration, empirical verification method, etc. And your template is vague to me. 

**** {So instead of taking me up on my offer, you're going to say something false and pretend you have a point. The ButterflyEffect is a statement about chaotic (in the mathematical sense) systems. In those systems, the input (which in the anecdote is the air movement created by the butterfly) is indeed associated with the output (the storm on the other side of the world). In fact, the relationship is even stronger, it's causal. BTW, I asked you to create the template, it should only be my template if you find that clear.}

**** Nothing false. Take up your own "template" offer first on your model so I can see it "done right". Your existing write-up is chaotic (in every sense).

**** {You were given a link to it done right. (I also notice no actual defense of the statement I tagged as false. Just claiming it's true is hardly a counter.}

**** The "Cardi" link sucks, for reasons already given.

**** {None of which stood up under scrutiny. (And still no support of the false statement.)}

*** ''It doesn't have to.  All terminology in said write-up is used in a strictly conventional fashion.  Use any recognised source you like for the definitions of "variable", "value", "type", and so forth.''

*** By now you should either be able to predict my reply to this, or you have not been paying attention.

** And I'm still not clear on what you mean by "It won't change the output of either of the above compiled options". My experiments show quotes can affect the output, including some D1 languages, such as Php's is_bool(true); versus is_bool("true"); (Which strangely acts different from its is_num function which ignores quotes.) In fact, typeName() kind of functions found in D1 languages will almost always produce different output depending on quotes.

** {You said above that you were comparing the output of the original program with the output of the two semi-abstract machine languages programs. The output of all three will be the same, since the latter two are compilations of the first. Whether or not a different source program (which would be a different experiment in the above description) results in a different output doesn't appear to matter. Are you changing your model again? If you are, please put it in TopsTagModel.}

** There appears to be confusion (what's new?). One simulation assumes/uses tags and one doesn't (generally per operator test). If they both have the same output, it means tags don't matter, and could be removed from the language's model entirely if consistent across all tests for that language. If it does make a diff, then we know that op is affected by the tag. (for D1 languages, generally it's a mix.)

** {What simulation? That's the first time you've mentioned it on this page. Where does it fit into the above description? As a guess, I would say it's where you run the two sets of compiled code. Still, I don't see how the output would change between them since they are both compiled from the same source.}

** The model predicts/mirrors output of the actual language. This is "simulation" is it not? That's not new. I'm not sure what you are calling "source". One model version uses a tag and one doesn't. There may be more than 2 models variations in some cases. Example: 

     Test app: a=7;b="7";print(typeName(a),typeName(b));
     Text input: None in this case
     Actual language run results: Number String
     Tag model results: Number String
     Non-tag model N1 results: Number Number
     Non-tag model N2 results: String String
.

** In this case we cannot get any non-tag model variation to match actual language output, so we surmise "typeName" uses a tag, which is affected by quotes in assignments. In other words, something "in the variable" tracks "quote-ness"; and we informally call that something "tag". We cannot make any model that ignores the tag and uses just the value to produce "correct" output (matching actual language implementation). We can also surmise  that the language being tested is overall tag-based (at least as a classification), and that at least one operator uses such tags. (Whether other ops do is a question that requires more testing.)

** Note that non-tag languages typically don't have a direct equivalent of "typeName", which is a clue that they are non-tag languages. But sometimes it's difficult to know what "type-ish" ops actually do or are comparable to without testing.

** {In general, predicting output is not simulation. One can certainly use simulation to predict output, but there are other ways. The source is what you refer to as "test app" above. As for the rest, how do you build the tag model, N1, and N2 from the test app?}

** N1 would be a model of typeName that only looks at (uses) the value; such as parsing-based determination. N2 could be a model of typeName that also only looks at the value, but since all values may be "strings", it would always return "string" (making it essentially a useless function when comparing just scalars, although in some languages non-scalars may return a different answer).

** {It appears to contradict what you said earlier. Why would the '''non-tag''' model N2 only look at a tag? I thought it wasn't supposed to have any to look at. In addition, it doesn't answer the question. You were asked how to build those models. Please answer that question as well.}

** I have reworked the statement. I have already described how to build the models. I have no idea why it's not clear to you; I cannot x-ray your head. I agree it needs more regimentation to make sure it's as thorough as possible, but the basic steps have been covered many times.

** {Where have you described how to build the models?}

** Various places, especially in terms of examples for specific operators. TypeTagDifferenceDiscussion has probably the most thorough examples so far.

** {I'm afraid you'll have to be more specific. I don't see anything that looks like information about how to build the models.}

** We'll, it's clear to me. I don't know why it's not clear to you. I'm completely at a loss as to why.


You haven't provided such for your model.

''There's no need, because our "model" describes programming languages using standard terminology and concepts.  The above points are based on your suggestion that we "'run' the abstract machine language" (and the like), none of which is suggested for or required by our "model".''

Standard terminology is vague. We've been over this already. Claiming it's "clear" 48 times does not make it so.

''Beware of conflating "it's vague" with "I find it vague."  The former requires evidence.  Do you have evidence?  More importantly, however, you are suggesting a regime involving an "abstract machine language" that adds considerable complexity to your supposedly "simpler" model.  That seems unreasonable, given that the observational speculation that your model is based on is completely unnecessary.  To understand how languages work, we need only look to multiple sources (including their authors!) that describe how they '''actually''' work.  We don't need to construct models based on speculation about I/O observations.''

"Type" has been a '''vague, abused, overlapping, and overloaded concept/word'''. I don't know why it has been that way, it just has. Nobody has figured out how to write clearly about types yet. Until somebody figures it out, I'll use a "mechanical" style model instead.

{Any evidence of that? (And it's clear enough for a computer to figure it out.)}

Exhibit 1: The "type=" in cfArgument that never touches the "type" in your model/description.

Computers process machine instructions, not fuzzy English.

{I don't see anything vague, abused, overlapped, or overloaded there. Try again.}

Both call different "things" types.

{No, that's a falsehood on your part. It's been explained to you many times that the types in cfArgument and the types in the rest of our model are exactly the same thing.}

How do we objectively know that?

{Since, this is a model designed by people, you ask the people who made the model. They have complete control over such things.}

I think they died.

{So check their writings on the subject.}

It's too convoluted and round-about. Makes no sense to me.

''You mean you've never found an introductory textbook, on-line tutorial, language reference manual, language implementer's 'blog post, course instructional materials, or face-to-face discussion with a language developer that made sense to you?''

* No, they are usually vague. "Type" is an abused word, like I said.

* ''Yet millions of programmers successfully write programs -- some even write interpreters and compilers -- which would be pretty much impossible if such sources were vague.  I suggest that doesn't mean they're vague.  It only means '''you''' find them vague.''

* I've already described how typical programmers "work with types" based on my experience with them. If you disagree with my stated assessment, there's nothing I can do further. I observe and I report what I see. If it differs from your observations, when we just have to leave it at that. We all only live one life. Why do you keep dredging that up? You know my response will be the same each time. I suspect your mother eventually caved in if you asked for something enough times, and you hope everybody else will do the same to make you feel comfy inside.

* {Can you provide any evidence that there are other programmers that work that way?}

* No. Can you provide evidence that many or most don't?

* {Sure. Look at all the tutorials, textbooks, etc. that describe it in other ways. (BTW, you're ShiftingTheBurdenOfProof. You made the claim, you need to support it. Challenging others to prove you wrong doesn't count as support.)}

* YOU are the one who keeps bringing up what programmers think, not me. Thus, your burden claims are full of horse stuff; I'm just responding to YOUR question based on my experience in the field. Maybe if I search gijillion type books maybe I'd eventually find a good description. Until then, the tag model works decently.

* {Really? You didn't say, "which is the last way I'd document anything for typical programmers, the target audience," "it's an alternative if one finds your model hard to digest," and "Like I said, different model may fit different WetWare differently," when you originally created the page? In addition, this particular instance of bringing up what programmers think was started by you. You made the claim that the current descriptions are vague. It appears that what you mean by that is "typical programmers won't understand it." It's perfectly reasonable to provide evidence that a large number of people do understand it in response to such a claim.} 

* The page is a continuation due to size, not a new topic. Either way, neither party has anything except personal anecdotes. The burden is on both sides to justify their anecdotes above the other side. 

* {You've described your model in a very different way on this page. I thought it was a replacement for your original. If it's supposed to be an addition, I'm going to need to have a clean copy of your complete model somewhere. It's gotten to the point where I can no longer tell what is or isn't in your model anymore. I'd suggest TopsTagModel. And no, we haven't given you any anecdotal evidence.}

* {Why do you think the vagueness of our model (this is not an admission of vagueness) has any bearing on how well the tag model works?}

* Less chance of mis-applying the model or misinterpreting it.

* {Why would the vagueness of one model have any effect on the chance of mis-applying or misinterpreting another model?}

* No, it's a matter of competition. If model A is too fuzzy, then model B looks more inviting.

* {OK. Fix what you originally said, and I'll delete the stuff that came from it.}

* What statement specifically is broken? Nevermind, we'll revisit immaterial squabbling later.

* {"Maybe if I search gijillion type books maybe I'd eventually find a good description. Until then, the tag model works decently." This appears to say that how well the tag model works is dependent on how vague our model is.}

* You lost me. I'll take a non-vague model over a vague model.

* {Where'd you get lost? Your own words? (I've certainly felt that way about stuff you write.) My explanation of what's wrong with it? You said that your tag model works decently until you find a good description of ours. You've since said that you meant you preferred your model until you find a good description of ours. I'm asking you to fix your original to match your current since there really is no value to the subsequent conversation.}

* I didn't change the tag model. You mean the CSR definition thing? I'm abandoning those defs for now. The tag model does not rely on those definitions. It's a model, not a definition anyhow.

* {You most certainly have changed your model. None of your original formulations included semi-abstract machine languages. Your most recent one does. But I'm willing to take whatever version you want to use. Just put it on TopsTagModel so I don't have to try to piece it together from the multiple pages it's scattered across now. BTW, I don't think you put your response in the right spot. The paragraph you responded to is talking about how you misspoke in an earlier sentence, not about the tag-model per se.}

* Like I said before, I use StepwiseRefinement for areas that seem to be confusing others. Hopefully we don't have to get down into individual bits, but if that's what it takes. I'll try to find the right home for this reply.

* {Stepwise changes are changes. But I'm not asking you to get to individual bits. I just want you to tell us about the parts you already have. In particular, this semi-abstract machine language is central to your current formulation. You've avoided answering the questions about it. I've asked you to put the current version of your model in a single spot since, after all your stepwise refinements, your model is scattered over several pages. You've refused twice.}

* Stepwise refinement is NOT change. And I've asked you to do it to your model first so I know the "right" way since you seemingly keep moving the goalposts.

* {It is, but it's not really worth arguing about. Anyway, we haven't moved the goal posts, what we need, and have always needed, is for you to present your model in enough detail that we can run it without having to ask you to make a pronouncement. Why do you refuse to do so?}

* See TopsTagModel.

But back to the question, your model's XML representation of a variable has something called a "type". cfArgument has something called a "type". How EXACTLY are these two related or not related in a way that we can objectively test the relationship? I know the fuzzy "head model" that types are category-like things, but that's in the human head, not necessarily in the interpreter.

''It doesn't matter.  There's nothing in the descriptions at the top of TypeSystemCategoriesInImperativeLanguages that relies on definitions of "type".  We simply use the familiar term "type" in precisely the way it's used in every introductory textbook, on-line tutorial, language reference manual, language implementer's 'blog post, course instructional materials, or face-to-face discussion with a language developer.''

So the ColdFusion creators didn't read the correct holy books because your model doesn't touch their cfArgument "type=" thingy despite using the word "type" also. You claim it's covered in the Holy Type Books, but YOU don't cover it.

{We never said that. And why do you think it's not covered? It's been covered repeatedly, it's used to determine which values are valid and which aren't. See? Covered.}

Sorry, it didn't appear that way to me. It touched nothing called "types" in your model. Your typeness doesn't match up with theirs.

{Where does it differ?}

I don't know if it's a matter of "differ", it's at least a matter of confusion. You have part A called "type" in your model, and you use it to model language X which has part B also called "type". Yet there appears to be no clear connection between them. Unless a rational and objective connection can be found between the two, it seems logical to avoid such overlap and give part A a different name. Anybody who spent a fair amount of time writing and had feedback on their technical documentation should know this rule of thumb: '''Don't give two different things the same name without a very good reason.'''

''Of course.  But there aren't "two different things".  The two "type" things are the same thing.  The connection between them is clearly indicated by using the same name for both.''

So you say. I don't see the connection in your model; only in name.

''The use of the same name '''is''' the connection.  Every time "variable" appears, we mean the same concept of "variable".  Every time "value" appears, we mean the same concept of "value".  Every time "tag" appears in your model, you mean the same concept of "tag", right?  So why should "type" be any different?  We don't need to draw arrows between every instance of "type".  In scientific and technical writing, unless explicitly stated otherwise, it's safe to assume that every use of a significant term refers to the same thing.''

But they are NOT the same thing. There are (at least) '''two very different "kinds" of types''' in dynamic languages: tag-based typing and parse-based typing. If you want to put them under the umbrella "type", that's fine, but there needs to be a clear '''sub'''-division in the models and vocab that's often lacking or nebulous or downplayed. -t

''Parsing to determine type is not the same as a LexicalAnalysis to determine type, obviously.  The former occurs at run-time, the latter occurs prior to run-time.  However, the "type" is exactly the same in both.''

You haven't given a metric or clear explanation of how we know they are "exactly the same". Either way, in the tag model, they are not "exactly the same" such that I give them different names. Perhaps you are arguing I abandon the tag model to use a diff model where they are the same. But that probably violates my priorities as previously listed because your model is more complicated (and confusing).

''My metric for "the same" is sameness.  Whether our model is more complicated or not is for the reader to decide, but there's a trivial reason why it's more complicated: Compared to your tag model, it explains more behaviour in more kinds of languages.  Indeed, it is the very ''basis'' for TypeSystem behaviour in all popular imperative programming languages.''

Re: "in more kinds of languages" -- No argument from me there. But I am NOT looking for a grand god-model of all languages. I want a model that predicts type-related behavior of common/typical dynamic languages and will tune/trim my model for that scope and that scope alone.

''Fine, but then you can't claim your model is simpler -- or even less confusing -- unless your model covers precisely the same elements as ours.''

It's less confusing for the stated purpose/scope. Static languages have less confusion associated with "types" in my experience because much more of such apps are explicit: it's one of the very advantages of static languages.

''Apparently it's less confusing to you, but that's hardly surprising given it's your model.  That doesn't mean it's less confusing to anyone else.  Perhaps you'd like to recount your personal experiences of showing it to your fellow programmers?  How do they react to it?''

Very incidentally, the tag model ''could'' be extended to more language flavors using a variable model similar to:

   <variable name="foo" static-type-tag="..." dynamic-type-tag="..." value="..." readonly="false"/>
''Can you give an example of a language where a variable can be declared to have both a "static-type-tag" and a "dynamic-type-tag"?''

C-sharp. Note that the dynamic type is generally ignored if using one of the static types. We could also use nested XML to split variables into parts and simply not nest if using one of the static types. But that's like a data-modeling fight over nulls versus "skinny tables", which I'll avoid here by saying I'm showing the "widest" variable structure here (at least for scalars).

{Can you show us the code with both a static type and a dynamic type associated with a single variable?}

 object x = 2.7;

{I see the static type associated with it is object. What's the dynamic type associated with it?}

Here's how I'd model it:

 <variable name="x" static-type-tag="object" dynamic-type-tag="double" value="2.7" readonly="false"/>

 object x = "2.7";  // would instead give:

 <variable name="x" static-type-tag="object" dynamic-type-tag="'''string'''" value="2.7" readonly="false"/>

 double x = 2.7;  // would give:

 <variable name="x" static-type-tag="double" dynamic-type-tag="N/A" value="2.7" readonly="false"/>

(As described in a prior topic, C-sharp provides two different operators to directly examine (output) these two different tags. It's possible the dynamic tag will also be "double" instead of null or "N/A" in this case. I'd need to run tests to see. But, it's not change-able for most "base" types, and thus "dynamic" may be misleading, but I've yet to find a better term. Perhaps "run-time-tag"? "secondary-tag"?)

{I suppose you could make that work. It's unnecessarily complex since you would have to add the dynamic-type-tag to everything that has a value, and special case your rules when the dynamic-type equals the static-type. Probably better to just go with the simpler}

 object x = 2.7;

 <variable name="x" type="object">
  <value type="double">2.7</value>
 </variable>

 object x = "2.7";

 <variable name="x" type="object">
  <value type="string">2.7</value>
 </variable>

 double x = 2.7;

 <variable name="x" type="double">
  <value type="double">2.7</value>
 </variable>

 dynamic x = 2.7;

 <variable name="x">
  <value type="double">2.7</value>
 </variable>

{which has neither of those problems. BTW, what's wrong with looking it up in the language definition?}

Better to verify; Microsoft has made mistakes in their writing.

Again, that's akin to the heated "thin table versus nulls" debate in table design. I see no reason to rekindle that here.

{If there are any differences between what an implementation does and what the language definition says it does, it's the implementation that wrong. This has nothing to do with the "thin tables versus nulls" debate where you wish to complicate things in the name of simplicity.}

That's a false statement. Both ends can make mistakes. If you want to claim X is "simpler" than Y, then please be clear how you are measuring.

{Sure both ends can make mistakes, but when you correct a mistake in the language definition, you get a different language. That doesn't happen when you correct mistakes at the implementation end. }

"Different language" is relative. Every minor bug fix could technically result in a "different language". 

{No, different language isn't relative. It simply means that there is a difference between the languages. And yes, every correction to the language definition could result in a different language.}

This appears to be a LaynesLaw loop over what constitutes a "language": the "official" documentation or the interpreter EXE or both. It's probably a pointless debate path and will probably boil down to WhatIsIntent and/or EverythingIsRelative since one can "declare" one or the other the official determination standard but which won't change day-to-day issues for app developers either way, becoming a UselessTruth from their perspective because they just want to finish and ship their damned project regardless of what part of the tool stack is declared "official".

But it's mostly moot because one should check their interpretation of the "official" document even if the document was deemed technically "perfect". If you are comfortable with your own reliability in reading and interpreting of such a document, then the tag model is probably not for you: it's for people like me who find the documentation of "types" vague or contradictory. If it's because you are an elite mind and I'm a big dummy head, so be it. Us dummy-heads want docs useful to ''us'' also. So put the big-ass gold star sticker on ''your'' forehead and get out of ''our'' way, and stick your PersonalChoiceElevatedToMoralImperative where gold stars don't shine.

{As far as I can tell, the tag model is only for Top, since he won't share.}

I tried.

----

Re: ''Semantics are not "in the head".''

Oh really. 

''Really.  Syntax and grammar are about formation of correct sentences in a language.  Semantics are about what the language does to the machine, not what it means to us.''

That's an algorithm, not semantics.

''No, algorithms are the precise steps used implement semantics.  You appear to be confusing the academic field called "Semantics" with what ComputerScience calls "semantics".  Syntax is about how a computer language is written, semantics are about what a computer language does to the machine.  Algorithms are how semantics are implemented.''

So you are admitting you are using overloaded words.

''Not at all.  Where did I admit that?  In general, "semantics" is about meaning whether we're talking about the academic field or ComputerScience.  However, in ComputerScience, what language elements "mean" is defined in terms of what they do to the machine.  Thus, they're not "in the head" but "in the machine."''

Then you are defining a language by implementation, which ideally shouldn't be the case.

{Why would you think that? (That we've defined a language by implementation.) }

"...what language elements "mean" is defined in terms of what they do to the machine"

Ideally, they are defined in terms of I/O, not processing. Otherwise, you are dictating implementation without a reason.

{Ah. You think that any restriction on what the machine does is "defining a language by implementation". In that case, yes we are, but so what?}

''Indeed.  Furthermore, by "what they do to the machine", I mean that what a given language statement does, i.e., its semantics, are observable in terms of state changes or other actions in a machine.  However, this does not mean that state changes or other actions in a machine are the sole or even a significant determinant of how we design programming languages.''

We are again in a LaynesLaw loop over "does". Again, teaching programmers about how interpreters "actually work" is not my main goal. I'm building an I/O forecaster model with simplicity as the primary goal over implementation mirroring (per rank chart). If you wish to focus on implementation, that's fine, but that is not my main focus for reasons already given. I'll give them the Newtonian Model which they can absorb in a few hours instead of the more accurate but more involved Einstein model that may take months to absorb. (Besides, you appear to be using the tag model also, but just label it differently and wrap it in fuzzy wording.)

''No, we're not "using the tag model".  We're describing what programming languages do.  Your "tag model" appears to be trying to do the same, but you label it with a PrivateLanguage and leave out significant parts.''

"Do" is not defined in terms of programming languages. If you mean leaving out static languages, that's not a flaw but a trade-off decision. The equivalent XML data structure has about 3 times as many parts. All else being equal, less parts is better than more parts. I chose to narrow the scope rather than adapt a more complex structure.

''The "leave out" I'm thinking of is expressions.''

When we "solve" it for variables in both models, I'll revisit that.

* For the time being, let's stick with the working assumption that "a=34;print(a);" is the same as "print(34);" for small experiments. If it makes a diff, we'll revisit it later. We have enough thread-messes on our plate already.

''What does that mean?  Expressions -- which include variable references -- evaluate to values, which are covered in the description at the top of TypeSystemCategoriesInImperativeLanguages.''

''What do you mean by, "'Do' is not defined in terms of programming languages"?''

How is do-ness measured? If you mean observable input and output, then the tag model does the same.

''The debate over "do" came about as a result of discussing whether "semantics" is "in the head" or (presumably) externally observable, not from debate over the "tag model", but if "do-ness" is "observable input and output" (which is certainly an aspect of "do-ness") then it's clearly not just "in the head".''

By "observable state changes" do you also mean X-raying RAM during runtime?

''You could do that with a debugger -- which would give you output (at least) in places where it might not be explicitly specified -- but you can also do it by simply examining the effects of statements.  E.g., what does this statement do to this variable?  To the screen display?  To the printer?  Etc.''

Screen and printer? Isn't that called "output"? So how is do-ness materially (objectively) different from output-ness (I/O)?

{I see you've ignored the first one.}

I'm not sure I'd consider a debugger "official" output because often what you see is shaped by implementation. It's a courtesy view. If a different vendor re-implements the language, what you see or don't see in the debugger may be different even if the language's usual output is the same. If vender B's debugger showed different stuff than vendor A's, that alone wouldn't be a reason to call B's interpreter/debugger "broken" or "wrong". Anyhow, back to screens and printers for now. Please finish your answer.

{I would have thought it was obvious, but ok. What shows up on the screen and printer would indeed be considered output. What happens to a variable would not. Since what happens to a variable is part of "do-ness" but not "output-ness", there is, objectively, a difference between "do-ness" and "output-ness".}

I focus on output-ness, not do-ness. Ideally a language should be defined by its "interface" to the world, not its implementation. Commodore greatly simplified the hardware of their C-64 machine throughout the 80's in order to make them ever cheaper, but they are all considered C-64's, and with occasional relatively small exceptions, were considered the "same model" of computer. Similarly, a programming language interpreter may be re-worked for efficiency or to trade space for speed or the like.

{Making up fictional anecdotes does not help your case any. The only non-cosmetic, significant change made to the C-64 was done to reduce power consumption. Most of the cost savings came from the general downward trend in costs of parts in that time period. But back to the matter at hand. You asked for an objective difference between "do-ness" and "output-ness". It doesn't matter one bit what you focus on. If there's a difference, there's a difference. Furthermore, the purpose of programming languages is to tell the computer what to do. Output is only a small part of that. If memory serves, there are even some languages that don't have output.}

A language with no output has no use. The purpose of a programming language is serve humans, not computers. Are you Ceylon or something? That would explain your attitude. If stable control of RAM is your goal, then you'd use assembly or the like.

Regarding C64 changes, "...the original 1982 board had about 40 chips on it while the the final 1992 board had only about 15."
http://www.commodore.ca/products/c64/commodore_64.htm

{Sure there are programming languages with no output, such languages rely on the environment to communicate rather than providing output themselves. Yes, the commodore 64 reduced the chip count. They were able to do so by making the each chip more complex.}

Example of using the environment? Programmers are going to want some kind of representation of output when testing anyhow.

{SQL is a language that uses the environment to communicate.}

And that means that different database I/O API's create their own artifacts or have their own oddities, which has drawbacks and can create inconsistencies across them. Still, one can select a representative API or two and use that as an I/O testing reference. Thus, one could say, "Based on Oracle 10g SQL and ODBC used with C, here are the results of...".

{Yes, but it's still a language without output that isn't useless.}

Essentially it needs extra parts to be a complete tool. It's comparable to a car engine without wheels (at least), and the choice of wheels does affect some of the resulting "output" characteristics.

{And it's still a language without output that isn't useless.}

Anymore than an engine is "useless" per se. It just needs some way to make contact with/on the outside world for its use to be felt.

{That's true enough, and there's an advantage to doing it that way. By not including the wheels with the engine, the engine can be used to drive a car by attaching it to some wheels or to provide electrical power by attaching it to a generator. Similarly, by not including output in the language, the environment can do what's appropriate for it instead of having to conform to the language. In conclusion, your statement that languages with no output have no use is clearly false.}

I meant as stated with no explicit extra parts. Empirical testing requires SOMETHING that generates output be included. Otherwise, it's almost like a socket wrench without the end-pieces.

{Claiming something is useless without extra parts is a far different claim than claiming something is useless. Please be more careful in how you say things. So, do you now agree that there are languages that are useful (in the ordinary since) without output? Do you now agree that there is a difference between "output-ness" and "do-ness"? Do you now agree that semantics of computer languages are about "do-ness"?}

This is getting unnecessarily quibbly. We need to have an "output port" to do sufficient empirical analysis. SQL leaves many "output" issues to drivers and API's, and if we wanted to experiment and compare, we'd have to select a reference output mechanism. In some cases such glue-parts may affect the experiments such that we should call what we are testing "SQL plus output tool X" to be thorough. Our interaction with tools being compared still needs an "output port" of some kind.

{How can a counter-example to the claim under dispute be unnecessarily quibbly? Or was that a warning about what you were about to say?}

"Useful" depends on the context, which doesn't appear to be material to the main discussion that I can see anywhere. If we are comparing two or more models and/or tools, we need some reference "output" to objectively compare with, such as to see if Model X's output is "equivalent to" Language Y's output. A byte stream is probably the simplest and most common and thus I elect it as our de-facto comparison format for this discussion. If you can give a good reason to use some other comparison format, please describe the reasoning behind such. We are not comparing query languages anyhow such that I see no reason to drag SQL's into this and muck things up over it. Debugger's can give us a nice view into some of the guts, but for reasons already given, they should not serve as acceptable reference output. Debugger I/O would '''not be a "safe" source to build a production app around'''; there is no guarantee or expectation of cross-version I/O stability with debuggers, especially if a different vendor implements a debugger and/or the language. I'm not even sure the OSS version of C-sharp has a debugger such that if you rely on debuggers for comparison, the OSS version will be considered 100% different from MS's version since it always produces zilch.

'''If a language/tool comes with output operators or mechanisms out of the box, that's the low-hanging-fruit of the "comparison port". At least that's what I am going to use for my descriptions'''. If you want to select something else, that's fine but I will not recognize it as "official" in my book and ignore it unless a good reason is given. I will agree that "intermediate" I/O-like info is useful in providing ''clues'' to the actual behavior (I/O) of a language, but should not be taken as-is as final information.

{This section of the discussion is about the semantics of programming languages. You made a claim that semantics were about "output-ness". SQL is a perfectly good counter example. This came about because I wanted to know the semantics of the semi-abstract machine langauge that is part of your model. You claimed that semantics are "in the head" as an excuse not to answer it. In response we told you that semantics were about "do-ness". Now I don't really care what you use to describe the output. But I do need to know how the statements in your semi-abstract machine language interact with each other, the input, and the output (however you define it). Otherwise, I can't make any use of your model.}

I believe the problem is that you've been "in the guts" of languages for so long, working on compilers etc., that you cannot bring yourself to consider them a black-box from a scientist's perspective. I use output-ness as the reference standard for testing for practical reasons: app developers generally think of "the language" in terms of the I/O, not in terms of actual implementation. They don't care if the interpreters are implemented via caffeinated gerbils on Tinker Toy treadmills, as long as the I/O is as expected, and in theory it could. My model may ignore actual implementation, but as long as it provides forecasting ability, it does its stated job. It's somewhat comparable to math regression and epicycles (done right) in that it makes no claim to mirror the underlying mechanism: it only fits curves. I don't know what you call "semantics"; I cannot read your mind. I try my best to explain the model, and if fails for you, then I'm currently stumped. You are probably not the best specimen anyhow due to your "guts exposure" per above.

{I don't work on compilers, so that can't be it. I've never met an app developer who thinks of language in terms of I/O. I'm not talking about actual implementation either. What I mean by "semantics" is the usual definition of the term for programming languages. I.e. what the language requires the computer to do. For example, "x = 10" in many computer languages would require that the computer take value "10" and store it in variable "x". That is what is meant by semantics, and that is one of the things necessary to "run" your semi-abstract machine language.}

My model (as given) uses an XML representation of a variable and the examples precisely show where the hypothetical interpreter (candidate models of ops) looks and/or changes with arrows pointing to the very specific corresponding elements of that XML representation. I don't know how to make it any more explicit than that on a wiki. (The specific steps a hypothetical interpreter(s) takes for a given operator depends on the specific language being modeled. I give suggestions based on typical/common patterns found in the wild. I agree these suggestions ideally need better cataloging and regimentation, but that shouldn't be a show-stopper to seeing the general usage of the model.) -t

{Yes, we've seen the XML representation of a variable. But an XML representation of a variable is hardly a semi-abstract machine language. Where's the rest of it? You only have one example I could find with arrows, and it only gives one rule. There you say it looks for quotes in the source language. You later said that quotes weren't necessarily important, that it would it depend on the source language. How can I tell if it does? (And before you say "experiment" keep in mind that I need to know this to set up the experiment in the first place.)}

That was a specific sample language in which quotes did matter. One does experiments to see if quotes are important (affect results) in a given lang/op. That's Science 101, I shouldn't have to re-state such.

{Show me the experiment you would use to show that the quotes matter.}

Observation 1 in TypeTagDifferenceDiscussion is a simple one. There are more involved ones, but are language-dependent. Here's a JavaScript example:

 // Example quote03, numbering for reference only
 1. a = 123;
 2. b = "123";
 3. alert(a + a);  // result: 246
 4. alert(b + b);  // result: 123123
 
{Your experiment to determine that quotes matter appears to be the same as your experiment to determine if the language has tags. Why the different conclusions?}

What's different? The tag model can "explain" how the quotes affect the results. "123" applied with quotes makes variables behave different than those with 123 applied without quotes, per experiments (not all of which are shown here). This behavior can be modeled in the tag model by having the quotes "set" (affect) the type tag, which then later affects how "+" behaves. The "quoteness" in statements 1 and 2 appear to affect the state/behavior of the variables that carry over to statements 3 and 4. (We could swap lines 3 and 4 to verify the ordering doesn't matter.) Thus, whatever model is used should have/show a mechanism to "save" this state (quote-ness) with or associated with the variable. I choose an XML representation that uses a "tag=" attribute to explicitly carry this state along with a given variable (a data structure that represents the state of a variable).

Keep in mind it's not the only way to model this phenomenon, but it usually works and it's relatively simple.

{What a round-about way to say that values have types.}

"Have" is a vague word. I model the have-ness explicitly. There is an explicit data structure to illustrates this have-ness and one can see the data structure explicitly change its type as we run through a hypothetical interpreter. It can also illustrate how parse-based "typing" '''ignores the tag''' (either because of a given operator's implementation, or because the language has no tags). Your verbal approach does not clearly distinguish between these two "kinds" of typing techniques, and that is a big failing of it. 

And in '''colloquial "type"''' discussions, something that can be parsed (interpreted) as a given type is often said to "be associated with" and/or "is" that type such that the colloquial approach also fails to distinguish between them. (Typical implementations of isNumeric() is an example.) '''Parse-based typing does not exclude "associated with" (have) and thus associated-with applies to both typing approaches'''. I'm looking for a model that makes the distinction clear as night and day. You seem to value fitting existing spoken language usage ABOVE clarity of this point, and that is a big mistake in terms of having a clear model. Parse-based typing does not change state in my model because parse-based typing does not use state (or at least acts as if it doesn't). --top

{How can it make the distinction clear when you can use parsing to explain the above code as well?}

That's why "has" is not good enough. It doesn't distinguish between parsing and non-parsing. Remember, this is in response to "What a round-about way to say that values have types".

{How does your model distinguish between parsing and non-parsing? The code above can be explained by setting your tag. It can also be explained by parsing. (Our model find the distinction between parsing and non-parsing irrelevant. So why bother explaining it?)}

How can it be explained by parsing? (Note I am talking about the processing of "+", not assignment statements 1 and 2.)

{Line one sets a to "123". Line two sets b to ""123"". Line three parses the value of a and sees only digits. + therefore adds the values numerically to return "246" and alert outputs "246". Line four parses the value of b and sees a non-digit. + therefore concatenates what's inside to return ""123123" and alert outputs "123123". See, just parsing.)

Are you saying the variable "keeps" the quotes along with the value (digits)? That's indeed one way to model it, but creates a lot of confusion, especially with embedded quotes. Plus, it can be argued that "keeping" the quotes is just '''another form of tagging'''. Further, it doesn't work so well for Boolean values and other types since there may not be an equivalent to quotes for them. For example, 'd=date("12/31/2013");' may be the way to generate variables having the explicit type of "date" in some langs. And, you have to do "quote diddling" in your model when you concatenate strings. I find that a clearly-separated "tag" makes modeling smoother. 

{Yes, the string literals in that code snippet are stored exactly as they appear in the source. Yes, you could argue that it's another form of tagging, but that's the whole point. Your model can't differentiate between parsing and tagging since any piece of code can be explained either way. Since we can encode any value into a string, there is absolutely no problem at all handling booleans, dates, or even complex structures in a similar manner. Yes I do "quote diddling", but that's just something + has to parse the values for. It's not otherwise special.}

Please explain "can't differentiate". You seem to be viewing it all wrong. (There are specific cases where the result is the same either way, but then it doesn't matter which path you choose to keep.) And yes, I already agreed one can model other types in a similar way, but it's essentially an ugly form of tagging, almost like old-style BASIC's type markers for variables. In fact, BASIC did it better than you because BASIC only needed one character and it's always in the same place.

{What I mean by "can't differentiate" is that any combination of source code, input, and output can be explained by using tags and it can be explained by using parsing. There is no way, using just source code, input, and output to tell if it's one or the other. (Note: It's not just specific cases, it's every case.)}

No, not unless you go to a different model with a different vocab and conventions to force it one way or another, cherry-picking the model per op.

{Nope, using what little of the model that you've been willing to articulate so far and the exact same vocab.}

Please demonstrate. I don't see it. Specifically, how can line 3 and 4 produce different results if the tag is not inspected by the interpreter?

{The values are parsed. a was set to "123" and b was set to ""123"". Since a contained only digits, + used numeric addition. Since b contained something other than digits (in particular the first and fifth characters are '"'s), + used concatenation.}

If you put the tag ''inside'' the value, then yes you have to "parse" to get at it. But that's just silly word-game playing. My model doesn't put it inside the value, which arguably is a value plus a tag and not just a "value" anyhow. I'd challenge calling it just a value if you did that. It's a value with a tag(s) embedded.

{No. It's just a string value. There's nothing special about the '"'s. In fact, this language has a concatenate function does just what it sounds like it does, without parsing. In this case, alert(concatenate(a, a)) would display "123123". alert(concatenate(b, b)) would display "123""123".} You could also do alert(a + concatenate(a, a)) and it would display "123246".}

What is "this language"?

{The one I'm using to show that your model can't differentiate between languages that parse vs. languages that use tags.}

Do you mean actual implementation? Actual doesn't matter; the primary purpose of my model is NOT about modelling actual implementation. I ranked the priorities in a list. Did you forget the list? A programmer is not going to know by I/O whether a given language actually puts the type marker(s) in with the value or not inside. I would like to point out that your toy language still has '''two different ways''' to "calculate" types: search for the type marker(s), or ignore the marker and look only at the value's characteristics. The first would be used for typeName-like ops, and the second for isTypeX-like ops, for example.

{No, I mean in your model. In your model, I can explain any combination of input, source, and output in both ways. (And it wouldn't matter if my toy language had ten million different ways to calculate types.)}

I don't believe you. show it.

{Just scroll up a little bit.}

MY model does NOT shove the quotes up the value. And for the sake of argument if we got drunk off our asses and did it that way, one can still see it's a very different process to scan for the type markers versus analyze only the value bytes, ignoring the type markers, per diff op modelling.

{As presented, it doesn't violate any of the rules of your model to "shove the quotes up the value". Stuff like this is why I asked you about restrictions when translating to your semi-abstract machine language. You refused to answer. And yes, it's a different process, but that's an implementation detail you wanted to ignore. You only want to use the source, input, and output to differentiate between "parsed" and "tagged". Since I can map any combination of source, input, and output to both the tagged and the parsed models, you simply can't differentiate between them that way.}

None of the many examples do it, yet you go right ahead and drive off the road and into a creek. What keeps somebody from not pulling the same trick in your model? And my use of "parsed" versus "tagged" was within the model, not general.

{So what? They're just examples, and can only tell you about that particular combination of source, input, output, and mapping to your model. The things that prevents them from pulling the same trick in our model are the are rules against it.}

What rules? If you had good, clear rules, I would have stolen them already for the tag model.

{The ones that say "Every value has a type..." and "Every value is represented by...". You check the language definition and see which one the language says it does.}

"Has a type" is vague for reasons already given multiple times. And I doubt the "language definition" tells you that quotes are "kept with" the value in most languages we are interested in. And even IF they were, that doesn't mean we should necessarily model the language that way unless it has a clear advantage in the deviation from the norm.

''If you're going to allege that "has a type" is vague, you need to provide evidence that it's vague.  I find it difficult to believe that it's vague, given that "value has a type" and "variable has a type" are familiar descriptive phrases used in both technical documents and formal treatises with no apparent confusion.  Without some compelling evidence to the contrary, it's simplest to assume that rather than being vague in general, you simply don't understand it.  I.e., the problem is yours and yours alone -- or perhaps one shared with or only found in very poor programmers -- rather than a characteristic of the phrase "has a type".  If it's a misunderstanding among very poor programmers, then I doubt your "tag model" is going to help, but if you have evidence otherwise -- like you've tried your model on programmers in an experimental setting to observe their reactions (you know, that "science" stuff) -- then I look forward to reading about it.''

See above near "colloquial". The explanation of the apparently contradictory responses between typeName()-like functions and isTypeX()-like functions, for example, are not fully addressed in colloquial-land and papered over. I want a model that makes the distinction as clear as possible. There is definitely (at least) two kinds of "type" detection processes in dynamic languages, regardless of what we call them or how we model them.

{What apparently contradictory responses between typeName()-like functions and isTypeX()-like functions? If you're talking about ColdFusion's cfArgument, then we did explain it without contradiction. So regardless of how contradictory it appears to you, it's not. As for typeName() and isTypeX(), the language defines what they do in terms of the type system used by the language. How is that not fully addressed (it can't be any more specific without being more specific about the language(s) in question.) or papered over?}

No, I'm thinking more like Php's getType() versus is_numeric():
  // Example Php04
  a="123";
  print(getType(a));  // result: string
  print(is_numeric(a));  // result: true
Thus, in Php, ''a'' could be said to be "string" and "numeric" at the same time. And you are right, the behavior is "per language", but we can model such behavior using the "tag modeling kit" for a good many dynamic languages. 

A curious programmer may ask, "how can it be both at the same time?" The answer, using the tag model, is that getType looks at only the tag, while is_numeric looks at the characteristics of the value, not of the tag, and the value "can be interpreted as" a number (based on parsing the value) because it's all digits. 

There is only one IS-A because there is only one tag "slot" in the model. There can be many "be interpreted as" because a given set of characters can successfully be interpreted as different "types". (Although Php is inconsistent in that some isX functions only look at the tag, and programmers have complained about this inconsistency.)

{Or, they could read the language definition. They would then see that the value "123" has a type of string, and getType() returns that. They would also see that is_numeric() returns true if and only if the value passed in has a numeric type (as returned by getType()) or it's a string that can be converted to a numeric type. Viola, we've simplified things so that there's no need for your tags at all. The fact that it's simpler this way becomes especially clear once you include in your description how PHP decides how to set your tag, something you've been leaving out.}

You've given no explanation/model as to why getType() returns what it does or how long it does it. Also note it's simpler to model is_numeric as always parsing because it's one step instead of up to two. Granted, under the hood the interpreter may check as a speed short-cut, but it's otherwise unnecessary for a prediction model. Occum. And "see that it has a type of string" implies a simple relationship. IS-A/HAS-A ain't good enough as we can see because is_numeric is also "asking" what it "is", and gives a DIFFERENT answer. Is has-ness different than is-ness?????? I'm assuming they are the same and so '''we've got two is-a's going on''' giving different results. It's just sloppy fuzzy notiony words with contradictions not explained.  Why do you have such an attachment to fuzzy language? It may "mean" something clear in YOUR head, but I only see overlap and fuzz. Bill Clinton was right about one thing: "is" is a fuzzy word. It's much much clearer and clean to me if we define the var as a data structure with two clearly separate "compartments" (XML attributes) and model operators as reading one or the other compartment (depending on best fit of results). And replacing "getType" for the tag attribute doesn't simplify anything, in fact makes it worse because you are not modelling how getType "works", it's just a function floating around in space that does magic. An attribute is a simpler part than a function. That tag model is far more "mechanical" and visual and we can '''step thru it like clock-work''': tick tick, look at attribute X, tick tick, look at attribute Y, tick tick, etc. Maybe I'm just fucking '''language-blind''', give me a damned visual. I don't "get" your overlapping is-a/has-a shit and I'm fucking giving up. You are Lewis Carroll reincarnated on LSD, which that fucker needed like a hole the head. If Lewis Caroll and Dr. Seuss had a bastard baby and fed it LSD milk, it would grow up sounding much like you: The is-a has-a was-a fizz-a, typing madly until it is-a, but tell the mothah' it was-a fuzzah, tazing and typing and wiping and swiping tags with names of bags that have no links to values that blinks and shrinks the kinks until it is-a link to a type of hype you cannot wipe until it becomes smelly tripe.

http://laughingsquid.com/wp-content/uploads/fifty_percent__harvey_dent_by_drfaustusau-d4l1k0l-640x449.jpg 
'''
    What type am I? Asked this guy. 
    Am I number, Am I string, 
    or am I something in between? 
    Is it what I am, or what they see, 
    inside my guts, or their view of me?"
    Must I be one, or can I be both?
    Or is duality, something to loath?
'''

''It's not fuzzy to us.''

''Nice rant, by the way.''

{I did too tell you why getType() returns what it does. It's defined to return the type associated with the value. In this particular case, the value is of type string, so getType() returns string. I didn't tell you how, but that's an implementation detail. Something you've repeatedly stated you wish to ignore. (I don't know why you care about how long it takes). Yes, if you read the language definition for PHP, you would find that there is a single type associated with every value. Yes, is_numeric() doesn't return true just for those types that are defined, by the language definition, to be numeric. It's defined to also return true for certain string values as well. Occam says to cut the tags entirely, since you need to know the types associated with the values to set your tags up in the first place. (I.e. the step you keep sweeping under the rug in order get your model to appear about as simple as ours), and once you know that, you already have all you need to know what getType() will return without having to use tags. There is absolutely no doubt that "has" is different from "is". That's something you should have learned in early grade school. After that, you appear to have blown a neuron. Take a deep breath, and try to post something coherent next time.}

"Is" is vague. That's something I learned in college. Categories are in the head. "But that's an implementation detail" sweeps a big step under the rug. It's a detail that should be modeled if we want a good model. You are right, I should take a break from "types" here. It's getting very frustrating. It ''is'' frustrating.

''If "that's an implementation detail" is supposed to be part of a model, how do you reconcile your previous claims that your "tag model" is a model and not about implementation?''

We need to "explain" that part one way or another; not just say a function magically does something vague. The explanation can be virtual, as in a model that produces the right output. It has to be clear, not necessarily "real". If one uses epicycles to model planet movement, that's fine, as long as the epicycles are sufficiently described (and predict planets properly).

''The "implementation detail" we're referring to is '''how''' the type is associated with a value or a variable.  It doesn't matter whether it's a tag byte or a type name or a type ID or a pointer to a type definition or the topmost item on the "type" stack at a given point when traversing the abstract syntax tree.  "Associated with" is entirely sufficient, because "x is associated with y" -- as in a variable is associated with a type, or a value is associated with a type -- simply means that given an x we can answer questions about y.  When we say a variable x is associated with a type y, we mean that given variable x we can answer questions about its type y.  Or, given a value x, we can answer questions about its type y.  We don't need to say that x has tag byte y, or x has type name y, or x has type ID y, or x has a pointer to a type definition y, or when we encounter node x when traversing the abstract syntax tree we can find y on the top of the "type" stack, because all of these mean precisely the same thing.  "X is associated with y" gives us all the information we need with no extraneous detail.''

Well, okay, but give the "association" a name, and make it clearly separate (named differently) from OTHER associations or type-association-like processes or artifacts, such as is_numeric()-like results. The best way I've found so far to do this is with XML because it's familiar and has relatively clear rules. If '''English were good enough''' by itself, we'd never need XML and computer languages and logic notation systems. If you want, you can use circle-and-stick graphs, but just label the lines and the nodes so we can write rules and instructions with clear references to the parts. In short, '''avoid anonymous associations''' in models if you want to make sure they are clear to the reader. I called the association in my model the "type tag". It has a name, and there's only one per variable per XML attribute rules. I don't see the one-ness limiter in your model. If somebody is running a "by-hand" interpreter in their head or on paper, then they have a clear choice, model a given op as using the "tag=" attribute or the "value=" attribute from the variable's representation. The choices are clearly distinct and their association with "variable" is clear because of the XML. Note that a stack may be overkill for the intended use.

* ''What do you mean by "a stack may be overkill for the intended use"?  The use of a stack in such a context is part of a common mechanism for implementing compile-time TypeChecking in StaticallyTyped languages.''

* Yes, but we don't want to model an ''entire'' interpreter if we don't have to. When I process expressions by hand, I don't use a stack. "Playing interpreter" is more "natural" other ways (realizing that "natural" depends on specific WetWare.)

* ''Who said anything about modelling?  I was describing what actual languages actually do.''

* My goal is to produce a simple and clear model that predicts output based on input (source & data) for reasons already stated. I don't know precisely what you mean by "do", but I'm going to stop caring because I've tried too many times and am giving up.

* ''This was a discussion about what "associated with" means, and I pointed out that "associated with" can be implemented -- I'm not talking about modelling here, but runnable interpreters and compilers -- using a whole host of mechanisms including stacks.  You wrote that "a stack may be overkill for the intended use".  I wrote that the intended use "is part of a common mechanism for implementing compile-time TypeChecking in StaticallyTyped languages", as part of my illustration of runnable interpreters and compilers.  What does modelling "an ''entire'' interpreter" have to do with it?''

* I stated my ranked goals in a list somewhere around here and the reasons behind the ranking. Mirroring actual implementation (the running guts) was a relatively low priority.

* ''That doesn't answer my question, at all.  It doesn't even appear to be relevant.  You appear to have misunderstood my original point, but rather than admitting it and writing something like, "Oops, yeah, my 'stack' comment wasn't appropriate," you appear to be trying to defend it.  Why?''

* I am not sure what you are getting at, then. 

* ''I explained what I was getting at in the fourth bullet point above this one.''

* I guess I misinterpreted it. Why bring stacks into the discussion even? Stacks are not an issue nor do they simplify the models of type-related issues.

* ''We were talking about (from above) "'''how''' the type is associated with a value or a variable".  I described various ways that implementations associate types with variables, including how in some language implementations, the association between a variable and its type can be physically quite disconnected.  This was to show the variety of implementation strategies that may underpin an abstraction like "associated with", and to show that the phrase "associated with" is as specific as you can get without, well, lying.  In a system modelled as "variable x is associated with type y", there is very much a connection between x and y (given 'x', we can answer questions about 'y'), but it's implemented so that the only physical connection between x and y is that when we encounter x in the abstract syntax tree, we'll find y at the top of a stack.  I wouldn't '''model''' the system using a stack, but there are languages I would '''implement''' using a stack, because there are benefits to implementing parts of a statically-typed language's compilation-phase type-checker using a stack.''

* See TypesAndAssociations

PageAnchor: Assoc02

ALL associations should have at least these:

* A name/ID that can be referenced in descriptions
** {Totally unnecessary.}
* Quantity limitations: if there can only be one X per Y, then make that clear somewhere.
** {In this case, the language definition will tell you what these are. If our model specifies it, it would restrict unnecessarily.}
* Origin: Where did it come from? What set it that way?
** {The language definition.}
* Scope: How long does it last? If X references Y and Y disappears, does X stay in the model?
** {There's no danger of Y disappearing. Y is static.}
** In a dynamic language?
** ''I know of no language where type definitions can disappear at run-time.  I know of no language where the type of a value changes at run-time.  What makes an imperative programming language dynamically typed is that any value of any type may be assigned to any variable at any time.  The type associated with a value is static.''
** Without a clear definition of "value" and "type", I cannot say I agree or disagree.
** ''A value is the result of evaluating an expression, and consists of a representation (almost invariably a string of bits) and an implicit or explicit type reference.  A type is a set of values and associated operations.''
** That doesn't tell me anything clear or measurable/testable. Everything in the observable universe is "associated" with each other to some extent. And we only have I/O to examine, not internals (if we define a language by its behavior and not implementation), and these I/O operators can choose to filter or alter the internals as they want. To say some thing "is" is misleading or a UselessTruth at best because '''we cannot directly observe it's is-ness''', only the transformed versions/views of it. Under the hood it does not even have to use bytes or even binary to represent variables etc. That's purely an arbitrary implementation choice for convenience, speed, fit to existing hardware, familiarity, or habit.
** ''Why would we define a language by its behaviour and not its implementation?  "Under the hood" is what is meant by "representation".  "Associated operations" means the operations are dependent on the set of values, or are only meaningful in the context of those values.''
** Dependency alone does not tell us anything specific.
** ''What do you mean?''
** Dependency only means that changes to X may affect Y, but does not tell us when and how.
** ''When and how are not relevant.  If they were, they'd be part of the definitions.''
** Oh, those wonderful wonderful definitions. Define Tylenol.

These are not too much to ask.

{In general, they are. They might be good questions (outside the first, which is simply a matter of convenience), but none of them are necessary.}

Bull! Mind-only assumptions are arrows in the back of good documentation. And if a specific language has certain limitations, they should be included in the model for that language. It's a given that the model will need to be customized per language (unless we have a Swiss Army Model with way too many parts to switch off or ignore). If you are dead-set believing that your verbal descriptions are "good enough" and won't change your practices, then there is no need to continue here because clarity is a second class citizen to such stubborn personalities.

Re: "Totally unnecessary": even if YOU think it's clear, it does not hurt to make the above explicit.

''It is explicit.  Our model calls it "type" -- short for "type reference" -- as is done in pretty much every language reference, ever.  You call it a "tag", as is done in a handful of descriptions of particular language implementations using very specific implementation approaches.''

I have already described potential overlaps, confusion, and discussions with rank-and-file coders more than twice and I won't repeat them here. Let's LetTheReaderDecide if existing material is clear on this matter or not. If the reader thinks it is, then they have no need for the tag model. Done! 

The tag model has served '''me''' well. I hope others will also find it useful if English-centric approaches are not working for THEM. ThankYou, --Top

''Yes, over time, let's see how many visitors demonstrate preference for your "tag model".  It's clear how many support it now.''

You've done ''no'' reliable survey; anecdote against anecdote.

''How many edits have there been in support of your model?  Aside from the two of us who oppose it, I've seen at least two responses that seemed confused by it.  One thought "tag" meant a data definition, another thought "tag" was the same as C's typedef.''

I have no idea what text you are referring to. And again, the subject is a model/tool for I/O prediction such that what a tag "is" is irrelevant. I have abandoned attempts to define it at this point (for now). I have used XML as a working representation due to its familiarity, but if somebody wants to replace with something else on their own, that's fine. I would note that your model is not based on clear definitions either. (You probably think they are clear, in your head, but you often mistake your head models for universal truths.)

''One of the confused responses -- the one that thought "tag" meant a data definition -- is one you responded to not long ago on ThirtyFourThirtyFour.  The other was a few weeks ago on a different page.  Regardless, it's quite apparent there has been no articulated support for your "tag model" on this Wiki.  Have you tried it on your work colleagues or other developers?  What was their reaction?''

I still don't know what text you are referring to.

''See the text on ThirtyFourThirtyFour that begins, "Well a tag may be a data definition, though I cannot be sure of that from what I am reading, so call it a "Datad". When we work out what it really means, it may become pervasive. It may be data adder, or data addressor, or data administrator, or data dictionary or ...."''

I have yet to test the tag model thoroughly on others, but like I said, few if any have appeared happy about the current state of affairs. In general most don't seem interested in the details and spot-fix any issues and move on. Anyhow, we've been over this popularity contest talk before. Let the fucking readers decide. 

''If in "general most don't seem interested in the details", what makes you think they'll be interested in the details of your tag model?  I'm not clear why you wish to limit it to the "fucking readers", too -- wouldn't the virgins and celibates be equally interested?''

I used to be less curious about such, spot-fixing any odd issues and lathering up on defensive wrapping. Later I decided to poke around some more rather than live with fuzz. Most indeed don't care, but it's nice to have a model and testing kit for those who do.

''Where is your testing kit?''

I already explained it multiple times, but for some reason you don't seem to get it. I don't know where the communication break-down is; I cannot read your mind and your feedback is too vague for me to process.

''On what page is it documented?''

''Are you sure the problem is in understanding TypeSystem categories defined by your model, as opposed to simply developing a better understanding of the language-specific peculiarities of (I presume) PHP, ColdFusion and JavaScript?  In other words, does your model do a better job of explaining PHP peculiarities than the PHP manual?''

Yes!

''Can you give an example of where your tag model explains something that the PHP manual does not?''

See PhpTypeSystemDiscussion.

I'm going to say it's vague and you are going say it's clear and we are going to re-argue the same points all over again.

''Perhaps you could point out specifically what you feel is vague?  It's relatively easy to identify vagueness by highlighting absent definitions, dangling references or WoodenLanguage, but not so easy to prove clarity as it's inevitably subjective.''

I tried that before, and it just seems to lead to fractal vagueness.

''There are things in ComputerScience that are axiomatic and/or abstract and either have to be taken at face value, or you have to see what the code looks like that implements them.  Your comment reminds me of someone I once met who couldn't come to grips with set theory because given a set like X {a, b, c}, he had to know what a, b, and c were.  If you told him they were apples, he had to know what kind of apples.  If you played along and said they were Mackintosh apples, he wanted to know whether they came from the same tree or different trees.  If you told him they came from the same tree, he had to know where the tree was.  And so on.''

There are some unpleasant people with dysfunctional behaviors you remind me of, but I'll save my mud slinging for a time when I'm pissed instead of just irritated by you.

''Why are you irritated by me?  That's an oddly emotional reaction to what is no more than text on your computer screen.''

Part of the problem appears to be that there are multiple ways to model the behavior (I/O) of a given language, and you wish to limit such models to a traditional standard (or what you believe to be a traditional standard) where-as I'll happily blow up tradition if it gets in the way of stated goals. YOU want to limit models to only Mackintosh apples.

''It's not even a question of "a traditional standard".  What I and others have described is how popular imperative languages are actually constructed in terms of values and variables and their relationship to types.  We have explained not only why the language behaviour or "I/O" is the way it is, we add a bonus of describing how languages are actually built.  How is your model simpler or superior to that?  Can you match the parts of your model with the parts of our description, and show how and where your model improves on it without loss of explanatory power?''

I cannot quite figure out your model because there is too much English and not enough data structures. But anyhow, I place simplicity of the model (for its target purpose) above fitting actual implementation. If epicycles created a simpler model than Newton and it accurately predicted the motion of the planets for a sufficient time-frame, it'd go with epicycles over Newton.

''I've tweaked the descriptions at the top of TypeSystemCategoriesInImperativeLanguages in an attempt to make them more readable.  I've also brought in my descriptions of operator invocation from ThirtyFourThirtyFour.  The "model" hasn't changed, but hopefully the descriptions are clearer.  Again, I'd be curious to see -- if you still feel your model is simpler or superior -- how and where specifically you believe it to be simpler or superior.  In particular, can you match the parts of your model with the parts of our description, and show how and where your model improves on it without loss of explanatory power?''

----
SeptemberThirteen