TypesServerTwoPurposes.

[Did you mean "serve"?]

* TypeSafety. By checking types at compile time certain classes of bugs are excluded, such as inadvertedly using an int as a string or calling a non-existing method.
** ''There are many topic that debate this contentious issue. In my opinion it's a matter of trade-offs, and one should use TheRightToolForTheJob.''

* Efficiency. By using optimized representations for specific types significant performance advantages can be realized. The simplest example is using native ints (32/64 bit, whatever) for strings which have been discovered to be numbers. Lisp does this. 

Historically these have been lumped together. Lisp for example provides a test for "numberness" which just checks the internally discovered optimization. This allows programmers and optimizers to specialize their code.

''I believe part of this is that our CPU's are biased toward explicit typing, C/C++ in particular. If they had built-in dynamic typing operations, the efficiency advantage may be less.''

{What's "explicit typing"?  Do you mean manifest typing?  Static typing?  In what way are "CPU's" [sic] biased toward "explicit typing"?}

For one, they don't natively process strings, including doing math on numbers stored as strings.

{Some processors do natively process strings, such as the VAX processors.  I'm sure there are others.  Processors do tend to be oriented toward their primary uses, though, so doing math on numbers stored as strings (which is highly questionable at a processor level) is unlikely to be a priority, especially since the performance gains are likely to be minimal.  Determining whether a string is numeric or not is typically O(n) wherever you do it.}

''This seems to be dismissing the "efficiency" claim above. However, O(n) still "matters" if there is a constant associated with actual performance.''

{Why do you think my comment dismisses the "efficiency" claim above?  I have not contradicted the fact that using optimised representations for specific types does result in significant performance advantages.}

{Your comment regarding O(n) does not make sense to me.}

''When one does BigOh comparisons, one discards constants. However, there may be a big constant influencing the practical application of the above. Maybe it's say 5 times slower to do math on string-based numbers, for example. It could have a practical impact if there are a lot of such activities in a given program.''

{You apparently do not understand BigOh.  One discards the constant not because it is significant -- i.e., a "big constant" -- but because it is insignificant.  What is significant is 'n'; in this case the length of the string.}

But we are not comparing general string processing. We are comparing the processing of "numeric" strings to the processing of dedicated "types" (such as "double"). A typical numeric string will be about 7 characters. For discussion purposes, lets look at the case where the average is 7, and then the case where it is 12. There's no reason why a CPU has to process such strings one-character-at-a-time. I realize there may be a DiscontinuitySpike if the string size goes past a certain size, but the frequency of such will vary, or can be mitigated by balancing the various tradeoffs.

Let's say there's a single machine operation to read two strings up to 16 characters each from a given RAM address, do addition, subtraction, division, or multiplication on, and put the result (up to 16 characters) in a register. (If it's longer than 16 characters, then other techniques would perhaps be used. Division may need a 2nd result register.)

{Determination of whether a given string is a number or not is essentially one character at a time.  For example, to determine that "1234567" is numeric but "123456a" is not requires iterating over seven characters.  Some clever optimisations may be possible, but these will still require n iterations where n is proportional to the length of the string.} 

* No. If we assume a 64-bit CPU, then eight bytes and thus eight digits can be read from memory at the same time (assume ASCII for now). Each byte can be analyzed '''in parallel''' in a registry. The CPU does not have to analyze each byte one at a time. Plus, it may be able to do preliminary mathematical processing while analyzing each digit. I'm not an expert in that area (and perhaps there are none because it doesn't get enough academic attention), but I suspect that some of the steps could be combined. -t

* [That's an implementation optimisation.]

* The sub-topic is "efficiency" is it not?

* [Yes, it is.  That does not alter the truth of my statement.]

* Please clarify. Efficiency (machine speed/performance) is mostly about implementation (at least at the scalar variable level). Your statement, "but these will still require n iterations where n is proportional to the length of the string" is wrong when comparing to the alternative. There is no law that says CPU's must process only one byte/digit at a time. Now it may be the case that nobody has found a way YET to do it in parallel, but that's not the same as "cannot be done".
* (If we are looking at "long" numbers, of say 50 digits, then the existing types built into most CPU's cannot do that natively, period.)
* [Most CPUs are general purpose, representing a variety of compromises. Speciality computation that requires exceptional speed is handled by specialised hardware and/or parallel computation. Demand drives this; there isn't much real (i.e., serious money behind it) demand for dynamic scripting languages running on raw CPU hardware.]
* If they caught on, perhaps we'd see more of such. ArgumentFromPopularity has limits. GUI's used to be considered a "waste of resources"; however now that they are ubiquitous, special graphics chips often have GUI-friendly operations. The future may laugh at us saying "they were stuck in the FORTRAN types mentality" back then.

''BigOh tells you the efficiency ''in the limit''. But this is completely irrelevant here because all numbers are small and the constants factor surely dominates in this range. Anyway you seem to be in ViolentAgreement that the CPU could do numeric string processing. Note that x86 still has BCD support nobody uses it, therefore it is not optimized for - quite different from vector operation of limited precision which are heavily used...''

[Actually, BigOh is quite relevant. The '''overall''' average computational efficiency difference between a scheme that internally represents values as their actual type, vs a scheme that requires string-to-numeric conversions on each operator invocation because all values are strings, is O(1) vs O(n).  Whether "all numbers are small" (?!) or not is irrelevant.  Optimisations of various sorts may improve the real efficiency, of course.  One obvious optimisation is simply to -- at least at the machine level -- internally represent types, where appropriate, as native values rather than represent all values as character strings.  High level languages can hide this, and Top's desire for a language in which all types are strings -- or at least can be viewed that way -- is not unreasonable.  ToolCommandLanguage does this.]

BigOh notation mostly matters when comparing multiple orders of magnitudes of differences. I will agree that "native" types such as doubles will probably always be faster, but a chip that has number-string-friendly operations can greatly reduce the difference. Most CPU's don't, making the difference fairly large. For example, maybe the difference now is 100 to 1 on average, whereas it may be say 5-to-1 if the chip had ''native'' support for NS's. Thus, my statement that current chips are biased toward certain value models still stands. -t

[The biases are entirely reasonable, but feel free to buy a cheap FPGA board and implement your dynamic-typing-favourable CPU to show how it would be better.]

* Better than what? I'm just pointing out that support for string-oriented numbers COULD be better/faster in commercial CPU's. -t

* Better than existing CPU architectures?  Dynamic scripting languages are not considered viable for performance computing, for numerous good reasons. There simply is no market for this. It would be a waste of silicon. 

* They are viable for custom and small-market applications. -t

Why are they "reasonable"? I see it as a case of LeftHandedSyndrome. If many languages used the dynamic model, then chip makers may give them more attention. And no, I'm not going to build a fucking CPU in my garage.

[They are reasonable because TypeSafety is generally critical at the OperatingSystem level, even if not critical at various user levels.  Building an experimental CPU using an FPGA is not unreasonable.  The experimenter boards are cheap -- usually less than the cost of a mediocre laptop -- and it's a popular activity amongst hobbyists and in university-level courses.  They're programmed using languages like VHDL.]

Dynamic languages are not intended for SystemsSoftware. Thus, your example is a straw-man. And, it's not Intel's job to dictate what programming languages people should use. If BrainFsck becomes common, then the chip makers should take that into consideration. They are hardware experts, not SE experts. Anyhow, I don't wish to mix up efficiency issues with language design and software engineering (SE) issues if we don't have to.