CeeSyntax is very low-level. Today, in this era of CPUs orders of magnitude faster than the ones in use when C was invented, we could make programming accessible to a lot more people by replacing C syntax with a much higher-level syntax or apply the syntax lessons learned over time. ''Well, Pascal is easy to read (on paper too) and it's blazingly fast (see freepascal compiler speeds versus GNU C speeds). We don't necessarily need to increase the CPU and increase the compilation time in order to dump CeeSyntax.'' See also AlternateHardAndSoftLayers, BetterSyntacticSugar, ObfuscatedCee, ItsNotTimeToDumpCeeSyntax ''Big refactoring done by BenKovitz. I have saved the previous copy if you want to retrieve some of the old text.'' ---- Summary of '''Suggestions''': * Add For-each syntax * Rework switch statements to eliminate need for "break" (see IsBreakStatementArchaic) * Indexing (see ZeroAndOneBasedIndexes) ** Zero-based indexing (which we already have; why was this listed at all?), or, ** Programmer-specified ranges (e.g., int taxes[1997...2007];) (preferred) * Make variable declarations more explicit and less context dependent. ** Put types and modifiers on right side of variable or declaration (like Pascal and ScalaLanguage) instead of left. Readability suggests having MostImportantWordsOnLeft. * Toss SemiColon, like JavaScript (controversial) * Rid need for parenthesis around IF expressions * Eliminate confusion between "=" and "==" (AssignmentVsEqualityOperator) ** Single "=" is a syntax error; for assignment, use ":=" and for equality, "==". * Use English for Boolean operators instead of symbols (And, Not, Or, etc. Could be argued symbols are more international-friendly) * AND (&&) having precidence over OR (||). About 90% of the time one wants ORs to be evaluated first in my observation. * "print" makes code seem paper-centric. Perhaps "show" instead.. * Invert C's syntactic control of indirection ** indirect (i.e. follow pointers) implicitly ** utilize explicit syntax to reference a pointer directly and control exact degree of dereferencing. (I.e. use mutable references instead of pointers, but still allowing structures to store 'flat' values should low-level memory management be required in the language.) Doesn't really apply to CeeSyntax languages that don't have pointers. * Add a "function" keyword to make function declaration more specific. Not controversial. ** ''Please clarify. Already in there.'' * Allow NestedFunctions. * Optional named parameters with defaults, if not required * Fixed-sized (by default) numeric types (arguably not a syntax issue) * TossMathCentrism, at least for some languages ** ''C and C-like languages are "math-centric"? How? Mathematica, Maple, MATLAB, AMPL, maybe APL are clearly "math-centric" languages. C and C-like languages, if anything, are general-purpose languages with nominal (but reasonable) support for infix algebraic expressions.'' ** Math-centric may be relative. Perhaps a better question is whether to support in-fix. However, a language requiring a fixed number of parameters may not have enough alternatives for say large volumes of addition or concatenation. It's a form of WaterbedTheory. Several languages with CeeSyntax have already done much of this. CSharp, Java, JavaScript, all have CeeSyntax, so we need to separate the C complaints out. ''In fact, only "put types and modifiers on right side of variables" seems to be an actual issue with Cee '''syntax''' - the others are just gripes with CeeLanguage. And even that goes away when you use a DynamicTyping language like JavaScript, which most would say has CeeSyntax.'' ---- '''Examples of unnecessarily low-level C syntax''' K&R C does not have a "for each" construct. To iterate all elements in a typical, real-world composite data structure, you have to understand how to construct the loop and dereference the correct number of pointers to pointers. Most C programs use null-terminated strings, or strings whose length must be explicitly specified. ''Does this page object to just some specifics of the syntax of C, or to the syntax of all the languages that use a C-like syntax, including C++ and Java? Java doesn't have this problem with strings, but it has a "C syntax" in a broadly understood sense. Also, I don't think anyone is a fan of C's lack of strings, except for super-low-level applications.'' EeLanguage has a syntax similar to C, but it also has a "for each" construct, built-in strings, and relatively little unnecessary surface complexity. Users of E don't seem to complain about the syntax. So it seems that the main objections on this page are to other aspects of C's design than its syntax. ''With C++, it's reasonably easy to create a macro that uses standard STL iterators. Macros aren't terribly clean, but this is a pretty trivial expansion. Of course the C++ iterator idiom is so ingrained that most programmers always look for for(Foo::iterator i=c.begin();i!=c.end();i++) and wouldn't visually recognize a foreach loop. The worse offense is the lack of easy map/fold syntax.'' C has been described as a "super assembler," meaning, it is as low level as you can go without getting into the specific instruction set of the underlying hardware device. This has a great deal of value for those of us who work with raw machine control and highly specialized and optimized computing tasks. Many of the components of C syntax relate directly to what the processor is doing, which is part of the reason so many good optimizing compilers are available for every microprocessor and microcontroller you can shake a stick at. Why would we want to give this kind of execution control up? {Maybe our chips suck and need a more modernized instruction set, such as native string handling.} [Modern CPUs, and even some pretty old ones (I'm looking at you, PDP-11) support native string handling primitives. See, for example, http://www.plantation-productions.com/Webster/www.artofasm.com/Linux/HTML/StringInstructions.html] ---- Before reading the following (up until hrule), better read ZeroAndOneBasedIndexes. EditHint: can probably be merged. Indexing from 0. This reflects the underlying addressing scheme of the CPU. Intuitively, we number a sequence of elements with ordinal numbers starting at 1. ''In C syntax the index is the offset from the beginning of the array. It has nothing to do with indexing in real life.'' ''If you don't like indexing from 0 you could always... (Warning: Evil)'' main() { int storage[10]; int *my_array = &storage - 1; ... } --BruceIde (No I don't advocate this...) Evil isn't the right word for it. Illegal is closer. UndefinedBehavior is accurate. C doesn't permit you to construct a pointer that points outside of an object (such as a variable, struct, or array), except for a pointer to one place past the end of an array, and even then it doesn't permit you to dereference that pointer. See the comp.lang.c FAQ for more information. (The above code will probably work on many systems, but if the array is located at the start of the memory arena then a segfault would result as soon as the address '''&storage - 1''' is loaded into an address register on certain computer architectures.) Am I the only one who gets annoyed by sequences that start with 1? Maybe it's because I work with mathematical sequences so much. -- JoshuaGrosse No. you are not the only one. It is ridiculous to claim that "we" intuitively start sequences from 1. Many of us will intuitively start from 0. This point is similar to the column major vs row major array storage issue. It certainly isn't something you can claim is empirically better one way or the other. ''Most of us use both 0 and 1 based offsets intuitively. Measurements start at 0, while counts start at 1. A baby is in his first year before he is one year old.'' ''More examples, please?'' On a trip, when you've travelled 0 miles, your are in the first mile of the trip. ''Most people don't think, "What mile of the trip am I on?" They think, "How many miles have I traveled?"'' Most programmers have 0 millions of dollars; we are all working on our first million. More generally, bank accounts start from 0 dollars, not from 1. If you're flat broke, you don't go to the bank to take out the 1 dollar that must, at a minimum, be there. (OK, you might, but they won't give it to you, and will probably look at you funny.) ''Bank account value is not a sequence.'' The first floor of a building is (usually) offset 0 floors from the ground. (AmericanCulturalAssumption! In the UK, the floor at offset 0 is the ground floor; the first floor is the one at offset 1.) ''In France, ground floor is 0st; in Latvia, 1st (guess in all ex-USSR, because there everything was made after some state standards. People are strange ...'' * In Hong Kong, the first floor is one level above the ground floor; across the border in mainland China they are generally the same. (In both locales, the 4th and 14th floors are routinely skipped, for a better reason than many buildings omit the 13th floor in the US; the ideograph for '4' resembles the one for 'death'.) ''But, "13" resembles a mooning in some fonts.'' 四 looks nothing like 死 - it's the ''sound''. * {Did I read that right? do you mean mooning in this sense?:} http://www.tiptopsigns.com/images/D/BAD_boymooning.jpg It seems that it's more natural to start with zero when dealing with real numbers, but when dealing with integers it's more natural to start with one. ''Aaaaargh. Zero-based indexing is the solution, not the problem. Just read EWD 831 (http://www.cs.utexas.edu/users/EWD/ewd08xx/EWD831.PDF).'' -- AnAspirant I've realised that the problem isn't 0 based or 1 based, it's ''half-open intervals''. We naturally say 1 to 10, Monday to Friday, and mean that inclusively. Thanks Djikstra for making things difficult because he thought saying 0 to 0 should mean no items instead of one (how often do we seriously worry about empty sets and saying 0 to -1?). In VHDL, array types may start at any number and number in any order e.g. range (1 to 10) or range (0 to 9) or range (9 downto 0) or range (10 downto 1) are all arrays of length 10. It turns out that this freedom is problematic if actually used as you have to write lots of code to deal with interfacing the various cases - the solution, by convention, uses only one. * An array is a ''function,'' mapping some input domain to some output range. It just so happens to be a discontinuous function, but it ''is'' a function nonetheless. Therefore, whether an array is zero or one indexed depends on the natural problem the function is trying to solve. It is trivially easy to think of functions that take values of, say, years: '''if (taxesPaid[1997] < taxesPaid[2007]) panic();''' Ideally, we'd even allow for non-integer indexes too. I really like Pascal's solution here, allowing the programmer to determine the starting and ending indices. ''It just makes sense.'' ** Perhaps C didn't do that because it requires more internal math to access array cells. But, this is less of a problem than it was in the 70s. A smart compiler can still eliminate the offset calculation if the programmer specifies zero as the start. * Another gripe of C-style syntax is the need for ''distinct type classes'' for what ought to be algorithmic primitives. I'm not talking about HaskellLanguage type classes, but in the OO sense. For example, I'm running into a problem with a C program I'm maintaining. There exists a file "config.h" which includes a bunch of preprocessor definitions containing the sizes of various types. This allows the "platform independent" implementations of types WORD16 and UWORD32 to actually have their stated size. It's ''important'' that this be true, because it's used for both saving and loading stuff from files, and also because it's part of a virtual machine implementation. How bloody nice it would be if only I could say this: '''typedef bigendian integer(-32768...32767) WORD16;'''. That would probably eliminate 1/4th of the whole purpose for GNU autotools, and would suddenly make programs ''that much more portable.'' Under ''no circumstances'' would it affect the high- or low-levelness of C either, since tags like "bigendian" needn't be specified if you don't care, and the compiler would be able to choose the most suitable native representation to fulfill your list of bounded requirements. C has existed for, what, 30 to 40 years now? And it ''still'' doesn't have this? -- SamuelFalvo That suggestion does not meet your stated requirements. Suppose 'word16' is defined as a 'bigendian integer' ranging from -(2^15) to 2^15-1 as suggested. Suppose this code is compiled on a 36-bit word machine (without 2's complement signs, by the way), whose C-compiler (naturally) handles this type as an 18-bit signed half-word. What does this do to your file formats? Yes, I'm picking nits. Yes, I'll stop now. ''Eh? There's no such thing as 'without 2's complement signs'. Anyhow, it would be more proper for 16-bit words to be 16 bits regardless of the host machine's natural word-size. This is true even if it has the cost of making things less efficient. C language doesn't make it easy, but there isn't any good reason why a person shouldn't be able to read or write a file in 23-bit chunks if that is how they wish to do so. The inadequate support for absolute control over word-size, bit-ordering within representation, and even word-alignment, as part of the C standard is a gripe I share with SamuelFalvo. But it isn't really a C-specific issue... it just comes up in C more often because C is often used more often for the low-level IO stuff.'' * Canadian, Eh? Or, is that British? ---- '''Teaching newbies''' Introductory programming classes could use extremely high-level systems such as ASP or php based Web scripting, to gently introduce programming-track and non-programmers alike to the basic possibilities of programming. Students interested in becoming software experts would of course take additional classes that would cover material such as assembly language, C, and operating system internals. However, for a first class to use C or a C-like language such as Java, means introducing surface complexities (e.g. "public static void Main (args[])" that just needlessly scare off people who could benefit from JustaLittleScripting in their life, if only they weren't taught from the beginning that programming was mainly about finding where you forgot to put the correct semicolon or closing parenthesis. ''If folks don't care enough to get past syntax (and understand semantics), maybe they should do something else?'' ''Well PHP may be "extremely high-level", but for a newbie it sure isn't "easy to read". I sure didn't find it easy to learn. I started wondering how an integer could magically just be a string, and if an apple could be an orange? And why was I using all these dollar symbol signs and curly braces? Gee, sounds normal to a C programmer.. but for a non-cprogrammer, they start wondering "why all this use of symbols every second key?"'' I think a better language to learn with would be one which is more English based, and less sloppy with a neater block syntax - i.e. not one that is C/Perl based. Maybe Smalltalk, Pascal or God Forbid even VB? Maybe it's easy to create a quick and dirty web page in php -if you already know php- or you already know C. Personally, I think most PHP programmers come from a C or Perl background, and already know C or Perl. Most ASP programmers probably come from a VB background. So then when they jump into PHP or ASP it seems easy, familiar, and hackish. But for newbies.. I think web scripting is too html/document and symbol based, and they should learn something more english and application based. They're first project could be a program, or a web page. A web page is just a word document, though, remember. "all this programming just to make a crumby looking html page which I could have done in Word or Excel?" A working program at least gives them some satisfaction ;-) '' ---- It's my understanding that the C language was created by and for computer programming researchers whose main purpose at the time was developing the UNIX operating system. Further, I understand that although the minicomputer they used was one of the least expensive machines available at the time, with a 1971 price tag in the neighborhood of $10,000 it cost more than a programmer's annual salary. I seem to recall reading an article by one of the principals - could have been Ritchie, I'm not sure - about how their interaction with the machine at the dawn of the C/UNIX era was via teletype. C was certainly a remarkable breakthrough in their ability to conduct their research and development efforts; just look at how quickly they were able to get UNIX started. Today, the least expensive computer systems available are embedded chips which cost a few dollars in quantity, are generally more powerful than the system available to the C designers, are built into a huge range of products from microwaves to automobiles, and which don't need to receive text or commands from the user or provide text back to the user at all. Computers which are designed to process text are now literally thousands of times more powerful than the minicomputers of the early 1970s; handheld systems dozens of times more powerful than the C-era mini's can be purchased on a whim for less than a programmer's day rate; and even the most hot-rodded personal computer today can easily be bought for less than a single programmer makes in a month. Computers as expensive as a year of programming are only needed for the largest of commercial and scientific enterprises, such as generating animation for feature films, handling thousands of customer orders a day, simulating weather or global climate, analyzing the stock exchange, or running high-energy physics experiments (either experimental or numerical). It is impossible to saturate the capacity of most of today's personal computers without doing either audio-visual editing, interactive simulations with realtime 3D graphics, or the largest of software engineering projects such as compiling the kernel; and these software tasks hardly take enough time on today's average PCs to justify a break for coffee or tea. Using a language like C, which involves explicitly allocating memory and juggling pointers, is a waste of effort when the end result of the programming effort does not justify the effort involved in having a person keep track of those implementation details rather than letting the computer do it. It bothers me that many open-source projects use the C language rather than a more appropriate higher level language, because C is comfortable and familiar to the developers based on their education and work background rather than because a conscious choice was made to do development at C's low level. -- ChrisBaugh http://cm.bell-labs.com/cm/cs/who/dmr/chist.html is an excellent history of the CeeLanguage by one of its inventors, DennisRitchie. ''Unix had already existed by the time C was conceived, but was (like most other OSes at the time) written in assembly language. The whole purpose of implementing C was to ''port'' it from the PDP-7 to the PDP-11. This is why C has post- and pre-increment operators; *x++ and *--x maps 1:1 to addressing modes used in the PDP-11 instruction set (this is also why the 680x0 series microprocessors were such a good match for C too, and is one of the reasons why anyone who was anyone was using the 68K line for Unix workstations).'' --SamuelFalvo ---- '''Discussion''' ''Much rudeness has been removed from this page. Please don't add it back. Thanks.'' ---- This is an old argument. Good quote: "Why didn't Smalltalk become popular? It had square braces." Really, the sticking point is the "I can learn a new programming language in a day" clan. I used to be one of them, but now I think otherwise. If all the languages are ALGOL descendants, sure, I can learn them fast, but that's just learning how they differ from C/C++. "Sure I can remember everyone's name, as long as everyone is called Fred." -- SunirShah ---- C syntax is perhaps obtuse, but at least elegant. It maps well to the underlying hardware. ''How can the '''syntax''' of a language map well to the underlying hardware? The semantics of C might map well to hardware (this is debatable, and arguably more true of a PDP-11 than, say, a Pentium 4), but the syntax has little or nothing to do with that.'' So when you are concerned with the underlying hardware, C is a good choice. At the opposite end of the scale, languages like Lisp map well to the underlying realities of computation. [DeleteMe: this bugs me, it was originally one statement, but got refactored into two people, in a sense] ''I was always under the impression that the first versions of LispLanguage had primitive functions that mapped almost directly to the underlying hardware. That's where we got stupid function names like CAR and CDR. LispLanguage would not be my choice of an example of a high-level language. -- IainLowe'' Yes, but now they've been aliased to FIRST and REST, to pacify Iain. Please remember that Lisp is forty years old, and ''has'' evolved in that period. It is in fact a testament to its elegance and flexibility that it has evolved into the best high-level language out there from such ... well... ''primordial'' beginnings. Indeed, the terms '''car''' and '''cdr''' are derived from the ''contents of the address register'' and the ''contents of the decrement register'', respectively, which referred to features of the hardware architecture the primordial LISP interpreter was implemented on. {But that's just bad naming, not necessarily bad language features. They could have called them "fel" and "roe" for "first element" and "rest of elements", for example.} Source code with "fel" all over it may make management nervous :-) ---- If you don't like a language's syntax or even its semantics, that's simple to fix. See ObjectiveCee for a major example, the java preprocessor at http://virtualschool.edu/jwaa for a minor one. -- BradCox ---- I used a database called CornerStone back in the early 1980s that kept the names that you used for object such as tables and fields separate from their internal IDs so that if you changed a name all screens reports and view knew about it. A small thing, it made refactoring databases a lot easier. It also allowed you to define calculated fields and lookup fields in the tables' definitions, there was no distinction between tables and views. -- AndyMorris What happened to Cornerstone? - sounds like they were a little ahead of their time. See CornerStone. ---- About affordable computer systems more powerful than the pdp-11 used to create C and Unix: ''Holy crap! How much do you make? Where do you get those deals? Hook me up, man. :-)'' I'm not sure which model of pdp-11 was used by the C and UNIX developers. Picking at random a typical model available at the time, we find that with 4kwords of 16 bit memory, a pdp-11-20 had a 1971 price of $10,800. Source is http://www.village.org/pdp11/faq.pages/pricing.pdp11-20-aug-1971.html. Handheld: http://www.handspring.com/ - Visor Platinum, 33MHz Dragonball processor, 8 MB RAM, $249. One of these was recently given to me by some relatives who split the cost of it. (I didn't ask for it, they came up with the idea and the money and asked if I'd like to receive it. I'm enjoying it very much and would recommend it as a swell PDA.) Hot-rod PC: http://arstechnica.com/guide/system/hotrod.html - The ArsTechnica recommended "hot rod" box, $1527.78. Athlon/1.4 GHz, 512 MB RAM, 30 GB hard drive, etc. I agree that's not the most hot-rodded computer available today, but it can compile the kernel in less time than most people like to have in a coffee break, and I think does qualify as orders-of-magnitude more powerful than the PDP-11-20. My personal jalopy computer is an antique compared to this thing. Jobs that pay $250 a day or $1600/month are often listed on http://www.dice.com and http://www.monster.com. If you're an expert at making programs that are super-efficient at runtime, using C without any help from BetterSyntacticSugar, then you certainly should be able to make enough to buy a computer that would pour the sugar. But you wouldn't need it! -- ChrisBaugh Actually, as the apparently rather underpaid interjector here, I am an expert (well, I'm proficient anyway) at making programs that are superefficient at runtime in C and C++, for which I use magical syntax (but not syntactic sugar) and magical optimizing Perl generator scripts to create my C/C++ code for me. Only weak minds are trapped by their languages. The strong ones can invent their own. -- SunirShah ''True. But some languages make that process natural, others, almost impossible.'' ---- Are people getting confused, or am I? Surely the argument is to dump C ''syntax'', not C itself. Of ''course'' you can't dump C -- Unix is written in it! But that doesn't mean new languages have to use the same (or, to be more accurate ''similar looking'' syntax forever! I think the reasonable success of PythonLanguage shows that a new syntax is not necessarily the death of a language. ---- You can dump C syntax in C language: #define begin { #define end } #define then etc. ''Don't forget'' '''as''', '''with''', '''of''', '''and''', '''uses''', '''implementation''', '''program''', '''library''', '''function''', '''procedure'''. ''Or you could just use a Pascal compiler in the first place.'' ---- It is very difficult to write a CodeWalker for C programs. That's mostly due to the following two features: * C syntax is not LL(1), not LALR, and not even context-free. In particular, it is necessary to know whether an identifier denotes a type name or something else in order to correctly parse C code. * C's preprocessor directives are extremely nasty. In theory, the solution is simple: run your CodeWalker on the already pre-processed source. In practice, this doesn't work. The reason has mostly to do with #include, which might pull in thousands of lines which 1 have nothing to do with the code at hand and 2 might not even be legal C. The sad fact is that these aspects of C syntax could be replaced by sane constructs with no lack of expressive power, and with only minimal changes to the syntax. -- StephanHouben ''This is preposterous. Syntaxes aren't context-free, grammars are. Grammars describe syntax. If you want to program in a context-free syntax, you're going to have to write everything in BNF. There are tons of CFGs for C, not the least of which is the one published in the ANSI C standard.'' While obscuring the syntax with the grammar is an all-to-common mistake, the point above stands. One can certainly write a CFG for C, but there are numerous forms which look alike to the parser, and cannot be disambiguated strictly with a CFG. For example, consider the following. makePie (apple, cherry, orange); If "apple", "cherry", and "orange" all refer to ''variables'', then this is a function call. If they all refer to ''type names'', then it's a function prototype. If there is a mix, or if any of apple, cherry, and orange aren't previously defined symbols, then this is an error. ''Or a macro invokation. I mention this because some C compiler (like Borland?) tightly integrate the C preprocessor into the lexer and/or the parser. -- SamuelFalvo'' Such symbol lookup is well above and beyond the capability of a CFG. Of course, for implementing a compiler that matters little - the semantic analyzer will have to run anyway, so deferring disambiguation of forms until then isn't a big problem. ---- Is this page lamenting excessive capability or lack of guidance on how to use it? I will admit C syntax will allow you to do silly things, but it also rarely blocks you from doing the things you need to do. It is also true that the preprocessor was used to patch deficiencies in the language and compiler technologies, but the former shortcomings are slowly evolving out of the language. In general, I prefer the less restrictive nature of C, but in return I must agree not to push the envelope. -- WayneMack ---- On the subject of choosing C for open-source projects, I'd imagine it would be for the same reason that the NetPbm suite of programs was created, because (in the case of libraries) it is possible for any other language to use C libraries, but if you were to pick, for argument's sake, let's say, PerlLanguage, how would you go about calling a library written in that from, let's say, LispLanguage? If C were used instead to create the library, there exist for most languages automated systems for creating shims to allow access to C libraries. On the other hand, for applications, I'd imagine it is because it's easier to attract the ArmyOfProgrammers that make projects work if you use a commonly known language. -- MartinRudat, who likes his C syntax just fine... in the form of PerlLanguage, which doesn't suffer from all of C's semantics... ''all of C's semantics'' Funny, I didn't know C had semantics. ---- As a long-time C / Java coder, I see two main disadvantages to C syntax. It's verbose (compared to Python) and it's difficult to express certain concepts (compared to Lisp). Are there any other disadvantages? ''Just curious, but why do you state the C is verbose? The usual complaint I hear is that it is too terse?'' for(int index = 0;index LOG("Error occurred.")) So does Perl6, with its blocks and pointy blocks. Ruby has a similar syntax; unfortunately, it doesn't use curly braces, so the syntax ''isn't'' consistent. aCollection.map {|f| f.factorial} is fairly concise, though. ---- CategorySyntax, CategoryCee, CategoryAlgol, CategoryPascal, AlternativesToCeeSyntax, CategoryLanguageDesign, PhpProsAndCons (lessons from another language)