I have been kicking ideas for multi-paradigm database system organizations from a logical perspective. Although I am a RelationalWeenie and think that OODBMS are ModernDinosaur''''''s, I realize that too many people like OO to outright ignore it and it may be better to find a way to get along rather than fight them in a never-ending HolyWar. Thus, I have created a topic where we can kick around ideas for something that works in (at least) both paradigms. Plus, we may get some "dynamic" benefits in the process by moving away from the "static relational" patterns of the current generation, and may even satisfy the NoSql crowd. Features include: * Essentially it's a giant '''map-of-maps''', where the first-level map has automatically-generated rowID as its key. * Dynamic or type-free columns * Each row is a relatively independent map. There is '''no need for a formal schema''' edit to add an attribute (key/value pair) to a row. ** See PageAnchor "Tuple-Spaces" for related comments. * Missing cells return an "empty" value or answer rather than trigger an error. ("Empty" will mean a zero-length string here by default, although it may depend on system, app, or query-specific settings, being that "null handling" is a contentious topic.) * There is no formal distinction between missing or blank/null cells. (Perhaps an optional Null character or marker string such as "[null]" could be permitted/used in case one does not want to stick with the default convention, but is otherwise considered "against the spirit" of this architecture; WaterbedTheory). * '''Tables are optional''' (using an "entity" attribute). Contrast with DynamicRelational. * Every row automatically gets a "rowID" (auto-gen) and is addressable like any other attribute, except that it's read-only (to non-admin DB users). * Constraints or rules can be added as needed, such as to make columns required, etc. * "Type" enforcement is generally through trigger-like constraints and validation. For example, integer could be enforced using a rule that only allows digits and no decimals. (Pre-defined types generally tend to favor specific languages, which we want to avoid. But for optimization purposes, how stuff is stored internally does not change the basic qualities of a MultiParadigmDatabase. If a column is designated an "integer", then it may indeed be stored as binary. This is just an implementation/optimization issue and does not change what the DB user sees. Essentially such a transformation is considered "hidden compression".) Related "type" info: TagFreeTypingRoadMap. ** ''In other words, your "types" are going to work just like every popular programming language's types? Or are you planning to make the type constraints and validation unusually awkward?'' ** No. More like Perl's and CF's type system, not Php's and JS's. ** Continued at MultiParadigmDatabaseCriticism. * (See below for indexing discussion.) I am not sure such is practical without large compromises, for the differences between the two paradigms are significant in my opinion (TablesAndObjectsAreTooDifferent), but it does not hurt to try. On approach is one big (conceptual) table with at least the following attributes: Table: objects --------------- objectID // auto-gen integer entity // optional table name parentID // to implement object inheritance (see note) ... ''See below for a critique of this approach.'' "Entity" would be the relational table name of a given record (object). Additional requirements may be put upon those records with entities, such as having at least the same required columns of their "master" schema. But since these are things normally expected in the relational world, such as a valid unique key of some sort, I will not address those here much. Those with Entity defined (and fallow the relational rules) can be queried as regular relational data. The "objects" table would have (assume) '''open-ended attributes (columns)'''. They would not have to correspond to a pre-designated table schema. This allows objects to have whatever attributes they need and not be bound by relational's traditional "table shape" rules. If you request a given attribute that has not been assigned to a given object/record, then a blank, null, or zero is assumed in its place. (Again, those with "entity" defined might be made more picky to correspond to relational's requirements.) Note that we have to keep our rules for "object" rather open and dynamic so that a wide variety of OOP languages can use such a database. Proponents of static typing or ExpensiveAdministrator''''''s to enforce consistency rules may not like this. This approach also allows one to query objects using relational query languages. (Perhaps some tree-friendly operations could be added, similar to Oracle's tree extensions.) A typical object query in SQL may resemble: select a,b,c from objects where foo="bar" and zarg="blog" Perhaps this would be possible also: select * from objects where foo="bar" and zarg="blog" But, your language or API would have to support some kind of open-ended structure (a list of variable-celled maps) to accept what may be an unknown number of result columns. The ODBC standard may not work well for this. Note that multiple inheritance is not addressed here. Another table or "side structure" may be needed to support such. Records that have an "entity" defined can be queried in the usual relational fashion. Those that don't would just have to be queried as a single big table of objects if we want to do relational queries on non-entitied objects. You could perhaps narrow it down by parent, however. The ratio of entitied to non-entitied records would vary per shop, depending on how OO-minded or relational-minded their tech culture is or what they had in the past that they are moving away from. Note that a given record can be part of both relational and object queries. ---- The above could be implemented with existing RDBMS perhaps, although I could not vouch for performance; and queries would be more complicated in many cases. We add another table: Table: attributes ---------------------- objectRef // foreign key to object table attribName attribValue // type: open-ended string or "blob" This is essentially an AttributeTable. This approach may make it harder to convert an object to a relational record than the original approach, though. --------- Example constraint/type dialog: COLUMN CONSTRAINT . Column Name: _________ . Pre-Defined Constraints: [Integer|v] [create new] (the first item is a drop-down list) . Column can be empty: [yes/no] . Applies only to Entity: _______ (blank for all) (dots to prevent Mozilla/Firefox-related wiki bug) Newly-created constraints would be a function or SP-like thing: constraint myConstraint(columnName, colValue, row) { var result = ""; if (columnName=="foo" && getArrayItem(row,"entity","")=="bar") { if (contains(colValue, "$")) { result="Sorry, no dollar signs allowed in bar rows" } } return(result) } ---- Classes tend to have 'shape' that corresponds, approximately, to the idea of a table. ''"Approximate" is often not good enough when it has to be implemented.'' I would approach the problem by embedding the idea of a view into the database as a first-class element, rather than a derived element. This may be sufficient to cover variant types, particularly if the view is 'writable' i.e. has a one-to-one relationship between a view-tuple and an underlying object. I'm not sure much else would need to change. I have done this sort of thing, abstracting the type using views, but one cannot treat a view as a table, so the foreign-key problem remains. -- RichardHenderson ''But OO philosophy tends to divide things differently. If you bend a paradigm enough to fit another paradigm, you might as well go with that other one. If you want a vehicle that is 50 percent SUV and 50 road car, but there is no such thing, then you are faced with getting an SUV, a car, or both. But if you wanted something that was 90 SUV and 10 percent car, then you might as well get the SUV rather than buy both just to get that 10 percent. IOW, why use objects if you only or mostly use them in a relational "shape"? I suppose we should work with an example. How about CampusExample.'' I'm not sure OO is that much different beyond inheritance, once the functional stuff is stripped away. I'll wait for you to complete the example so we can argue over something solid. -- RichardHenderson ''The campus example is all relational at this point. How would you OO-ify it?'' In many dynamic OOP languages, you can add attributes or even methods willy nilly. There is no "master class" to match to a table schema. I realize that your development style might not use such dynamism, but can we just dismiss dynamic OO developers? It would probably degenerate into a dynamic-vs-static typing HolyWar. Dynamic can cover both better than static. Perhaps '''basic typing''' can be added by allowing a "class" attribute that references a kind of OO DataDictionary that describes the class's attribute types. But things get complicated if you allow user-defined types. It risks run-away tree or DAG type subtitution checks. Going all out risks turning this thing into a programming language, almost like a SmallTalk engine. ---- Relational tables and object "shape" are ''almost'' identical. With the exception of inheritance and forward-referencing "keys" also known as the ''evil pointer''. What I find amazing is that everyone runs around and argues about how "low-level" and "evil" pointers are compared to "keys". But, what's a key? Isn't a key, composite or otherwise, a pointer? What it ultimately comes down to is that some folks here are saying pointers have no intrinsic value. As Costin frequently says, 0xaaabbbccc is less descriptive than "Panici, Jeff" or whatever you might be using. True, it is ... so what? Not all keys must store meaningful data in and of themselves. It's enough to know that it's a "key" that opens a door somewhere. Anyway, back to my original train of thought. Say we have this class shape: public class DriversLicense { string Number; string FirstName; string LastName; string Address; string City; string State; string PostalCode; } This class "shape" describes a table schema. Now, what does an RDBMS do, that the majority of object-oriented programming languages don't currently do? It creates an extent that matches the shape defined by this schema. An extent is a collection of the same type. Unless we're using an OODBMS bound language, we don't get this kind of behaviour out of an object-oriented heap. Each instance of DriversLicense is allocated where there's a free block of memory and a global object-pointer (evil) table is updated. This table is only used for reference management and eventual garbage collection. But, what if we had an Extent object instance that ''transparently'' "collected" all of the instances of DriversLicense? We'd have a "table". Now, if we have enough of these Extent objects in memory, we can perform ... relational algebra on them and get derived sets (tables). This exact functionality exists in Poet FastObjects. Now, the utility of such arbitrary sets is questionable in an object-oriented language. Usually, we want to find some root objects and we're not, per se, interested in the results of the intersection itself. Thus, the query syntax should be (and is) slightly different from that of standard SQL; hence, Poet FastObjects supports ODMG's OQL. In addition, we have a concept of a "root object" extent that acts like a dictionary. This allows object-oriented programmers to create "shortcuts" to the parts of an object graph they're most interested in. One ''can'' treat objects-in-extents as relational tables. In fact, Poet FastObjects provides an SQL ODBC driver that does exactly this. GemStone/S doesn't have any high-level query capabilities built in (it does have extended block syntax, but that's still at the Smalltalk level), but it wouldn't be impossible to write such wrappers on top of it. Now, here's the rub to this whole argument: Is it appropriate to use an OODBMS as a "shared databank"? What I've tried to articulate in the past is that the answer is probably no. For some small projects, it might be desirable to use ''only'' an OODBMS to store data. Then it takes on the role of both a persistent object store ''and'' a "shared databank". However, I'm of the mind that ''both'' products serve an architectural role. The OODBMS is useful as a transactional object persistence engine. Almost all OODBMS products support ACID transactions but there are variations in the level of query support. Poet FastObjects is an example of a product that has some nice, high-level query capabilities - if you want them. GemStone/S is geared more toward a centralized object instance repository. An RDBMS would be used as the "shared databank": reports would be generated from here, analysis would be done from this data, etc. Most of the major OODBMS products have data transfer systems that will populate a relational schema. While I personally believe more time and energy should be put into advancing OODBMS products, I'm in the minority. Oh well, I ''know'' I'll get harpooned for this one: There already ''is'' a MultiParadigmDatabase - it's called an OODBMS. -- JeffPanici The only time I've ever seen GemStone act as a MultiParadigmDatabase was when many man-years of effort was applied to building a large nearly-relational framework on top of the core engine. Otherwise, it's really a DBMS toolkit. A wonderful toolkit, but not for everyone. -- StuCharlton ''The problem of keys is that in an rdB foreign keys can only reference a single table (type). My abstraction of views fixes this. Keys are abstracted from the physical address, much as IP addresses are abstracted from MAC addresses, therefore a reference will be valid for as long as the referenced object exists, wherever it exists. I'll have a look at Poet, thanks :). -- RH'' Issue perhaps belongs in ForeignKeysCanOnlyReferenceOneTable. The given example is a struc, not an object. The issue gets more interesting with a real OO design: class D''''''riversLicense { string Number; Driver driver; Date expires; M''''''imeItem picture; D''''''riversLicense priorLicense; } class Driver { string F''''''irstName; string L''''''astName; Date dateOfBirth; Address address; D''''''riversLicense currentLicense; Vehicles vehiclesRegistered; // persistent collection of Vehicle } class Drivers {...} // persistent collection of Driver class Address { string numStreet; City city; State state; P''''''ostalCode postalCode; Drivers driversAtAddress; Vehicles vehiclesAtAddress; } etc... -- MarcThibault ---- Re: "There already is a MultiParadigmDatabase - it's called an OODBMS" Being an OODBMS alone does not make it automatically relational, so how could this be? Even if some are relational, some suggest that they are tuned for OO and not relational as far as performance. I agree that each one '''could''' emulate the other, but it is hard to optimize performance for both. The biggest difference between RDBMS and OODBMS is that OO does not have to follow certain rules that relational does. If data is created when the relational rules are switched off, there may not be anything usable as relational when the rules are switched back on for a relational viewpoint. Further, in this system we have to bust pure OO-style encapsulation. It is an attempt to find the best compromise. Every multi-paradigm solution is probably going to have compromises. ---- '''Language-Space''' What I am looking for here is a "database" that can (at least) share data with multiple languages and/or applications and allow for relational queries. For the sake of argument, lets assume that objects and instances of do match the "shape" of tables (I may not agree, but will leave that issue to another section). Some of the tools mentioned above appear to be an attempt to manage the data inside "application space" or "language space". For example, it may be trying to use Java's objects to directly manage and query the data. For most applications I deal with, sharing data with other languages or apps is important, even if not an immediate requirement. It seems to me that a language-space solution would hinder this, no? Are you assuming that every app in a shop will be written in the same language to increase sharing and querying across apps? If so, that assumption makes me itchy. It also seems that the very spirit of "multi-paradigm" is to divorce data from particular application languages. Perhaps this very idea is counter to the OO philosophy (in some ideologies) of encapsulating data behind behavior. (SeparationOfDataAndCode) Further, if the schema is in the multi-language database, isn't echoing it in the language a form of OnceAndOnlyOnce violation? ---- I think a shop's philosophy is going to be geared toward either relational or the NavigationalDatabase "web of dictionaries". It makes little sense to have the same data be both an object and a relational record. Perhaps some types of data are best as objects and others are best as table rows, but there is little agreement about how to assign what to what. The philosophical underpinnings to solve this division are still lacking. ---- Such a feature could be added to existing RDBMS by having an optional "allow dynamic columns" setting for a given table. It may not make sense to add it to all tables since dynamic columns are probably slower than static columns in a similar way that statically-typed languages run quicker than dynamic languages for the most part. My suggestion is sort of the "dynamic vision of relational". IOW, the SmallTalk of relational, in a rough sense. -- top http://www.geocities.com/tablizer/dynrelat.htm ---- '''Support for ObjectCapabilityModel Methodology''' On approach is one big (conceptual) table with at least the following attributes: Table: objects --------------- objectID // auto-gen integer entity // optional table name parentID // to implement object inheritance (see note) ... ''A false start IMHO. This is a relational model of a possible implementation of an OO language; whereas what we want is a way of integrating support for relational and object-based modelling in the same language. Note that well-designed OO languages generally prohibit queries on "all objects", requiring that an object can only be accessed if a reference to it is held (certainly this is true of ObjectCapabilityLanguage''''s).'' Well, this might be a fundamental philosophical difference between OO thinking and database-centric thinking. ''I don't think it's a fundamental difference. At least, it is not a difference between ObjectFunctional thinking and '''relational''' (as perhaps opposed to database-centric) thinking. The pure RelationalModel is after all a mathematical model that ensures ReferentialTransparency, which makes it almost automatically capability-secure. To the extent that impure versions of relational programming introduce insecurities, it is because they diverge from the RelationalModel.'' There might be a way to bolt or wrap such restrictions into a system like this. The approached described here focuses more on the ability to represent "things" from diverse paradigms and philosophies. Thus, it has to be "wide" instead of restrictive. Different paradigms might need different restriction rules. In a multi-language or multi-paradigm shop, perhaps capability security techniques are not the right tool. ''I couldn't disagree more. Consider languages like OzLanguage, which is almost capability-secure, and demonstrates that there's no significant conflict between multi-paradigm and capabilities. It may not be ''a priori'' obvious that different paradigms actually need the '''same''' restriction techniques, but in fact this seems to work almost unreasonably well.'' Tying a database to the internal pointers of a specific language has always been a tricky matter. ''"Tying a database to the internal pointers of a specific language" has tended to be done for existing languages that were designed without support for relational programming as a goal. We can hope for better in new languages (or language variants) where relational programming was taken into account when designing the DataModel/InfoSet and libraries.'' A "generic" or "multiparadigm" database cannot assume use with only fancy/new languages by definition. ''That's a strange definition; the term MultiParadigmDatabase doesn't imply any assumptions about whether it can or cannot be used from existing languages, AFAICS. It's very plausible that a MultiParadigmDatabase would be significantly easier to use from a MultiParadigmProgrammingLanguage, and that does not include most "popular" languages.'' ''In any case, the objectID-as-integer approach doesn't solve any of the trickiness involved with "tying a database to the internal pointers of a specific language". What advantage would be gained by having object IDs be integers rather than opaque values? Something like the SemanticBinaryModel seems to be a more promising approach.'' Reason: To help coordinate information outside the system with information in the system. --------- If we remove "entities" as a requirement, then perhaps this can no longer be called "relational". Maybe call it a "predicate database" (PredicateDispatching) if there is a terminology dispute. ''Relational databases (and the relational model) are NOT (contrary to apparrently common misconception) named after EntityRelationshipModelling or because they "store" "entities" and "relationships" or even because a JOIN operation "relates" two tables together. Relational refers to the fact that all the data is (logically) represented as _relations_ (see RdbRelation). For that matter, there is no requirement that all the facts about a conceptual entity be gathered into the same table/relation - that's just a useful pattern to keep related stuff together (often, though certainly not always, we deal with a whole bunch of facts about the same entity together - getting the user to input them, reporting on them, etc.). Having said that, perhaps the term "predicate database" would help remind us that DatabaseIsRepresenterOfFacts, but there is no reason to ditch Codd's preferred term just because we ditch E/R modelling, say.'' Somebody suggested that something that is a single big table is hardly about "tables" anymore. I wanted to avoid the issue by suggesting a different name to avoid a battle over what DrCodd "really" meant. Perhaps I wimped out too early. Plus, "Farfegnugen" was already taken. The goal is flexibility, not necessarily conforming to "relational". ---- '''Indexing''' I imagine that indexing of a full MultiParadigmDatabase would be quite interesting. Obviously one could index on every column, if desired, but when there are N columns in use, there are 2^N possible queries on identity alone, and considerably more when working with less-than and greater-than relationships, and more still when dealing with queries on arbitrary patterns. I imagine that, in practice, the super-large MultiParadigmDatabase is out of reach until it can build its own set of indexes based on both profiling of queries and manipulations to it, and upon initial estimates of this profile performed by the programmers. (See also AdaptiveCollection). ''Indexing every column? If one really wants to do that, then perhaps a column-wise implementation should be done. Of course, this would slow row-centric queries.'' I think you've misunderstood my intent. When I say that one indexes 'on' a column or 'by' a column, I mean that the rows are the things indexed, but they are indexed such that you can find them -by- the attributes specified in the columns they're indexed on. E.g. if you index on the name, then you can find all rows that have the same 'name' field very rapidly. I didn't mean to imply indexing the columns within the rows. That isn't nearly so useful... or at least I can't imagine as much use from it. If the set of columns is wide enough, I'd certainly expect values in each row to possess some sort of indexing or sorted-order to avoid a linear search, but I doubt that will ever be much an issue. The greater savings would come simply from using some sort of fast identifier for column-name (like a small number identifying an interned string) and performing simple compares on that. {By default, it should probably be cluster-indexed on rowID with the convention assumption that most non-aggregate cross-references will be on rowID. There are at least two approaches to other indexes: No attribute name (column) will be indexed unless explicitly requested (other than rowID); or every attribute name will be indexed unless explicitly excluded. These choices would be a DBA decision. -t} {Whether it's actually stored row-wize or column-wize under the hood should be a configuration choice that shouldn't affect the interface (queries used). The idea is that one could change it without having to change existing queries. The NoSql crowd is making the mistake of tying query language to implementation choice; which is poor future-proofing; even if it does bring short-term benefits by closely tailoring query language to implementation. But MultiParadigmDatabase places flexibility over out-of-the-box speed/performance in terms of priorities.} -------- (Moved from FearOfAddingTables) For ultimate flexibility, may I suggest the MultiParadigmDatabase, which is essentially one big dynamic-row GodTable. (If you want entities, you add an Entity attribute to a "row".) ''There's something to be said for dumping all the data into one, massive DataSpace upon which arbitrary manipulations and self-joins can be performed. Any Relational database can be represented this way by designating one 'column' the 'table-name' column, and simply dropping every row of every table into the single GodTable. Any Logic-programming predicate-database can be represented this way, too, including all memoizations. Any set of objects can be represented this way, complete with descriptions of their behaviors. The MultiParadigmDatabase certainly makes ''representation'' very easy.'' ''However, there are costs... especially regarding the semantics of these representations. It becomes quite difficult to understand: 'what means this row'? There is ambiguity in representation, largely because the set of columns can overlap BOTH intentionally AND unintentionally (because a single column-name can possess many different semantics based upon in which 'row' it is participating). (If one could guarantee the uses of the columns are disjoint, one sort of loses the 'advantages' of having a MultiParadigmDatabase in the first place... one could simply break it back up into a regular Relational database.) The semantic cost becomes reflected in query results and (therefore) consistency checks; that 'c' property is quite difficult to maintain if there is no explicit semantics for DataSpace. A row might represent a single predicate truth, or a whole bunch of predicate truths. There can be two different rows, one saying: {State: California, Anthem: "I love you, California"} and another saying: {State: California, Anthem: "I love you, California", Gemstone: Benitoite}. What mean these rows? One will need to include the semantics, somehow, directly in the row ({Predicate: PlaceCreated, State: California, Anthem: "I love you, California"}, {Predicate: FactsAboutState, State: California, Anthem: "I love you, California", Gemstone: Benitoite}. At this point, one might ask: 'what means "Predicate"?'. It invites a sort of arbitrary regression on these questions, at least until some sort of artificial constraints are enforced upon the table structure. It doesn't seem right that it need be artificial. Relational, for example, has non-artificial places to attach semantics: table-name and column-name.'' ''Besides, in addition to being a fan of flexibility, I'm a huge fan of correctness(-proofs). I want my databases (and my queries, and their results) to all be strongly, statically typed. That 'C' property is quite important to me. While I'm quite willing to consider a well optimized MultiParadigmDatabase DBMS for ''representation'' purposes (including storage, management, concurrency, AcID transactions, etc.), I'd probably reject it as the overlying database 'concept'. I.e. I'd use it with an overlying 'wrapper'.'' ''(Perhaps move the above to MultiParadigmDatabase and leave a reference?)'' Strong typing can be optionally applied to columns, as described above. ''Don't just slap on options without first exploring their consequences. For example, you can't have strongly typed columns AND have an open or unbounded number of columns ('''open-ended attributes (columns)'''). You'll need to close it - to have exactly one arbiter who determines who gets which column-name - and, thus, who gets to add new entities (since they'll not often be able to share strongly-typed columns). You seem so excited by all the hand-waving happening in the above discussion that you're doing it yourself.'' I don't see a problem with it. Before accusing me of hand-waving and pants wetting, please demonstrate at least one failure scenario. Once a type/constraint is put on a given column, it must follow that constraint. It does not mean that the other columns also have to have that constraint. And the constraint may optionally still allow empty/null/non-existence if need be. The constraint may say "if there is a value for column X, then it must conform to POSITIVE-INTEGER". This would still allow an empty row. * The failures described above do not imply the existence of impossible-to-resolve scenarios. The real problem here is you're attempting to claim a bunch of features ("use MultiParadigmDatabase where you can do X '''AND''' Y"), where fulfilling certain of them means chopping off the others. You can't in practice have both "Each row is a relatively independent map. No need for a formal schema edit" and have strong typing. Worse still if you need to specify that every entity of a particular type must have a particular attribute combined with constraints between various entity-types represented either in that table or across tables. At this point one must ask: is it really a MultiParadigmDatabase we're talking about? I do not believe so. By adding strong typing, you'll end up gutting the 'MultiParadigm' aspect of 'MultiParadigmDatabase'. ''Take some time to understand how each and every possible 'option' interacts with each and every other purported feature, for you'll very often be paying for a new feature at a cost to the others.'' There indeed may be trade-offs, but the idea is to allow such trade-offs. This is why it is flexible. The trade-offs are probably inherent to any data management system. * I agree there are always trade-offs. However, I reject your notion that ''the idea of'' M''''''ultiParadigm''Anything'' is ''"to allow such trade-offs"''. The primary motivation for MultiParadigmProgrammingLanguage''''''s is to gain the features and benefits of two or more programming models (or approaches), to allow them ''simultaneously'', to provide for them to ''interact''. It isn't M''''''ultiParadigm if getting the benefits of one essentially means entirely ''disposing of'' the other. Perhaps we can call this the '''C''''''hoiceOfParadigmDatabase'''. That'd at least be consistent with your notion that its purpose is to allow you to flexibly make a few design trade-offs (then be stuck with them). ("use '''C''''''hoiceOfParadigmDatabase''' where you can do X '''OR''' Y (but maybe not both)") * ''You seem to be complaining that one cannot be tight and loose at the same time, which goes without saying. It is hard to beat incremental and optional tightening of conditions/types/constraints. You seem to want to violate physics. You want a universe where the inhabitants are limited to 3 dimensions but also are not limited.'' * Sigh. Recall carefully: you directed me to MultiParadigmDatabase, mentioning that it is "the ultimate in flexibility". I discussed several of its nice, flexible features, but added that I prefer strongly-typed systems. To this, you essentially replied: "strong typing can be optionally applied". Sure. Maybe so. But if I do so, what happened to my flexibility? What happens to the features that led me to even somewhat consider MultiParadigmDatabase in the first place? When I ask this, you get all grumpy and say: "You're asking for the world! The universe even!". Now, as to whether I ''want'' the impossible - of course I do. But I don't ''expect'' it. I'm not ''complaining'' that MultiParadigmDatabase doesn't provide both flexibility and strong typing. If I have any complaint, it's that Mr Salesperson-of-this-idea has been misleading in its presentation. One should NOT say that a MultiParadigmDatabase has features or provisions for paradigms X, Y, Z ''unless it has them all at the same time''. One should NOT answer ''or even imply'' that I can simply add strong types via constraints ''and somehow keep the properties that make MultiParadigmDatabase tempting in the first place'' - not unless you can explain to me how this works. ** Re: "I prefer strongly-typed systems" - ''If that is your or your shop's ''primary'' goal, then MPD is not for you. Done. As far as balancing design trade-offs, it does NOT place a premium on that above other features.'' * As to whether it is "hard to beat '''incremental''' and optional tightening of conditions/types/constraints": I disagree. Incremental loosening or tightening of constraints is a bad idea to start with due to problems of ''coupling''. Supposing you already have a user-base, any tightening of constraints will break applications... and you'll be fighting the data already extant within the database. Similarly, any loosening of constraints can allow addition of data that violates assumptions made by those applications that access your database, which can also break applications. The only benefit incremental manipulation of the types/constraints in the database provides is during the pre-beta development phase where you haven't nailed anything down and you possess full ability to redesign BOTH database AND the applications. At this point, you will not have made many solid decisions as to entity-representation, anyway... no more than you would over any Relational schema... so whatever data you do have will be going through transformation after transformation ''even if'' you had the MultiParadigmDatabase. This so-claimed "feature" is easily beat by focusing on constraints-spec-languages that provide expressive power and type-flexibility and/or runtime speed and optimization ''at cost of temporal-flexibility''. ''The use of constraints goes part of the way towards strong typing, at least if the constraints language is sufficiently powerful. One then needs well-typed operators if these types are to have any real meaning.'' The query language operators would only need to address 3 basic types: text, number, and date/time to fit usual query conventions. (Date/time perhaps can be omitted as a formal type). I am not a fan of operator overloading, I would note. If you are thinking of user-defined types, that is not the goal of MPD, and is not available in most existing RDBMS anyhow. It is not a feature I'd miss and conflicts with the goal of being app language neutral. * Any real 'MultiParadigmDatabase' is definitely going to need user-defined types. A great many entities cannot naturally be represented without them, and lack of complex value-types prevents representation of or operation over the vast majority of real data (it forces either unnatural simplification or even far more artificial representation as trees). * That said, I can see how you'd imagine everything easily resolvable if you narrow your world-view to only allowing numbers and unstructured text. ** ''That may be a query language library issue rather than a database design issue. There is another topic on such somewhere on this wiki.'' I would like to explore specific "difficulty" scenarios. If you are looking for a complex type management system with DAG-based lookup and substitution and that is required for you before you qualify it as "multiparadigm", then I suppose it may fail. Nothing will make everybody happy. The above suggestion leans toward a "lite" typing viewpoint, which perhaps could be seen as a bias. But my experience is that type-lite tends to allow more inter-language usuage. Text atoms are more sharable than binary atoms, and this is part of the reason why HTML and other web standards have taken off. Type-heavy tends to bind/limit one to specific languages, or becomes a language itself; and reinventing an app language is generally beyond the goal of a database. ''I'm a firm believer in the idea that DBMS features ought to just be pulled into generic programming language runtimes: transparent optional persistence, multi-cell transactions, ACI(D?) transaction semantics, translucent distribution and replication, etc. If one wishes an explicit data-service used by a great many processes, it can be provided as a service written in the language atop these built-in features. If other languages wish to interface with your DBMS, they can do so much as they do today: via communications ports in a common text-language through ForeignFunctionInterface''''''s with the OperatingSystem, or perhaps via ForeignFunctionInterface''''''s in a library. You cannot truly escape such language issues. Your avoidance of specific languages in the implementation of DBMS and DBs is probably wise in the short-term, but long-term the world would be better off with DBMS support simply being standard, allowing for databases of true complex data and objects.'' ''Anyhow, when you said you described as an example type: "a pattern representing a positive integer", my first thought was: "okay, so this person means that any text-pattern is a valid type. This means I could write an XML-schema as a text-pattern for column X, and write a C++ grammar for column Y, and (etc.)". If you were thinking this, it'd at least be halfway compatible with true semantic types (because you could guarantee that your decodec/parser always will run without errors - the 'only' (rather large) problem from there would be writing any sort of query that leverages the value-data within these fields ("find me all rows that have column Y specify function just_do_it(int,int,double)").). However, obviously you weren't thinking this.'' ''As for difficult scenarios: back to my favorite. Knowledge Representation in a Learning system, with an ontology (vocabulary, set of words or meanings or 'entity-classes') that increases during runtime.'' (replies below) ------ Re: ''I discussed several of its nice, flexible features, but added that I prefer strongly typed systems. To this, you essentially replied: "strong typing can be optionally applied". Sure. Maybe so. But if I do so, what happened to my flexibility? what happens to the features that led me to even somewhat consider MultiParadigmDatabase in the first place?... One should NOT answer or even imply that I can simply add strong types via constraints and somehow keep the properties that make MultiParadigmDatabase tempting in the first place'' "Types" are generally orthogonal to paradigms. In my opinion "types" and flexibility are at odds anyhow. Set theory is a more powerful classification and better fits real changes than DAG's in my observation. A flexible classification system would thus use sets, not types. "Flexible types" is like a "strait jacket that provides freedom of movement". (Some define "types" via sets, in which case we may just be talking about different ways to say or present the same thing.) * Types indeed provide constraints. But a "flexible type" system isn't fixed in the form of "a strait jacket". It isn't flexible unless it can take the form you require of it. And I consider 'types' at a far higher level than DAGs or even sets, though you're free to use a set to represent a type (or vice versa; set-comprehension is often defined in terms of types). Avoiding a discussion on WhatAreTypes, assume I'm using the word in this context for flexible classification of immutable value-objects... similar to or representable as sets (placement in a set is an extrinsic property one can use to describe a type). One might add mutable objects if the types or constraints are specified temporally (e.g. specify that the database is only in a consistent state if integers identified with XYZ only increase or remain the same in any delta-time.). * ''I don't consider sets and types the same thing. Sets are a broader concept than "types". Types are not about general classification in the minds of most people, but a context-specific. A classification does not need to assume context because a decent classification system can handle multiple classifications. (I suppose we are drifting back to the ever looming definition of "types" issue.)'' * Sets are not a broader concept than types, though they are a different concept. There are many type-systems that abandon the notion of 'most-specific-type' for objects and values. In these systems, program concepts like variables (and protocols, and procedures) can have types (describing their assumed and necessary properties for interfacing with them) but you cannot ask for the 'typeof' a value or object directly (most you can ask is: is this value compatible with that type? can you prove that no typing constraints are violated? if not, can you prove that any typing constraints are violated?). You, I think, haven't been much exposed to the broader possibilities for type systems. I've seen languages where types are represented by arbitrary predicate-functions (like 'odd?' or 'prime?' or 'abstract-C++-syntax-tree?'). These are isomorphic to computable set-membership. * ''We had this discussion already with regard to ColdFusion's "isNumeric" kind of operations where scalars have no side flag. I don't consider that "types", but parsing or validation. The flag model makes a fairly clear line between parsing/validation and "types". Yours does not (or is so wide that it covers everything, making it nearly useless).'' * Yes. We had this discussion already, so let's not have it again here. Your mind is obviously quite inflexible when it comes to any sort of "I consider" or "I don't consider" statements, and you'll continue to spout completely unsubstantiated claims or even outright falsehoods as facts no matter what I do (including this very recent: "the flag model makes a clear line between parsing/validation and types". Unsubstantiated: your implicit assumption that there SHOULD be a clear line between the two. Falsehood: Flag model MAKES a clear line - you quite clearly backed straight into Parsing when you made your one and only attempt to figure out where those flags come from in the first place.) If other people want to hear the whole of this argument, they can suffer through it on 'TypesAreSideFlags'. * ''I found YOU the inflexible one. Your evidence is purely authoritarian. As far as an alleged logical flaw I've made, please point it out.'' * You act the damn child. If you aren't willing to accept a little education because you think it 'authoritarian', go read some very thick books on your own and come back when you understand them. I've already pointed out plenty of logical flaws to you back on the other page, and yet again just above (see the 'Unsubstatiated:' and 'Falsehood:' with descriptions? those are also ''pointing flaws out''). I can't help it that you close your eyes whenever I point. * You mistake personal head notions and tradition for "logic". ''Now, as to whether I want the impossible? of course I do.'' I am selling vacuum cleaners, not genies. * ''Hmmm. So you're saying that MultiParadigmDatabase... sucks? :)'' * If "lacking supernatual ability" means "sucks" to you, then I guess so. If you invent magic, I'll be the first to invest. * ''My vacuum cleaner sucks, but I don't believe it has any supernatural abilities. You said you're selling vacuum cleaners.... How did this pun get past you?'' * Guess I missed it. ''As to whether it is "hard to beat incremental and optional tightening of conditions/types/constraints": I disagree. Incremental loosening or tightening of constraints is a bad idea to start with due to problems of coupling. Supposing you already have a user-base, any tightening of constraints will break applications... and you'll be fighting the data already extant within the database. Similarly, any loosening of constraints can allow addition of data that violates assumptions made by those applications that access your database, which can also break applications.'' As a project gets further along in testing and construction, one can crank up the constraints. For example, when the project passes the first test, add the constraints and then deal with the constraint issues that may come up. Another possible feature is '''constraint monitor'''. This would simply log problems (deviations from expected) rather than trigger formal errors. One can then investigate the problems without shutting down a running system before making the constraints formal. * ''Use in development phase I can see. Unfortunately, "as the project gets further along in construction" you'll probably be finding yourself needing to do more than merely "crank up constraints". E.g. deciding that a single entity needs to be split into two entities with a one->many relationship. With these sorts of refactorings already exist in that development-phase mix, the benefits from the incremental modification of constraints are severely diminished. You'll break things either way.'' ** I disagree, but until it is tried in practice, we won't know for sure. ** ''What? you've never redesigned a schema or refactored entities and relationships while in development? I've done it myself, and I've seen it done by others. I feel quite confident that both myself and those others would suffer the same problems even if we represented entities in a MultiParadigmDatabase. Perhaps you're perfect and you can get everything BUT the type-constraints right on the first try? Uh huh, yeah. Even if so, it won't help everyone else.'' ** Testing can be done either way. I'm not sure what your point is. ** ''Can you '''please''' start re-reading when you've forgotten the conversational context? Above with the pun and now this? YES, testing will need to be done either way. THE POINT is that you GAIN VERY LITTLE AT ALL from simply being able to incrementally modify constraints on the database. THE REASON is that CHANGES you need to make WILL OFTEN BE STRUCTURAL, which INVALIDATES INCREMENTAL CHANGES TO CONSTRAINTS anyway. Can you even remember what you said to the contrary? oh, yes: "until it is tried in practice". Well, I'll tell you something: the need to make structural changes is ALREADY EXTANT IN PRACTICE, and nothing in MultiParadigmDatabase reduces the need for it.'' ** What do you mean "structural"? You're making up random stuff. Suppose there is an ID value that is supposed to be only digits and the client input screen allegedly validates it as digits, but we've never enforced it at the DB level. If we turn on such validation, that is hardly "structural". Or a column that has always been populated in the past (we can check), but has yet to be set to "required" in the DB. Informal rules slowly become formal rules. ** ''What I mean by "structural" is equivalent to changes in the schema in an RDB, but in an MPDB it would mean restructuring the data itself... a didactic example being, for instance, switching from: {entity: X, name: Bob, child1: Joan, child2: Billy} to {entity: X, name: Bob, parent: Anne}. This might happen even if it is guaranteed that 'entity X' things have at most two children. Another example would be splitting 'entity X' into two entities: 'entity X' and 'entity Y', where 'X' is some smaller subset of the original properties. These sorts of changes in how the data itself is structured are likely to force a partial (possibly substantial) rewrite of the constraints specifications (if they exist). They are also just as common a target of change in early development as any set of constraints.'' ** ''Now, turning this around on you: Suppose there is an ID column specified nowhere but your head as "supposed to be only digits". After you build a large userbase, what are your chances of being able to simply 'turn on' this new intended constraint? Answer: Quite slim - chances are that many IDs have been represented as arbitrary strings. Vice versa: Suppose there is an ID column was clearly specified as being 'only-integers'. Later, you decide you want text IDs, too, so you loosen the constraint and add a few text IDs. What are the chances at least one application using the DB is optimized for representing IDs as integers, internally, and will now be broken because the database carries strings in the ID field? Answer: Fairly high, especially if you're not the primary person controlling the client-application software.'' ** Both of the above are going to be issues of at least similar magnitude in an up-front-tight DB also. I never claimed MPDB would simplify *every* scenario. I also agree it may delay inevitable pain to favor the short-term. But at least it gives one such options. If a manager/customer favors RAD over longer-term costs, it is not our job to set their priorities (only make sure they understand the trade-offs). ** ''Oh, I agree that this is also an issue in up-front-tight DB. However, a good "up-front-tight DB" won't be paying for this 'incremental constraint' feature you promoted that is, essentially, rendered 'mostly useless' by the realities of the development cycle. As far as: "but at least it gives one such options" - you seem think you can buy a feature, even an ''optional'' feature, for free. You're wrong; even offering "such options" has costs - development costs, feature costs (e.g. loss of certain optimizations one might have otherwise been able to perform, greater 'code-bloat', inability to include another option that is subtly incompatible), complexity costs: maintenance costs, design costs, etc.. Whether you know it or not, your design is paying for these 'mostly-useless' options. A more experienced designer would say: "YouAintGonnaNeedIt! Cut the option. It was worth contemplating, but just because I contemplated it doesn't mean it's going into the final design. Having that option makes features 'Fast' and 'Small' harder to obtain even before I start looking at how it impacts everything else."'' ** We'll have to AgreeToDisagree on this. I've been in situations where such dynamacy would have been nice. BigDesignUpFront is a nice luxary not always available for technical, political, or situational reasons. In organic situations, an organic approach is often the best match. ** ''I agree that organic approaches are sometimes better. Don't confuse the immediate issue with BigDesignUpFront. The feature you're advocating is '''mostly useless EVEN IN ORGANIC SITUATIONS'''. With an "up-front-tight DB", you'll handle structure and type changes by simply creating a slightly modified schema, and writing a few manipulations to translate the data from the old structure to the new one. And, with the 'incremental constraint' feature, you'll quite often end up doing ''the exact same thing''. Thus the "mostly useless" name. Now, if it cost ''nothing'', I'd agree that it's also ''harmless''. But nothing in this world is free, including options.'' * ''The logging idea and constraints monitoring is a good idea if your applications can handle AcID databases either without buggy behavior or if the buggy behavior is acceptable.'' ** Switching ACID off is often not a show-stopper (except maybe a bank, which probably should use more formal approaches up front anyhow). ACID is just added insurance. ** ''I agree. Sometimes buggy behavior is acceptable. However, ACID is very ''nice'' insurance to have... it simplifies a great many problems.'' ''but long-term the world would be better off with DBMS support simply being standard, allowing for databases of true complex data and objects.'' "Complex data" meaning DAG-based classification systems? I'll pass. Shag the DAG. * ''Who told you that types mean DAG or hierarchical classification? I certainly did not. DAGs aren't even a primary representation of types (the main representations being sets, categories, and predicates, of which I prefer the latter). I'm tired of arguing about silly typing ideas (TypesAreSideFlags not too long ago).'' * See above. ''As for difficult scenarios: back to my favorite. Knowledge Representation in a Learning system, with an ontology (vocabulary, set of words or meanings or 'entity-classes') that increases during runtime.'' You mean BeliefDatabaseExample-like gizmos? I'm sure there's a set-oriented way to do it if there is a decent DAG way. Sets are a superset of DAGs. * ''The challenge here doesn't '''just''' call for types. It calls for both strong complex types and a great deal of runtime flexibility. It's telling that most existing representations are quite weak, like RDF, relying upon an extrinsic ontology (not part of the system) to give them meaning and constraint. And yes, this is similar to the BeliefDatabaseExample-like-gizmos; knowledge and belief are strongly related (if you know that you know something, you believe it).'' * I think you just personally like types, not that they are necessary for this. TuringEquivalency could probably be established between a DB-centric approach and a type-heavy approach anyhow. * ''The types for knowledge representation are more necessary for creating reasonable and efficient operators over complex values stored in a database than they are for constraint-proofs, though they do serve a dual purpose. I do personally like types, too. And there is no fundamental gulf between type-heavy and DB-centric (you can have a type-heavy AND DB-centric approach... you just need to toss SQL aside).'' * Perhaps. I never said SQL was the end-all-be-all, and it may need additions, such as traversal operators, to work well in some niches. ** Sure. But among the additions you will need are complex value/data types and support for describing operations over them. ''And TuringEquivalency seems to be your version of 'Abracadabra'... I can almost see you waving your hands again as you type the words; sure, one can make do with some form of TuringTarpit''''''D''''''ataManipulationAndQueryLanguage if one really absolutely needed to do so, but that doesn't mean the domain itself doesn't "call for" something higher level. One ought to be able to directly express what is desired from a query, and "making it work" also includes "make it correct" and "make it fast" - things that become remarkably difficult to do while wallowing in a TuringTarpit. Keep in mind that one TuringTarpit can drown a lot of ModernDinosaur''''''s.'' You have not identified any inherant tarpit. If you can make your own custom knowledge-base system, then one could in-fairness make a custom relational version to compete. You are comparing roll-your-own to off-the-shelf, which is not a fair comparison. If you roll, the other side can also roll. ''I don't need to identify an inherent tarpit. I need only tell you to shut up about 'TuringEquivalency' because it doesn't prove anything at all - not when you can create a D''''''ataManipulationAndQueryLanguage based on something like BrainfuckLanguage and still have your silly excuse for 'TuringEquivalency'. This is a truth: Turing equivalent or not, a good query for a domain language must both be expressive over the entities in that domain and allow for optimizations (preferably at a high level). And I don't mind if both sides 'roll'; I wasn't aware that there were 'sides' in this thing at all. I'm only saying that, to 'roll', you will need types to support operations over complex values. Why? because the domain requires representation of complex values that are not, by themselves, meaningful 'data'.'' Bullshit! It only "requires" it because you are used to thinking about it in that way. You are mistaking your personal mental model for reality. When you make your "types" flexible, query-able, and meta-able, you will have '''invented a database without even realizing it'''. Interpreters and compilers are databases of sorts, its just that language-centric thinking tends to hide them away, forcing the reader to think in syntax instead of data structures. "Complex types" are turned into nothing but look-up tables, trees/dags, and ID numbers. I would rather approach such "database building" knowing that we are building a database instead of backing into it accidentally while on thinking in terms of syntax. When you know where you are going, its easier to plan. (Another advantage of a DB is that the presentation is controllable. I'm not stuck with your ugly Erlang conventions or what-not because I can sift, sort, and present the info myself any damned way I please. Now that is high abstraction in my book. I can Frank my Sinatra.) ''Values '''are not data''' unless they represent a proposition. You '''don't have information''' simply by possessing a value, therefore you cannot "sift, sort, or present '''the info'''" to anyone. You cannot, no matter how you work with it, gain truths from a collection of values that never represented propositional truth in the first place. This seems to be a fundamental gap in your own understanding of computation. Go fix it.'' Any digital informational construct you can envision can be (re) represented as a data structure. Thus, you are technically not correct. Any differences will be about machine efficiency and/or human relatability. ''Your logic is flawed - non sequitur from statement to conclusion. It is true that digital information constructs are represented as data-structures. However, that does not imply that every data-structure is a digital information construct. The '''information''' constraint is the relevant one, here, not efficiency and human comprehension.'' You lost me. Please restate. Constraints can be put on data structures also. ''Not all values are information... 'data' in the true sense of the word. Example: Just having the value [7,23,42,108] doesn't tell you anything about anyone, anyplace, anywhere, anywhen. It is not a ''datum'', and it is not ''information''. Not, at least, by itself. To be data, this value needs to represent a proposition... e.g. "the numbers on the bus routes that drive by my apartment" or "auspicious numbers in various cultural media". This is fundamental. Which proposition the value represents might be implicit in the context in which you find the value (e.g. you ask which bus routes stop near my apartment, I give [7,23,42,108] as answer, implicitly representing "bus-routes 7, 23, 42, and 108 have stops near my apartment"). Now, DatabaseIsRepresenterOfFacts - they are intended to represent propositions (sentences that are ''true''). You can force a database to represent complex values by breaking it down: parts of value X are Y and Z, parts of value Y are A and B, etc.. However, X is not information. Y and Z are not information. A and B are not information. None of them have any meaning... absolutely none at all. Thus, no matter how you use the Database queries on a value, no matter how you sift or sort or join or decompose, you'll never, never, never learn anything new. No '''information''' was ever there to begin with.'' That there is context that gives something meaning is understood, I thought. In MPDB it naturally has a column name and the record that it is part of as the minimum context, and of course one adds domain-specific contexts/relationships via references (keys). ''Values, by themselves, don't inherently possess any extrinsic context... which is why they don't provide information. In a database, values are given meaning only by their placement within a proposition. One can represent complex values in a database, but this never serves any purpose beyond mere representation (since values by themselves do not have meaning), and doing so comes at a cost of artificial complexity, a reduction in expressiveness, and inefficiency. You know you've gone wrong (or just ran out of options due to a sucky DBMS) when you have any table that essentially is representing 'is a value', or if you have tuples whose whole existence is dedicated to: 'is part of this other value'. Going for support of complex types is truly the 'simple as possible but no simpler' approach. Pay once up front or pay forever out the back.'' ''Imagine the horrors you'd deal with if a database could represent only integers, and representing a string of 50 characters required that you add 50 tuples to a table, and getting a table of names required deep recursive queries of arbitrary depth, and simply querying an entity by string identifier required deep query traversals first on the strings-table. That's essentially the hell you send people to every time you damn complex value types. Stop doing it!'' One could likely put "view wrappers" around such a highly "atomized" DB if such did exist such that they are not usually dealling with stuff at such a low level. Going to the sub-value level like that is an extreme case. As far as performance, you are right that a fully atomized DB would probably unpleasent. In practice there is usually a compromize between fully atomized and fully hard-grouped. MPDB is more atomized than a RDBMS, but one could get even more atomized, such as a "single-value graph" (a single value with multiple potential pointers), but still be at the value level. In general though, the more flexibility or varied requirements (less "structured") you need, the lower you go on the atomic level. ''In an ideal database, you'd '''never''' have a tuple dedicated to describing a sub-value. Those are pointless on a fundamental, philosophical level. Beyond that, the reasons include: (1) you can't learn much at all from a table that essentially carries as data of the sort 'this-is-part-of-a-value-used-elsewhere-in-database', (2) what little you can learn is the sort of security-violating data-mining that should be avoided anyway.'' * "Sub" is relative. * ''I'm rather curious as to your point. Mine is that each field ought to carry exactly one, complete value... and that each 'row' ought to carry exactly one, complete fact. As such, there should be no tuples dedicated to carrying parts of a value. (Similarlly, there should be no tuples dedicated to simply carrying a named value - not unless naming the value '''directly''' reflects or projects a fact about the world.)'' * Many items have a clear value level, but things like Social Security Numbers, phone numbers, addresses, etc. often do not. * ''Addresses, phone numbers, and SSNs are all extrinsically specified identifiers for a single object or entity. For these, the situation is quite clear: the full identifier is a single value. This is the only approach that actually works between different nations and across all domains. If you need information on how some addresses relate to others, create appropriate relations in the database (e.g. identifying that "12345 College Drive JOCO, KS" is in "JOCO, KS". That is what this normalization form would call for.'' * Often that is not practical. In many places if they want to index on say street name alone for quick searches, some parsing or retyping has to be done to isolate out the street name from the fuller address. * ''You could have a relation between addresses and city/street names, if you wished it, though that wouldn't apply to all sorts of addresses. I'm failing to see how this would be impractical (even without resorting to complex types).'' ''However, I understand the concern that complex values allow for storage of complex data ''within'' the value... essentially creating a navigational database. Of course, who are you to be arguing that people shouldn't have the option because they might abuse it? That runs contrary to the philosophies you've described at earlier points in this discussion.'' * Any hierarchies that form from such are "accidental" and not part of the inherent DB rules. Further, the goal of a MultiParadigmDatabase is to fit to multiple paradigms, even ones that may make you or me cringe. I see it as being "organic" in that it can bend with the wind. It may be useful for experimental or unsettled applications where both the problems and solutions are organic by their nature. I wouldn't want my banking info on it, but it may be useful for AI research, or simply an app with tricky requirements in which nobody can agree on the direction or tool. * ''Oops! by 'hierarchical', I actually meant 'navigational'; I've corrected it above. Anyhow, if you want for MPDB to "bend with the wind", embrace the option of complex types. It can only give you more flexibility to have the option to choose between complex types and, say, primitive strings. Don't just embrace it; promote it.'' * Complex types tend to lean toward imperative designs though. Nor do I think they are provably "better". It may just be a personal preference and people choose a model that best fits their personal psychology. But, I welcome a demonstration of their power. * Re: "Complex types tend to lean toward imperative designs" - ''Now '''that''' is an unsubstantiated and completely false comment if I've ever seen one. I take it you haven't actually used many typeful languages? I'll tell you something relevant: the vast majority of them are '''functional programming languages'''. A few of them are even '''pure'''. Complex types are useful for imperative languages too, but they very much do not "tend to lean toward imperative designs".'' * ''And as to your thinking that complex types aren't provably "better": I think you're ignoring the fact that '''strings''' are complex types relative to '''integers''', and that '''integers''' are complex types relative to '''booleans'''. You keep rejecting complex types simply because you aren't comfortable changing the level of abstraction with which you're currently familiar. If you wish a '''demonstration of their power''', simply rewrite a database schema you've already used, but '''without any strings'''. Represent every single string you're currently using into a list of integers, represented in a table. Then compare the two. Look at what the difference cost you, and try to find a single thing you gained by doing so. Can you now better '''sort, filter, and join''' parts of strings? Sure, but to what end? None whatsoever - there is no information or data-content in represented values! Did it cost you efficiency, time, frustration, etc? Yep. THAT is the difference in power, the proof that they're "better". I face these exact same frustrations and inefficiencies when dealing with the slightly more complex type structures that carry such things as propositions into a database. You're familiar with strings, so you don't think of them as 'complex'; consider that I'm familiar with propositions, and I don't think of them as complex. Nonetheless, they're both "complex types" in the sense that they're each formed of a structure of smaller, more primitive value-components. Give it some thought.'' * You are talking about machine efficiency, not human efficiency. I agree it would be faster if the base types were better restrained. However, that has its own downsides. For example, even though ID numbers for a current project I'm working on are currently always integers, I made them string because it may someday hook into other known systems that don't use only digits. I didn't want to commit to a more narrow type up front. Strings are also easier to interface between different languages and tools, reducing the conversion effort. This sounds like it is shaping to be yet another battle between strong and weak/no typing. No need to repeat that here, there are already topics such as BenefitsOfDynamicTyping. As far as the "imperative" issue, I should have qualified my original statement a bit more. However, let's table that for now because it is not important. * ''Machine efficiency, but human frustration, top. Complex types (like strings) are easier for humans to comprehend than artificial representation of those types (like strings represented as lists of numbers). And on the language support front, I agree: strings better support a variety of languages with fewer problems of representation (due to common use of ASCII or UTF-8 or UTF-16), whereas complex types tend to require complex structures. However, I don't believe that supporting a slightly wider variety of languages is worth the price we're paying for it. (Also, please remember that BenefitsOfDynamicTyping is on the dynamic/static axis, not the strong/weak/none axis.)'' ''What you might wish to do, however, is encourage some extreme normalization forms - essentially: '''exactly one''' fact per row per table. That is, you'd reject a user-entity table that had user-id AND name AND ssn AND birthday; instead, for this same information, you'd have an autonum 'IS-USER', with separate tables for 'USER-NAME', 'USER-SSN', 'USER-BDAY' - a whole four tables for the same information. Then you'd '''let the DBMS optimize the schema'''. E.g. the DBMS is perfectly capable of making the decision to transparently 'join' the IS-USER, USER-SSN, USER-NAME, and USER-BDAY tables for space and time-efficiency purposes. With this philosophy, you'd be able to more easily spot places that can be further normalized... e.g. if someone used a collection-value to indicate a relationship to each thing in that collection, you could say: "hell, no!" and point them at the extreme-normalization form and remind them that the DBMS can optimize.'' ''Of course, you'd also need to make sure that the DBMS '''can''' optimize. You could even offer it 'suggestions' on how to do so if you don't feel comfortable with fully automated optimizations.'' Most DB's tend to force a decision between row-wise versus column-wise grouping of items. I wonder about DB's where it can be both or neither. Rows (and classes) are a human-friendly construct, but maybe not necessary for the problem domain. After all, the human brain doesn't use rows. Then again, it is highly fallable. Maybe there is an inherent trade-off between "clean" and "flexible" and the row-vs-column is a reflection of this trade-off. It may just be naturally hard to do "theorem proving" on an organic system, because organics don't care much if they commit a violation of things like OnceAndOnlyOnce, cleaning it up incrementally and gradually, if at all. ''The human brain represents data within computational structure directly; it doesn't at all separate 'data' from 'process'. Further, human-brains are very much not human-friendly; brain surgeons have difficulty looking at that grey matter and understanding what's in there, and even we (the users of the brain) cannot recall or process (or otherwise remember) data stored within it without a trigger. The information in our brains is almost impossible to port to other processes (i.e. to other brains), and completely fails when it comes to concurrent usage or ACID semantics in an enterprise environment. I wouldn't want to store my banking information solely in a brain... not even my own. I suggest we avoid "the brain" as a model for data storage, or just reference it in terms of "what not to do" if we wish for portable, durable, and human-friendly access to data.'' ''Nature, after all, had to make do with what it could construct by chance alone, and took millions of years (and trillions of failures) to get something more complex than a flatworm. :)'' ''Organic flexibility comes from emergent properties of millions of tiny things, each of which can be super-complex (e.g. possessing 3.2 GB worth of DNA and often being specialized to use just part of that). We can do the same thing, but we seriously need to find a more efficient approach.'' ''Anyhow, back to your point: The extreme normalization form I mentioned is extremely flexible... allowing you to always tag more information to any particular entity (bigger 'row'). Further, it is extremely clean logically, it corresponds exactly to predicates as you'd find in (say) Prolog relations, missing only the infinite sets that can be specified in form of predicates. Instead of trading off between "clean" and "flexible", this normalization form buys both and pays in "efficiency". Fortunately, we know of ways to buy efficiency back - a process called "optimization". Optimization costs energy and time, but that's also what you gain back after the optimization... making it a fine investment.'' As stated above, I am skeptical without seeing actual examples. But it should be noted that a MPDBMS does not prevent such a construct. That's why it may make a good experimental tool. ''I didn't say MPDBMS prevents a construction of this normalization form... though I shall suggest that it DOES make considerably more difficult the optimization (performing automatic joins and such) due to the difficulty in describing relations. If you want the best place for examples regarding the 'clean-ness' and 'flexibility' (and weaker 'speed') of the purist approach, I'd suggest playing with logic languages like prolog for a bit - you can skip the computed predicates and stick with table-based predicates if you wish a direct correspondence with this extreme form of normalized relational. I'm not sure it's something you could appreciate given a static view of the normalization form (you'd be focusing on all sorts of optimizations by joining tables, which I suggest ought be performed by the DBMS... possibly with suggestions from an expert).'' ''One thing I do like about extreme normalization: never a need for a NULL.... only thing that comes close is need to represent an 'empty' value (the fact that nothing is there).'' [Am I correct in inferring that the "extreme normalization" you're referring to is the sixth normal form described in (for example) http://www.dcs.warwick.ac.uk/~hugh/TTM/Missing-info-without-nulls.pdf ?] ''I vaguely recall reading that, or something like it, a long, long time ago. Anyhow, that seems similar to the extreme normalization, though it seems to be a bit of an earlier paper on the subject. To fully match extreme normalization, one would need to acknowledge that even 'CALLED:(ID,NAME)' might someday have NULLS in it, and break off the user entirely (to 'IS_EMPLOYEE:(ID)', 'CALLED:(ID,NAME)'). Horizontal decomposition might also be rejected in favor of variant-typed data (e.g. changing from JOBID: String to JOBID: 'Maybe String', with Haskell-style 'Just String | Nothing' (where 'Nothing' represents 'no job). However, that could go either way; the horizontal decomposition isn't really part of the extreme normalization I was describing, but isn't contradictory to it. It was used by Chris Date in the first place to solve the problem of dealing with multiple types in a single column... perhaps he was trying to solve too much at once with 6th Normal Form.'' Not having a row for one of those is not much different than not having a given item in the row-map of MPDBMS. And it avoids repeating the ID over and over in the database. Per stated rules, these are equivalent: And ''I understand the use of NULL in MPDBMS well enough; it was well described near the very top of this page. However, I am failing to identify whatever point it is you're attempting to drive home by repeating it down here. Are you trying to say that there are no NULLs in MPDBMS? That can't be right - it looks to me like MPDBMS has [null] for an unbounded or infinite number of columns. Are you attempting to say that MPDBMS can represent entities in their un-normalized form? Why would you think that is insightful? Even relational can do that! Indeed, even Date started with an example that basically uses one big row-map - the very same thing MPDBMS is using.'' I just wanted to make sure it was understood to readers that the linked approach to avoiding explicit nulls is not the only approach. * ''Oh? so it's okay if you have an infinite number of implicit nulls, so long as none are explicit? Hmmm...'' ''One can always go about adding more columns to a big row-map. But I can't help but feel that doing this for purpose of 'avoid repeating the ID in the database' is very much a premature optimization (AND is one that could easily be performed 'under the hood' by a good DBMS). Anyhow, for lesser normalization forms, the reason to normalize is prevent '''duplication''' of facts, which is relevant for data maintenance issues. The more extreme forms are advocated for other reasons: to avoid NULLs, or for purity in adhering to the idea that 'DatabaseIsRepresenterOfFacts'. Failing these extreme forms detracts is '''only''' in the purity of the data representation and introducing many meaningless NULLs. This isn't a computation problem - you can use the one-big-row-map representation to derive all the same facts as you could from the extreme normalization or Date's 6th normal form. It just isn't as 'clean'.'' ''Anyhow, I think the context for this discussion on extreme normalization has been lost. If you recall, I suggested use of an extreme normalization form is to help discourage the use of complex-value-types to represent multiple facts with a single value. It mostly helps keep the schema creators focused: 'one fact per row' 'one fact per row' 'one fact per row' - that mantra makes it easy to justify splitting up a [collection,of,values] across multiple rows if it was representing multiple facts (as opposed to exactly one fact that needed a value collection). Further, it helps do so '''despite the temptation''' to embrace the exact sort of PrematureOptimization you very recently advocated: why duplicate the ID a bunch? why not just use a [collection,of,values]? This is a very human issue, one of socially engineering the acceptance of purity above that of premature optimization. '''Let the DBMS optimize''' is the paired mantra, which must be supported in truth (via good optimizing DBMSs, to which you can make all the same suggestions you were tempted to inflict by hand upon the 'pure' schema). I suggest you'd need to teach both mantras on the same priniciple we currently teach normalization forms.'' It is not premature optimization if there is no identifiable reason to denormalize into slim tables. It may simplify some queries at the expense of others. For example, queries that use multiple attributes of a single (normal) row are longer and probably slower under the slim-table approach. Ex: WHERE hireDate > 01-apr-2005 and salary < 50000. The slim-table approach probably requires a join or "in". * ''Actually, it's a PrematureOptimization because that's the reason you gave for doing it. If the reason you gave was: "there is no identifiable reason to denormalize[sic] into slim tables", that wouldn't have been a PrematureOptimization. Of course, you'd be wrong in the sense that there ARE identifiable reasons to normalize further (those being to avoid NULLs (Date) or have greater purity in the representation of facts (like the Prolog guys)). "There is no reason acceptable by top" would be more accurate. And I'm not disagreeing that the slim-table approach would probably be slower or that you'd need joins. That's why you need to '''let the DBMS optimize'''... and '''fix''' the DBMS so it actually does optimize. As far as queries being longer: that really depends on the query language. If it was known that a database is to be fully normalized, it would be a bit easier to formulate queries more like browsing a prolog statement. A simple example follows (eschewing, for a moment, the conditions to handle the NULLs):'' QUERY(ID,NAME,JOB,SALARY) :- IS_EMPLOYEE(ID),ENTITY_NAME(ID,NAME),EMPLOYEE_JOB(ID,JOB),EMPLOYEE_SALARY(ID,SALARY) * I generally consider the least repetition to be the default unless requirements lean toward other approaches. Then again, row-wise and column-wise can potentially just be different views on the same thing, although I haven't figured out how to make an implimentation that doesn't favor row-wise performance over column-wise or visa versa. * ''Performance-wise, compromise. Keeping in mind that the values in these rows and columns must ultimately be stored in memory (extending that concept beyond just RAM), even small things like whether you group the components of one row vs group the components of one column can make a difference in performance. In-between, you can have blocks of rows and/or blocks of columns - these being good for both the tasks of compromise between row and column optimization AND as 'blocks' useful for distribution (in distributed databases). Personally, I tend to prefer complete access to a row once I've identified it, so I index rows within columns. However, for me it's always: correctness is king, speed is queen. I look awfully hard for ways to have both.'' ''As an aside to TopMind: what "not that different" actually means is that "there's a '''small''' difference that might or might not be '''extremely significant'''". Exempla gratia: A parachute pack with a rip cord is "not that different" from a parachute pack without a rip cord... weight, size, area of fabric, ability to support your fall, etc. a lot of things are exactly the same. Both are even usable - it's possible to open a parachute without the rip cord. However, that rip cord makes the parachute a damn bit more usable - it reduces the average time it takes to open a parachute from several seconds to under one second, which improves safety a great deal. I bring this up because you recently used this "not that different" as part of your rhetoric, and (to my own vague recollection) you've done so quite often in the past. Doing so implies that you're failing to actually recognize the differences or comprehend their significance (or lack thereof). When you follow it by repeating yourself (rather than asking for clarification), it implies you aren't even ''interested'' in understanding or comprehension. Here is a better argument form: "idea and idea are only different on point , which is (not?) significant because <... your reason here ...>". Please consider using it. (DeleteWhenRead)'' It sounds like you want others to pre-argue your cases for you. That's asking too much. If I identify a "problem" with the difference, I will mention it. ''Hahahaha! I've been arguing my cases all along. That would be why my posts are often 4-5 times the size of yours. Now I'm demanding the same of you. That isn't asking too much. If your reasoning is so damn awful that stating it is like others pre-arguing their cases against you, then perhaps you ought to work on your reasoning.'' Your posts are longer because you meander and rely too much on English when examples or code samples may be more useful. It is hard to tell whether you are ansering the question directly or trying to plug "in case" holes. The relavancy of each paragrph to the immediate question is not clear to me. Another technique is to restate how you interpreted the other person's statement and then address that. That way you don't have to cover alternative interpretations without actually needing them. ''I'll believe what you just said if I stop having to put (NEEDS PROOF) in the middle of half your claims and you're still writing a fifth as much as I do (and still making claims).'' You mean like: "There is an objective definition of 'classification' (NEEDS PROOF)". (moved discussion to MostHolyWarsTiedToPsychology) ------ '''Type-Heavy Must Be Given Up''' I am skeptical that a "type heavy" database can be made sufficiently language and paradigm neutral to be called a "multi-paradigm database". Type-heaviness probably has to be sacrificed. The success of web protocols has proved the cross-platform value of the PowerOfPlainText. Personally, I like light-typing, so wouldn't miss it, but realize that other people or specialties may prefer heavier typing. But I welcome demonstrations of attempts at a type-heavy MPDB. ''To the contrary, such things as XML and HTML and javascript (structured text, not plain text) are what have proven to have cross-platform value, and these things are type-heavy within their small domain - they require strict structuring, possess schema and standards, etc.. A great deal of their value comes from the ability to automatically verify that they are correctly formed and meaningful, and to simply 'know' that they're meaningful to others across domains and platforms due to adherence to a standard structure. Those and the vast majority of web protocols gain success from their strict definition and standardization - features that allow programmers across domains and platforms to support the same protocols and guarantee they are implemented correctly. These things are natural to strict typing. Some protocols wrap other protocols, much like abstract types wrap other types (i.e. other standardized protocols or plain-text) as required - a flexibility that has further expanded the success of web protocols but that is no less 'typed' for doing so. Now you state that "light typing", whatever that means, is to your preference. However, you've offered nothing at all to support your thesis that type-heaviness must be sacrificed. Burden of proof for your thesis is very clearly on you, top. And if you truly "welcome" demonstrations of type-heavy database structures, you should bother doing research and self-education on existing type-heavy database designs.'' XML is "typed"? I am not sure I agree with that. Nor do I agree its (limited) success is because of "strict definition and standardization". People study the vendor's actual output and write extractors that fit it for the most part. They are often not concerned with heavy schema validation (which is not necessarily "typing" anyhow). Similar constraint checkers could be put on top of the kind of MPDB mentioned here anyhow, as mentioned earlier. If you wish to call that "type checking" be my guest. I don't want to argue that definition anymore, although the wideness of your usage makes communication difficult because it puts the "type" umbrella over validation and constraint management, which some readers may consider a outside of "types". If you wish to continue to use such a wide definition, it may be helpul to create a taxonomy of "kinds" of types to produce clearer communication. ''XML is "typed" the moment you run it through a schema or DTD. And, despite your intention to put on blinders and think otherwise, these forms of constraints and validation do fall under the "type" umbrella. This isn't especially "wide" usage, either; it follows naturally from all other forms of typing. What do you think it means to check whether something is a real number vs integer vs unsigned integer? or is a structure with a particular feature? Type-echecking IS a form of validation of constraints - a subset of that whole field (much like birds ARE animals). If you want "kinds" of types, I invite you to read up on some works on actual TypeTheory rather than coming up with half-baked ideas like "TypesAreSideFlags"; you can learn about dependent types (wherein you produce a type based upon the actual arguments to the function), linear types (wherein you validate that variables are used under certain protocols... e.g. "exactly once"), uniqueness types (wherein you validate that a process uses only one instance of that type), predicate types (wherein you validate that a variable will necessarily meet a particular predicate), protocol types (wherein you validate that communications match particular patterns), constraint types (wherein you validate that two or more variables meet a particular predicate when taken together), and more. Give up on YOUR foolish notions of "types" and learn what is actually out there, THEN we can "produce clearer communication".'' I disagree with the "natural" claim (''on what grounds?''). That being said, the above MPDB can have a constraint system/language that is as complex and fancy as one wants to make it (although it may require interfaces to a language of the user's/shop's choice). One may argue that such is not built-in to encourage or enforce a standard way to provide such. But adding such may be turning the DB into an app language, which is reaching beyond the typical scope of a typical DB. ''I offered sufficient explanation after the "natural" claim that just saying you disagree with it (without explanation) is somewhat crass. And I haven't stated that the MPDB '''can't''' have a constraint system or language (though I discussed above why it defeats the flexibility of using MPDB). What I've asked is that you explain your thesis that "Type-heaviness probably has to be sacrificed". So, please, tell me why MPDB '''shouldn't''' have a constraint system/language as fancy as I (or any fan of strict, heavy typing) would make it.'' "Natural" is difficult to objectively measure. I consider hidden type-flags, validation, and constraints as sufficiently different as to not roll them up under "types". Your personal world view may differ because of your fondness for Reynold's work, but you are not the reference being for every other person on the planet (nor am I for that matter, but I suspect you'd find my view of them the more common among IT practicioners). ''I'll accept that the word "natural" perhaps entails something different for you and I in this context. To me it means that the operations and computations and descriptions necessary to support typing will '''necessarily''' support constraints and validation the moment the type-system reaches a certain level of complexity, and vice versa. This can be objectively (deductively) determined. Anyhow, regarding your "I consider" comments: Only a naive type-theorist would try to roll type-flags up under "types" (as opposed to optional implementation-detail). And only a noob type-theorist won't have already studied predicate and constraint types enough to know that one cannot describe a computable constraint (with an immutable description) that cannot be used as a type in a type-system. Your personal world view may differ because you're a naive noob in the field of type-theory... a condition that is probably common among IT practitioners.'' As far as building in a fancy constraint system, such would probably require including a fairly complex TuringComplete programming language. While many DBMS do indeed include one, I feel that the DB and such languages should be separate things. I would like to be able to write Oracle triggers with Python or VB or Java or whatnot, not just Oracle's PL/SQL (although I believe Oracle is becoming more Java-friendly of late). The DB does not need to sanction a One True Language. Although, perhaps it should have a default implementation for speed resulting from tighter integration. But, this is mostly an implementation consideration. It would also be interesting to draft a declarative constraint system and see how far one could take that. But, I bet such would kind of look like the internals of a formal type engine. ''I'd assert on principle that any constraint system should be completely independent from "triggers", largely because "triggers" imply communications that can be extremely difficult to reverse (making 'undo', rollback, and transactional semantics tricky). Declarative, rather than reactive, is the way to go for constraints. Now, a subscription to a query (and changes to it) would a be pretty cool way of handling "triggers" that communicate as one might expect "triggers" to do.'' Perhaps. But that is kind of another subject, though. ''I am a believer in One True Language for a database. Systems that try to avoid a common-tongue language are most analogous to a broken and shattered TowerOfBabel. I have difficulty finding any logic behind your approach of supporting a Broken Almalgamation Of Languages instead of One True Language.'' * Because people prefer their favorite languages. Perhaps they prefer languages that best model their mind and are more productive under them. I don't want you shoving your preferences down my throat any more than visa versa. * ''People do have mixed thoughts on what they prefer. People also like elegance and simplicity, and the ability to learn only one language when performing maintenance, which is not something your approach is offering (or can offer).'' ''However, if your primary objection to heavy-typing is that it would require creating (and sanctioning) a One True Language for describing constraints, I can understand your objection - the same reasoning you provide for focusing on strings and integers: support for the lowest common denominator among client languages that might utilize your Database. I just disagree with it. I don't believe that making all languages second-class is better than making all-but-one language second-class.'' * If you can demonstrate that your alleged "first class" approach is better, be my guest. * ''I can trivially prove it can't be worse; ensure the first-class language is capable of supporting plugins or ForeignFunctionInterface, then design the DB to support plugins/FFIs that support other languages... and the result is the same as any other DB that supports many languages. As far as demonstrating it better, I'll make an effort to do so. I've included support for database operations over collections in a programming-language I'm designing. But I wouldn't expect results for another ten years or so. Perhaps someone else will get to it before me.'' * Weak typing can improve flexibility because you don't have to be married to a long cumbersome type DAG chain and can focus on "flatter" simpler interfaces. {''You and your silly "DAG chains"... honestly, stop pretending you know much about type-theory.''} But we are digressing into a classic strong/weak typing debate here. The main reason for separation is to give people the language choice they want. * ''I'd argue that turing-complete extensible syntax (e.g. a macro and function system) is the way to go for that particular goal (of giving people a language choice). People are stuck using the "language" that is the interface prescribed by the DBMS implementation no matter what they do. Might as well be honest about it.'' * Type-heavy proponents will probably not like ANY database because of DB's tendency to assume separation of attributes and app-specific behavior. ADT-centric thinking is generally against this. ADT's and DB's generally don't mix well because DB's use attributes as interfaces and ADT's use behavior. * ''We're talking about type-heavy proponents of databases, here. Recall that many type-heavy proponents are into FunctionalProgramming, not necessarily ObjectOrientedProgramming, and so ADTs aren't important to all of us. Anyhow, should they be decided as necessary, AbstractDataType''''''s and interfaces can be handled via indirection and higher-order DataManipulationLanguage''''''s (those that allow entries to describe queries or manipulations or even table-names, and to utilize them as part of a query; SQL doesn't allow this at the moment - to perform indirection in SQL, you need to send back intermediate results with table-names so new queries can be formed). ADT's really are something your MPDB should suppoort if it is to meet the expectations created by its name, but support for it requires support initially in the DataManipulationLanguage (which then extends to the DBMS requirements and infiltrates the optimizations (e.g. indexing and precaching strategies) in some rather significant manners). We could spend a whole page discussing the requirements implied by ADTs.'' * "Multi-paradigm" is possibly a questionable name, other than to say that it can cater to MANY paradigms/styles but not necessarily all. It's not called "All Paradigm Database", after all. It tends to be attribute-centric, not behavior-centric, and thus will not make some style proponents very happy. It could also be argued that behavior-centric databases are actually "rule-bases" and not ''data''-bases. But I don't want to dive into such a vocab battle today. ''Despite this, feelings one way or another about DB sanctioning of languages are completely insufficient to support your thesis that "Type-Heavy Must Be Given Up." Emotions are pretty low on the EvidenceTotemPole.'' Because not giving it up turns the DB into a programming language, or at least shifts the emphasis toward the features of programming languages. Again, if you can '''propose a DB model that is considered multiple-paradigm AND type-heavy''' (and not a tangled mess), you can show my allegation to be wrong. However, until your unicorn shows up, my proposal is the most concrete MPDB on C2. I am not making an absolute claim, but rather analysing what has been shown thus far. * ''Perhaps later I'll start a page on prolog-inspired databases that show how a single model can be MultiParadigm and type-heavy.'' Further, if in order to use your MPDB, one has to master a programming language that people do not like, it will not poliferate even if it is the greatest MentalMasturbation toy since Lisp. ''Ideally you shouldn't have to '''master''' a programming language in order to use the database, I'll agree. You '''master''' the language in order to '''master''' the database; there should be some natural graduation of available power, such that it's easy to create tools that perform easy or common activities. However, put it this way: almost nobody likes SQL, but it proliferates away regardless. An MPDB will succeed if it does the job people demand of it either in the absence of equivalent competition, or while providing a competitive advantage (e.g. one competitive advantage is the greater speed and optimization, and another is simpler work with tree-structured and collection-structured values, both of which can be achieved much more easily in the presence of static-typing or soft-typing.)'' I'd like to see evidence for the tree and collection claims. The main competitive advantage of MPDB is flexibility and reduced reliance on a DBA, while fitting most of the SQL-influenced idioms people are already familiar with. I haven't seen any idea that is more flexible yet not requiring tossing out most existing RDBMS knowledge. There are DB ideas that are ''more'' flexible, yes, but they are too different from established tools. I realize that these optimization goals (flexibility + familiarity) may only matter in some niches; but universality is not necessarily the primary goal. I challenge anybody to find a better fit for flexibility + familiarity. -t -------- PageAnchor Tuple-Spaces : Between this and auto-gen row ID, it sounds like what you want is a TupleSpace... plus Join queries. ''Its structure sounds too "fixed" for the needs described here.'' You never did describe any 'needs' here. Regardless, I haven't a clue how you came to believe the structure of 'TupleSpace' is too "fixed". What gives you that impression? ''The meaning of the positions doesn't seem to be well-defined or well-tracked in TupleSpace. The approach I'm proposing uses maps and every row (map) carries a label for each value (or at least requires a reference that identifies such). I don't see where TupleSpace guarantees the same thing. '' ---- The topic DynamicRelational was created in attempt to split OO-centric characteristics from dynamic characteristics. Perhaps a refactoring of this topic is in order. ---- See Also: ObjectsAreDictionaries, TablesCanBeObjects, SqlFlaws, TupleDefinitionDiscussion, MaspBrainstorming, TableQuantityVersusAppSize, GodTable, MultiParadigmDatabaseDiscussion, MultiParadigmDatabaseQuestions, MultiParadigmDatabaseCriticism. SeptemberZeroSeven JuneThirteen ------- CategoryMultiPurpose, CategoryDatabase, CategoryMultiparadigm, CategorySpeculative ----