Critique my MySQL Database Design for Unlimited DYNAMIC Fields

Looking for a scalable, flexible and fast database design for 'Build your own form' style website - e.g Wufoo.

Rules:

  1. User has only 1 Form they can build
  2. User can create their own fields or choose from 'standard' fields
  3. User's 1 Form has as many fields as the user wants
  4. Values can be the sibling of another value E.g A photo value could have name, location, width, height as sibling values

Special Rules:

  1. User can submit their form a maximum of 5 times a day
  2. Value's Date is important
  3. Flexibility to report on values (for single user, across all users, 1 field, many fields) is very important -- data visualization (most will be chronologically based e.g. all photos for July 2009 for all users).

Table "users"

uid

Table "field_user" - assign a field to a users form

fid

uid

weight - int - used to order the fields on the users form

Table "fields"

fid

creator_uid - int - the field 'creator'

label - varchar - e.g. Email

value_type - varchar - used to determine what field in the 'values' table will be filled in (e.g. if 'int' then values of this field will submit data into the values.type_int field - and all other .type_x fields will be NULL).

field_type - varchar - e.g. 'email' - used for special conditions e.g. validation rules

Table "values"

vid

parent_vid

fid

uid

date - date

date_group - int - value 1-5 (user may submit max of 5 forms per day)

type_varchar - varchar

type_text - text

type_int - int

type_float - float

type_bool - bool

type_date - date

type_timestamp - timestamp

I understand that this approach will mean records in the 'Value' table will only have 1 piece of data with other .type_x fields containing NULL's... but from my understanding this design will be the 'fastest' solution (less queries, less join tables)


ANSWERS:


At OSCON yesterday, Josh Berkus gave a good tutorial on DB design, and he spent a good fraction of it mercilessly tearing into such "EAV"il tables; you should be able to find his slides on the OSCON site soon, and eventually the audio recording of his whole tutorial online (the latter will probably take a while).

You'll need a join per attribute (multiple instances of the values table, one per attribute you're fetching or updating) so I don't know what you mean by "less join tables". Joining many instances of the same table isn't a particularly fast operation, and your design makes indices nearly unfeasible and unusable.

At least as a minor improvement use per-type separate tables for your attributes' values (maybe some indexing might be applicable in that case, though with MySQL's limitation to one index per query per table even that's somewhat dubious).


You should really look into schema-free dbs like CouchDB, problems like this are exactly those these types of DBs want to solve.


y'know, create table, alter, add a column, etc are operations you can do at run time in many modern rdbms implementations. Why be EAVil? Especially if you are using dynamic sql.

It's not for the fainthearted. I recall an implementation at Boeing which resulted in 70,000 tables in a database.

Obviously there are pitfalls in dynamic table creation, but they also exist for EAV tables. Things like two attributes for the same key expressing the same fact. Or transitive dependencies and other normalization gotchas. So why not at least leverage the power of the RDBMS on your behalf?


I agree with john owen.

dynamically creating a query from the schema is a small price to pay compared to querying EVA tables. Especially if the tables are large.

Usually table columns are considered an "interface". A design that relies on a dynamically changing interface is usually bad, but EAV data is a special case where you don't have many options. You have to choose between slow unintuitive queries or dynamic schema.



 MORE:


 ? Store dynamic form fields in database
 ? Learning - Extensibility: Dynamic loading, and any other no-recompile software updates
 ? dynamic database columns?
 ? Learning Dynamic Allocation
 ? Dynamic Database table attributes (best practice)
 ? Designing a database for my students
 ? Java database management system to handle nested/tree data
 ? Database design; Dynamically creating tables
 ? Dynamic equivalent of Python's "from <module> import <symbol>"
 ? Dynamic equivalent of Python's "from <module> import <symbol>"