开发数据库模型的不确定性

我正在尝试为候选人,他们的注册考试和考试结果建立数据库模型。这是我到目前为止所做的。但是,不确定从检查表到检查结果表的走线是否正确。为特定候选人的检查结果总体正确编写插入sql代码有多容易

考试类型分为科学,艺术和社会科学。它们每个都有4个组成部分

在此处输入图片说明

PerformanceDBA

进度注释

Given the fact that the Question changes substantially (in clarifying the requirement, not is scope) in response to my Response and TRD, this is going to take some back-and-forth. Let's identify Steps: your Step numbers are odd, starting from 1; mine, in response, are even. Parts of previous Response Steps have become obsolete, they may no longer make sense.

I would suggest a bounty, except for the fact that you have few points.

Response Step 2 to Initial Question & Step 1 Diagram

This is what I've done so far.

You have done some good work, but it is too early for assigning PKs. Besides, assigning an ID on every file as a starting point will cripple the modelling process, the result will not be a database. You have to model the data (not the database) first, then assign Keys when the entities are clear and stable. So drop all your IDs and PKs and model the data, as data. Forget about what you want to do with the data (ie. forget the app).

how easy will it be to right write an insert sql code for examinationresult population for a particular candidate

Right now you can't. You have no relationship between Candidate and Examination[Result]. That is not a problem because the modelling is incomplete at this stage, when it is complete the code will be simple.

The entity Course is implied, but it is missing.

however im unsure if am on the right track especially from the examination table to the examination result table

You are on the right track with some of the other files, but the Examination cluster needs work. This will take a bit of back-and-forth. Once you answer the questions in the comments, we can proceed.

The main issue is this: how is Examination identified.

An ID field does not identify anything, nor does it provide uniqueness in the data, which is required if you want data integrity. IDs result in a Record Filing System with no integrity, however, it appears you want a database with data integrity. Is that correct ?

Go back to the user and discuss how courses and components are identified, what codes they use, etc. Those are the natural Keys that they use to identify their data, that they will enter into the system when they need look something up, or to enter examination results.

  • Eg. It is not reasonable to contemplate an Examination that exists independently (as you have modelled it). People do not go to a hall and sit for any old exam. The exam exists only in the context of a course, they sit for an exam for a course.
  • Then the course, and not the exam, has components, which are examined. And each course has a different number of components.
  • Eg. a Course which is identified as ENG101 for English Literature year 1
  • And then the components within that. Eg. 2b Short essay on poetry.

They may need to identify the year and semester of the course as well, in which case, you need a CourseOffering per semester.

Consider this, as a discussion point. Courier is example data, blue is Key, green is non-key:

Response Step 4

Response to Question & Description

This is what I've done so far.

My previous response still applies:

You have done some good work, but it is too early for assigning PKs. Besides, assigning an ID on every file as a starting point will cripple the modelling process, the result will not be a database. You have to model the data (not the database) first, then assign Keys when the entities are clear and stable. So drop all your IDs and PKs and model the data, as data. Forget about what you want to do with the data (ie. forget the app).

You have not addressed that issue, that I identified in your Step 1 Diagram, in your Step 3 Diagram. It appears, from the evidence, that you might be happy with IDs as "Primary Keys" (there aren't), despite the hindrance having been identified to you. That means your understanding of the data is crippled, and the progress of your diagrams will be slow.

My previous response still applies:

An ID field does not identify anything, nor does it provide uniqueness in the data, which is required if you want data integrity. IDs result in a Record Filing System with no integrity, however, it appears you want a database with data integrity. Is that correct ?

You must answer these questions, otherwise your design cannot proceed. These are severe errors that must be corrected. One cannot build on, or progress, a foundation that contains severe errors.

  1. Could you please confirm, you do want a Relational Database, with the integrity and performance that Relational Databases are capable of, that is easy to code against, as opposed to a Record Filing System, with no integrity or speed, that will be difficult to code against. Correct ?

  2. If [1] is correct. Since ID fields as "Primary Keys" do not provide row uniqueness, which is demanded for a Relational Database, how exactly, do you intend to provide the required row uniqueness ? Alternately, are you happy to have an RFS that is full of duplicate rows (each with an unique record ID) ?

how easy will it be to right write an insert sql code for examinationresult population for a particular candidate

My previous response still applies:

Right now you can't. You have no relationship between Candidate and Examination[Result]. That is not a problem because the modelling is incomplete at this stage, when it is complete the code will be simple.

Ok, in your Step 3 Diagram, you have drawn a line between Candidate file and the ExaminationResult file (as opposed to, inserting a relationship in a database).

  • In a record filing system, sure, you can just draw a line between any two files, insert the relevant ID field, and hey presto, you have "linked" or "connected" or "mapped" the two files.

  • But database design (as opposed to file design) does not progress like that, you cannot just draw a line between any two objects, insert the relevant ID field, and hey presto, create a database relationship. No. There is no basis, no integrity, in the dashed line that you have drawn. Eg. in your Step 3 Diagram, any Candidate can be related to any Examination[Result].

  • That is "normal" or "ordinary" in record filing systems, but in a database, it is something to be recognised and understood as an error, and thus prevented. Because we expect integrity in a database, and because it can be prevented, easily.

however im unsure if am on the right track especially from the examination table to the examination result table

My previous response still applies:

You are on the right track with some of the other files, but the Examination cluster needs work. This will take a bit of back-and-forth. Once you answer the questions in the comments, we can proceed.

The main issue is this: how is Examination identified.

An ID field does not identify a row (it identifies a record, which has no relevance whatsoever in a database).

The same two problems (a) lack of a valid identifier, and (b) lack of row uniqueness, exists with your Candidate, Component and ExaminationResult files.

Response to Diagram as a Diagram (as opposed to the content)

You have improved it over your Step 1 Diagram, and in response to my Response Step 2, great. But the relationships (most of them) are still incorrect. And the basis of Candidate::Examination is still not resolved.

  • It appears to me that you are not clear about the notation (notches; circles; crows feet) and precisely what they mean at the parent and child ends). So you need to learn that first, and then draw the diagram, rather than the other way round.

  • It is great that you are using a Notation that is meaningful, and many details are shown (many people don't, they draw nice-looking diagrams that lack the detail required for a full understanding of the model. That means that every notch; circle; crows foot, has specific meaning, and must be drawn correctly, in order to convey that meaning to the reader.

  • Entities do not exist in isolation, there must always be a parent first, in order for the child to be a child of the parent. There is no such thing as "equal". Dependency is always in one direction.

  • Your relationships that are 1-and-only-1 on one side, and 1-and-only-1 on the other side, are incorrect, they indicate a Normalisation error. The field in the subordinate record can be Normalised into the ordinate record.

    • Eg. AdmissionLetter is not a separate file, some form of AdmissionLetter identifier (not an ID field) should be located in Candidate.

    • Eg. Title::Candidate is a drawing error, it should be 1 at the Title end and 0-to-many at the Candidate end.

  • In a data model, bold (by convention) means a migrated Foreign Key. The Primary Key that is migrated is not bold.

Response to Diagram Content

  1. From your replies, the term Subject trumps the term Component; Category trumps various loosely-identified elements into one clear entity.

  2. It is not reasonable to contemplate an Examination that exists independently (as you have modelled it).

    • People do not go to a hall and sit for any old exam, any old Subject. The exam exists only in the context of a Subject, they sit for an exam for a Subject.

    • I accept that the Examination is one sitting, for four Subjects

    • I accept that the four Subjects are defined by a Category.

    • I accept that the Candidate is registered for a Category.

    • Thus the exam exists only in the context of a Subject, which exists only in the context of a Category, and the Candidate sits for an exam which is a Category, which contains four (the number does not matter) Subjects.

  3. Having resolved that, two questions remain:

    • Do you need to record an Examination as an event, independent of the Candidates who sit in that event. Eg. Examination(Location, DateTime) ?

    • Does the Examination event examine Candidates in one, or more than one, Category ?

  4. The notion of four Subjects that are implemented as four repeated fields in one record breaks Second Normal Form, which demands that repeating fields are Normalised into separate records in a child file.

    • Therefore, for both your Component and ExaminationResult files, that issue needs to be resolved.

    • Note that the fact that that problem is repeated in two separate files is a second alarm that it is an error.

    I have clarified the Category/Subject issues for you, and resolved the Normalisation error.

    • I have given simple identifiers for Categories and Subjects.

    • If you do not implement that, you will not have integrity between the Candidate and the Subject they are being Examined for. As well, you will suffer various problems when you get to the coding stage.

  5. I have no idea what you are trying to do with exComp, therefore I have no response. Perhaps you can say a few words about it.

  6. Thus far, there is still no reasonable way of relating Candidates to Examinations or ExaminationResults. That is, it has no basis, nothing has been defined as the basis for the relationship, and thus the relationship has no integrity.

    • On the basis of what I have been able to ascertain thus far, there must be some sort of registration for an exam. Otherwise you would not know that a Candidate is sitting for an exam.

    • When the Candidate registers, they register for an exam, and that exam is defined (and therefore constrained) by a Category. Otherwise any Candidate can sit for any exam, which I believe, you would like to prevent.

    • Further, the [four] exam Subjects that they sit for, should be constrained by the Category that they registered for.

      • You do want to ensure that you do not record an Economics exam result for a Candidate who is registered for Science, correct ?

    I have determined that the basis of an exam is the Registration. That is the event, the fact, the recording of which, establishes that a Candidate will sit for an exam.

    • The identifier virtually jumps out at you, it is CategoryCode plus CandidateID. Voila! we have row uniqueness. Magnifique! we have integrity.

    • Now the integrity of ExaminationResult can be implemented: it is constrained to the CandidateRegistration::Category and to the Category::Subject.

  7. To be Resolved: Do you need to identify the fact of a Candidate registering for an examination (RegistrationDate, AdmissionLetter of whatever) vs the fact that the Candidate sat for the examination (eg. ExaminationDate) ? A sort of roll call.

    Right now, I have modelled that as a single fact with no differentiation, and the table is called Examination because you seem to be focussed on that.

Predicate

These days, people seem to be throwing themselves at drawing a diagram, without understanding either the basics of a Relational Database, or of the exercise of modelling data. Predictably, that results in an ill-defined diagram (many relevant details are omitted) [gratefully, your diagram has some definition], and it produces a record filing system with no integrity, no relational power, no speed, instead of a Relational Database with integrity, power, and speed.

One concept that is often missing is Predicates. A competent reader can read a good data model, and ascertain the Predicates, because they are drawn in the model, in the form of notation, but a novice doesn't understand the notation, or the relevance of the various items, and therefore will miss the Predicates. In sum, the Predicates are all the constraints that are placed on the data:

  • Row Identification:

    • The basis of it existence, and how it is Identified: Independent (square corners); or Dependent (round corners)
  • Row Uniqueness: Primary and Alternate Keys (note, IDs are not Keys)

  • Relationships between rows:

    • Identifying (solid lines); or Non-identifying (dashed lines)

    • Meaning, relevance, purpose: the all-important Verb Phrase

Further, a novice cannot determine the Predicates when there is no diagram, or when the diagram is poor, or when they are designing the filing system and drawing the diagram themselves. Thus they do not identify the relevant Predicates in their diagram.

Predicates are very important during the modelling exercise, in that as well as the model expressing the Predicates, the Predicates confirm the accuracy of the model, it is a feedback loop. It is an essential part of the modelling exercise. Since I am executing the modelling task for you, I am working out the Predicates as I perform that task, they are obvious to me. But they may not be obvious to you.

When the data model is published, and ready for discussion with the users, these Predicates are incorporated into it. They come under the heading of Business Rules, they form a part of that, because that is the way the user perceives them. Consequently, during the walkthroughs and discussions, the Predicates (as well as the other stated Business Rules) are either confirmed or denied by the user. They need to be stated explicitly, because unlike the technically educated developer, the user cannot be expected to read all the relevant Predicates from the notation in a good data model.

In this situation, I am the modeller, and you are the "user". Thus I have decided to provide the Predicates for you, explicitly. So that you can confirm or deny them, and thus we can progress the modelling exercise. Once you get used to reading the Predicates from a good data model, you will not need to have them declared explicitly for you. Again, Predicates are very important because they verify (or not) the accuracy of the model. So please read them carefully and comment on any Predicates that you do not completely agree with, or that you do not understand.

Of course, it is not necessary to explicitly declare all the Predicates, there are just too many, we declare just the more relevant ones, that relate to:

  • (a) rows (tables), the basis for their existence

  • (b) their identification

  • (c) all dependencies

  • (d) relationships, both sides (one side is the Verb Phrase).

Step 4 TRD

I have implemented all the above, as detailed. Please consider this TRD as a discussion platform for the next iteration, and comment. Courier indicates example data, blue indicates Key values, green indicates non-key values:

Response Step 6 to Chat Step 5

All issues discussed have been resolved, and implemented in the model. Sorry, I do not have time right now to post details, this is simply identifies the updated models.

  • Entity-Relation and full Predicates on page 1

    All resolved issues have been implemented.

    Predicates
    Now that it is stable, I am now giving you the second side of the Relation Predicates (child-to-parent). And now that you understand them, I have deleted the repeated, annoying "Each" that is demanded for novices.

  • Entity-Relation-Key on page 2

    Now that the TRD is stable, we are ready to proceed to Determination of Keys

    (Second only to Normalisation, Key Determination is a critical part of the modelling exercise. The two tasks are normally performed side-by-side, they are inseparable, I have already determined the keys. In this case, given the limitations of the communication media, I am presenting it as a sequential step).

    Here, I use an Extension to the IDEF1X Notation that allows me to concentrate of the components that are relevant to the task, I expect that it is self-explanatory. The Key columns only, are given. Foreign Keys are not Bold (as they are in the DM). All that, is intended to make it easy on the eye.

    Most tables have one Key (Primary). Where there are two Keys (Primary and Alternate), the AK is below the line.

    根据您的要求,这是我对“钥匙”的建议,以供您查看。

步骤6 TRD和TRK 6

本文收集自互联网,转载请注明来源。

如有侵权,请联系 [email protected] 删除。

编辑于
0

我来说两句

0 条评论
登录 后参与评论

相关文章