What is Database Design?
In This Article, We Are Going To Answer One Of The Most Important Questions Of Software Engineering Students And Explore What Database Design Is.
One Thing That Many Users Are Unaware Of Is The Philosophy Of Database Design. Large Databases Are Implemented On A Scientific Basis Called Ontology.
While having enough information about how to interact with databases, a software engineer sometimes has to design databases to meet the organization’s needs.
Given that databases are widely used in various fields such as machine learning, it is important to have a thorough understanding of the concept of database design.
Database design course for associate, bachelor, and master
Database design is one of the courses that undergraduate and graduate undergraduate and graduate software students are familiar with in the main courses of this field. Undergraduate and graduate students learn the basics of database design and learn more advanced details at the graduate level.
As you can see, undergraduate and graduate students are familiar with all the important concepts in this field, although it takes considerable time to master each of these concepts.
What is a database?
Databases are organized collections of data stored in binary or text files on computer systems, allowing companies to accurately and purposefully access business information.
Today, databases have taken on a complex and advanced form, so formal modeling is needed to design them.
The database design should be such that the Database Management System (DBMS) (software that the end-user, applications, and the database itself needs to record and analyze data) can communicate with it without any problems.
What is database design?
Database design refers to organizing data based on a database model. The database designer specifies what data should be stored and what pattern the data components should interact with to simplify processing and access information.
Database design involves the classification and identification of interrelationships, called ontology. Ontology is the theory behind the design of databases.
Database design refers to processes that simplify the design, development, implementation, and maintenance of business data management systems.
A good database improves data compatibility and does not waste valuable storage space in vain.
For this reason, the main task of the database designer is to determine how the components of the database interact with each other and determine the data that should store.
Database design emphasizes two critical points of the physical design model and the logical design model.
The logical model focuses on data requirements but does not include how data is stored. The physical model focuses on translating a logical database model into a physical medium (local or cloud) using hardware and software resources (DBMS).
What is database design modeling?
The first responsibility of the database designer is to build a conceptual data model that shows the structure of the information to be stored in the database.
Typically, database designers use the Entity-Relationship approach for conceptual modeling and special tools such as the Unified Modeling Language (UML) title.
A successful data model accurately reflects the modeled state of the external world. For example, if people can have more than one phone number in the real world, a conceptual model should show that a database can record more than one phone number for one person.
Accordingly, before designing a conceptual model, the business needs must be properly understood so that the model is designed based on factual information and not assumptions.
To receive such information, the database designer must prepare in-depth questions that align with the goals of the organization and the database to be implemented.
This helps the database designer properly assess the database requirements and know what issues are important and what issues should be ignored.
Once the process of building a conceptual data model is complete and the user is satisfied with it, the next step is to translate the model into a Database Schema, which some sources call the logical database design process.
The output of this step, called the logical data model, is presented in the form of a schema. In a situation where the data model is a technology-independent concept on which the database is built, the logical data model is described as a specific database model supported by a database management system.
The most common pattern used by designers to build is the relational model provided by the SQL language.
The logical database design process using this model is based on the Normalization method.
The purpose of normalization is to ensure that information is stored in only one place. That compatibility is maintained in the database when various operations such as inserting, updating, and deleting are performed automatically or manually.
The final step in the database design process is decision-making regarding performance, scalability, post-accident recovery, and security, which directly impact database construction and are directly related to the database management system to be used for hosting.
These are referred to as Physical Database Design, which determines the output of the physical data model. The important point that this step emphasizes is that the decisions made regarding performance optimization should be visible to the end-user and the applications that are going to use the database.
What is physical database design?
Physical Database Design Specifies the physical configuration of the database on the storage medium. In the process of physical database design, data types, indexing options, details about data elements, media hardware specifications, application software, and details related to other components of the database and database management system are examined.
The most important aspects that the physical design of the database emphasizes should be mentioned as follows:
Security: چه What is the level of security of the end-user and the security of the database administrator account?
Duplication: What parts of the data should be copied to the secondary database, and at what intervals is this process done?
High accessibility: Regardless of active or passive configuration, the connection and schema meet all the customer’s business requirements. The designer must answer the important question: Is there access to the database at the highest level?
Segmentation: If the database is to be implemented in a distributed manner, is a single entity defined and the data distribution process in all parts of the database performed correctly? In addition, what are the necessary arrangements for more advanced cases such as failures in different parts of the database?
Is the ability to backup and restore the schema considered?
Why should we pay special attention to designing a database before building it?
The purpose of designing a database is to accurately and purposefully build information repositories that are supposed to meet the organization’s needs and provide the highest level of performance.
For this reason, when building databases, their development life cycle must carefully consider. The database development lifecycle refers to the steps that are performed sequentially when developing a database.
This life cycle is as follows.
Needs analysis
The needs analysis phase involves planning and defining the system. The planning stage is related to the database life cycle and defines the organization’s business vision and goals.
Database design
In database design, two categories of the logical model and physical model are examined. A logical model refers to the development of a database model based on needs.
In this stage, database design is done without physical implementation or specification of a specific database management system.
The physical model uses the schema prepared in the logical model stage and implements the physical model based on the factors of the database management system.
Implementation
In the implementation phase, two important processes of data conversion and loading and testing are performed. In the data conversion and loading step, the database is coded. New data is entered into the database, or data inside the old systems are imported after conversion to the new database, if necessary.
After completing this process, the testing phase begins, aiming to identify errors in the new system.
In the testing phase, the database is evaluated for coding errors or security issues. In most cases, the person responsible for designing the database is an expert in database design and implementation and is not supposed to specialize in the data to be entered into the database.
For example, a database designer is not supposed to be proficient in financial information, scientific data, or medicine. For this reason, when entering data into the database, another person must help the designer store the correct data in the database.
Once the data is to be stored in the database, the interdependencies between the data must determine.
For example, only one real address is assumed for an employee (name field), and two different addresses for an employee should not register simultaneously. In addition, the inverse dependency must be considered, as several employees may reside in one physical address.
For example, several employees live in a complex. In this case, the address field is considered dependent on a name. Another point to be careful about, and unfortunately, most database designers ignore, is the hidden relationship between the different fields. Once the data is to be stored in the database, the interdependencies between the data must determine.
For example, only one real address is assumed for an employee (name field), and two different addresses for an employee should not register simultaneously.
In addition, the inverse dependency must be considered, as several employees may reside in one physical address. For example, several employees live in a complex.
In this case, the address field is considered dependent on a name.
Another point to be careful about, and unfortunately, most database designers ignore, is the hidden relationship between the different fields.
Once the data is to be stored in the database, the interdependencies between the data must determine. For example, only one real address is assumed for an employee (name field), and two different addresses for an employee should not register simultaneously.
In addition, the inverse dependency must be considered, as several employees may reside in one physical address. For example, several employees live in a complex. In this case, the address field is considered dependent on a name.
Another point to be careful about, and unfortunately, most database designers ignore, is the hidden relationship between the different fields.
In addition, the inverse dependency must be considered, as several employees may reside in one physical address.
For example, several employees live in a complex. In this case, the address field is considered dependent on a name. Another point to be careful about, and unfortunately, most database designers ignore, is the hidden relationship between the different fields. In addition, the inverse dependency must be considered, as several employees may reside in one physical address.
For example, several employees live in a complex.
In this case, the address field is considered dependent on a name. Another point to be careful about, and unfortunately, most database designers ignore, is the hidden relationship between the different fields.
For example, sometimes the first and last names of employees working in an organization are the same or similar (because they are family), and for example, the first and last names of the son and grandfather are the same.
(For example, Ali Roghani, the son of Reza Roghani, works in an organization, and Reza Roghani is also called Ali Ronaghi and is employed in the same organization!) If the database designer does not know about this issue, the organization will face many problems…
last word
As you can see, a database designer has many different tasks and responsibilities, and the lack of knowledge about each of them drastically reduces the performance of the database.
As mentioned, the first task of a database designer is to design a conceptual data model that reflects the structure of the information in the database.
Database designers use custom design tools to develop an institution-relationship model, one of the most popular of the integrated modeling language.
In designing and building databases, some new experts create tables and define the relationships between tables without being thoroughly familiar with the structures and concepts related to this field.
There are several important points in database design that need to be carefully considered.
The first point is to make the right decision about normalization (avoiding repetition to save volume) or non-normalization (trying to repeat to increase speed).
Sometimes organizations store the same information in different databases for faster access to records.
The second important point is indexing or purposeful sorting of information. Some believe that sorting is done only based on username or numeric ID, while this rule does not apply to all databases.
The third point to note is B-tree, which some people confuse with a binary tree, while B-tree in this area refers to the self-balancing tree, meaning that the software is based on the number of rows.
The data table categorizes the indexing steps in several steps to quickly reach the desired row data by comparing the searched value.
The different levels of these categories are divided into three types: root nodes (initial category only), branch nodes (middle), and leaf nodes (end).
The fourth important point is to increase or improve the performance of the database. For this purpose, solutions such as combined indexing according to the type of search are used, which is defined when creating connections between tables.
In connection with some servers, such as MySQL, designers ask what type of MyISAM or InnoDB information engine should be used.
The shortest answer is that the former represents the speed at which information is recorded (usually an archive). The latter represents the balance at the speed of recording and reading.
In connection with the first information engine, the entire table is locked at the time of data entry, known as the non-transactional engine, while the latter has the opposite function and locks the rows.
Typically, the database design process begins after the brainstorming phase, when the model design phase may be performed simultaneously.