Database




 A collection of program that enables you to store modify, and extract information from a database. There are many different types of DBMSs, ranging from small systems that run on personal computers to huge systems that run on mainframes. The following are examples of database appication:

v  computerized library systems

v  automated teller machines

v  flight reservation systems

v  computerized parts inventory systems

From a technical standpoint, DBMSs can differ widely. The terms relationalnetworkflat, and hierarchical all refer to the way a DBMS organizes information internally. The internal organization can affect how quickly and flexibly you can extract information.Requests for information from a database are made in the form of a query, which is a stylized question. For example, the query

SELECT *FROM users; 

DBMS Stands for "Database Management System." In short, a DBMS is a database program. Technically speaking, it is a software system that uses a standard method of cataloging, retrieving, and running queries on data. The DBMS manages incoming data, organizes it, and provides ways for the data to be modified or extracted by users or other programs.
Some DBMS examples include MySQL, PostgreSQL, Microsoft Access, SQL Server, FileMaker, Oracle, RDBMS, dBASE, Clipper, and FoxPro. Since there are so many database management systems available, it is important for there to be a way for them to communicate with each other. For this reason, most database software comes with an Open Database Connectivity (ODBC) driver that allows the database to integrate with other databases. For example, common SQL statements such as SELECT and INSERT are translated from a program's proprietary syntax into a syntax other databases can understand.


What is a database management system?

 

A database management system (or DBMS) is essentially nothing more than a computerized data-keeping system. Users of the system are given facilities to perform several kinds of operations on such a system for either manipulation of the data in the database or the management of the database structure itself. Database Management Systems (DBMSs) are categorized according to their data structures or types.

 

There are several types of databases that can be used on a mainframe to exploit z/OS®: inverted list, hierarchic, network, or relational.

 

Mainframe sites tend to use a hierarchical model when the data structure (not data values) of the data needed for an application is relatively static. For example, a Bill of Material (BOM) database structure always has a high level assembly part number, and several levels of components with subcomponents. The structure usually has a component forecast, cost, and pricing data, and so on. The structure of the data for a BOM application rarely changes, and new data elements (not values) are rarely identified. An application normally starts at the top with the assembly part number, and goes down to the detail components.

Hierarchical and relational database systems have common benefits. RDBMS has the additional, significant advantage over the hierarchical DB of being non-navigational. By navigational, we mean that in a hierarchical database, the application programmer must know the structure of the database. The program must contain specific logic to navigate from the root segment to the desired child segments containing the desired attributes or elements. The program must still access the intervening segments, even though they are not needed.

The remainder of this section discusses the relational database structure.

 

What structures exist in a relational database?

Relational databases include the following structures: 

Database

A database is a logical grouping of data. It contains a set of related table spaces and index spaces. Typically, a database contains all the data that is associated with one application or with a group of related applications. You could have a payroll database or an inventory database, for example.

Table

A table is a logical structure made up of rows and columns. Rows have no fixed order, so if you retrieve data you might need to sort the data. The order of the columns is the order specified when the table was created by the database administrator. At the intersection of every column and row is a specific data item called a value, or, more precisely, an atomic value. A table is named with a high-level qualifier of the owner's user ID followed by the table name, for example TEST.DEPT or PROD.DEPT. There are three types of tables:

v  A base table that is created and holds persistent data

v  A temporary table that stores intermediate query results

v  A results table that is returned when you query tables.

Figure 1. Example of a DB2 table (department table)



In this table we use:

v  Columns–The ordered set of columns are DEPTNO, DEPTNAME, MGRNO, and ADMRDEPT. All the data in a given column must be of the same data type.

v  Rows–Each row contains data for a single department.

v  Values–At the intersection of a column and row is a value. For example, PLANNING is the value of the DEPTNAME column in the row for department B01.

Indexes

An index is an ordered set of pointers to rows of a table. Unlike the rows of a table that are not in a specific order, an index must always be maintained in order by DB2®. An index is used for two purposes:

v  For performance, to retrieve data values more quickly

v  For uniqueness.

By creating an index on an employee's name, you can retrieve data more quickly for that employee than by scanning the entire table. Also, by creating a unique index on an employee number, DB2 will enforce the uniqueness of each value. A unique index is the only way DB2 can enforce uniqueness.

Creating an index automatically creates the index space, the data set that contains the index.

Keys

A key is one or more columns that are identified as such in the creation of a table or index, or in the definition of referential integrity. 

Primary key

A table can only have one primary key because it defines the entity. There are two requirements for a primary key:

v  It must have a value, that is, it cannot be null.

v  It must be unique, that is, it must have a unique index defined on it.

Unique key

We already know that a primary key must be unique, but it is possible to have more than one unique key in a table. In our EMP table example, the employee number is defined as the primary key and is therefore unique. If we also had a social security value in our table, hopefully that value would be unique. To guarantee this, you could create a unique index on the social security column.

 

Foreign key

A foreign key is a key that is specified in a referential integrity constraint to make its existence dependent on a primary or unique key (parent key) in another table.

The example given is that of an employee's work department number relating to the primary key defined on the department number in the DEPT table. This constraint is part of the definition of the table.



 An Instance is a logical environment for DB2 Database Manager. Using instance, you can manage databases. Depending on our requirements, you can create multiple instances on one physical machine. The contents of Instance directory are:

v  Database Manager Configuration file

v  System Database Directory

v  Node Directory

v  Node Configuration File [db2nodes.cfg]

v  Debugging files, dump files

 For DB2 Database Server, the default instance is “DB2”. It is not possible to change the location of Instance directory after its creation. An instance can manage multiple databases. In an instance, each database has a unique name, its own set of catalog tables, configurations files, authorities and privileges.

 

Architecture of instance in DB2 product

 



Multiple instances

 You can create multiple instances in one DB2Server on Linux, UNIX and Windows. It is possible to install multiple DB2Servers on a physical machine.

Creating instance on Linux

You can create multiple instances on Linux and UNIX if DB2 Server is installed as root user. An instance can run simultaneously on Linux and UNIX independently. You can work within a single instance of the database manager at a time.

An Instance folder contains database configuration files and folders. The Instance directory is stored at different locations on Windows depending on the operating system versions.


Database architecture



A database is a collection of Tables, Schemas, Bufferpools, Logs, Storage groups and Tablespaces working together to handle database operations efficiently.

 Database directory

Database directory is an organized repository of databases. When you create a database, all the details about database are stored in a database directory, such as details of default storage devices, configuration files, and temporary tables list etc. Partition global directory is created in the instance folder. This directory contains all global information related to the database.


Buffer Pools



  Bufferpool is portion of a main memory space which is allocated by the database manager. The purpose of bufferpools is to cache table and index data from disk. All databases have their own bufferpools. A default bufferpool is created at the time of creation of new database. It called as “IBMDEFAULTBP”. Depending on the user requirements, it is possible to create a number of bufferpools. In the bufferpool, the database manager places the table row data as a page. This page stays in the bufferpool until the database is shutdown or until the space is written with new data. The pages in the bufferpool, which are updated with data but are not written onto the disk, are called “Dirty” pages. After the updated data pages in the bufferpool are written on the disk, the bufferpool is ready to take another data.

Relationship between tablespaces and bufferpools

Each table space is associated with a specific buffer pool in a database. One tablespace is associated with one bufferpool. The size of bufferpool and tablespace must be same. Multiple bufferpools allow you to configure the memory used by the database to increase its overall performance.

Bufferpool sizes

The size of the bufferpool page is set when you use the “CREATE DATABASE” command. If you do not specify the page size, it will take default page size, which is 4KB. Once the bufferpool is created, it is not possible to modify the page size later


Tablespaces



Table space is a storage structure, it contains tables, indexes, large objects, and long data. It can be used to organize data in a database into logical storage group which is related with where data stored on a system. This tablespaces are stored in database partition groups

Benefits of tablespaces in database

The table spaces are beneficial in database in various ways given as follows:

Recoverability: Tablespaces make backup and restore operations more convenient. Using a single command, you can make backup or restore all the database objects in tablespaces.

Automatic storage Management: Database manager creates and extends containers depending on the needs.

Memory utilization: A single bufferpool can manage multiple tablespaces. You can assign temporary tablespaces to their own bufferpool to increase the performance of activities such as sorts or joins.

Container

Tablespaces contains one or more containers. A container can be a directory name, a device name, or a filename. In a database, a single tablespace can have several containers on the same physical storage device. If the tablespace is created with automatic storage tablespace option, the creation and management of containers is handled automatically by the database manager. If it is not created with automatic storage tablespace option, you need to define and manage the containers yourself.

Default tablespaces

When you create a new database, the database manager creates some default tablespaces for database. These tablespace is used as a storage for user and temporary data. Each database must contain at least three tablespaces as given here:

v  Catalog tablespace

v  User tablespace

v  Temporary tablespace

Catalog tablespace: It contains system catalog tables for the database. It is named as SYSCATSPACE and it cannot be dropped.

User tablespace: This tablespace contains user-defined tables. In a database, we have one default user tablespace, named as USERSPACE1. If you do not specify user-defined tablespace for a table at the time you create it, then the database manager chooses default user tablespace for you.

Temporary tablespace: A temporary tablespace contains temporary table data. This tablespace contains system temporary tablespaces or user temporary tablespace.

System temporary tablespace holds temporary data required by the database manager while performing operation such as sorts or joins. A database must have at least one system temporary tablespace and it is named as TEMPSPACE1. It is created at the time of creating the database. User temporary tablespace holds temporary data from tables. It is created with DECLARE GLOBAL TEMPORARY TABLE or CREATE GLOBAL TEMPORARY TABLE statement. This temporary tablespace is not created by default at the time of database creation.

Tablespaces and storage management:

Tablespaces can be setup in different ways, depending on how you want to use them. You can setup the operating system to manage tablespace allocation, you can let the database manager allocate space or you can choose automatic allocation of tablespace for your data.

The following three types of managed spaces are available:

System Managed Space (SMS): The operating system’s file system manager allocates and manages the space where the table is stored. Storage space is allocated on demand. This model consists of files representing database objects. This tablespace type has been deprecated in Version 10.1 for user-defined tablespaces, and it is not deprecated for catalog and temporary tablespaces.

Database Managed Space (DMS): The Database Server controls the storage space. Storage space is pre- allocated on the file system based on container definition that you specify when you create the DMS table space. It is deprecated from version 10.1 fix pack 1 for user-defined tablespaces, but it is not deprecated for system tablespace and temporary tablespace.

Automatic Storage Tablespace: Database server can be managed automatically. Database server creates and extends containers depend on data on database. With automatic storage management, it is not required to provide container definitions. The database server looks after creating and extending containers to make use of the storage allocated to the database. If you add storage space to a storage group, new containers are automatically created when the existing container reach their maximum capacity. If you want to use the newly-added storage immediately, you can rebalance the tablespace.

Page, table and tablespace size:

Temporary DMS and automatic storage tablespaces, the page size you choose for your database determines the maximum limit for the tablespace size. For table SMS and temporary automatic storage tablespaces, the page size constrains the size of table itself. The page sizes can be 4kb, 8kb, 16kb or 32kb.

Tablespace type

4K page size limit

8K page size limit

16K page size limit

32K page size limit

DMS, non-temporary automatic storage tablespace regular

64G

128G

256G

512G

DMS, temporary DMS and non- temporary automatic storage table space large

1892

 

Database Storage Groups



 A set of Storage paths to store database table or objects, is a storage group. You can assign the tablespaces to the storage group. When you create a database, all the tablespaces take default storagegorup. The default storage group for a database is ‘IBMSTOGROUP’. When you create a new database, the default storage group is active, if you pass the “AUTOMATIC STOGROUP NO” parameter at the end of “CREATE DATABASE” command. The database does not have any default storage groups.


Schema

 A schema is a collection of named objects classified logically in the database. In a database, you cannot create multiple database objects with same name. To do so, the schema provides a group environment. You can create multiple schemas in a database and you can create multiple database objects with same name, with different schema groups.



A schema can contain tables, functions, indices, tablespaces, procedures, triggers etc. For example, you create two different schemas named as “Professional” and “Personal” for an “employee” database. It is possible to make two different tables with the same name “Employee”. In this environment, one table has professional information and the other has personal information of employee. In spite of having two tables with the same name, they have two different schemas “Personal” and “Professional”. Hence, the user can work with both without encountering any problem. This feature is useful when there are constraints on the naming of tables.


Data Types

  1. Datetime
    1. TIME: It represents the time of the day in hours, minutes and seconds.
    2. TIMESTAMP: It represents seven values of the date and time in the form of year, month, day, hours, minutes, seconds and microseconds.
    3. DATE: It represents date of the day in three parts in the form of year, month and day.
  2. String
    1. Character
  3. CHAR (fixed length): Fixed length of Character strings.
    1. Varying length
  4. VARCHAR: Varying length character strings.
  5. CLOB: large object strings, you use this when a character string might exceed the limits of the VARCHAR data type.
    1. Graphic
  6. GRAPHIC
    1. Fixed length: Fixed length graphic strings that contains double-byte characters
    2. Varying length
  7. VARGRAPHIC: Varying character graphic string that contains double bye characters.
  8. DBCLOB: large object type
    1. Binary
  9. BLOB (varying length): binary string in large object
  10. BOOLEAN: In the form of 0 and 1.
  11. Signed numeric
    1. Exact
  12. Binary integer
    1. SMALLINT [16BIT]: Using this you can insert small int values into columns
    2. INTEGER [32BIT]: Using this you can insert large int values into columns
    3. BIGINT [64BIT]: Using this you can insert larger int values into columns
  13. Decimal
    1. DECIMAL (packed)
    2. DECFLOAT (decimal floating point): Using this, you can insert decimal floating point numbers
    3. Approximate
  14. Floating points
    1. REAL (single precision): Using this data type, you can insert single precision floating point numbers.
    2. DOUBLE (double precision): Using this data type, you can insert double precision floating point numbers.
  15. eXtensible Mark-up Language
    1. XML: You can store XML data into this data type column.

 Tables are logical structure maintained by Database manager. In a table each vertical block called as column (Tuple) and each horizontal block called as row (Entity). The collection of data stored in the form of columns and rows is known as a table. In tables, each column has different data type. Tables are used to store persistent data.

Type of tables

  1. Base Tables: They hold persistent data. There are different kinds of base tables, including:
    1. Regular Tables: General purpose tables, Common tables with indexes are general purpose tables.
    2. Multidimensional Clustering Table (MDC): This type of table physically clustered on more than one key, and it used to maintain large database environments. These type of tables are not supported in DB2 pureScale.
    3. Insert time clustering Table (ITC): Similar to MDC tables, rows are clustered by the time they are inserted into the tables. They can be partitioned tables. They too, do not support pureScale environment.
    4. Range-Clustered tables Table (RCT): These type of tables provide fast and direct access of data. These are implemented as sequential clusters. Each record in the table has a record ID. These type of tables are used where the data is clustered tightly with one or more columns in the table. This type of tables also do not support in DB2 pureScale.
    5. Partitioned Tables: These type of tables are used in data organization schema, in which table data is divided into multiple storage objects. Data partitions can be added to, attached to and detached from a partitioned table. You can store multiple data partition from a table in one tablespace.
    6. Temporal Tables: History of a table in a database is stored in temporal tables such as details of the modifications done previously.
  2. Temporary Tables: For temporary work of different database operations, you need to use temporary tables. The temporary tables (DGTTs) do not appear in system catalog, XML columns cannot be used in created temporary tables.
  3. Materialized Query Tables: MQT can be used to improve the performance of queries. These types of tables are defined by a query, which is used to determine the data in the tables

Constraints

 To enforce database integrity, a set of rules is defined, called constraints. The constraints either permit or prohibit the values in the columns.

In a Real time database activities, the data should be added with certain restrictions. For example, in a sales database, sales-id or transaction-id should be unique. The constraints types are:

v  NOT NULL

v  Unique

v  Primary key

v  Foreign Key

v  Check

v  Informational

Constraints are only associated with tables. They are applied to only particular tables. They are defined and applied to the table at the time of table creation.

 NOT NULL

It is a rule to prohibit null values from one or more columns within the table.


Indexes

  Index is a set of pointers, which can refer to rows in a table, blocks in MDC or ITC tables, XML data in an XML storage object that are logically ordered by the values of one or more keys. It is created on DB2 table columns to speed up the data access for the queries, and to cluster and partition the data efficiently. It can also improve the performance of operation on the view. A table with a unique index can have rows with unique keys. Depending on the table requirements, you can take different types of indexes.

Types of indexes

v  Unique and Non-Unique indexes

v  Clustered and non-clustered indexes


Trigger

A trigger is a set of actions, which are performed for responding to an INSERT, UPDATE or DELETE operation on a specified table in the database. Triggers are stored in the database at once. They handle governance of data. They can be accessed and shared among multiple applications. The advantage of using triggers is, if any change needs to be done in the application, it is done at the trigger; instead of changing each application that is accessing the trigger. Triggers are easy to maintain and they enforce faster application development. Triggers are defined using an SQL statement “CREATE TRIGGER”.

Types of triggers

There are two types of triggers:

v  BEFORE triggers

They are executed before any SQL operation.

v  AFTER triggers

They are executed after any SQL operation.


Sequences

  A sequence is a software function that generates integer numbers in either ascending or descending order, within a definite range, to generate primary key and coordinate other keys among the table. You use sequence for availing integer numbers say, for employee_id or transaction_id. A sequence can support SMALLINT, BIGINT, INTEGER, and DECIMAL data types. A sequence can be shared among multiple applications. A sequence is incremented or decremented irrespective of transactions.

A sequence is created by CREATE SEQUENCE statement.

Types of Sequences

There are two type of sequences available:

·        NEXTVAL: It returns an incremented value for a sequence number.

·        PREVIOUS VALUE: It returns recently generated value.

Parameters of sequences

The following parameters are used for sequences:

Data type: This is the data type of the returned incremented value. (SMALLINT, BIGINT, INTEGER, NUMBER, DOUBLE)

START WITH: The reference value, with which the sequence starts.

MINVALUE: A minimum value for a sequence to start with.

MAXVALUE: A maximum value for a sequence.

INCREMENT BY: step value by which a sequence is incremented.

Sequence cycling: the CYCLE clause causes generation of the sequence repeatedly. The sequence generation is conducted by referring the returned value, which is stored into the database by previous sequence generation.


View

  A view is an alternative way of representing the data stored in the tables. It is not an actual table and it does not have any permanent storage. View provides a way of looking at the data in one or more tables. It is a named specification of a result table.









       














No comments:

Post a Comment