10 October 2017
This blog gives an introduction of Relational database-management system (RDBMS) and different topics which are frequently asked in interview questions.
A RDBMS is a computer-software application that interacts with end-users, other applications, and the database itself to capture and analyze data. A general-purpose DBMS allows the definition, creation, querying, update, and administration of databases.
It is a type of database which is optimized for data warehouse and online analytical processing(OLAP) applications. This database are normally created from the inputs obtained from existing relational databases.
Relational database can be accessed using a SQL query whereas a multidimensional database allows a user to answer analytical question from those relational database.Multidimensional database is mainly used to summarize business operations and trends.
OLAP tool allows user to analyze multidimensional data interactively from multiple perspectives. OLAP is part of the broader category of business intelligence, which also encompasses relational database, report writing and data mining. Typical applications of OLAP include business reporting for sales, marketing, management reporting, business process management (BPM), budgeting and forecasting, financial reporting and similar areas, with new applications coming up, such as agriculture.
It is used to manage transaction-oriented applications typically used for data entry and retrieving transactional processing on a relational database management system.
A Natural (Business) Key is the column or column(s) in the source data that identify a unique row.It can be exposed on a business report and have business value in the company.
A Surrogate Key is an artificially produced value, most often a system-managed, incrementing counter whose values can range from one to n. It has no business meaning and should not be exposed in reports.
It is recommended to use the natural keys of the table throughout the ETL process.Assigning and maintaining a surrogate key is complex and leads to lot of errors.If invalid surrogate keys are assigned ,it might cause severe data integrity issue in the data ware house .Upon re-loading a table ,it is extremely difficult to guarantee the same Surrogate Key for a particular row.This complexity and risk can be eliminated by using natural keys and keeping the surrogate key from source system as data attribute .
It is a process of organizing the data in database to avoid data redundancy, insertion anomaly, update anomaly & deletion anomaly.
Most commonly used normal forms:
It is a normal form that is used in normalizing a database design to reduce the duplication of data and ensure referential integrity by ensuring that (1) the entity is in second normal form, and (2) all the attributes in a table are determined only by the candidate keys of that relation and not by any non-prime attributes.
There are four isolation levels
A synonym is an alias or alternate name for a table, view, sequence, or other schema object. They are used mainly to make it easy for users to access database objects owned by other users. They hide the underlying object’s identity and make it harder for a malicious program or user to target the underlying object. Because a synonym is just an alternate name for an object, it requires no storage other than its definition There are two major uses of synonyms:
1.Statement -It is Used for SQL Queries
It is mainly used for repetitive SQL queries .We cannot use Stored procedure in Prepared Statement . Instances of Prepared Statement contain an SQL statement that has already been compiled. This is what makes a statement “prepared”
Because Prepared Statement objects are precompiled, their execution can be faster than that of Statement objects. The prepared statement is used to execute sql queries
A Callable Statement object provides a way to call stored procedures in a standard way for all RDBMSs. A stored procedure is stored in a database; the call to the stored procedure is what a Callable Statement object contains. It is Used for Stored Procedures and functions
Junction table is a place to store attributes of the relationships between two lists of entities.
A junction table allows us to create the many-to-many relationship and prevents us from adding duplicate entries
It outstrips traditional data integration (DI), which tends to focus primarily on the selection, movement, and transformation of strictly-structured data from OLTP or other, similar sources.
Database cursor is a control structure that enables traversal over the records in a database. Cursors facilitate subsequent processing in conjunction with the traversal, such as retrieval, addition and removal of database records. A cursor can be viewed as a pointer to one row in a set of rows.
An index in database creates an entry for each value which makes it faster to retrieve data. It is one of the performance tuning method in database that allows faster retrieval of records from the table.
There are 3 types of Indexes in Database
Relationship in database defines how different tables are connected together.
There are mainly 4 types of relationship in database
User defined functions are the custom functions that can be used in queries.
An aggregate function is used to perform a calculation on a set of values to return a single scalar value. We often use aggregate functions with the GROUP BY and HAVING clauses of the SELECT statement.
Below are the most commonly used SQL aggregate functions:
AVG – calculates the average of a set of values. COUNT – counts rows in a specified table or view. MIN – gets the minimum value in a set of values. MAX – gets the maximum value in a set of values. SUM – calculates the sum of values.
View is like a subset of table which are stored logically in a database. A view is a virtual table. It contains rows and columns similar to a real table. The fields in the view are fields from one or more real tables. Views do not contain data of their own.
CREATE VIEW <view-name> AS SELECT <column_name1,column_name2> FROM <table_name> WHERE <condition>
Constraints are rules you can place on columns or tables to limit the type of data that can be entered into a table. This prevents errors and can improve the accuracy and reliability of the database as a whole.
Common constraints include:
|NOT NULL||It prevents a column from having a NULL value|
|DEFAULT||It specifies a default value for a column where none is specified|
|PRIMARY KEY||It Uniquely identifies rows/records within a database table|
|FOREIGN KEY||Uniquely identifies rows/records from external database tables|
|UNIQUE||It makes sure that all values are unique|
|CHECK||Checks values within a column against certain conditions|
|INDEX||Used to quickly create and retrieve data from a database|
WHERE clause is used when there is a need of condition on individual rows .Where clause is used to fetch data from database that specifies a particular criteria .It cannot be used with aggregrate function.
HAVING clause is used when we need to introduce conditions on aggregrate results .
Having clause is used along with ‘GROUP BY’ to fetch data that meets a particular criteria specified by the Aggregate functions. It can be used with aggregrate function
DELETE command is used to remove rows from the table, and WHERE clause can be used for conditional set of parameters. With the execution of DELETE command it is possible to commit and Rollback
TRUNCATE removes all rows from the table such that it cannot be rolled back.
The basic SQL commands can be organized into the following categories:
1. Data Manipulation Language (DML)
Data manipulation Language or DML is used to access or manipulate data in the database.
2. Data Definition Language (DDL)
3. Data Control Language: (DCL)
Even though both of these data types are used for characters ,
VARCHAR2 is used for character strings of variable length whereas
CHAR is used for character strings of fixed length
A subquery is a select statement that is embedded in a clause of another select statement. An Example
SELECT employeename ,salary FROM employeeTable WHERE salary > (SELECT salary FROM employeeTable WHERE employeename = 'Nitendra');
A Correlated subquery is a subquery that is evaluated once for each row processed by the outer query or main query. Execute the Inner query based on the value fetched by the Outer query all the values returned by the main query are matched. The INNER Query is driven by the OUTER Query.
An Correlated Query Example:
SELECT employeeNumber ,salary from branchid FROM employeeTable et WHERE salary =(SELECT AVG(salary) FROM employeeTable WHERE branchid = et.brachnid);
First inner query executes and finds a value which is again used by outer query to execute .
A clustered index is a special type of index that reorders the way records in the table are physically stored. Therefore table can have only one clustered index. The leaf nodes of a clustered index contain the data pages.
A non clustered index is a special type of index in which the logical order of the index does not match the physical stored order of the rows on disk. The leaf node of a non clustered index does not consist of the data pages. Instead, the leaf nodes contain index rows.
In in SQL we can use CASE statement or DECODE statement to transpose a table
 Data Warehouse
 Union vs Join