YuWebdesign



SQL Interview questions (Theoretical)

By YuwebDesign


SQL stands for Structured Query Language. It is the standard language to operate a relational database management system (RDBMS, e.g., MYSQL). SQL is used to access and manipulate data in a database.
Standard SQL Commands are Select.

    SQL can:

  1. execute queries against a database
  2. retrieve data from a database
  3. insert records in a database
  4. update records in a database
  5. delete records from a database
  6. create new databases
  7. create new tables in a database
  8. create stored procedures in a database
  9. create views in a database
  10. set permissions on tables, procedures, and views

Although SQL is an ANSI/ISO standard, there are different versions of the SQL language. Most of the SQL database programs also have their own proprietary extensions in addition to the SQL standard.
However, to be compliant with the ANSI standard, they all support at least the major commands (such as SELECT, UPDATE, DELETE, INSERT, WHERE) in a similar manner.

*ANSI – American National Standards Institute.
*ISO – International Organization for Standardization.

Although SQL is a language, it does not support programming since it is not a programming language, it is a command language.

We do not have conditional statements in SQL (e.g., “for loops” or “if..else”), we only have commands that can be used to access and/or manipulate data in a Database: SELECT, INSERT, DELETE etc.

SQL stands for Structured Query Language. It is the standard language to operate a relational database management system (RDBMS), such as MYSQL.
MySQL is database software that uses the SQL language to query the database.

Developed in the mid-90s., MySQL was one of the first open-source database available in the market. Today there are many alternatives variants of MySQL (e.g., MS SQL Server, IBM DB2, Oracle, and Microsoft Access). However, the differences between the variants are not significant as they use the same syntax, and basic functionality also remains same.


SQL vs PL/SQL?

SQL is a query language that allows you to issue a single query or execute a single insert/update/delete whereas PL/SQL is Oracle’s “Procedural Language” SQL, which allows you to write a full program (loops, variables, etc.) to accomplish multiple operations such as selects/inserts/updates/deletes.

What is a Database?

Database is nothing but an organized collection of data.
Databases allow to easily access, store, and manipulate the data.

A user can interact with the database through a variety of the Database Management Systems (DBMS).

Relational Databases (e.g., MySQL) based on a branch of algebraic set theory known as Relational Algebra. Relational databases are a collection of schemas, tables, queries, views etc. Data is stored in tables that are related.
Non-relational Databases (e.g., MongoDB) represent data in collections of JSON documents.

What is DBMS? What are its different types?

A Database Management System (DBMS) is essentially nothing more than a computerized data-keeping system.

DBMS interacts with the user, applications and the database itself.

It gives a user facilities to interact with the database so that

  1. the data stored in the database can be manipulated (modified, retrieved, added or deleted)
  2. the database structure itself can be manipulated.
  1. Relational DBMS (RDBMS):
    The data is stored in relations (tables), uses SQL.
    E.g., MySQL, SQL Server, IBM DB2, Oracle
  2. NoSQL or Non-Relational Database Management System:
    NoSQL commonly referred to as “Not Only SQL”.
    There is no concept of relations (tables), tuples (rows) and attributes (columns).Data is stored in many ways: columns, key-value store (associative array/map/dictionary), document store (JSON, XML, YAML, or binary forms like BSON, values are nested hierarchically), graph (elements interconnected with a finite number of relations between them, e.g., social relations, public transport links, road maps).
    E.g., MarkLogic, Couchbase, CloudDB, and Amazon’s Dynamo DB.
  3. Object-oriented DBMS (OODBMS):
    information is represented as objects, uses an object-oriented programming language (e.g., Java, Python, C#, C++, or JavaScript) for development.
    E.g., Versant Object Database, Objectivity/DB, ObjectStore, Caché and ZODB
  4. Hierarchical DBMS:
    organizes data into a tree-like structure, where each record has a single parent or root (one-to-many relationship). This is similar to a folder architecture in a computer system and requires the user to pass a hierarchy in order to access needed data.
    This model was primarily used by IBM’s Information Management Systems in the 60s and 70s, but they are rarely seen today due to certain operational inefficiencies..
  5. Network DBMS:
    builds on the hierarchical model by allowing many-to-many relationships between linked records, so each child record can have multiple parents. Based on mathematical set theory, the model is constructed with sets of related records. Each set consists of one owner/parent record and one or more member/child records.

    It was most popular in the 70s after it was formally defined by the Conference on Data Systems Languages (CODASYL).

What is RDBMS?

RDBMS stands for Relational Database Management System.

A relational database management system is a database management system based on the relational model of data. It also provides relational operators to manipulate the data stored in the database.

Most relational database management systems use the SQL language to access the database.

The data in RDBMS is stored in database objects called tables.
Tables are related by common fields between the columns of the table (thus the name “relational”).

Examples are MS SQL Server, IBM DB2, Oracle, MySQL, and Microsoft Access.
NoSQL databases have recently become popular as an alternative to RDBMS databases.

State some properties of Relational databases?

  1. In relational databases, each column should have a unique name
  2. The sequence of rows and columns in relational databases are insignificant
  3. All values are atomic and each row is unique

What is a DB schema?

Schema is the structure of the database that defines the objects in the database.

RDBMS schemas are fixed: Table structure and data types are defined at creation. To change the schema, the entire database must be altered, during which time the database must be taken offline.

Schema objects can be created and manipulated with SQL.
Although a schema is defined in SQL, the term is often used to refer to a graphical depiction of the database structure.

RDBMS vs NoSQL

  1. RDBMS stores data in structured normalized relations.
    NoSQL stores denormalized unstructured and semi-structured data.
    Data is stored in many ways: columns, key-value store (associative array/map/dictionary), document store (JSON, XML, YAML, or binary forms like BSON, values are nested hierarchically), graph (elements interconnected with a finite number of relations between them, e.g., social relations, public transport links, road maps).
  2. RDBMS schemas are fixed: Table structure and data types are defined at creation. To change the schema, the entire database must be altered, during which time the database must be taken offline.
    NoSQL are defined as schema-less or dynamic: documents can be created without having to first define their structure, each document can have its own unique structure, and you can add fields as you go.
  3. RDBMS uses SQ: for queries.
    In NoSQL the syntax can vary from database to database, e.g. Mongo DB allows data manipulation through object-oriented APIs
  4. One of the main differences between NoSQL and SQL is that NoSQL databases are considered to be more scalable than SQL databases.
    RDBMS are vertically scalable (increasing server hardware power).
    NoSQL are horizontally scalable (adding more machines into pool of resources)
  5. NoSQL does not support multi-record ACID transactions
  6. RDBMS can be configured for strong consistency.
    NoSQL: Some DB provide relatively strong consistency (e.g., MongoDB, with tunable consistency for reads) whereas others offer eventual consistency (e.g., Cassandra).
  7. No Stored Procedures in NoSql.

What are Tables, Rows and Fields?

Each database consists of a structure to hold the data and the data itself.
Data in a database is stored in one or more tables.

Tables (also referred to as “relations”) are a collection of related data entries.

Database tables are a grid organized in horizontal rows and vertical columns.
Rows represents records and columns represent the attributes.
A table has a specified number of columns but can have any number of rows.

A row (record/tuple) – each single individual, implicitly structured data item/entry that exists in a table (e.g., customer, book, project; 91 records in the Customers table).

Each column (attribute) contains specific information about the entity (e.g., customer name, book author, project start date).

The place where a particular row and column intersect, the individual cell of the table, is called a field. A field is a column in a table that is designed to maintain specific information about every record in the table.

What is Identity?

Identity (or AutoNumber) is a column that automatically generates numeric values.

A start and increment value can be set, but most DBA leave these at 1.

A GUID column also generates numbers; the value of this cannot be controlled.

Identity/GUID columns do not need to be indexed.

Row vs tuple vs column

A row is also often referred to as tuple.

Concept of tuples is widely used in programming languages where tuple is any list of heterogeneous values.

The term “tuple” comes from Set Theory, where n-tuple is an ordered list (collection, sequence) of “n” elements/records.
Given this, it might be more properly said that tuples are implemented as records even though the terms are commonly used interchangeably.

A tuple in RDBMS is commonly considered to be a row in a table or result set.
This means that all rows are tuples, but not all tuples are rows.

Any result set of a single query is a grid, so in DB terms, it consists of tuples.

Tuples are just a representation; it means that you can potentially query multiple rows/columns (as tuples) and return them as a new tuple.

E.g., if you create a view from multiple tables or do the join, each new row of a new grid will be a new tuple.

SELECT
one.fieldA AS fieldAOne,
one.fieldB AS fieldBOne,
two.fieldA AS fieldATwo
FROM tableA one
INNER JOIN tableA two

/*let's say rows of tableA are represented as tuple (Double, Text)
then the rows of the result of the query would be (Double, Text, Double)
which contains values from multiple rows*/

What are Database Entities and Relationships?

Entities: A person, place, or thing in the real world about which data can be stored in a database. Tables store data that represents one type of entity.
E.g., a bank database has a customer table to store customer information. Customer table stores this information as a set of attributes (columns within the table) for each customer.

Relation instance − A finite set of tuples in the relational database system represents relation instance. Relation instances do not have duplicate tuples.

Relation schema − A relation schema describes the relation name (table name), attributes, and their names.

Relationships: relation or links between entities that have something to do with each other.
E.g, The customer name is related to the customer account number and contact information, which might be in the same table. There can also be relationships between separate tables (for example, customer to accounts).

Name different kinds of Database Relationships

Database Relationship is defined as the connection between the tables in a database. There are three types of relationship in DBMS.

  • One-to-one: Both tables can have only one record on either side of the relationship. Each primary key value relates to only one (or no) record in the related table. They’re like spouses—you may or may not be married, but if you are, both you and your spouse have only one spouse.
  • One-to-many (or many-to-one): The primary key table contains only one record that relates to none, one, or many records in the related table. This relationship is similar to the one between you and a parent. You have only one mother, but your mother may have several children. (E.g, Hierarchical DBMS)
  • Many-to-many: Each record in both tables can relate to any number of records (or no records) in the other table. For instance, if you have several siblings, so do your siblings (have many siblings).
    E.g., Network DBMS.
  • Self-Referencing: One table with a self referencing parent. It is the common database design pattern for deeply nested hierarchies.

What are the different tables/database ENGINEs present in MySQL?

MySQL does not come with a single engine. Rather, it ships with several engines, all buried within the MySQL server.

MySQL uses internal engines to manage and manipulate the data. When you use the CREATE TABLE statement, some engine is used to actually create the tables, and when you use the SELECT statement or perform any other database processing, that engine is used internally to process your request.

If you omit the “ENGINE=” statement, the default engine is used (most likely MyISAM), and most of your SQL statements will work as is. But not all of them will, and that is why this is important

Foreign Keys Can’t Span Engines
A table using one engine cannot have a foreign key referring to a table that uses another engine.

    ENGINES:

  1. MyISAM – the default database engine used in MySQL. Very high-performance engine that supports full-text searching, but does not support transactional processing.
  2. MEMORY – functionally equivalent to MyISAM, but as data is stored in memory (instead of on disk) it is extremely fast (and ideally suited for temporary tables).
  3. INNO DB – transaction-safe engine, does not support full-text searching.
  4. Heap
  5. Merge
  6. ISAM

Define Data Integrity

Data Integrity defines the accuracy and consistency, and reliability of data stored in a database.

Maintaining data integrity means making sure the data remains intact and unchanged throughout its entire life cycle. This includes the capture of the data, storage, updates, transfers, backups, etc.

    4 Types of Data Integrity

  1. Entity (Row) integrity – defines each row to be unique within its table. No two rows can be the same. To achieve this, a primary key can be defined.
  2. Referential integrity – When two or more tables have a relationship, we have to ensure that the foreign key value matches the primary key value at all times and there are no orphaned records.
  3. Domain (Column) integrity – concerns the validity of entries for a given column. Selecting the appropriate data type for a column is the first step in maintaining domain integrity.
    Other steps could include, setting up appropriate constraints (e.g., Check and Default constraints) and rules to define the data format and/or restricting the range of possible values.
  4. User-defined integrity – allows the user to apply business rules to the database that aren’t covered by any of the other three data integrity types.

Every time data is processed there’s a risk that it could get corrupted (whether accidentally or maliciously). E.g., a user could accidentally try to enter a phone number into a date field. If the system enforces data integrity, it will prevent the user from making these mistakes.

    Some more examples of where data integrity is at risk:

  • A user tries to enter a date outside an acceptable range.
  • A user tries to enter a phone number in the wrong format.
  • A bug in an application attempts to delete the wrong record.
  • While transferring data between two databases, the developer accidentally tries to insert the data into the wrong table.
  • While transferring data between two databases, the network went down.
    A user tries to delete a record in a table, but another table is referencing that record as part of a relationship.
  • A user tries to update a primary key value when there’s already a foreign key in a related table pointing to that value.
  • A developer forgets that he’s on a production system and starts entering test data directly into the database.
  • A hacker manages to steal all user passwords from the database.
  • A hacker hacks into the network and drops the database (i.e. deletes it and all its data).

Explain SQL Data Types

Each column in a database table is required to have a name and a data type.

An SQL developer must decide what type of data that will be stored inside each column when creating a table. The data type is a guideline for SQL to understand what type of data is expected inside of each column, and it also identifies how SQL will interact with the stored data.

E.g., integer, character, money, date and time, binary etc.

What is the largest value that can be stored in a BYTE data field?

The largest number that can be represented in a single byte is 11111111 or 255.
The number of possible values is 256
(i.e. 255 (the largest possible value) plus 1 (zero), or 28).

What are the possible values for the BOOLEAN data field?

For a BOOLEAN data field, two values are possible: -1(true) and 0(false).

CHAR vs VARCHAR datatype in SQL

Although both CHAR and VARCHAR are used for characters datatype, they differ in storage and retrieval.

  1. CHAR column length is fixed while VARCHAR length is variable.
  2. The maximum no. of characters CHAR data type can hold is 255.
    VARCHAR can hold up to 4000 characters.
  3. CHAR is 50% faster than VARCHAR.
  4. CHAR uses static memory allocation while VARCHAR uses dynamic memory allocation.

CHAR vs VARCHAR2 datatype in SQL

Both CHAR and VARCHAR2 are used for characters datatype.

  1. VARCHAR2 is used for character strings of variable length.
  2. CHAR is used for strings of fixed length.

E.g., CHAR(7) can only store 7 characters, a string of any other length will not be allowed.
VARCHAR2(7) can store string of any length, e.g., 6,5, or 2 will be allowed in this variable.

Can you elaborate on BLOB and TEXT in MySQL?

BLOB(Binary Large Object) is used to hold a variable amount of data and holds up to 65,535 bytes of data.

  1. TINYBLOB
  2. BLOB
  3. MEDIUMBLOB
  4. LONGBLOB

TEXT is used to store string values and holds up to a maximum length of 65,535 characters.

  1. TINYTEXT
  2. TEXT
  3. MEDIUMTEXT
  4. LONGTEXT

NULL vs Zero vs Blank Space/Empty String

A NULL value is not at all same as that of zero or a blank space.

Every table column is either a NULL column or a NOT NULL column, and that state is specified in the table definition at creation time.
NULL value represents a value which is nothing, empty, the lack of a value, no value at all, unavailable, a missing unknown value, assigned or not applicable. It indicates data that does not exist.

A column that allows NULL values also allows rows to be inserted with no value at all in that column.
If a field in a table is optional, it is possible to insert a new record or update a record without providing a value for the field. The field will be saved with a NULL value.

Blank space/empty string is a character. If you were to specify ” (two single quotes with nothing in between them), that would be allowed in a NOT NULL column. An empty string is a valid value; it is not no value. NULL values are specified with the keyword NULL, not with an empty string.

Zero is a number.

What is a CONSTRAINT?

Constraints are used to limit the type of data that can go into a table. This ensures the accuracy and reliability of the data in the table. If there is any violation between the constraint and the data action, the action is aborted.

  1. NOT NULL – Ensures that a column cannot have a NULL value
  2. CHECK – Ensures that all values in a column are different
  3. DEFAULT – Sets a default value for a column when no value is specified
  4. UNIQUE – Ensures that all values in a column are different
  5. PRIMARY KEY – A combination of a NOT NULL and UNIQUE. Uniquely identifies each row in a table
  6. FOREIGN KEY – Uniquely identifies a row/record in another table
  7. INDEX – Used to create and retrieve data from the database very quickly

What are the various levels of CONSTRAINTs?

Constraints can be column level or table level. Column level constraints apply to a column, and table level constraints apply to the whole table.

What is a UNIQUE key?

A unique index does not allow any duplicate values to be inserted into the table.
There is no difference between Unique Index and Unique Constraint. Unique Constraint creates Unique Index to maintain the constraint to prevent duplicate keys.
Unique indexes are used not only for performance, but also for data integrity.

What is a PRIMARY KEY?

The PRIMARY KEY constraint uniquely identifies each record in a table.
It is a combination of a NOT NULL and UNIQUE: Primary keys must contain UNIQUE values, and cannot contain NULL values.
A table can have only one primary key, which may consist of single or multiple fields. If a single column is used for the primary key, it must be unique; if multiple columns are used, the combination of them must be unique.

PRIMARY KEY vs UNIQUE

Both the UNIQUE and PRIMARY KEY constraints provide a guarantee for uniqueness for a column or set of columns.

A PRIMARY KEY constraint automatically has a UNIQUE constraint.

  1. You can have many UNIQUE constraints per table, but only one PRIMARY KEY per table.
  2. PRIMARY KEY does not allow NULL value, UNIQUE constraints can have NULL values.
  3. PRIMARY KEY creates the clustered index automatically, UNIQUE key creates non-clustered index.

What are Keys of a relation?

  1. Candidate key – minimal set of attributes which can uniquely identify a tuple.
    The value of Candidate Key is unique and non-null for every tuple.
    E.g., STUD_NO as well as STUD_PHONE both are candidate keys for relation STUDENT.

    The candidate key can be simple (having only one attribute) or composite.
    E.g., {STUD_NO, COURSE_NO} is a composite candidate key for relation STUDENT_COURSE.

    There can be more than 1 Candidate key of a relation and its proper subset can’t determine tuple uniquely.
    Candidate key is a minimal superkey, i.e., no proper subset of Candidate key attributes can be a superkey.


  2. Super key – set of attributes which can uniquely identify a tuple.
    Adding zero or more attributes to candidate key generates super key.
    No two rows can have the same value of super key attributes.
    Candidate key is always a Super key but vice versa is not true.
    E.g., STUD_NO, (STUD_NO, STUD_NAME) etc.

  3. A Primary Key and Alternate Key
    Among various candidate keys, one is selected as most important and becomes the primary key and others are alternate keys.
    There cannot be more that one primary keys in a table.
    E.g., STUD_NO and STUD_PHONE both are candidate keys for relation STUDENT but STUD_NO can be chosen as Primary key (only one out of many Candidate keys).

  4. Foreign key is a field (or collection of fields) in one table that uniquely identifies a row of another table.
    Referenced attribute of referenced relation should be primary key for it.
    E.g., STUD_NO in STUDENT_COURSE is a foreign key to STUD_NO in STUDENT relation.

    A table can have many foreign keys and only one primary key.
    Unlike, Primary Key, Foreign Key can be NULL
    as well as may contain duplicate tuples i.e. it need not follow uniqueness constraint.

    For Example, STUD_NO in STUDENT_COURSE relation is not unique. It has been repeated for the first and third tuple. However, the STUD_NO in STUDENT relation is a primary key and it needs to be always unique and it cannot be null.

What is a FOREIGN KEY?

FOREIGN KEY (also referred to as “Referential Integrity Constraint”) links two tables together.
It is a field (or collection of fields) in one table
that points to the PRIMARY KEY in another table.

The table containing the FOREIGN KEY is called the child table,
and the table containing the PRIMARY KEY is called the referenced or parent table.

Syntax

CREATE TABLE child_table(
[columns]
PRIMARY KEY (column_name)
FOREIGN KEY (column_name) REFERENCES parent_table(column_name)
);

// for multiple columns as a FOREIGN KEY
CREATE TABLE child_table(
[columns]
PRIMARY KEY (column_name)
CONSTRAINT FK_name FOREIGN KEY (column_name1, column_name2)
REFERENCES parent_table(column_name1, column_name2)
);

The FOREIGN KEY constraint is used to prevent:

  1. actions that would destroy links between tables.
  2. invalid data from being inserted into the foreign key column,
    because it has to be one of the values contained in the table it points to.

    The FOREIGN KEY constraint could be described as “Allow No Orphans!”
    NoChild should ever be allowed in the child table if it hasn’t a Parent (a row in the parent table).

Several relational behavior options are available.
Unless otherwise specified, the default behavior is:
ON DELETE RESTRICT ON UPDATE CASCADE

  1. CASCADE = update child keys when parent is updated,
    this is dangerous on delete = “Leave No Orphans!” (kill children too).
  2. RESTRICT = no action (on delete = “You Can’t Kill Parents!”)
  3. SET DEFAULT = sets the child keys to the specified default value
  4. SET NULL = set all the child keys to null values

Is it possible for a table to have more than one foreign key?

A table can have many foreign keys and only one primary key.

What is AUTO-INCREMENT?

Auto increment keyword automatically generates a unique number when a new record is inserted into the table. MySQL automatically assigns the next available number each time a row is added to a table.

Only one AUTO_INCREMENT column is allowed per table, and it must be indexed (for example, by making it a primary key).

What is an INDEX?

INDEX is a performance optimization technique
that speeds up the data retrieval process.

Users cannot see the INDEXes, they are just used to speed up searches/queries.

Data can be stored only in one order on a disk.
To support faster access according to different values,
faster search like binary search for different values is desired.
For this purpose, indexes are created on tables.
They allow faster search according to different frequently searched values.

Searching whole table without INDEX is called full table scan.

A very good analogy is to think of a database index as an index in a book.
If you have a book regarding countries and you are looking for Australia,
instead of flipping through the entire book
(which is the equivalent of a full table scan in database terminology)
you can just go to the index at the back of the book
(index object in a database),
which will point to the exact pages that contain information about Australia.

Similarly to a book index containing a page number,
a database index contains a pointer to the row
that has the value that you are searching for
in your SQL.

Name different types of INDEXes

  1. Clustered Index reorders the physical order of how the data is physically stored on disk. Example is a PRIMARY KEY.

    When a table has a clustered index, the table is called a clustered table.
    If a table has no clustered index, data rows are stored in an unordered structure called a heap.

  2. Nonclustered Index does not alter the physical order of the table. It creates a complete separate index object within the table, which contains key values, each of which has a pointer to the original data row that has the value that you are searching for in your SQL.

    Updating a table with nonclustered index takes more time than updating a table without it (because the indexes also need to update). So, indexes need to be created only on columns that are frequently searched.

Clustered vs Nonclustered INDEX?

  1. Clustered index reorders the physical order of how the data is physically stored on disk. Database sorts out rows by the column which is set to be clustered index.

    A nonclustered index does not alter the way the data is stored. It creates a complete separate index object within the table, which contains key values, each of which has a pointer to the original data row that has the value that you are searching for in your SQL.

  2. There can be only one clustered index per table, because the data can be physically stored in only one order. In fact, the only time the data rows in a table are stored in sorted order is when the table contains a clustered index.
    Each table can have 999 nonclustered indexes.
  3. Unlike clustered indexes, nonclustered indexes need extra space on disk to maintain the extra copy of data.

  4. Both clustered and nonclustered indexes can be UNIQUE.
    When you create a table with a UNIQUE constraint, Database Engine automatically creates a non-clustered index.

    If you configure a PRIMARY KEY, Database Engine automatically creates a clustered index, unless a clustered index already exists.

  5. Clustered index search is relatively faster whereas reading from non clustered index.

What are the different operators available in SQL?

    There are three operators available in SQL, namely:

  1. Arithmetic Operators
    + Add
    – Subtract
    * Multiply
    / Divide
    % Modulo
  2. Comparison Operators
    = Equal to
    > Greater than
    < Less than >= Greater than or equal to
    <= Less than or equal to <> Not equal to
  3. Logical Operators
    ALL TRUE if all of the subquery values meet the condition
    AND TRUE if all the conditions separated by AND is TRUE
    ANY TRUE if any of the subquery values meet the condition
    BETWEEN TRUE if the operand is within the range of comparisons
    EXISTS TRUE if the subquery returns one or more records
    IN TRUE if the operand is equal to one of a list of expressions
    LIKE TRUE if the operand matches a pattern
    NOT Displays a record if the condition(s) is NOT TRUE
    OR TRUE if any of the conditions separated by OR is TRUE
    SOME TRUE if any of the subquery values meet the condition
  4. Bitwise Operators
    & Bitwise AND
    | Bitwise OR
    ^ Bitwise exclusive OR
  5. Compound Operators
    += Add equals
    -= Subtract equals
    *= Multiply equals
    /= Divide equals
    %= Modulo equals
    &= Bitwise AND equals
    ^-= Bitwise exclusive equals
    |*= Bitwise OR equals

How many row comparison operators are used while working with a subquery?

There are three row comparison operators
which are used in subqueries:
IN/NOT IN, ANY and ALL.

BETWEEN vs IN conditional operators

  1. BETWEEN displays rows based on a range of values.

    SELECT * FROM Students where ROLL_NO BETWEEN 10 AND 30;
    // select all rows from table Students where field value of ROLL_NO is between 10 and 30

  2. IN checks for values contained in a specific set.

    SELECT * FROM students where ROLL_NO IN (5,10,15);
    //select all rows from table Students where field value of ROLL_NO is either 5 or 10 or 15.

What is a query?

A database query is a request for data from a database. Usually the request is to retrieve data; however, data can also be manipulated using queries. The data can come from one or more tables, or even other queries. A query must be written in the language the database requires; usually, that language is Structured Query Language (SQL).

Types/categories/subsets of queries/statements/languages in SQL

  1. Data definition language (DDL) – define the structure of database or table.

    These statements are used to:

    • create/delete tables and databases (CREATE/DROP),
    • adjust the structure of a database table (ALTER TABLE)
    • and define field properties or table properties.

    Statements: CREATE, ALTER, DROP, TRUNCATE, RENAME.


  2. Data manipulation language (DML) is utilized most often. It is used for managing data within table object. These statements are used to recover data from a database (SELECT) order, add new data to a database (INSERT order), adjust data (UPDATE) and expel data from a database (DELETE).
    Statements: SELECT, INSERT, UPDATE, DELETE; also MERGE, LOCK TABLE, CALL, EXPLAIN PLAN.
    Sometimes the SELECT command is separated into a separate group of Data Query Language (DQL) used to retrieve information from a database.

  3. Data Control Language (DCL) – defines privileges to control the data in the database. It is used to control who can access the data. These directions structure the center of the social database security.
    Statements: GRANT, REVOKE, ANALYZE, AUDIT, COMMENT

  4. Transaction Control Language (TCL) – commands are used to manage the transactions in the database. These are used to manage the changes made by DML statements. It also allows the statements to be grouped together into logical transactions.
    COMMIT, ROLLBACK, SAVEPOINT,
    SET TRANSACTION.

  5. Session Control Statements (SCS) – dynamically manage properties of a user session. These statements do not implicitly commit the current transaction.
    Statements: ALTER SESSION, SET ROLE.

What is a subquery?

A subquery (inner/nested query) is a query within another query.
The outer query is called as main query,
and inner query is called subquery or nested query.

Subquery is embedded within the WHERE clause and is always executed first.
The result of subquery is passed on to the main query to further filter the data to be retrieved.
Subqueries can be used with the SELECT, INSERT, UPDATE, and DELETE statements along with the operators like =, <, >, >=, <=, IN, BETWEEN, etc.

Name different types of subqueries

  1. Regular (non-correlated) subquery is run just once and returns a value or a set of values to be used by the outer/main query.
      Regular subqueries can be further characterized as Single-row vs Multiple-row

    1. Single-row subquery is used when the outer query’s results are based on a single, unknown value. Single-row subquery can return zero or only one row of results consisting of only one column to the outer query. The following comparison operators can be used: [=], [>], [>=], [<=], [<>] or [!=]
    2. Multiple-row subquery – Subqueries that can return more than one row (but only one column) to the outer statement. Multiple-row subqueries can be used with IN/NOT IN, ANY, or ALL comparison operators. Multiple-row subqueries are used most commonly in WHERE and HAVING clauses.

    These queries are commonly used to generate result sets that will be passed to a DML or SELECT statement for further processing. Both single-row and multiple-row subqueries will be evaluated once, before the parent query is run.

  2. Correlated (synchronized) subquery executes multiple times, once per each row returned by the outer query.
    Because the subquery is correlated with a column of the outer query, it must be re-executed for each row of the result.

    A correlated subquery depends upon the outer query and cannot execute in isolation, but a regular or non-correlated subquery doesn’t depend on the outer query and can execute in isolation (or independently of the outer query).

    One of the most popular examples of the correlated subquery is about finding Nth highest salary of an employee in a given table.

What is a Synchronized Subquery?

Refer to Correlated Subquery.

Can you tell the order of SQL SELECT statement?

SELECT, FROM, WHERE, GROUP BY, HAVING, ORDER BY.

What is a clause?

SQL clause helps to limit the result set by providing a condition to the query.
This filters certain rows from the whole set of records.
E.g., a query that has WHERE condition or HAVING condition.

What are different Clauses used in SQL?

  1. WHERE Clause
    filters only those records that fulfill the condition.

    Syntax:

    SELECT column_name(s)
    FROM table_name
    WHERE condition;

  2. GROUP BY Clause
    is used with SELECT statement to group the result of the executed query using the value specified in it.
    It matches the value with the column name in tables and groups the end result accordingly.

    Syntax:

    SELECT column_name(s)
    FROM table_name
    GROUP BY column_name;

  3. HAVING clause
    is used in association with the GROUP BY clause.
    It filters groups after data is grouped.

    Syntax:

    SELECT column_name(s)
    FROM table_name
    GROUP BY column_name
    HAVING condition;

  4. ORDER BY clause
    defines the order of the query output.
    Ascending (ASC) order is the default, descending (DESC) can be set up explicitly.

    Syntax:
    SELECT column_name(s)
    FROM table_name
    WHERE condition
    ORDER BY column_name ASC|DESC;

  5. ON clause
    provides the condition when tables are JOINed.

    Syntax:
    SELECT table1.column_name(s), table2.column_name(s)
    FROM table1
    JOIN table2
    ON table1.column_name = table2.column_name;

  6. USING clause
    is used to check equality of columns when tables are JOINed.
    It can be used instead ON clause in Joins.

    Syntax:

    SELECT column_name(s)
    FROM table_name
    JOIN table_name
    USING (column_name);

WHERE vs HAVING clause

HAVING clause can be used only with SELECT statement.
It is usually used in a GROUP BY clause.
When GROUP BY is not used, HAVING and WHERE clauses are essentially equivalent.

However, when GROUP BY is used:

  1. WHERE filters rows and HAVING filters groups.
  2. WHERE filters before data is grouped, and HAVING filters after data is grouped.
    This is an important distinction; rows that are eliminated by a WHERE clause are not included in the group. This could change the calculated values, which in turn could affect which groups are filtered based on the use of those values in the HAVING clause.
  3. All types of WHERE clauses can be used with HAVING.
  4. The HAVING clause is used to filter values from a group (i.e., to check conditions after aggregation into groups has been performed).
    WHERE clause does not work if filtering is based on the group aggregate value, it can only filter values of specific rows.

ORDER BY vs GROUP BY clause

  1. ORDER BY sorts generated output.
    GROUP BY groups rows.
  2. ORDER BY – any columns (even columns not selected) may be used.
    GROUP BY – only selected columns or expressions columns may be used, and every selected column expression must be used.
  3. ORDER BY is never required.
    GROUP BY is required with aggregate functions.

What do you mean by query optimization?

Query optimization is a process in which a database system compares different query strategies and select the query with the least cost.

How can you increase the performance of MySQL SELECT query?

The SELECT statement is used to select data from a database and the data returned is stored in a result table, called the result-set. The SELECT statement can be either individually used or can be used with other statements such as ORDER BY, GROUP BY, and HAVING clause.

To increase the performance of a MySQL SELECT query, you can use the LIMIT clause to limit MySQL from further search in a table, after collecting the required number of records.

What is the difference between DROP, DELETE and TRUNCATE commands?

Statement type: Delete and Drop are DML, Truncate is DDL

Row scope:
DELETE can be used to remove all rows or only a subset of rows. A WHERE clause can be used to only remove some rows. If no WHERE condition is specified, all rows will be removed.
TRUNCATE removes all the rows, but not the table itself, it is similar to deleting with no WHERE clause. Ideal for cleaning out data from a temporary table. It does preserve the structure of the table for future use.
DROP removes a table from the database, remove the table definitions as well as the data. All indexes and privileges will also be removed. There are some cases when you can’t do a truncate or drop table, such as when you have a foreign key reference.

Commit and Rollback:
DELETE: Commit and Rollback can be performed after delete statement. Deleting records from a table logs every deletion and executes delete triggers for the records deleted.
TRUNCATE and DROP cannot be rolled back, empty a table without logging each row, no triggers will be fired.

Performance: TRUCATE is faster and doesn’t use as much undo space as DELETE. DELETE must read the records, check constraints, update the block, update indexes, and generate redo/undo. All of that takes time.

What is a temporary table

Temporary table is a database table that exists temporarily on the database server.
A temporary table stores a subset of data from a normal table for a certain period of time.
Temporary tables use the same selection, update, and join capabilities as typical tables.

  1. Temporary table will last as long as the session is alive and will be deleted will be deleted when the current client session terminates.
  2. If you run the code in a PHP script, the temporary table will be destroyed automatically when the script finishes executing.
  3. If created inside a stored procedure they are destroyed upon completion of the stored procedure.
  4. If you are connected to the MySQL database server through the MySQL client program, then the temporary table will exist until you close the client or manually destroy the table.

Local vs global temporary tables

  1. Local temporary table is only accessible to the connection that created that temporary table. It is not accessible to other connections.
    Global temporary tables are accessible to all the open connections.
  2. Local temporary table exists only for the duration of the connection.
    A global temporary table remains in the database permanently, but the rows exist only within a given connection. When connection is closed, the data in the global temporary table disappears.
  3. In SQL Server, the name of the temporary table starts with a hash symbol (#).
    The name of the global temporary table starts with a double hash symbol (##).

What is an alias command?

SQL aliases are used to give a table, or a column in a table, a temporary name.

Aliases are assigned with the AS keyword. Double quotation marks or square brackets are required if the alias name contains spaces.

An alias only exists for the duration of the query.

    Aliases can be useful when:

  1. There are more than one table involved in a query
  2. Functions are used in the query
  3. Column names are big or not very readable
  4. Two or more columns are combined together


SELECT o.OrderID, o.OrderDate, c.CustomerName
FROM Customers AS c, Orders AS o
WHERE c.CustomerName="Around the Horn" AND c.CustomerID=o.CustomerID;
--selects all the orders from the customer with CustomerID=4 (Around the Horn).
--"Customers" and "Orders" tables are given the table aliases of "c" and "o" respectively

Explain the difference between Rename and Alias?

Rename is a permanent name given to a table or column.
Alias is a temporary name given to a table or column.

What is a JOIN?

A JOIN clause is used to combine rows from two or more tables, based on a relationship of columns between them.

It is important to understand that a join is not a physical entity, in other words, it does not exist in the actual database tables. A join is created by MySQL as needed, and it persists for the duration of the query execution.

Name different types of JOINs

  1. (INNER) JOIN: Returns records that have matching values in both tables. If there are records in the “Orders” table that do not have matches in “Customers”, these orders will not be shown.
    SQL INNER JOIN explained
  2. LEFT (OUTER) JOIN: Return all records from the left table, and the matched records from the right table. The result is NULL from the right side, if there is no match. Include all of the rows from the table on the left even if one of the columns is NULL.
    SQL LEFT OUTER JOIN explained
  3. RIGHT (OUTER) JOIN: Return all records from the right table, and the matched records from the left table. The result is NULL from the left side, when there is no match. Include all of the rows from the table on the right even if one of the columns is NULL.
    SQL RIGHT OUTER JOIN explained
  4. FULL (OUTER) JOIN: Return all records when there is a match in either left or right table. If there are rows in table1 that do not have matches in table2, or if there are rows in table2 that do not have matches in table1, those rows will be listed as well.
    FULL OUTER JOIN explained

What is a Self-Join?

A self-join is a query in which a table is joined to itself.
This is also called Unary relationships, especially when the table has a FOREIGN KEY which references its own PRIMARY KEY.

Syntax:

SELECT
  T1.column_name(s) [AS alias_name(s)],
  t2.column_name(s) [AS alias_name(s)]
FROM
  table1 T1,
  table2 T2
WHERE condition (e.g., T1.common_field = T2.common_field);

--T1 and T2 are different table aliases for the same table
--ALIAS AS can be used for the same table comparison

The self join can be viewed as a join of two copies of the same table.

To join a table itself means that each row of the table is combined with itself and with every other row of the table.

Self join is used to compare values in a column with other values in the same column in the same table.

Self joins are often used to replace statements using subqueries that retrieve data from the same table as the outer statement. Although the end result is the same, sometimes these joins execute far more quickly than they do subqueries.

What is a CROSS JOIN?

CROSS JOIN returns a Cartesian Product.

Cartesian Product – the results returned by a table relationship without a join condition.

The number of rows retrieved is the number of rows in the first table multiplied by the number of rows in the second table (n*m).

SQL CROSS JOIN explained

Syntax:

SELECT
  */column_names
FROM
  T1
    CROSS JOIN
  T2
[WHERE condition];

An alternative way of achieving the same result is to use column names separated by commas after SELECT and mentioning the table names involved, after a FROM clause.

SELECT
  */column_names
FROM
  T1,
  T2
[WHERE condition];

Without the WHERE clause, every row in the first table is paired with every row in the second table, regardless of if they logically go together or not.

If WHERE clause was to be used for filtering, the query would work like an INNER JOIN with the ON clause.

INNER JOIN vs CROSS JOIN

In standard SQL the difference between INNER JOIN and CROSS JOIN is ON clause can be used with INNER JOIN on the other hand ON clause can’t be used with CROSS JOIN.

In CROSS JOIN without the WHERE clause, every row in the first table is paired with every row in the second table, regardless of if they logically go together or not.

If WHERE clause was to be used for filtering, the query would work like an INNER JOIN with the ON clause.

All rows from both/all tables are selected as long as there is a match between the columns.
If there are records in table1 that do not have matches table2, these records will not be shown.

JOIN vs. INNER JOIN vs. FULL OUTER JOIN

JOIN and INNER JOIN are the same, the INNER keyword is optional as all joins are considered to be inner joins unless otherwise specified.

INNER JOIN returns only matched rows on both sides, duplicate values allowed.

FULL OUTER JOIN returns INNER JOIN + unmatched rows from each table.

FULL OUTER JOIN vs CROSS JOIN

FULL OUTER JOIN returns INNER JOIN + unmatched rows from each table.

You can also visualize this as LEFT OUTER JOIN + RIGHT OUTER JOIN in the same query:

  1. every row from Table1 matching what could be matched in Table2,
  2. for T2.columns that don’t match with T1, display T2 side, with NULLs on T1/LEFT side,
  3. for T1.columns that don’t match on the T2, display T1 side with NULLs on T2/RIGHT side.

CROSS JOIN returns a Cartesian Product.

Cartesian Product – the results returned by a table relationship without a join condition.

The number of rows retrieved is the number of rows in the first table multiplied by the number of rows in the second table (n*m).

Example:

T1 T2
1 NULL
2 2
NULL 3

FULL OUTER JOIN will return 3 rows only:

  1. first row will left join T1 without matching value in T2 (NULL)
  2. second row will be matched
  3. third row will right join T2 without matching value in T1 (NULL)

CROSS JOIN will return 3*3 = 9 rows with 3 possible combinations per each row.

SQL CROSS JOIN explained

INNER JOIN vs EQUI JOIN

The main confusion about equi join and inner join differences comes from dividing comparison and equality operators into two different groups. In reality, equal sign (“=”) belongs to the group of comparison operators.

This makes the relationship inclusive, not exclusive. Animals (THETA JOINS) can be wild (NON EQUI JOIN) or domesticated (EQUI JOIN), but cats (INNER JOIN) can be either, it only depends on a particular case.

The concept of joins comes from Relational Algebra, where Conditional (Theta) Join is used when you want to join two or more relations based on certain conditions.

As opposed to CROSS JOIN (where no condition is present), Theta join allows for arbitrary comparison relationships (comparison operators, e.g. ≥, in the ON clause in SQL).

General mathematical Theta join is not present in SQL syntaxis.
However the concept of Theta join can help to understand SQL joins.

Theta join is the broader umbrella that can be further divided using
two independent/unrelated kinds of classifications:

  1. INNER JOIN vs OUTER JOIN (LEFT, RIGHT, FULL OUTER)
  2. EQUI JOIN vs NON EQUI JOIN

EQUI JOIN vs NON EQUI JOIN

  1. Both are subsets of the general Theta join.
  2. Both can belong to INNER JOINs and all three types of OUTER JOINs (LEFT, RIGHT, FULL OUTER).

EQUI JOIN is a special case of a Theta join where the operator θ (theta) is the equality operator (=) used as a join condition for matching rows.

This is the most used type of join and databases are optimized for it.

NON EQUI JOIN uses comparison operator (but not “=”) as a join condition for matching rows.

Databases are not optimized for this type of join.

INNER JOIN returns only those rows (any comparison operator, including “=”) that satisfy the join condition (ON keyword allows to specify the predicate used to filter the rows).

INNER JOIN is a THETA JOIN that can be based on equality to match rows (EQUI JOIN case), or on some other comparison (NON EQUI JOIN case).

Natural join vs Cross-Join vs Inner Join vs Outer Join vs Equi Join

Cross Join returns a Cartesian Product, where every row in the first table is paired with every row in the second table, regardless of if they logically go together or not.

Standard joins (INNER, OUTER) return all data, even multiple occurrences of the same column.

For INNER JOIN the matching row in the joined table is required for a row from the first table to be returned. Inner join uses a comparison operator.

For OUTER JOIN (LEFT, RIGHT, FULL) the matching row in the joined table is not required for a row from the first table to be returned.

A natural join eliminates multiple occurrences so only one of each column is returned and results are unique/distinct. A natural join assumes the join criteria to be where same-named columns in both table match.
Natural join does not use any comparison operator.

Syntax:
SELECT *
FROM table1
NATURAL JOIN table2;

The SQL NATURAL JOIN is a type of EQUI JOIN and is structured in such a way that, columns with the same name of associated tables will appear once only.

A NATURAL JOIN can be an INNER join, a LEFT OUTER join, or a RIGHT OUTER join. The default is INNER join.

Explain the MERGE statement

The MERGE statement is used to make changes in one table (target table) based on values matched from anther (source table).
The MERGE statement combines INSERT, DELETE, and UPDATE operations into one table.


MERGE targetTable
Using sourceTable
ON mergeCondition
WHEN MATCHED
THEN updateStatement
WHEN NOT MATCHED BY TARGET
THEN insertStatement
WHEN NOT MATCHED BY SOURCE
THEN deleteStatement

If a row exists in both tables (MATCHED) an UPDATE will be performed on the target table.
Rows found in the SourceTable, but not in the target (NOT MATCHED BY TARGET) will be INSERTed.
For rows found only in the target, but not the source table (NOT MATCHED BY SOURCE), DELETE will be performed.

What are the different set operators available in SQL?

Some of the available set operators are – UNION, MINUS and INTERSECT operators.

What are UNION, MINUS and INTERSECT commands?


Syntax:
SELECT column1, column2,... FROM table11
UNION/MINUS/INTERSECT
SELECT column1, column2,... FROM table2;
WHERE ...
ORDER BY column1, column2, …;

    The basic rules for a query that uses UNION, MINUS and INTERSECT operator are the same:

  1. The number and order of columns in both tables must be the same.
  2. The data types of the corresponding columns in both queries must be compatible.

UNIONs are also called Combined or Compound Queries.
UNION operator is used to combine the results of one ore more tables with similarly structured data, using multiple SELECT statements that return the results as a single query result set.

SQL UNION explained
UNION includes only distinct results from both tables. The result is somewhat similar to INNER JOIN minus duplicate values (but organized differently).
SQL UNION ALL explained
UNION ALL includes duplicate rows and is faster. The result is somewhat similar to INNER JOIN (but organized differently).
MINUS operator is used to returns distinct values from the query of the first table that are not found on the result of the query of the second table.
SQL MINUS explained
Some database systems e.g., Microsoft SQL Server, PostgreSQL, etc., use the EXCEPT instead of MINUS, which have the same function.

The INTERSECT operator returns the distinct rows (intersection) of both result sets.
SQL INTERSECT explained

UNION vs UNION ALL

UNION includes only distinct results from both tables.
UNION ALL includes duplicate rows and is faster.

UNION vs JOIN

JOIN combines columns from two tables while the UNION combines rows from two queries. In other words, join appends the result sets horizontally while union appends result set vertically.
SQL UNION vs JOIN explained

JOIN does not force distinct results (e.g., duplicate values in joins with many-to-many relationships) while UNION does not allow duplicate values, you need to specify the ALL keyword explicitly to retain duplicate rows.

What is a VIEW?

Views are virtual tables based on the result-set of an SQL statement.
Views are just that – views/snapshots into data stored elsewhere.
Views contain no data themselves, but they contain queries that retrieve data from other tables.

Syntax:
CREATE VIEW view_name AS
SELECT column1, column2, ...
FROM table_name(s)
WHERE condition;

A view contains rows and columns, just like a real table. The fields in a view are fields from one or more real tables in the database.

After views are created, they can be used in the same way as tables. You can perform SELECT operations, filter and sort data, join views to other views or tables, and possibly even add and update data (with some restrictions).

What are VIEWs used for, what are their benefits?

  1. A view always shows up-to-date data.
    Since views are snapshots into data stored elsewhere, when data is added or changed in those tables, the views will return that changed data.
  2. Views take very little space to store.
    The database contains only the definition of a view, not a copy of all the data which it presents.
  3. Views are reusable.
    After the query is written, it can be reused easily, without having to know the details of the underlying query itself.
  4. To ensure data independence
    Views limit the degree of exposure of the underlying tables to the outer world. Users can be given access to only parts of a table instead of complete tables.
  5. To secure data. A given user may have permission to query the view, while denied access to the rest of the base table.
  6. To change data formatting and representation.
    Views can return data formatted and presented differently from their underlying tables.
  7. To make complex queries simple.
    Views can join and simplify multiple tables into a single virtual table.
    Views can act as aggregated tables, where the database engine aggregates data (sum, average etc.) and presents the calculated results as part of the data

How we can update a VIEW?

A view can be updated with the CREATE OR REPLACE VIEW command.

Syntax:

CREATE OR REPLACE VIEW view_name AS
SELECT column_name(s)
FROM table_name(s)
WHERE condition

What’s the difference between materialized and dynamic VIEW?

  1. Materialized views are disk based and are updated periodically based upon the query definition.
    Dynamic views are virtual only and run the query definition each time they are accessed.
  2. For materialized view a materialized table is created or updated infrequently and it must be synchronized with its associated base tables.
    A dynamic view may be created every time that a specific view is requested by the user.

h4>Does View contain Data?

No, Views are virtual in structure.

Can a View based on another View?

Yes, A View is based on another View.

What is a TRIGGER?

The MySQL trigger is a database object that is associated with a table. It will be activated when a defined action is executed on the table.

A SQL trigger is a special type of stored procedure. It is special because it is not called directly like a stored procedure. The main difference between a trigger and a stored procedure is that a trigger is called automatically when a data modification event is made against a table whereas a stored procedure must be called explicitly.

A trigger is automatically executed by MySQL in response to any of these statements:
1. DELETE
2. INSERT
3. UPDATE
No other MySQL statements support triggers.
Triggers are only supported on tables, not on views (and not on temporary tables).

Some uses for triggers are to perform checks of values to be inserted into a table or to perform calculations on values involved in an update.
SQL triggers provide an alternative way to check the data integrity or run scheduled tasks, catch errors in business logic in the database layer, and audit the changes of data in tables.

What are Nested Triggers?

Triggers may implement data modification logic by using INSERT, UPDATE, and DELETE statement.
These triggers that contain data modification logic and find other triggers for data modification are called Nested Triggers.

TRIGGER vs. STORED PROCEDURE

Unlike Stored Procedures, Triggers cannot be called directly.
They can only be associated with queries.

What is a STORED PROCEDURE?

STORED PROCEDURE is a function that contains collections of one or more MySQL statements saved for future use.

You can also pass parameters to a stored procedure, so that the stored procedure can act based on the parameter value(s) that is passed.

Stored procedures help to simplify complex operations by encapsulating processes into a single easy-to-use unit. Instead of writing the same SQL query over and over again, it can be saved as a stored procedure, and then just called onto to execute it.

Advantages and Disadvantages of STORED PROCEDUREs

    Advantages

  1. Reusable and Transparent to any application (e.g., multiple client applications & environments, such as web, desktop, different OS’s).
  2. Stored procedure can be used as a modular programming – means create once, store and call for several times whenever required. This helps to simplify complex operations (complex joins, complex where clauses) by encapsulating processes into a single easy-to-use unit.
  3. It reduces network traffic. Instead of having inline SQL code spread throughout the application, now sections of SQL code can be encapsulated inside of the stored procedure. All complex data processing can now be performed on the server, allowing the client processing to focus more on presentation rather than manipulation of data. This reduces the number of calls the application makes to the database.
  4. Stored procedures improve performance, as stored procedures typically execute quicker than do multiple individual SQL statements.
  5. Stored procedures are beneficial for data integrity and security. Stored procedures limit direct access to tables, all implementation and even the data itself is shielded.This reduces the chance of data corruption (unintentional or otherwise).
  6. Because scripts are in one location “Stored Procedure” folder of the database. Stored procedures are easier to maintain and can be tested independent of the application.


    Disadvantages

  1. Stored procedures can be executed only in the Database and utilize more memory in the database server.
  2. Stored procedures tend to be more complex to write than basic SQL statements, and writing them requires a greater degree of skill and experience.
  3. Limited Coding Functionality compared to application. Not designed for developing Complex or Flexible business logic.
  4. Debugging is complex
  5. Source control can be tricky.
  6. If stored procedures become simply CRUD operations, they don’t add any value to the application, they only complexify maintenance and become inefficient.

User Defined Function vs STORED PROCEDUREs

UDF can be used in the SQL statements anywhere in the WHERE/HAVING/SELECT section where as Stored procedures cannot be. UDFs that return tables can be treated as another rowset. This can be used in JOINs with other tables. Inline UDF’s can be thought of as views that take parameters and can be used in JOINs and other Rowset operations.

Nested vs recursive STORED PROCEDUREs

A stored procedure can itself execute a stored procedure. Each time a stored procedure calls another procedure, the call is said to be nested because it occurs in the context of a previous and still active call to the first procedure. A stored procedure called by another stored procedure is known as a nested procedure.

If a procedure calls itself, it is recursive.
Recursive stored procedure calls itself until it reaches some boundary condition.
A stored procedure can call itself up to the maximum nesting level of 32.

Recursive procedures are useful for tasks that involve repetitive steps where programmers need to use the same set of code “n” number of times., e.g., when you need to expand a tree relationship.

What is a CURSOR?

A database Cursor is a control which enables traversal over the rows of records in the table.
A cursor can be viewed as a pointer to one row in a set of rows.

Cursor makes it possible to define a result set (a set of data rows) and perform complex logic on a row by row basis, instead of the typical SQL commands that operate on all the rows in the set at one time.

In order to work with a cursor we need to perform some steps in the following order:

  1. Declare cursor
  2. Open cursor
  3. Fetch row from the cursor
  4. Process fetched row
  5. Close cursor
  6. Deallocate cursor

Cursors are usually used in stored procedures to iterate a set of rows returned by a query and process each row accordingly.
It is similar to “pointer” in C/C++, or an iterator in PHP’s foreach statement.
The cursor can only reference one row at a time, but can move to other rows of the result set as needed.

Local vs global variables

The scope of a variable refers to where is a variable visible or accessible. It is ether local or global.
Global (also called Public) variables are declared outside any function, and can be accessed on any function in the whole program.
Life-Time (Active in Memory) is until total program is executed.

Local variables are declared inside a function, and can be used only inside that function. They are not known to the other functions and those variables cannot be referred or used.

Variables are stored in a stack unless specified. They live until the function is executed.
It is possible to have local variables with the same name in different functions. It’s like two people with the same name who will never meet. Even though the name is the same, the persons are not.

It is a best practice to only use global variable when you really need to share data, but each variable should always be visible in the smallest scope possible.

Why are SQL functions used?

  1. To perform calculations on the data
  2. To modify individual data items
  3. To manipulate the output
  4. To format dates and numbers
  5. To convert the data types

What are user defined functions (UDF)?

Most of the functions being used in a database system are built-in functions also called System Functions. Most of what the application can do is represented by built-in functions that the user simply has to call and provide parameters.

MySQL comes bundled with a number of built in functions that are already implemented in the MySQL server, such as string functions, numeric functions, date functions, aggregate functions (produce summarized result sets), and some other functions.

User-defined functions are special custom functions that are meant to do something not normally done by the built-in functions. User-defined functions allow developers to create their own routines and procedures that can accept parameters, perform an action, and return the result of that action as a value.

    Advantages

  1. They allow modular programming.
    You can create the function once, store it in the database, and call it any number of times in your program.
  2. They allow faster execution.
  3. They can reduce network traffic in client/server applications. SQL server Functions can be used in WHERE Clause as well. By this we can limit the number of rows sent to the client.

Name types of user defined functions (UDFs)

There are two types of SQL user defined functions

  1. Scalar Functions return a single value, defined in the return clause.
    Field values can passed as parameters into UDFs.

    Scalar-valued UDFs can be used as the default value for a column in a table.
    Scalar-valued UDFs are an easy way to define constant values to use in the database environment.

    Scalar function calls can be nested – a call to a scalar-valued function can be passed to another function or stored procedure.

    • A deterministic UDF always returns the same result with the same set of input parameters. Some examples of deterministic functions are the system functions MONTH(), YEAR(), and ISNULL().
    • A non-deterministic UDF can potentially return a different value each time it is called with the same set of input parameters. Some examples of non-deterministic functions are the system functions GETDATE(), NEWID(), and @@CONNECTIONS.

  2. Table Valued Functions return a table.
    This type of function is special since it returns a table, the results of which can be queried or joined with other tables.

    • Inline Table Valued Functions return a table data based on a single SELECT Statement.
    • Multi Statement Valued Functions return the tabular result set but, unlike the inline table valued function we can use multiple select statements inside the function body.

What are aggregate functions?

Aggregate Functions (Group Functions) perform mathematical calculations based on the values of the column in a set of rows and return a single value per group.

  1. AVG() Returns a column’s average value
  2. COUNT() Returns the number of rows in a column
  3. MAX() Returns a column’s highest value
  4. MIN() Returns a column’s lowest value
  5. SUM() Returns the sum of a column’s values

Syntax:
SELECT MIN/MAX/AVG/SUM/COUNT(column_name)
[AS column_name]
FROM table_name
WHERE condition;

COUNT() vs COUNT(*)

COUNT()
counts the number of rows that have values in a specific column,
ignoring NULL values.

COUNT(*)
counts the number of records in a table,
whether columns contain values or NULL values.

Syntax:
SELECT COUNT(column_name)/COUNT(*)
[AS column_name]
FROM table_name
WHERE condition;

What are scalar functions?

Scalar functions are based on user input, these too return a single value.

  1. UCASE() converts the value of a field to uppercase
  2. LCASE() converts the value of a field to lowercase
  3. MID() extracts text from the text field
  4. LEN() returns the length of the value in a text field
  5. ROUND() round a numeric field to the number of decimals specified
  6. NOW() returns the current system date and time.
  7. FORMAT() formats how a field is to be displayed.

What is a Data warehouse (DW/DWH)?

Data warehouse is a central repository of data from multiple sources of information within that can be analyzed to make better informed management decisions. It is considered to be a core component of business intelligence.

Definition for data warehousing includes business intelligence tools, tools to extract, transform, and load data into the repository, and tools to manage and retrieve metadata.

A data warehouse architecture consists of three tiers. The bottom tier of the architecture is the database server, where data is loaded and stored. The middle tier consists of the analytics engine that is used to access and analyze the data. The top tier is the front-end client that presents results through reporting, analysis, and data mining tools.

A data warehouse is a database of a different kind: an OLAP (Online Analytical Processing) database. A data warehouse exists as a layer on top of another database or databases (usually OLTP – Online Transaction Processing Databases). The data warehouse takes the data from all these databases and creates a layer optimized for and dedicated to analytics.

A data warehouse utilizes a pre-defined schema optimized for analytics, which involves reading large amounts of data to understand relationships and trends across the data.

Data warehouses are:

  1. Subject-oriented, meaning that the data in the database is organized so that all the data elements relating to the same real-world event or object are linked together;
  2. Time-variant, meaning that the changes to the data in the database are tracked and recorded so that reports can be produced showing changes over time;
  3. Non-volatile, meaning that data in the database is never over-written or deleted, once committed, the data is static, read-only, but retained for future reporting.
  4. Integrated, meaning that the database contains data from most or all of an organization’s operational applications, and that this data is made consistent.

Online Transaction Processing (OLTP) vs Online Analytical Processing (OLAP)

Both OLTP and OLAP systems store and manage data in the form of tables, columns, indexes, keys, views, and data types. Both use SQL to query the data.

Online Transaction Processing (OLTP) manages transaction based applications which can be used for data entry, data retrieval and data processing. OLTP makes data management simple and efficient. A good example is an ATM or a bank application.

OLAP (Online Analytical Processing) database is structured to make analytics fast and easy.
It integrates copies of transaction data from different sources and provisions them for analytical use. It extract multi dimensional data for analysis.

A data warehouse is an OLAP database. An OLAP database layers on top of OLTPs or other databases to perform analytics.

Differences
OLTP database is typically constrained to a single application. OLAP accommodates data storage for any number of applications: one data warehouse equals infinite applications and infinite databases.

OLTP is the predominant use case for an RDBMS. RDBMS design often emphasizes data normalization for the sake of efficient transaction processing. In an OLAP database structure, data is organized specifically to facilitate reporting and analysis, not for quick-hitting transactional needs. The data is denormalized to enhance analytical query response times. Because it works with such large data sets, an OLAP database is heavy on CPU and disk bandwidth.

It is designed to accommodate a large volume of simpler transactions fast. OLAP is characterized by relatively low volume of transactions. Queries are often very complex and involve aggregations and thus are slower than for OLTP.

As OLTPs frequently executes transactions in database, in case any transaction fails in middle it may harm data’s integrity and hence it must take care of data integrity. While in OLAP the transaction is less frequent hence, it does not bother much about data integrity.

What is normalization and what are the advantages of it?

Normalization is the process of organizing data.

Normalization usually involves:

  1. dividing a database into smaller tables
  2. defining relationships between the tables according to rules.

The objective is to isolate data so that additions, deletions, and modifications of a field can be made in just one table and then propagated through the rest of the database via the defined relationships.

Data existing in more than one field must be changed in exactly the same way in all locations.
A customer address change is much easier to implement if that data is stored only in the Customers table and nowhere else in the database.

Benefits include:

  1. Reduction of redundant and duplicate data
    that wastes disk space and creates maintenance problems.
  2. More Compact Database
  3. Better Database organization
  4. More Tables with smaller rows
  5. Efficient data access
  6. Greater Flexibility for Queries
  7. Faster queries as it is easier to find the information
  8. Easier to implement Security
  9. Allows easy modification
  10. Ensure Consistent data after modification,
    no inconsistent dependency (Insertion, Deletion, and Update anomalies)

Name different normalization kinds (normal forms)

There are three main rules for database normalization, each rule is called a “normal form”, with increasing levels of normalization.

  1. First Normal Form (1NF): No Repeating Groups
    • Eliminate repeating groups in individual tables.
    • Create a separate table for each set of related data.
    • Identify each set of related data with a primary key.

    Do not use multiple fields in a single table to store similar data. For example, to track an inventory item that may come from two possible sources, an inventory record may contain fields for Vendor Code 1 and Vendor Code 2.

    What happens when you add a third vendor? Adding a field is not the answer; it requires program and table modifications and does not smoothly accommodate a dynamic number of vendors. Instead, place all vendor information in a separate table called Vendors, then link inventory to vendors with an item number key, or vendors to inventory with a vendor code key.

  2. Second Normal Form (2NF): Eliminate Redundant Data
    • Create separate tables for sets of values that apply to multiple records.
    • Relate these tables with a foreign key.

    Records should not depend on anything other than a table’s primary key (a compound key, if necessary). For example, consider a customer’s address in an accounting system. The address is needed by the Customers table, but also by the Orders, Shipping, Invoices, Accounts Receivable, and Collections tables. Instead of storing the customer’s address as a separate entry in each of these tables, store it in one place, either in the Customers table or in a separate Addresses table.

  3. Third Normal Form (3NF): Eliminate Data Not Dependent On Key
    • Eliminate fields that do not depend on the key.

    Values in a record that are not part of that record’s key do not belong in the table. In general, any time the contents of a group of fields may apply to more than a single record in the table, consider placing those fields in a separate table.

    For example, in an Employee Recruitment table, a candidate’s university name and address may be included. But you need a complete list of universities for group mailings. If university information is stored in the Candidates table, there is no way to list universities with no current candidates. Create a separate Universities table and link it to the Candidates table with a university code key.

What is Denormalization?

Denormalization does not mean not doing normalization. It is an optimization technique that is applied after doing normalization. By doing so database normalization levels are decrease by moving down in “normal forms”.
It helps the database managers to increase the performance of the entire infrastructure as the redundant data is added to one or more tables. This can help us avoid costly joins in a relational database.

In a traditional normalized database, we store data in separate logical tables and attempt to minimize redundant data. We may strive to have only one copy of each piece of data in database.
For example, in a normalized database, we might have a Courses table and a Teachers table.Each entry in Courses would store the teacherID for a Course but not the teacherName. When we need to retrieve a list of all Courses with the Teacher name, we would do a join between these two tables.
In some ways, this is great; if a teacher changes is or her name, we only have to update the name in one place.
The drawback is that if tables are large, we may spend an unnecessarily long time doing joins on tables.
Denormalization, then, strikes a different compromise. Under denormalization, we decide that we’re okay with some redundancy and some extra effort to update the database in order to get the efficiency advantages of fewer joins.

Collation vs Character Set vs Encoding

Words and sentences in text are created from characters.
Characters that are needed for a specific purpose are grouped into a character set (also called a repertoire) – be it those required to support Western European languages in computers, or those a Chinese child will learn at school in the third grade (nothing to do with computers).
So, character sets are collections of letters and symbols.

A coded character set is a set of characters for which a unique number has been assigned to each character. Units of a coded character set are known as code points.
A code point value represents the position of a character in the coded character set. For example, the code point for the letter á in the Unicode coded character set is 225 in decimal, or 0xE1 in hexadecimal notation. Hexadecimal notation is commonly used for referring to code points. A Unicode code point can have a value between 0x0000 and 0x10FFFF.
Coded character sets are sometimes called code pages.

The characters are stored in the computer as one or more bytes.
The character encoding reflects the way the coded character set is mapped to bytes for manipulation in a computer.
So, when you input text using a keyboard or in some other way, the character encoding maps characters you choose to specific bytes in computer memory, and then to display the text it reads the bytes back into characters.

Basically, you can visualize this by assuming that all characters are stored in computers using a special code, like the ciphers used in espionage. A character encoding provides a key to unlock (ie. crack) the code. It is a set of mappings between the bytes in the computer and the characters in the character set. Without the key, the data looks like garbage.

Collation determines encoding to be used.
Collation is a set of rules that determine how character data can be sorted and compared. Character data is sorted using rules that define the correct character sequence, with options for specifying case-sensitivity, accent marks, kana character types and character width.

A collation is a configuration setting that determines how the database engine should treat character data at the server, database, or column level.

A collation algorithm such as the Unicode collation algorithm defines an order through the process of comparing two given character strings and deciding which should come before the other.

    This is how MySQL decides on how to sort and compare text:

  1. If both CHARACTER SET and COLLATE are specified, those values are used.
  2. If only CHARACTER SET is specified, it is used along with the default collation for that character set (as specified in the SHOW CHARACTER SET results).
  3. If neither CHARACTER SET nor COLLATE are specified, the database default is used.

In practice, character sets can seldom be server-wide (or even database-wide) settings. Different tables, and even different columns, may require different character sets, and so both may be specified when a table is created.

Name different types of collation sensitivity

Collation is very often important with regards to internationalization. MySQL is used by a wide variety of people across the world. It is very important to configure MySQL to accept commands and characters for the language and alphabet needed.

It is also important to set sensitivity levels at a database level correctly as SQL server will treat certain words as different or the same depending on the setting.

In English, collation may be a pretty simple matter consisting of ordering by the ASCII code (a binary collation). Once you get into other languages with all their accents and other features, collation changes.

  1. Case Sensitivity – establishes whether it is case sensitive or insensitive. When case sensitive, a computer treats A and a differently because it uses ASCII code to differentiate the input. The ASCII value of A is 65, while a is 97.
  2. Accent Sensitivity – establishes whether it is accent sensitive or insensitive. Different accents usually mean that they need to be sorted as if they were different letters – e.g. e,ë,é,ě, and è might be distinct letters for the purposes of ordering (but possibly the same letter when searching)
  3. Kana Sensitivity – When Japanese kana characters Hiragana and Katakana are treated differently, it is called Kana sensitive. Insensitive by default.
  4. Width Sensitivity – When a single-byte character (half-width) and the same character when represented as a double-byte character (full-width) are treated differently then it is width sensitive. Insensitive by default.

    SQL Server stores character data using either one byte or two bytes per character, depending on the column’s data type and assigned collation. For example, European languages require only a single-byte character set, which supports up to 256 bit patterns. On the other hand, many Asian languages include thousands of characters and require a double-byte character set, which supports up to 65,536 bit patterns.

    Collations are intricately tied to the data types assigned to character columns. In SQL Server, you can configure a character column with a Unicode (UTF-8) data type (nchar, nvarchar, or ntext) or non-Unicode data type (char, varchar, or text).
    UTF-8 is a good choice because you can use a single character encoding to handle any character you are likely to need. This greatly simplifies things. Using Unicode throughout your system also removes the need to track and convert between various character encodings.
    For Unicode types, the character bit patterns conform to international standards that define a double-byte encoding scheme for mapping most of the world’s written languages, ensuring that the same bit pattern is always associated with the same character, regardless of the underlying environment.

What is a transaction?

A Database Transaction is a set/sequence/group of database operations/queries that must be treated as whole.
This means that either all operations are executed or none of them.

E.g., for a bank transaction from one account to another to happen,
either both debit and credit operations must be executed or none of them.

Data manipulation operations like inserting, updating, deleting records in the database comes from transactions.

ACID (Atomicity, Consistency, Isolation, Durability) properties guarantee that database transactions are processed reliably.

What are transaction controls?

  1. COMMIT is used to save all changes made through the transaction
  2. ROLLBACK is used to roll back the transaction,
    all changes made by the transaction are reverted back and database remains as before the transaction
  3. SET TRANSACTION sets the name of transaction
  4. SAVEPOINT is used to set the point from where the transaction is to be rolled back

What are ACID properties of the transaction?

ACID stands for Atomicity, Consistency, Isolation, Durability.
It is used to ensure that the data transactions are processed reliably in a database system.

Atomicity
ensures the completeness of all transactions performed.
Atomicity requires that each transaction be “all or nothing”:
If any part of the transaction fails,
the entire transaction is aborted and the database is left unchanged.

An atomic system must guarantee atomicity in each and every situation, including power failures, errors, and crashes.

Consistency
ensures that any transaction will bring the database from one valid state to another.

Any data written to the database must meet all defined validation rules, including constraints, cascades, triggers, and any combination thereof.

Isolation
ensures concurrency control.
All transactions are performed independently
and changes made by one transaction are not reflected on another.

The isolation property ensures that the concurrent execution of transactions
results in a system state
that would be obtained if transactions were executed serially,
i.e., one after the other.

Providing isolation is the main goal of concurrency control.
Depending on concurrency control method
(i.e. if it uses strict – as opposed to relaxed – serializability),
the effects of an incomplete transaction might not even be visible to another transaction.

Durability
means that if a transaction has been committed,
it will remain so,
all changes made in database persist
even after system failure, power loss, crash or any sort of error.

In a relational database, for instance, once a group of SQL statements execute,
the results need to be stored permanently (even if the database crashes immediately thereafter).
To defend against power loss, transactions (or their effects) must be recorded in a non-volatile memory.

What is the difference between GUI Testing and Database Testing?

  1. The tester who is performing GUI Testing doesn’t need to know SQL.
    The tester who is performing Database Testing needs to know SQL.
  2. GUI Testing is also known as User Interface Testing of Front-end Testing.
    Database Testing is also known as Back-End Testing or Data Testing.
  3. GUI Testing deals with items that interact with users (Menus, Forms).
    Database Testing deals with items that are hidden from users.
  4. GUI Testing focuses on the outlook of the application.
    Database Testing focuses on the integrity of data in the front end with the data present in the back end.
  5. GUI Testing includes invalidating the text boxes, check boxes, buttons, drop-downs, forms etc., majorly the look and feel of the overall application
    Database Testing involves in verifying the integrity of data in the front end with the data present in the back end. It validates the schema, database tables, columns, indexes, stored procedures, triggers, data duplication, orphan records, junk records. It involves in updating records in a database and verifying the same on the front end.

What do we need to check for in Database Testing?

Database Testing is also known as back-end testing or data testing.

Database testing involves verifying the integrity of data in the front end with the data present in the back end.

It validates the schema, database tables, columns, indexes, stored procedures, triggers, data duplication, orphan records, junk records.

It involves updating records in a database and verifying the same on the front end.

  1. Database Connectivity
  2. Constraint Check
  3. Required Application Field and its size
  4. Data Retrieval and Processing With DML operations
  5. Stored Procedures
  6. Functional flow

What is Database White Box Testing?

  1. Database Consistency and ACID properties
  2. Database triggers and logical views
  3. Decision Coverage, Condition Coverage, and Statement Coverage
  4. Database Tables, Data Model, and Database Schema
  5. Referential integrity rules

What is Database Black Box Testing?

  1. Data Mapping
  2. Data storing and data retrieving
  3. Use of Black Box techniques such as Equivalence Partitioning and Boundary Value Analysis (BVA)

What is CTE?

An expression which consists of a temporary set of results defined in a SQL statement is said to be a Common Table Expression(CTE).

Can you tell the reasons for selecting Lamp(Linux, Apache, MySQL, PHP) instead of any other combination of software programs, servers, and operating system?

The reason behind selecting Lamp stack is very simple. Linux, Apache, MySQL, PHP are open source software. The security of the Linux operating system is much more than Windows. The Apache server is a better server than others in the perspective of functionalities and security. MySQL is one of the most popular open source databases is used with PHP to perform various functionalities.

How can you backup and restore MySQL database?

  1. MySQL comes with a utility mysqldump to provide the database backup and restore. The command you can use for backup and restore are as follows respectively.

    //To take the backup of database
    mysqldump database > backup -file.sql;

    //To restore the database
    mysqldump database < backup -file.sql;

  2. You can also use the phpMyAdmin user interface to backup your database. If you wish to backup, the database you just have to click on the “export” link on the phpMyAdmin main page.

If you have to JOIN multiple tables with thousand rows in each,
will you perform filtering or transforming of rows first.

If you have three tables with thousands of tuples in each of them, then you are first supposed to filter the rows in those tables and then transform the table.

This would be beneficiary as if you transform the table, then the number of columns may increase reducing the performance.

Due to such performance issues, a lot of memory will be used and the output will appear on your screen after quite a long wait of time.

Static (Embedded) vs Dynamic (Interactive) SQL

  1. Static SQL is used for statements in an application that do not change at runtime and, therefore, can be hard-coded into the application.
    Dynamic SQL is a programming technique that enables you to build SQL statements dynamically at runtime. The application may allow users to enter their own queries.
  2. In static SQL how database will be accessed is predetermined in the embedded SQL statement. In dynamic SQL, how database will be accessed is determined at run time.
  3. Both EXECUTE IMMEDIATE, however, for Dynamic SQL EXECUTE and PREPARE statements are used, they are not used for Static SQL.
  4. Dynamic SQL is more flexible. You can create more general purpose, flexible applications by using dynamic SQL because the full text of a SQL statement may be unknown at compilation.
  5. Dynamic SQL is slower and less efficient than Static SQL.

List the ways in which Dynamic SQL can be executed

  1. Write a query with parameters.
  2. Using EXEC.
  3. Using sp_executesql.

Difference between Locking, Blocking and Deadlocking

  1. Locking
    occurs when a connection needs access to a piece of data in database
    and it locks it for certain use
    so that no other transaction is able to access it.
  2. Blocking
    occurs when a transaction tries to acquire an incompatible lock on a resource
    that another transaction has already locked.
    The blocked transaction remains blocked
    until the blocking transaction releases the lock.
  3. Deadlocking
    occurs when two or more transactions have a resource locked,
    and each transaction requests a lock on the resource
    that another transaction has already locked.
    Neither of the transactions here can move forward,
    as each one is waiting for the other to release the lock.

What are SQL Privileges?

SQL GRANT and REVOKE commands (Data Control Language - DCL) are used to implement privileges in SQL multiple user environments.

The administrator of the database can grant or revoke privileges to or from users of database object
on different actions, e.g., SELECT, INSERT, UPDATE, DELETE etc.

  1. GRANT Command
    provides database access to user other than administrator.

    Syntax:

    GRANT privilege_name
    ON object_name
    TO {user_name|PUBLIC|role_name}
    [WITH GRANT OPTION]; //user can grant access to another user

    GRANT SELECT
    ON database.*
    TO user_name;
    //allow select on all tables of a database

  2. REVOKE Command
    denies or removes access to database objects.

    Syntax:

    REVOKE privilege_name
    ON object_name
    FROM {user_name|PUBLIC|role_name};

    REVOKE DELETE
    ON database.table
    FROM user_name
    //revoke privileges for deleting on a certain table in a database

GRANT and REVOKE can be used to control access at several levels:

  1. Entire server, using GRANT ALL and REVOKE ALL
  2. Entire database, using ON database.*
  3. Specific tables, using ON database.table
  4. Specific columns
  5. Specific stored procedures

How many types of Privileges are available in SQL?

  1. System Privilege
    allows a user to perform administrative tasks (DDL: CREATE/ALTER/DELETE)
    on database objects.
  2. Object Privilege
    allows a user to perform actions (DML: INSERT, UPDATE, DELETE, SELECT)
    on database objects.

List privileges that a user can grant to another user

ALL, ALTER, CREATE [TEMPORARY TABLES, USER, VIEW etc.], DELETE, DROP, INSERT, SELECT, UPDATE etc.

What is SQL Injection?

SQL Injection is a type of database attack technique
where malicious code is placed in SQL statements
and once it is executed the database is opened for the attacker.

SQL injection is one of the most common web hacking techniques.
SQL injection usually occurs when you ask a user for input, e.g. username/userid,
but instead of a name/id, the user gives you an SQL statement
that you will unknowingly run on your database.

Example:

SELECT * FROM Users WHERE UserId = 105 OR 1=1;
//will return ALL rows from the "Users" table, since OR 1=1 is always TRUE

SELECT * FROM Users WHERE Name ="" or ""="" AND Pass ="" or ""=""
//will return all rows from the "Users" table, since OR ""="" is always TRUE

SELECT * FROM Users WHERE UserId = 105; DROP TABLE Suppliers;
//user input "105; DROP TABLE Suppliers" will drop the table;
//SQL Injection Based on Batched SQL Statements
//(2+ SQL statements, separated by semicolons)

Use of SQL Parameters protect a web site from SQL injection

  1. values that are added to an SQL query at execution time, in a controlled manner
  2. SQL engine checks each parameter to ensure that it is correct for its column
  3. treated literally, and not as part of the SQL to be executed

In ASP.NET @ marker is used.
Example in PHP:

$stmt = $dbh->prepare("INSERT INTO Customers (CustomerName,Address,City)
VALUES (:nam, :add, :cit)");
$stmt->bindParam(':nam', $txtNam);
$stmt->bindParam(':add', $txtAdd);
$stmt->bindParam(':cit', $txtCit);
$stmt->execute();

What is SQL Sandbox in SQL Server?

SQL Sandbox is the safe place in SQL Server Environment where untrusted scripts are executed. There are 3 types of SQL sandbox, such as

  1. Safe Access Sandbox: Here a user can perform SQL operations such as creating stored procedures, triggers etc. but cannot have access to the memory and cannot create files.
  2. External Access Sandbox: User can have access to files without having a right to manipulate the memory allocation.
  3. Unsafe Access Sandbox: This contains untrusted codes where a user can have access to memory.

Which TCP/IP port does SQL Server run?

By default, SQL Server runs on port 1433.

What are SQL comments?

  1. Single line comments start with --
  2. Multi-line comments start with /* and end with */

What do you mean by ROWID?

It’s an 18 character long pseudo column attached with each row of a table.

What is an execution plan?
When would you use it?
How would you view the execution plan?

An execution plan is basically a road map that graphically or textually shows the data retrieval methods chosen by the SQL server’s query optimizer for a stored procedure or ad hoc query.

Execution plans are very useful for helping a developer understand and analyze the performance characteristics of a query or stored procedure, since the plan is used to execute the query or stored procedure.

In many SQL systems, a textual execution plan can be obtained using a keyword such as EXPLAIN, and visual representations can often be obtained as well.

In Microsoft SQL Server, the Query Analyzer has an option called “Show Execution Plan” (located on the Query drop down menu). If this option is turned on, it will display query execution plans in a separate window when a query is run.

Leave a Reply or Comment

Your email address will not be published. Required fields are marked *