SQL Interview Questions for Freshers
1. What is the distinction between inner and outer join?
Ans: INNER JOIN: Returns only the rows where there is a match in both tables.
OUTER JOIN: Returns the matched rows as well as unmatched rows from one or both tables. There
are three types of OUTER JOIN
LEFT OUTER JOIN (or LEFT JOIN): Returns all rows from the left table, and the matched rows from the right table. If no match is found, the result is NULL on the right side.
RIGHT OUTER JOIN (or RIGHT JOIN): Returns all rows from the right table, and the matched rows from the left table. If no match is found, the result is NULL on the left side.
FULL OUTER JOIN: Returns rows when there is a match in one of the tables. This means it returns all rows from the left table and the right table, filling in NULLs when there is no match.
Examples:
INNER JOIN:
SELECT employees.name, departments.department_name FROM employees INNER JOIN departments ON employees.department_id = departments.id;
LEFT JOIN:
SELECT employees.name, departments.department_name FROM employees LEFT JOIN departments ON employees.department_id = departments.id;
RIGHT JOIN:
SELECT employees.name, departments.department_name FROM employees RIGHT JOIN departments ON employees.department_id = departments.id;
FULL OUTER JOIN:
SELECT employees.name, departments.department_name FROM employees FULL OUTER JOIN departments ON employees.department_id = departments.id;
2. Discuss subqueries in SQL?
Ans: A subquery, also known as an inner query or nested query, is a query within another SQL query. It is used to provide data to the main query (outer query). Subqueries can be used in various clauses such as SELECT, FROM, WHERE, and HAVING.
Types of Subqueries:
Single-row subquery: Returns a single row and is used with operators like =, <, >.
Example:
SELECT name FROM employees WHERE department_id = (SELECT id FROM departments WHERE department_name = ‘Sales’);
Multi-row subquery: Returns multiple rows and is used with operators like IN, ANY, ALL
Example:
SELECT name FROM employees WHERE department_id IN (SELECT id FROM departments WHERE region = ‘North’);
Correlated subquery: A subquery that references columns from the outer query. It is evaluated once for each row processed by the outer query.
Example:
SELECT e.name FROM employees e WHERE e.salary > (SELECT AVG(salary) FROM employees WHERE department_id = e.department_id);
3. Discuss about Normalization and its Types
Ans: Normalization is the process of organizing the data in a database to reduce redundancy and improve data integrity. The goal is to divide a database into two or more tables and define relationships between them to reduce redundancy and dependency. There are several normal forms, each with specific rules to help achieve this goal.
Normalization involves multiple steps, usually referred to as “normal forms” (NFs)
First Normal Form (1NF): Ensures that the table has a primary key and that each column contains atomic (indivisible) values.
Example:
CREATE TABLE customers (customer_id INT PRIMARY KEY, customer_name VARCHAR(100), contact_number VARCHAR(15));
Second Normal Form (2NF): Achieves 1NF and ensures that all non-key attributes are fully functionally dependent on the primary key. This means removing partial dependencies of any column on the primary key.
Example:
If a table has a composite key (e.g., order_id, product_id) and some columns depend only on part of that key, those columns should be moved to another table.
Third Normal Form (3NF): Achieves 2NF and ensures that all the attributes are functionally dependent only on the primary key. This eliminates transitive dependencies.
Example:
CREATE TABLE orders (order_id INT PRIMARY KEY, customer_id INT, order_date DATE, FOREIGN KEY (customer_id) REFERENCES customers(customer_id));
CREATE TABLE order_details (order_id INT, product_id INT, quantity INT, PRIMARY KEY (order_id, product_id), FOREIGN KEY (order_id) REFERENCES orders(order_id));
Boyce-Codd Normal Form (BCNF): A stricter version of 3NF where every determinant is a candidate key. This addresses situations where 3NF is not sufficient to eliminate all redundancies.
4. What is a self-join in SQL? Provide an example.
Ans: A self-join is a regular join but the table is joined with itself. This is useful when the data is hierarchical or when you need to compare rows within the same table.
Example:
SELECT e1.name AS Employee, e2.name AS Manager FROM employees e1 JOIN employees e2 ON e1.manager_id = e2.id;
Self-joins are particularly useful for hierarchical data, such as organizational charts, and emphasize the importance of using table aliases (e.g., e1 and e2) to differentiate between the different instances of the same table.
5. How can you detect and delete duplicate rows in a SQL table?
Ans: To detect and delete duplicate rows, you can use a Common Table Expression (CTE) along with the ROW_NUMBER() window function to identify duplicates and then delete them.
Example to detect duplicates:
WITH DuplicateCTE AS ( SELECT *, ROW_NUMBER() OVER (PARTITION BY column1, column2, … ORDER BY (SELECT NULL)) as row_num FROM table_name ) SELECT * FROM DuplicateCTE WHERE row_num > 1;
Example to delete duplicates:
WITH DuplicateCTE AS ( SELECT *, ROW_NUMBER() OVER (PARTITION BY column1, column2, … ORDER BY (SELECT NULL)) as row_num FROM table_name ) DELETE FROM DuplicateCTE WHERE row_num > 1;
6. What is the purpose of the COALESCE function in SQL? Provide an example.
Ans: The COALESCE function returns the first non-null value in a list of arguments. It is useful for handling NULL values and providing default values.
Example
SELECT name, COALESCE(phone, ‘No Phone’) as contact_number FROM employees;
if the phone column is NULL, the COALESCE function will return ‘No Phone’ instead.
COALESCE is particularly useful for dealing with potential NULL values in your data, and it can accept multiple arguments, returning the first non-null one. It is more versatile than the ISNULL function, which only handles two arguments.
7. Explain what normalization is and the different normal forms in SQL.
Ans: Normalization is the process of organizing the columns and tables of a relational database to reduce data redundancy and improve data integrity. The main goal is to divide large tables into smaller, related tables and define relationships between them. The different normal forms are:
First Normal Form (1NF): Ensures that the table has no repeating groups or arrays. Each cell contains a single value, and each record is unique.
– Example: A table where each column contains only atomic (indivisible) values.
Second Normal Form (2NF): Meets all the requirements of 1NF and ensures that all non-key columns are fully dependent on the primary key.
– Example: Remove partial dependencies; every non-primary-key attribute should depend on the
whole primary key.
Third Normal Form (3NF): Meets all the requirements of 2NF and ensures that all the columns are dependent only on the primary key.
– Example: Remove transitive dependencies; non-primary-key attributes should not depend on other non-primary-key attributes.
Boyce-Codd Normal Form (BCNF): A stricter version of 3NF where every determinant is a candidate key.
– Example: Ensure that for every functional dependency (A → B), A is a super key.
Fourth Normal Form (4NF): Meets all the requirements of BCNF and ensures that multi-valued dependencies are removed.
– Example: Remove multi-valued dependencies; a record should not have more than one independent multi-valued attribute.
Fifth Normal Form (5NF): Meets all the requirements of 4NF and ensures that join dependencies are properly handled.
– Example: Decompose tables further to eliminate redundancy and handle complex join dependencies.
8. What is a stored procedure, and what are its advantages?
Ans: A stored procedure is a prepared SQL code that you can save and reuse. It can contain multiple SQL statements and control-of-flow language. Stored procedures are stored in the database and can be executed by calling them directly.
Example:
CREATE PROCEDURE GetEmployeeDetails
@EmployeeID INT
AS
BEGIN
SELECT name, department_id, salary
FROM employees
WHERE id = @EmployeeID;
END;
Advantages:
Reusability: Once created, stored procedures can be reused multiple times in different programs or scripts.
Performance: Stored procedures are precompiled, which can result in faster execution compared to dynamically building and executing queries.
Security: Stored procedures provide an additional layer of security by controlling access to data and preventing SQL injection attacks.
Maintainability: By centralizing business logic in the database, it is easier to maintain and update the logic. Reduced Network Traffic: Executing a stored procedure can reduce network traffic because multiple SQL statements can be sent as a single call.
9. What is the difference between a cross join and an inner join in SQL?
Ans: Cross Join:
– Produces a Cartesian product of the two tables, meaning it returns all possible combinations of rows from both tables.
– Does not require a condition.
– Can result in a large number of rows, especially if both tables are large.
Example:
SELECT * FROM employees CROSS JOIN departments;
Inner Join:
– Returns only the rows that have matching values in both tables based on a specified condition.
– Requires a condition to match rows from both tables.
– Generally returns fewer rows than a cross join because it filters the results to include only the matching rows.
Example:
SELECT e.name, d.department_name
FROM employees e
INNER JOIN departments d ON e.department_id = d.department_id;
Cross Joins can be useful for certain scenarios, they should be used with caution due to the potentially large result set. Inner joins are more commonly used to combine related data from multiple tables based on a logical relationship, ensuring more meaningful results.
10. How can you optimize a slow-running query in SQL?
Ans: Optimizing a slow-running query can involve several strategies:
Indexing:
– Ensure appropriate indexes are in place, especially on columns used in JOIN, WHERE, ORDER BY, and GROUP BY clauses.
– Avoid over-indexing, as too many indexes can slow down data modification operations.
Query Refactoring:
– Simplify complex queries by breaking them down into smaller, more manageable parts.
– Use subqueries or Common Table Expressions (CTEs) to organize and streamline complex logic.
Avoiding SELECT *:
– Select only the columns needed instead of using SELECT *, which can reduce the amount of data transferred and processed.
Optimizing Joins:
– Ensure that joins are performed on indexed columns.
– Use INNER JOINs over OUTER JOINs where possible to reduce the number of rows processed.
These Interview questions will be helpful for your Interview to crack any role. If you like our efforts and content, let us know in the comments.