SQL

Question : In Oracle there is no out put upon the call of the dbms_output.put_line.
Answer : we need to call set serveroutput on; so it will activate the output for the current session.

Question : What is Scalar Datatypes?
Answer : Scalar datatypes divide into four families: number, character, datetime, and Boolean.

Question :What is the difference between rownum,rowid?
Answer : Row id has a physical significance i.e you can read a row if you know rowid. It is complete physical address of a row. While row num is temporary serial number allocated to each returned row during query execution.

Question : Truncate command in oracle is DDL command or DML command ? Give the answer with descriptive reason which is and why?
Answer : Truncate command is DDL statement ,we can`t after rolledback truncate command is executed.

What is the point of having a foreign key?
Answer : Foreign keys are used to reference unique columns in another table. So, for example, a foreign key can be defined on one table A, and it can reference some unique column(s) in another table B. Why would you want a foreign key? Well, whenever it makes sense to have a relationship between columns in two different tables.

Question : Can a table have multiple unique, foreign, and/or primary keys?
Answer : A table can have multiple unique and foreign keys. However, a table can have only one primary key.

Question : Can a unique key have NULL values? Can a primary key have NULL values?
Answer : Unique key columns are allowed to hold NULL values. The values in a primary key column, however, can never be NULL.

Question : Can a foreign key reference a non-primary key?
Answer : Yes, a foreign key can actually reference a key that is not the primary key of a table. But, a foreign key must reference a unique key.

Question : Can a foreign key contain null values?
Answer : Yes, a foreign key can hold NULL values. Because foreign keys can reference unique, non-primary keys – which can hold NULL values – this means that foreign keys can themselves hold NULL values as well.

Question : Some other differences between foreign, primary, and unique keys?
Answer : While unique and primary keys both enforce uniqueness on the column(s) of one table, foreign keys define a relationship between two tables. A foreign key identifies a column or group of columns in one (referencing) table that refers to a column or group of columns in another (referenced) table – in our example above, the Employee table is the referenced table and the Employee Salary table is the referencing table.
As we stated earlier, both unique and primary keys can be referenced by foreign keys.

Question : Natural keys versus surrogate keys
Answer : Natural keys are often compared to surrogate keys. What exactly is a surrogate key? Well, first consider the fact that the word surrogate literally means substitute. The reason a surrogate key is like a substitute is because it’s unnatural, in the sense that the column used for the surrogate key has no logical relationship to other columns in the table.
In other words, the surrogate key really has no business meaning – i.e., the data stored in a surrogate key has no intrinsic meaning to it.

Question : Why are surrogate keys used?
Answer : A surrogate key could be considered to be the “artificial” key that we mentioned earlier. In most databases, surrogate keys are only used to act as a primary key. Surrogate keys are usually just simple sequential numbers – where each number uniquely identifies a row. For example, Sybase and SQL Server both have what’s called an identity column specifically meant to hold a unique sequential number for each row. MySQL allows you to define a column with the AUTO_INCREMENT attribute, which just means that the value in the column will automatically increment the value in a given column to be 1 greater than the value in the previous row. This just means that every time you add a new row, the value in the column that is auto incremented is 1 greater than the value in the most recent row added to the table. You can also set the increment value to be whatever you want it to be.

How Oracle executes the query?
Answer : A query is first checked for semantics, and is then parsed. Parse can be hard parse or soft parse. Oracle will check if the parsed statement exists in the library cache. If it exists the existing statement is used, If not, the query statement will be hard parsed and placed in the library cache. The optimized then decides on the best execution plan and executes the statement. The data to be retrieved is fetched from the buffer cache if it already exists, if not the data blocks are read, data fetched and stored in the buffer cache and presented to the session requesting the data.

Question : What is the difference between char and varchar2?
Answer: char store only chacter type but varchar2 store variable chacters. also varchar2 shirinks the space if not fully filled but char cant.

What is ACID?
Answer : A cornerstone of RDBMS is adherence to ACID, a set of properties that are
intended to guarantee operations on a database are processed reliably:
A is for atomicity, which requires that you apply an update in its entirety or not at all. If one part of an update operation fails, then all parts should fail.
C is for consistency, which requires that you only make an update if it results in data that is valid according to all defined rules.
I is for isolation, which requires that concurrent execution of updates has the same result as if the updates had been run one after the other.
D is for durability, which requires that after an update is committed to a database, the resulting data persists even if the database crashes, the power is shut off, or the server room is overrun by pigeons.

Question : What is Truncate?
Answer : go to the following link here
Question : How can you create trigger which will insert the primary key on every insert statement issued.
Answer : This can be achieved by following two steps,
Step 1 : Create a sequence.

CREATE SEQUENCE "HR"."CONTACT_ID_SEQ" MINVALUE 1 MAXVALUE 5000 INCREMENT BY 1 START WITH 1 CACHE 20 ORDER CYCLE ;

Step 2 : Create a Trigger and use the pre created sequence for the trigger.

CREATE or replace TRIGGER contact_pk_insert

BEFORE INSERT ON contact

FOR EACH ROW

DECLARE

BEGIN

IF( :new.id IS NULL )

THEN

:new.id := contact_id_seq.nextval;

END IF;

END;

Question : What will happen if we leave ELSE in case when expression?
Answer : Null is returned when no condition is TRUE and no ELSE is specified. If multiple conditions are TRUE, the first-listed condition takes precedence.

What is CAST Function?
Answer : CAST explicitly converts a value to a new type. For example:
SELECT * FROM upfall u
WHERE u.id = CAST('1' AS INTEGER);
When converting from text to numeric or date types, CAST offers little flexibility in dealing with different input data formats. For example, if the value you are casting is a string, the contents must conform to your database’s default text representation of the target data type.

What is the difference between truncate and round function in the SQL?
Answer: Rounding is implemented to the nearest occurrence of the element you specify. My input date was closer to 1-Jan-2004 than it was to 1-Dec-2003, so my date was rounded up to the nearest month.
Truncation simply sets any element of lesser significance than the one you specify to its minimum value. The minimum day value is 1, so 31-Dec was truncated to 1-Dec.

Question : What is the ListAgg ?
Answer : Oracle implements the LISTAGG function to aggregate detail values for a column into a single value per group. The result is a delimited list of values. The following example extends the previous section’s query to return a column named stop with a comma-delimited list of falls on a given tour:
SELECT t.name AS tour_name,
LISTAGG (u.name, ',') WITHIN GROUP
(ORDER BY u.name ASC) AS stop
FROM upfall u INNER JOIN trip t
ON u.id = t.stop
GROUP BY t.name;

TOUR_NAME | STOP
-------------------|------------------------------
M-28 | Agate Falls,Alger Falls,Bond
| Falls,Canyon Falls,Munising
| Falls,Scott Falls

Question : What would happen without an index on the table?
Answer : Once we run that query, what exactly goes on behind the scenes to find employees who are named Jesus? Well, the database software would literally have to look at every single row in the Employee table to see if the Employee_Name for that row is ‘Jesus’. And, because we want every row with the name ‘Jesus’ inside it, we can not just stop looking once we find just one row with the name ‘Jesus’, because there could be other rows with the name Jesus. So, every row up until the last row must be searched – which means thousands of rows in this scenario will have to be examined by the database to find the rows with the name ‘Jesus’. This is what is called a full table scan.

Question : In databases, what is a full table scan? Also, what are some of the causes of full table scans?
Answer : A full table scan looks through all of the rows in a table – one by one – to find the data that a query is looking for. Obviously, this can cause very slow SQL queries if you have a table with a lot of rows – just imagine how performance-intensive a full table scan would be on a table with millions of rows. Using an index can help prevent full table scans.

Question : What are the different scenarios which cause a full table scan?
Answer : Full table scan if statistics haven’t been updated,Normally, statistics are kept on tables and indexes. But, if for some reason table or index statistics have not been updated, then this may result in a full table scan. This is because most RDBMS’s have query optimizers that use those statistics to figure out if using an index is worthwhile. And if those statistics are not available, then the RDBMS may wrongly determine that doing a full table scan is more efficient than using an index.
If a query does not have a WHERE clause to filter out the rows which appear in the result set, then a full table scan might be performed.

Question : Even if we have indexes, is it still possible that full table scan is done?
Answer : There are some scenarios in which a full table scan will still be performed even though an index is present on that table. Let’s go through some of those scenarios.
If a query does have a WHERE clause, but none of the columns in that WHERE clause match the leading column of an index on the table, then a full table scan will be performed.
Even if a query does have a WHERE clause with a column that matches the first column of an index, a full table scan can still occur. This situation arises when the comparison being used by the WHERE clause prevents the use of an index. Here are some scenarios in which that could happen:
If the NOT EQUAL (the “<>“) operator is used. An example is “WHERE NAME <> ‘PROGRAMMERINTERVIEW'”. This could still result in a full table scan, because indexes are usually used to find what is inside a table, but indexes (in general) cannot be used to find what is not inside a table.
If the NOT operator is used. An example is “WHERE NOT NAME = ‘PROGRAMMERINTERVIEW'”.
If the wildcard operator is used in the first position of a comparison string. An example is “WHERE NAME LIKE ‘%INTERVIEW%'”.

Question : How a database index can help performance.
Answer : You might be thinking that doing a full table scan sounds inefficient for something so simple – shouldn’t software be smarter? It’s almost like looking through the entire table with the human eye – very slow and not at all sleek. But, as you probably guessed by the title of this article, this is where indexes can help a great deal. The whole point of having an index is to speed up search queries by essentially cutting down the number of records/rows in a table that need to be examined.

Question : What kind of data structure is an index?
Answer : B- trees are the most commonly used data structures for indexes. The reason B- trees are the most popular data structure for indexes is due to the fact that they are time efficient – because look-ups, deletions, and insertions can all be done in logarithmic time. And, another major reason B- trees are more commonly used is because the data that is stored inside the B- tree can be sorted. The RDBMS typically determines which data structure is actually used for an index. But, in some scenarios with certain RDBMS’s, you can actually specify which data structure you want your database to use when you create the index itself.

Question : How does a hash table index work?
Answer : Hash tables are another data structure that you may see being used as indexes – these indexes are commonly referred to as hash indexes. The reason hash indexes are used is because hash tables are extremely efficient when it comes to just looking up values. So, queries that compare for equality to a string can retrieve values very fast if they use a hash index. For instance, the query we discussed earlier (SELECT * FROM Employee WHERE Employee_Name = ‘Jesus’) could benefit from a hash index created on the Employee_Name column. The way a hash index would work is that the column value will be the key into the hash table and the actual value mapped to that key would just be a pointer to the row data in the table. Since a hash table is basically an associative array, a typical entry would look something like “Jesus => 0x28939?, where 0x28939 is a reference to the table row where Jesus is stored in memory. Looking up a value like “Jesus” in a hash table index and getting back a reference to the row in memory is obviously a lot faster than scanning the table to find all the rows with a value of “Jesus” in the Employee_Name column.

Question : What is the disadvantages of a hash index?
Answer : Hash tables are not sorted data structures, and there are many types of queries which hash indexes can not even help with. For instance, suppose you want to find out all of the employees who are less than 40 years old. How could you do that with a hash table index? Well, it’s not possible because a hash table is only good for looking up key value pairs – which means queries that check for equality (like “WHERE name = ‘Jesus'”). What is implied in the key value mapping in a hash table is the concept that the keys of a hash table are not sorted or stored in any particular order. This is why hash indexes are usually not the default type of data structure used by database indexes – because they aren’t as flexible as B- trees when used as the index data structure.

Question : What are some other types of indexes?
Answer : Indexes that use a R- tree data structure are commonly used to help with spatial problems. For instance, a query like “Find all of the Starbucks within 2 kilometers of me” would be the type of query that could show enhanced performance if the database table uses a R- tree index.
Another type of index is a bitmap index, which work well on columns that contain Boolean values (like true and false), but many instances of those values – basically columns with low selectivity.

Question : How does an index improve performance?
Answer : Because an index is basically a data structure that is used to store column values, looking up those values becomes much faster. And, if an index is using the most commonly used data structure type – a B- tree – then the data structure is also sorted. Having the column values be sorted can be a major performance enhancement – read on to find out why.
Let’s say that we create a B- tree index on the Employee_Name column This means that when we search for employees named “Jesus” using the SQL we showed earlier, then the entire Employee table does not have to be searched to find employees named “Jesus”. Instead, the database will use the index to find employees named Jesus, because the index will presumably be sorted alphabetically by the Employee’s name. And, because it is sorted, it means searching for a name is a lot faster because all names starting with a “J” will be right next to each other in the index! It’s also important to note that the index also stores pointers to the table row so that other column values can be retrieved – read on for more details on that.

Question : What exactly is inside a database index?
Answer : So, now you know that a database index is created on a column in a table, and that the index stores the values in that specific column. But, it is important to understand that a database index does not store the values in the other columns of the same table. For example, if we create an index on the Employee_Name column, this means that the Employee_Age and Employee_Address column values are not also stored in the index. If we did just store all the other columns in the index, then it would be just like creating another copy of the entire table – which would take up way too much space and would be very inefficient. An index also stores a pointer to the table row So, the question is if the value that we are looking for is found in an index (like ‘Jesus’) , how does it find the other values that are in the same row (like the address of Jesus and his age)? Well, it’s quite simple – database indexes also store pointers to the corresponding rows in the table. A pointer is just a reference to a place in memory where the row data is stored on disk. So, in addition to the column value that is stored in the index, a pointer to the row in the table where that value lives is also stored in the index. This means that one of the values (or nodes) in the index for an Employee_Name could be something like (“Jesus”, 0x82829), where 0x82829 is the address on disk (the pointer) where the row data for “Jesus” is stored. Without that pointer all you would have is a single value, which would be meaningless because you would not be able to retrieve the other values in the same row – like the address and the age of an employee.

Question : How does a database know when to use an index?
Answer : When a query like “SELECT * FROM Employee WHERE Employee_Name = ‘Jesus’ ” is run, the database will check to see if there is an index on the column(s) being queried. Assuming the Employee_Name column does have an index created on it, the database will have to decide whether it actually makes sense to use the index to find the values being searched – because there are some scenarios where it is actually less efficient to use the database index, and more efficient just to scan the entire table.

Question : Can you force the database to use an index on a query?
Answer : Generally, you will not tell the database when to actually use an index – that decision will be made by the database itself. Although it is worth noting that in most databases (like Oracle and MySQL), you can actually specify that you want the index to be used.

Question : What is the cost of having a database index?
Answer : So, what are some of the disadvantages of having a database index? Well, for one thing it takes up space – and the larger your table, the larger your index. Another performance hit with indexes is the fact that whenever you add, delete, or update rows in the corresponding table, the same operations will have to be done to your index. Remember that an index needs to contain the same up to the minute data as whatever is in the table column(s) that the index covers.
As a general rule, an index should only be created on a table if the data in the indexed column will be queried frequently.

What is normalization? What is the advantage of normalization?
Answer : Database Normalization is a technique of organizing the data in the database. Normalization is a systematic approach of decomposing tables to eliminate data redundancy and undesirable characteristics like Insertion, Update and Deletion Anomalies. It is a multi-step process that puts data into tabular form by removing duplicated data from the relation tables.

Normalization is used for mainly two purpose,

Eliminating redundant(useless) data.
Ensuring data dependencies make sense i.e data is logically stored.

As per First Normal Form, no two Rows of data must contain repeating group of information i.e each set of column must have a unique value, such that multiple columns cannot be used to fetch the same row. Each table should be organized into rows, and each row should have a primary key that distinguishes it as unique.

The Primary key is usually a single column, but sometimes more than one column can be combined to create a single primary key. For example consider a table which is not in First normal form

Student Table :

Student	Age	Subject
Adam	15	Biology, Maths
Alex	14	Maths
Stuart	17	Maths

In First Normal Form, any row must not have a column in which more than one value is saved, like separated with commas. Rather than that, we must separate such data into multiple rows.

Student Table following 1NF will be :

Student	Age	Subject
Adam	15	Biology
Adam	15	Maths
Alex	14	Maths
Stuart	17	Maths

Using the First Normal Form, data redundancy increases, as there will be many columns with same data in multiple rows but each row as a whole will be unique.

Second Normal Form (2NF)

As per the Second Normal Form there must not be any partial dependency of any column on primary key. It means that for a table that has concatenated primary key, each column in the table that is not part of the primary key must depend upon the entire concatenated key for its existence. If any column depends only on one part of the concatenated key, then the table fails Second normal form.

In example of First Normal Form there are two rows for Adam, to include multiple subjects that he has opted for. While this is searchable, and follows First normal form, it is an inefficient use of space. Also in the above Table in First Normal Form, while the candidate key is {Student, Subject}, Age of Student only depends on Student column, which is incorrect as per Second Normal Form. To achieve second normal form, it would be helpful to split out the subjects into an independent table, and match them up using the student names as foreign keys.

New Student Table following 2NF will be :

Student	Age
Adam	15
Alex	14
Stuart	17

In Student Table the candidate key will be Student column, because all other column i.e Age is dependent on it.

New Subject Table introduced for 2NF will be :

Student	Subject
Adam	Biology
Adam	Maths
Alex	Maths
Stuart	Maths

In Subject Table the candidate key will be {Student, Subject} column. Now, both the above tables qualifies for Second Normal Form and will never suffer from Update Anomalies. Although there are a few complex cases in which table in Second Normal Form suffers Update Anomalies, and to handle those scenarios Third Normal Form is there.

Third Normal Form (3NF)

Third Normal form applies that every non-prime attribute of table must be dependent on primary key, or we can say that, there should not be the case that a non-prime attribute is determined by another non-prime attribute. So this transitive functional dependency should be removed from the table and also the table must be in Second Normal form. For example, consider a table with following fields.

Student_Detail Table :

Student_id	Student_name	DOB	Street	city	State	Zip

In this table Student_id is Primary key, but street, city and state depends upon Zip. The dependency between zip and other fields is called transitive dependency. Hence to apply 3NF, we need to move the street, city and state to new table, with Zip as primary key.

New Student_Detail Table :

Student_id	Student_name	DOB	Zip

Address Table :

Zip	Street	city	state

The advantage of removing transtive dependency is,

Amount of data duplication is reduced.
Data integrity achieved.

Query Tuning Basics

SQL Tuning/SQL Optimization Techniques:

1) The sql query becomes faster if you use the actual columns names in SELECT statement instead of than '*'.

For Example: Write the query as

SELECT id, first_name, last_name, age, subject FROM student_details;

Instead of:

SELECT * FROM student_details;

2) HAVING clause is used to filter the rows after all the rows are selected. It is just like a filter. Do not use HAVING clause for any other purposes.

For Example: Write the query as

SELECT subject, count(subject)

FROM student_details

WHERE subject != 'Science'

AND subject != 'Maths'

GROUP BY subject;

Instead of:

SELECT subject, count(subject)

FROM student_details

GROUP BY subject

HAVING subject!= 'Vancouver' AND subject!= 'Toronto';

3) Sometimes you may have more than one subqueries in your main query. Try to minimize the number of subquery block in your query.

For Example: Write the query as:

SELECT name

FROM employee

WHERE (salary, age ) = (SELECT MAX (salary), MAX (age)

FROM employee_details)

AND dept = 'Electronics';

Instead of:

SELECT name

FROM employee

WHERE salary = (SELECT MAX(salary) FROM employee_details)

AND age = (SELECT MAX(age) FROM employee_details)

AND emp_dept = 'Electronics';

4) Use operator EXISTS, IN and table joins appropriately in your query.

a) Usually IN has the slowest performance.

b) IN is efficient when most of the filter criteria is in the sub-query.

c) EXISTS is efficient when most of the filter criteria is in the main query.

For Example: Write the query as

Select * from product p

where EXISTS (select * from order_items o

where o.product_id = p.product_id)

Instead of:

Select * from product p

where product_id IN

(select product_id from order_items

5) Use EXISTS instead of DISTINCT when using joins which involves tables having one-to-many relationship.

For Example: Write the query as

SELECT d.dept_id, d.dept

FROM dept d

WHERE EXISTS ( SELECT 'X' FROM employee e WHERE e.dept = d.dept);

Instead of:

SELECT DISTINCT d.dept_id, d.dept

FROM dept d,employee e

WHERE e.dept = e.dept;

6) Try to use UNION ALL in place of UNION.

For Example: Write the query as

SELECT id, first_name

FROM student_details_class10

UNION ALL

SELECT id, first_name

FROM sports_team;

Instead of:

SELECT id, first_name, subject

FROM student_details_class10

UNION

SELECT id, first_name

FROM sports_team;

7) Be careful while using conditions in WHERE clause.

For Example: Write the query as

SELECT id, first_name, age FROM student_details WHERE age > 10;

Instead of:

SELECT id, first_name, age FROM student_details WHERE age != 10;

Write the query as

SELECT id, first_name, age

FROM student_details

WHERE first_name LIKE 'Chan%';

Instead of:

SELECT id, first_name, age

FROM student_details

WHERE SUBSTR(first_name,1,3) = 'Cha';

Write the query as

SELECT id, first_name, age

FROM student_details

WHERE first_name LIKE NVL ( :name, '%');

Instead of:

SELECT id, first_name, age

FROM student_details

WHERE first_name = NVL ( :name, first_name);

Write the query as

SELECT product_id, product_name

FROM product

WHERE unit_price BETWEEN MAX(unit_price) and MIN(unit_price)

Instead of:

SELECT product_id, product_name

FROM product

WHERE unit_price >= MAX(unit_price)

and unit_price <= MIN(unit_price)

Write the query as

SELECT id, name, salary

FROM employee

WHERE dept = 'Electronics'

AND location = 'Bangalore';

Instead of:

SELECT id, name, salary

FROM employee

WHERE dept || location= 'ElectronicsBangalore';

Use non-column expression on one side of the query because it will be processed earlier.

Write the query as

SELECT id, name, salary

FROM employee

WHERE salary < 25000;

Instead of:

SELECT id, name, salary

FROM employee

WHERE salary + 10000 < 35000;

Write the query as

SELECT id, first_name, age

FROM student_details

WHERE age > 10;

Instead of:

SELECT id, first_name, age

FROM student_details

WHERE age NOT = 10;

8) Use DECODE to avoid the scanning of same rows or joining the same table repetitively. DECODE can also be made used in place of GROUP BY or ORDER BY clause.

For Example: Write the query as

SELECT id FROM employee

WHERE name LIKE 'Ramesh%'

and location = 'Bangalore';

Instead of:

SELECT DECODE(location,'Bangalore',id,NULL) id FROM employee

WHERE name LIKE 'Ramesh%';

9) To store large binary objects, first place them in the file system and add the file path in the database.

10) To write queries which provide efficient performance follow the general SQL standard rules.

a) Use single case for all SQL verbs

b) Begin all SQL verbs on a new line

c) Separate all words with a single space

d) Right or left aligning verbs within the initial SQL verb

Question : What is the difference between Stored Procedures and Function
Answer :
Stored procs can be called independently using exec keyword. Stored procedure cannot be used in the select/where/having clause.
Function are called from select/where/having clause. Even we can join two functions.

The stored procedures can do all the DML operations like insert the new record, update the records and delete the existing records.
Functions cannot affect the state of the database which means we cannot perform insert, delete, update and create operations operations on the database. The function won’t allow us to do the DML operations in the database tables like in the stored procedure. It allows us to do only the select operation. It will not allow to do the DML on existing tables. But still we can do the DML operation only on the table variable inside the user defined functions.

Stored procedures are stored in database in the compiled form.
Function are parsed and compiled at runtime only.

In Stored procedures we can use transaction statements. We can’t use in functions.

Procedures may or may not return a value or may return more than one value using the OUTPUT and/or INOUT parameters. A procedure may return upto 1024 values through OUTPUT and/or INOUT parameters.Function always returns only one value.

Normally stored procedure will be used for perform specific tasks.
Functions will be used for computing value.

Temporary tables (derived) can be created in stored procedures.
It is not possible in case of functions.

when sql statements encounters an error, T-SQL will ignore the error in a SPROC and proceed to the next statement in the remaining code.
In case of functions, T-SQL will stop execution of next statements.

Procedures can accept input(default), output and inout type parameters for it.
Functions can accept only input type parameters.

Function return type could be scalar or table or table values.
Stored procs can create a table but can’t return table.

Question : What is a SQL Injection Attack?
Answer : A SQL Injection attack is a form of attack that comes from user input that has not been checked to see that it is valid. The objective is to fool the database system into running malicious code that will reveal sensitive information or otherwise compromise the server.
There are two main types of attacks. First-order attacks are when the attacker receives the desired result immediately, either by direct response from the application they are interacting with or some other response mechanism, such as email. Second-order attacks are when the attacker injects some data that will reside in the database, but the payload will not be immediately activated. I will discuss each in more detail later in this article.

Question : Give An example of what an attacker might do using SQL injection?
Answer : In the following example, assume that a web site is being used to mount an attack on the database. If you think about a typical SQL statement, you might think of something like:

SELECT ProductName, QuantityPerUnit, UnitPrice
FROM Products
WHERE ProductName LIKE 'G%'

The objective of the attacker is to inject their own SQL into the statement that the application will use to query the database. If, for instance, the above query was generated from a search feature on a web site, then they user may have inserted the "G" as their query. If the server side code then inserts the user input directly into the SQL statement, it might look like this:

string sql = "SELECT ProductName, QuantityPerUnit, UnitPrice "+
"FROM Products " +
"WHERE ProductName LIKE '"+this.search.Text+"%';
SqlDataAdapter da = new SqlDataAdapter(sql, DbCommand);
da.Fill(productDataSet);

This is all fine if the data is valid, but what if the user types something unexpected? What happens if the user types:

' UNION SELECT name, type, id FROM sysobjects;--

Note the initial apostrophe; it closes the opening quote in the original SQL statement. Also, note the two dashes at the end; that starts a comment, which means that anything left in the original SQL statement is ignored.
Now, when the attacker views the page that was meant to list the products the user has searched for, they get a list of all the names of all the objects in the database and the type of object that they are. From this list, the attacker can see that there is a table called Users. If they take note of the id for the Users table, they could then inject the following:

' UNION SELECT name, '', length FROM syscolumns
WHERE id = 1845581613;--

This would give them a list of the column names in the Users table. Now they have enough information to get access to a list of users, passwords, and if they have admin privileges on the web site.

' UNION SELECT UserName, Password, IsAdmin FROM Users;--

Assume that there is a table called Users which has columns called UserName and Password, it is possible to union that with the original query and the results will be interpreted as if the UserName was the name of the product and the Password was the quantity per unit. Finally, because the attacker discovered that there is a IsAdmin column, they are likely to retrieve the information in that too.

Question : How Locking down works?
Answer : Security is something that needs to be tackled on many levels because a chain is only as strong as its weakest link. When a user interacts with a piece of software, there are many links in the chain; if the user is malicious, he could attempt to attack these links to find the weak point and attempt to break the system at that point. With this in mind, it is important that the developer does not become complacent about the security of the system because one security measure is put in place, or a set of security measures are in place on only one part of the system.
An intranet website that uses Windows authentication (it takes the user's existing credentials based on who they are logged in as) and is sitting inside the corporate network and unavailable to Internet users may give the impression that only authorised users can access the intranet web application. However, it is possible for an authenticated user to gain unauthorised access if the security is not taken much beyond that level. Some statistics support the suggestion that most security breaches are insider jobs rather than people attacking the system from outside.
With this in mind, it is important that even if the application permits only valid data through that has been carefully verified and cleaned up, other security measures are put in place. This is especially important between application layers where there may be an increased opportunity for spoofing of requests or results.
For example, if a web application were to request that the user choose a date, then it would be normal that the values for the date are checked in some JavaScript function on the web page before any data was posted back to the server. This improves the user experience by reducing the wait between lots of server requests. However, the value needs to be validated again on the server as it is possible to spoof the request with a deliberately crafted invalid date.

Question : How Encrypting data works in DB?
Answer : Starting from the proposition that somehow an attacker has managed to break through all other defenses, what information is so sensitive that it needs to remain a secret? Candidates for encryption include user log in details or financial information such as credit card details.
For items such as passwords, the user's password can be stored as a "salted hash". What happens is that when a user creates a password, a randomly generated "salt" value is created by the application and appended to the password, and the password-and-salt are then passed through a one way encryption routine, such as found in the .NET Framework's helper class FormsAuthentication (HashPasswordForStoringInConfigFile method). The result is a salted hash which is stored in the database along with the clear text salt string.
The value of a salted hash is such that a dictionary attack is not going to work as each dictionary would have to be rebuilt appending the various salt values and recomputing the hash values for each item. While it is still possible to determine the password by brute force, the use of the salt (even though it is known) greatly slows down the process. The second advantage of the salt is that it masks any situations where two independent users happen to use the same password, as the salted hash value for each user would be different if given different salt values.

Question : Least Privilege - Database account?
Answer : Running an application that connects to the database using the database's administrator account has the potential for an attacker to perform almost limitless commands with the database. Anything an administrator can do, so can an attacker.
Using the example application above, an attacker could inject the following to discover the contents of the hard disk(s) on the server.
The first command is used to create a temporary store on the database and fill it with some data. The following injected code will create a table with the same structure as the result set of the extended stored procedure that will be called. It then populates the table with the results of the extended stored procedure.

Question : What is Least Privilege - Process account?
Answer :When an instance of SQL Server is installed on a computer, it creates a service that runs in the background and processes the commands from applications that are connected to it. By default, this service is installed to use the Local System account. This is the most powerful account on a Windows machine, it is even more powerful than the Administrator account.
If an attacker has an opportunity to break out of the confines of SQL Server itself, such as through the extended procedure xp_cmdshell, then they could gain unrestricted access to the machine that the SQL Server is on.
Microsoft recommends that during the installation of SQL Server, the service is given a domain account which has the permissions set to only the necessary resources. That way, an attacker is confined by the permission set needed to run SQL Server.

Question : How Cleaning and Validating input can prevent SQL Injection?
Answer : In many applications, the developer has side-stepped the potential use of the apostrophe as a way to get access to the system by performing a string replace on the input given by the user. This is useful for valid reasons, such as being able to enter surnames such as "O'Brian" or "D'Arcy", and so the developer may not even realise that they have partly defeated a SQL injection attack.

Question : What is Second-Order Attacks
Answer : A second-order attack is one where the data lies dormant in the database until some future event occurs. It often happens because once data is in the database, it is often thought of as being clean and is not checked again. However, the data is frequently used in queries where it can still cause harm.
Consider an application that permits the users to set up some favourite search criteria. When the user defines the search parameters, the application escapes out all the apostrophes so that a first-order attack cannot occur when the data for the favourite is inserted into the database. However, when the user comes to perform the search, the data is taken from the database and used to form a second query which then performs the actual search. It is this second query which is the victim of the attack.

Question : How Parameterised Queries help to prevent sql injection attact?
Answer : SQL Server, like many database systems, supports a concept called parameterised queries. This is where the SQL Command uses a parameter instead of injecting the values directly into the command. The particular second-order attack above would not have been possible if parameterised queries had been used.

Question : What is Concurrency control?
Answer : Concurrency control comprises the underlying mechanisms in a DBMS which handles isolation and guarantees related correctness. It is heavily utilized by the database and storage engines (see above) both to guarantee the correct execution of concurrent transactions, and (different mechanisms) the correctness of other DBMS processes. The transaction-related mechanisms typically constrain the database data access operations' timing (transaction schedules) to certain orders characterized as the serializability and recoverability schedule properties. Constraining database access operation execution typically means reduced performance (rates of execution), and thus concurrency control mechanisms are typically designed to provide the best performance possible under the constraints. Often, when possible without harming correctness, the serializability property is compromised for better performance. However, recoverability cannot be compromised, since such typically results in a quick database integrity violation.

Two-phase locking is the most common transaction concurrency control method in DBMSs, used to provide both serializability and recoverability for correctness. In order to access a database object a transaction first needs to acquire a lock for this object. Depending on the access operation type (e.g., reading or writing an object) and on the lock type, acquiring the lock may be blocked and postponed, if another transaction is holding a lock for that object.

Question : What are different Isolation levels?
Answer : Of the four ACID properties in a DBMS (Database Management System), the isolation property is the one most often relaxed. When attempting to maintain the highest level of isolation, a DBMS usually acquires locks on data or implements multiversion concurrency control, which may result in a loss of concurrency. This requires adding logic for the application to function correctly.

Most DBMSs offer a number of transaction isolation levels, which control the degree of locking that occurs when selecting data. For many database applications, the majority of database transactions can be constructed to avoid requiring high isolation levels (e.g. SERIALIZABLE level), thus reducing the locking overhead for the system. The programmer must carefully analyze database access code to ensure that any relaxation of isolation does not cause software bugs that are difficult to find. Conversely, if higher isolation levels are used, the possibility of deadlock is increased, which also requires careful analysis and programming techniques to avoid.

The isolation levels defined by the ANSI/ISO SQL standard are listed as follows.

Serializable
This is the highest isolation level.

With a lock-based concurrency control DBMS implementation, serializability requires read and write locks (acquired on selected data) to be released at the end of the transaction. Also range-locks must be acquired when a SELECT query uses a ranged WHERE clause, especially to avoid the phantom reads phenomenon (see below).

When using non-lock based concurrency control, no locks are acquired; however, if the system detects a write collision among several concurrent transactions, only one of them is allowed to commit. See snapshot isolation for more details on this topic.

Repeatable reads
In this isolation level, a lock-based concurrency control DBMS implementation keeps read and write locks (acquired on selected data) until the end of the transaction. However, range-locks are not managed, so phantom reads can occur.

Read committed
In this isolation level, a lock-based concurrency control DBMS implementation keeps write locks (acquired on selected data) until the end of the transaction, but read locks are released as soon as the SELECT operation is performed (so the non-repeatable reads phenomenon can occur in this isolation level, as discussed below). As in the previous level, range-locks are not managed.

Putting it in simpler words, read committed is an isolation level that guarantees that any data read is committed at the moment it is read. It simply restricts the reader from seeing any intermediate, uncommitted, 'dirty' read. It makes no promise whatsoever that if the transaction re-issues the read, it will find the same data; data is free to change after it is read.

Read uncommitted
This is the lowest isolation level. In this level, dirty reads are allowed, so one transaction may see not-yet-committed changes made by other transactions.

Since each isolation level is stronger than those below, in that no higher isolation level allows an action forbidden by a lower one, the standard permits a DBMS to run a transaction at an isolation level stronger than that requested (e.g., a "Read committed" transaction may actually be performed at a "Repeatable read" isolation level).

Question : What are the difference between DDL, DML and DCL commands?
Answer :

DDL
Data Definition Language (DDL) statements are used to define the database structure or schema. Some examples:
CREATE - to create objects in the database
ALTER - alters the structure of the database
DROP - delete objects from the database
TRUNCATE - remove all records from a table, including all spaces allocated for the records are removed
COMMENT - add comments to the data dictionary
RENAME - rename an object
DML

Data Manipulation Language (DML) statements are used for managing data within schema objects. Some examples:
SELECT - retrieve data from the a database
INSERT - insert data into a table
UPDATE - updates existing data within a table
DELETE - deletes all records from a table, the space for the records remain
MERGE - UPSERT operation (insert or update)
CALL - call a PL/SQL or Java subprogram
EXPLAIN PLAN - explain access path to data
LOCK TABLE - control concurrency
DCL

Data Control Language (DCL) statements. Some examples:
GRANT - gives user's access privileges to database
REVOKE - withdraw access privileges given with the GRANT command
TCL

Transaction Control (TCL) statements are used to manage the changes made by DML statements. It allows statements to be grouped together into logical transactions.
COMMIT - save work done
SAVEPOINT - identify a point in a transaction to which you can later roll back
ROLLBACK - restore database to original since the last COMMIT
SET TRANSACTION - Change transaction options like isolation level and what rollback segment to use

Everything

Search This Blog

SQL

Second Normal Form (2NF)

Third Normal Form (3NF)

Comments

Post a Comment

Popular posts from this blog

12 - HTML 5 and CSS

Collections JAVA

OOAP -Javascript