ARCHITECTURE:
1. Explain SQL Architecture
1. Explain SQL Architecture
A:The general architecture of SQL Server,
which has 4 major components
Protocols,
Relational engine (also
called the Query Processor),
Storage engine,
SQLOS.
Each
Every instruction & batch submitted to SQL Server for execution, from any
client application, must interact with these four components.
The
protocol layer receives the request
from the end client and translates it into the form where SQL Server relational
engine can understand and work on it.
The
query processor accepts the T-SQL batches processes and executes the T-SQL
batch. If data is required the request
is passed to Storage Engine.
The
Storage Engine manages the data
access and service the requested data.
The
SQLOS takes responsibility of
operating system and manages locks, synchronization, buffer pool, memory,
thread scheduling, I/O etc.
A Basic select Statement Life Cycle Summary
Figure 1-5 shows the whole life cycle of a SELECT query, described here:
1. The SQL Server Network Interface (SNI) on the client established a connection to the SNI on the SQL Server using a network protocol such as TCP/IP. It then created a connection to a TDS endpoint over the TCP/IP connection and sent the SELECT statement to SQL Server as a TDS message.
2. The SNI on the SQL Server unpacked the TDS message, read the SELECT statement, and passed a “SQL Command” to the Command Parser.
3. The Command Parser checked the plan cache in the buffer pool for an existing, usable query plan. When it didn’t fi nd one, it created a query tree based on the SELECT statement and passed it to the Optimizer to generate a query plan.
4. The Optimizer generated a “zero cost” or “trivial” plan in the pre-optimization phase because the statement was so simple. The query plan created was then passed to the Query Executor for execution.
5. At execution time, the Query Executor determined that data needed to be read to complete the query plan so it passed the request to the Access Methods in the Storage Engine via an OLE DB interface.
6. The Access Methods needed to read a page from the database to complete the request from the Query Executor and asked the Buffer Manager to provision the data page.
7. The Buffer Manager checked the data cache to see if it already had the page in cache. It wasn’t in cache so it pulled the page from disk, put it in cache, and passed it back to the Access Methods.
8. Finally, the Access Methods passed the result set back to the Relational Engine to send to the client.
2. Can you briefly explain about the component SQLOS?
A:
SQLOS is an operating system inside SQL
Server. This is a module and controls
the end to end data access, processing and management. Below are the roles of SQLOS:
Memory management
Scheduling
IO management
Synchronization services
Thread Management
Worker Management
Framework for locking management
Framework for Transaction management
Deadlock detection
Utilities for Dumping
Utilities for Exception handling
3. What is the difference between 32-bit and 64-bit versions of Windows?
The terms 32-bit and 64-bit refer to the way a computer's processor (also called a CPU), handles information. The 64-bit version of Windows handles large amounts of random access memory (RAM) more effectively than a 32-bit system.
Mainly it starts on a hardware level by the way the CPU) handles information. For example, a 32 bit CPU can only handle 2^32 bit, which limits it to 2 GB of RAM. The 64-bit can handle 2^64, which is a very large amounts of memory. You can easily conclude that any hardware improvement will positively affect every software that runs over it and SQL Server is no exception.
3. What is the difference between 64 bit and 32 Bit SQL server 2008?
4. What is difference between 32bit and 64 bit
A:
It is the amount of memory that 2008 can use. In a 32-bit version it's limited to only a few gigabytes where the 64bit version can use much much more. In the 32 bit version, the server has to access the hardrive for database information much much more, the 64-bit OS and enough memory allows the server to hold much more of the database in memory, making the queries faster.
Mainly it starts on a hardware level by the way the CPU) handles information. For example, a 32 bit CPU can only handle 2^32 bist, which limits it to 2Gb of RAM. The 64-bit can handle 2^64, which is a very large amounts of memory. You can easily conclude that any hardware improvement will positively affect every software that runs over it and SQL Server is no exception.
Main difference I see in a practical sense is memory caps. 32 bit is limited to the old 2 GB threshhold without AWE and you have to use AWE (which to me is like loading device drivers high) to access physical memory greater than the OS limit. Whereas in 64bit SQL, you can pretty much add memory in the current environment limitlessly (they aren't making Windows servers that can hit the 64 bit max for memory) and SQL will use it natively.
7. Calculate the size of BPOOL using below algorithm.
9. What is dedicated administrator connection(DAC)?
A SQL Server provides a special diagnostic connection for administrators when standard connections to the server are not possible. This diagnostic connection allows an administrator to access SQL Server to execute diagnostic queries and troubleshoot problems even when SQL Server is not responding to standard connection requests.
10. What is a Job in SQL Server?
11. What is the advantage of CLR integration?
A SQL Server has lack of certain features like Regular expressions. Its also not efficient in string processing etc also. Microsoft .NET framework is more efficient in processing strings,Regular Expressions etc. By integrating CLR integration, SQL Server can use the features of Microsoft .NET framework.
14. What is RAID and what are different types of RAID levels?
RAID stands for Redundant array of independent disks which was earlier called as Redundant array of inexpensive disks. It is a storage technology that has one logical unit consisting of multiple disk drive components. It increases the performance by replicating and dividing the data through many levels between multiple physical drives. There are 12 Raid Levels which are as follows:
- Level 0: it is a 'striped' disk array (provides data stripping) without fault tolerance.
- Level 1: It is used in system for “mirroring” and “duplexing” purpose.
- Level 2: in this error correction takes place
- Level 3: it provides byte level stripping also called as “bit-interleaved parity”
- Level 4: is used as “dedicated parity drive” and it provides block level striping
- Level 5: is “block interleaved distributed parity”
- Level 6: is “independent data disks with double parity.
- Level 0+1: is “a mirror of stripes” and used for replication and sharing of data among disks
- Level 10: is “a stripe of mirrors”. Multiple mirrors are created and then stripes over it.
- Level 7: It adds caching to Level 3 or 4.
- Level 50: implemented as striped array with fault tolerance
- RAID S: it is proprietary striped parity RAID system
15. What is normalization? Explain different forms of normalization?
Normalization is a process of organizing the data to minimize the redundancy in the relational database management system (RDBMS). The use of normalization in database is to decompose the relations with anomalies to produce well structured and smaller relations. There are 6 forms of normalization which are as follows:-
- 1NF represents a relation with no repeating groups
- 2NF represents no non-prime attribute in the table
- 3NF defines that every non-prime attribute is non-transitively dependent on every candidate key
- 4NF defines that every non-trival multi-valued dependency in table is dependent on superkey.
- 5NF defines that every non-trival join dependency in table is implied by superkey in table.
- 6NF defines that a table features no non-trival join dependency.
16. What is SQL service broker?
A service broker allows you to exchange messages between applications using SQL server as the transport mechanism. Message is a piece of information that needs to be shared. A service broker can also reject unexpected messages in disorganized format. It also ensures the messages come only once in order. It provides enhanced security as messages are handled internally by the database.
17. ETL - Extraction, Transformation, and Loading.
- It is the process of copying and cleaning data from heterogeneous sources.
- It is an important part of development projects for data warehousing and business intelligence.
#By default it is not enabled in SQL Server
Note:Instant file initialization is available only for data files. Log files will always be zeroed when being created, or growing in size.
The below database activities the get Benefit from Instant file initialization:
Creating database
Restoring a database from backup file
increasing database data file size manually
Increasing database data file size due to auto growth option
TEMPDB creation at the time of SQL Server restart
Note: Remember growing log file still uses the zeroing process.
Security risk: When instant file initialization is enabled, there is a slight security risk here. When a new database is crated those new pages are not zeroed out and there is a change that newly allocated pages might coentain previously deleted data and one can read that data using a recovery tool.
20. Can you explain the Transaction Log physical architecture?
24. what are the new changes in SQL Server log architecture on 2014?
Yes, VLF creation algorithm got changed from SQL Server 2014 which results into a smaller number of VLF when compared to the earlier (before 2014) algorithms.
Before SQL 2014:
Up to 64 MB: 4 new VLF's each roughly 1/4 the size of the growth.
64 MB to 1 GB: 8 new VLF's each roughly 1/8 the size of the growth.
more than 1 GB: 16 new VLF's each roughly 1/16 the size of the growth.
From SQL 2014:Is the growth size less than 1/8 the size of the current log size?
yes, create 1 new VLF equal to the growth size
No use the formula (8 VLF if auto growth > 1/8 of total log file)
Note: you can find examples and other VLF related questions in the chapter SQL DBA - General.
5. What is AWE?
A:
it can be found at SQL Server Level properties. AWE is properly explained in BOL so we will just have our simple explanation.
Address Windowing Extensions API is commonly known as AWE. AWE is used by SQL Server when it has to support very large amounts of physical memory. AWE feature is only available in SQL Server Enterprise, Standard, and Developer editions with of SQL Server 32 bit version.
Address Windowing Extensions API is commonly known as AWE. AWE is used by SQL Server when it has to support very large amounts of physical memory. AWE feature is only available in SQL Server Enterprise, Standard, and Developer editions with of SQL Server 32 bit version.
Microsoft Windows 2000/2003 server supports maximum of 64GB memory. If we have installed SQL Server 32 bit version which can support maximum of 3 GB memory on Windows 2000/2003, we can enable AWE feature to use available physical memory of server to improve performance of SQL Server. In simple words, AWE provides memory management functions which lets windows to allow more than 3GB memory to standard 32 bit application.
6. What is PAE?
A:To summarize (from the SQL Server point of view)
1) With default settings:
Each process will be assigned memory split between:
Physical Memory - user-mode - therefore max 2GB (max depending on phy mem or other apps using the same space) + Disk (Paged)
2) Same (nearly) happens with \3GB.
Each process will be assigned memory split between:
Physical Memory - user-mode - therefore max 3GB (max depending on phy mem or other apps using the same space) + Disk (Paged)
3) With \PAE and AWE
PAE\AWE-aware processes will be assigned up to 64GB of memory (depends on Windows Edition) split between:
Physical Memory - user-mode - therefore max 2GB or 3GB depending on \3GB option (see rule of thumb below) + Physical Memory (Additional) + Disk (Paged)
Rule of Thumb
If your system has < 4 GB - use only /3GB
If your system has > 4 GB and < 16 GB - use /3GB + /PAE + AWE
If your system has > 16 GB - use /PAE + AWE
Now then, so why considering x64 Architecture if we have AWE?
The use of SQL Server (32-bit) with AWE has several important limitations. The additional memory addressability is available only to the relational database engine’s management of database page buffers. It is not available to other memory consuming database operations such as caching query plans, sorting, indexing, joins, or for storing user connection information. It is also not available on other engines such as Analysis Services.
7. Calculate the size of BPOOL using below algorithm.
SQL Server Buffer Pool is minimum of “Physical RAM “ or “user mode memory(2GB or 3GB) – MTL- BUF structures”
BPool = Minimum (Physical memory, User address space – MTL) – BUF structures
Buf structures are arrays maintained by sqlserver to track the status of each buffer in BPOOL . SQL Server makes maximum of 32 allocation requests to the OS to reserve bpool pages.
SQL Server maintains contiguous array to track status information associated with each buffer (8 KB page) in the BPool. In addition SQL Server maintains a second array to track the committed and reserved bitmap pages.
This bit can be 0 or 1 . 1 indicates buffer is committed and 0 indicated page is reserved.
Size of Buf structure is approximately 16 MB when AWE is not enabled and when AWE is enabled buf structures use additional 8MB per each GB of RAM in the system.
A SQL Server Browser contributes to the following actions:
8. What is the use of SQL Server Browser ?
A SQL Server Browser contributes to the following actions:
1.Browsing a list of available servers.
2.connecting to correct server instance.
3.Connecting to dedicated administrator connection(DAC).
9. What is dedicated administrator connection(DAC)?
A SQL Server provides a special diagnostic connection for administrators when standard connections to the server are not possible. This diagnostic connection allows an administrator to access SQL Server to execute diagnostic queries and troubleshoot problems even when SQL Server is not responding to standard connection requests.
This dedicated administrator connection (DAC) supports encryption and other security features of SQL Server. The DAC only allows changing the user context to another admin user.
10. What is a Job in SQL Server?
A A job is a specified series of operations performed sequentially by SQL Server Agent. A job can perform a wide range of activities, including running Transact-SQL scripts, command-line applications, Microsoft ActiveX scripts, Integration Services packages, Analysis Services commands and queries, or Replication tasks. Jobs can run repetitive tasks or those that can be scheduled, and they can automatically notify users of job status by generating alerts, thereby greatly simplifying SQL Server administration.
11. What is the advantage of CLR integration?
A SQL Server has lack of certain features like Regular expressions. Its also not efficient in string processing etc also. Microsoft .NET framework is more efficient in processing strings,Regular Expressions etc. By integrating CLR integration, SQL Server can use the features of Microsoft .NET framework.
We can create stored procedures/functions in Microsoft .NET framework and we can call these from SQL Server
12. What is the function of SQL Server Agent Windows service?
- It is a Windows service which handles the tasks scheduled within the SQL Server environment. These tasks are also called as job and are stored with in SQL server. The jobs may run through a trigger, a predefined schedule or on demand.
- This service is very useful in determining why a particular job did not run as intended.
13. What about max_server_memory and
min_server_memory?
By
default, SQL Server can change its memory requirements dynamically based on
available system resources. Which means, it can use the amount of memory
specified between min_server_memory and max_server_memory.
Use min
server memory to guarantee a minimum amount of memory available to the
buffer pool of an instance of SQL Server. SQL Server will not immediately
allocate the amount of memory specified inmin server memory on
startup. However, after memory usage has reached this value due to client load,
SQL Server cannot free memory from the allocated buffer pool unless the value
of min server memoryis reduced.
Use max
server memory to prevent the SQL Server buffer pool from using more
than the specified amount of memory, thus leaving remaining memory available to
start other applications quickly.
Usually (underlined, yes it is underlined) you should leave these
settings at the default value, BUT the following
considerations\exceptions apply:
o As there is a short delay between the start of a new application
and the time SQL Server releases memory, using max server memory prevents
this delay and may give better performance to the other application.
o With SQL Server 2000 AWE or SQL 2005\2008 AWE on
Windows 2000, memory is not managed dynamically. This means, min
server memory is ignored and max server memory is never released so must be
set (underlined again).
Min
server memory and max server memory are advanced options. If you are using the
sp_configure system stored procedure to change these settings, you can change
them only when show advanced options is set to 1. These settings take effect
immediately (without a server stop and restart).
RAID stands for Redundant array of independent disks which was earlier called as Redundant array of inexpensive disks. It is a storage technology that has one logical unit consisting of multiple disk drive components. It increases the performance by replicating and dividing the data through many levels between multiple physical drives. There are 12 Raid Levels which are as follows:
- Level 0: it is a 'striped' disk array (provides data stripping) without fault tolerance.
- Level 1: It is used in system for “mirroring” and “duplexing” purpose.
- Level 2: in this error correction takes place
- Level 3: it provides byte level stripping also called as “bit-interleaved parity”
- Level 4: is used as “dedicated parity drive” and it provides block level striping
- Level 5: is “block interleaved distributed parity”
- Level 6: is “independent data disks with double parity.
- Level 0+1: is “a mirror of stripes” and used for replication and sharing of data among disks
- Level 10: is “a stripe of mirrors”. Multiple mirrors are created and then stripes over it.
- Level 7: It adds caching to Level 3 or 4.
- Level 50: implemented as striped array with fault tolerance
- RAID S: it is proprietary striped parity RAID system
15. What is normalization? Explain different forms of normalization?
Normalization is a process of organizing the data to minimize the redundancy in the relational database management system (RDBMS). The use of normalization in database is to decompose the relations with anomalies to produce well structured and smaller relations. There are 6 forms of normalization which are as follows:-
- 1NF represents a relation with no repeating groups
- 2NF represents no non-prime attribute in the table
- 3NF defines that every non-prime attribute is non-transitively dependent on every candidate key
- 4NF defines that every non-trival multi-valued dependency in table is dependent on superkey.
- 5NF defines that every non-trival join dependency in table is implied by superkey in table.
- 6NF defines that a table features no non-trival join dependency.
A service broker allows you to exchange messages between applications using SQL server as the transport mechanism. Message is a piece of information that needs to be shared. A service broker can also reject unexpected messages in disorganized format. It also ensures the messages come only once in order. It provides enhanced security as messages are handled internally by the database.
17. ETL - Extraction, Transformation, and Loading.
- It is the process of copying and cleaning data from heterogeneous sources.
- It is an important part of development projects for data warehousing and business intelligence.
18. What is the ghost record?
These are records that have been logically deleted but not physically deleted from leaf level of an index.
19. What is instant file initialization and how it works?
On Windows systems when SQL Server needs to write something on disk, first it verify the hard disk space is trustworthy means the space is readable. The verification process is:
SQL Server writes zeros to the file, This process known as zeroing process
SQL Server uses single thread for zeroing process.
The actual operation that triggers this verification should wait until this verification process completed.
If instant file initialization is enabled for SQL Server, it skips the zeroing process for data files and reduces the wait time.
#By default it is not enabled in SQL Server
The below database activities the get Benefit from Instant file initialization:
Creating database
Restoring a database from backup file
increasing database data file size manually
Increasing database data file size due to auto growth option
TEMPDB creation at the time of SQL Server restart
Note: Remember growing log file still uses the zeroing process.
Security risk: When instant file initialization is enabled, there is a slight security risk here. When a new database is crated those new pages are not zeroed out and there is a change that newly allocated pages might coentain previously deleted data and one can read that data using a recovery tool.
20. Can you explain the Transaction Log physical architecture?
The
transaction log in a database maps over one or more physical files.
Conceptually, the log file is a string of log records. Physically, the sequence
of log records is stored efficiently in the set of physical files that
implement the transaction log. There must be at least one log file for each
database.
The SQL
Server Database Engine divides each physical log file internally into a number
of virtual log files (VLFs). Virtual log files have no fixed size, and there is
no fixed number of virtual log files for a physical log file. The Database
Engine chooses the size of the virtual log files dynamically while it is
creating or extending log files. The Database Engine tries to maintain a small
number of virtual files. The size of the virtual files after a log file has
been extended is the sum of the size of the existing log and the size of the
new file increment. The size or number of virtual log files cannot be
configured or set by administrators.
If the log files grow to a large size in many small increments,
they will have many virtual log files. This can slow down database startup
and also log backup and restore operations. Conversely, if the log files
are set to a large size with few or just one increment, they will have few very
large virtual log files.
We recommend that you assign log files a size value
close to the final size required, using the required increments to achieve
optimal VLF distribution, and also have a relatively large growth_increment value.
21. What are
virtual log files – VLF?
Each SQL Server
transaction log file is made of smaller parts called virtual log files. The
number of virtual log files is not limited or fixed per transaction log file. Also, there is no fixed size of virtual log
file, if one is e.g. 512 kilobytes, all other will be of the same size.
SQL Server
determinates the size of a virtual log file dynamically when the transaction
log file is created or extended. The
goal is to maintain the small number of the virtual log files in the
transaction log file, because SQL Server handles the smaller number of files
easier. The size or number of virtual log files can’t be configured or set
by a database administrator.
By default, the SQL Server transaction log file is
set at an initial size of 2MB. Also, the default growth value is 10% of the
current size. While creating a SQL Server database, these options can be modified to
accommodate planned needs for the database. The auto-growth option is optional
and turned on by default. File growth can be specified in megabytes or percent.
There is also the clause to limit the maximum file size. By default, SQL Server creates a database with unrestricted file
growth.
If the
auto-growth settings are not properly managed, a SQL Server database can be
forced to auto-grow, which can cause serious performance issues. SQL Server
will stop all processing until the auto-grow event is finished. Due to physical
organization of the hard drive, the auto-growth event will take up the space
which is not close physically to the previous one occupied by the transaction
log file. This leads to the physical fragmentation of the files which also
causes slower response.
There is no
general rule how to determine the best values for the auto-growth option, as
these vary from case to case. Having too
many or too little virtual log files causes bad performance.
There isn’t an
option in SQL Server Management Studio which can provide the number of virtual
log files. The virtual log files can be shown via T-SQL script for each SQL
Server database. There will be other blog post which describes this topic in
detail.
The number of
virtual log files can be increased by an auto-grow event, this process is
common, but it needs strict rules to avoid unplanned problems with space or unresponsiveness
in peak hours. The number of virtual log files can be decreased by shrinking
the SQL Server transaction log file, which also requires strict rules to avoid
deleting the data which hasn’t been backed up yet.
22. What is
Virtual Log File (VLF) in Transaction Log File?
Every
transactional log file is logically divided into smaller segments and these
segments are called Virtual Log Files or VLFs.
23. How Virtual
Log Files Work Internally within Transactional Log File in SQL Server?
SQL Server
Database Engine defines the size of virtual log files dynamically while
creating or extending transactional log files in SQL Server. Virtual log files
will not have a fixed size and one cannot predict the number of VLFs within the
transactional log file. Internally, SQL Server will always try to maintain a
smaller number of VLFs. However, there is no way for a database administrator
to configure or set a fixed size for VLFs in SQL Server.
Virtual log
files are truncated only when there no records of an active transactions. The
space released after the truncation of VLFs will be made available for new
transactions. If the transaction log file increments are very small then this
will result in smaller and large number of VLFs within the SQL Server
transactional log file thereby resulting in Performance issue.
24. what are the new changes in SQL Server log architecture on 2014?
Yes, VLF creation algorithm got changed from SQL Server 2014 which results into a smaller number of VLF when compared to the earlier (before 2014) algorithms.
Before SQL 2014:
Up to 64 MB: 4 new VLF's each roughly 1/4 the size of the growth.
64 MB to 1 GB: 8 new VLF's each roughly 1/8 the size of the growth.
more than 1 GB: 16 new VLF's each roughly 1/16 the size of the growth.
From SQL 2014:Is the growth size less than 1/8 the size of the current log size?
yes, create 1 new VLF equal to the growth size
No use the formula (8 VLF if auto growth > 1/8 of total log file)
Note: you can find examples and other VLF related questions in the chapter SQL DBA - General.
ReplyDeleteThanks For this blog, This blog Contains valuable information. Keep Sharing your thoughts like this.
Mean Stack Technology
Mean Stack Programming