What are MDF and NDF files in SQL Server?

MDF: Primary database files: This is where everything in the database is normally stored. Where there are multiple database files (see .NDF files), the .MDF file will always store the database’s internal configuration information and other internal system data. A database can have only one primary database file and no two databases can share the same primary database file. SQL server by default will create one primary data file and one log file. The primary database file has a .mdf extension.

NDF: Secondary database files: Secondary database files are used to store all data that does not fit in the primary database file. Secondary files can be used to spread data across multiple disks by putting each file on a different disk drive. SQL server by default will not create an ‘ndf’ file. NDF files can be created while creating SQL server database or manually created after the database is created using the ‘ALTER DATABASE’ statement. A secondary database file has a .ndf extension.

LDF: Transaction log files: Every database contains one primary database file and one transaction log file (Log files are written to sequentially, so there is no benefit in having multiples of them, unless you exceed the maximum log file size (2TB) between backups.). The transaction log files are used to hold the log information used to recover the database. The transaction log is a serial record of all modifications that have occurred in the database as well as the transaction that performed each modification. The transaction log records the start of each transaction. It records the changes to the data and enough information to undo the modifications (if necessary later) made during each transaction. Data and transaction log information is never mixed up on the same file. Every database must include at least one transaction log file. Transaction log files have a .ldf extension.

Temp DB is a special system database used in certain key operations, and has a big performance impact on your overall system. Tempdb handles a lot of system tasks as well as user-created temp tables. Query operations like joins and aggregations happen there. Online index rebuilds and INSTEAD OF triggers are also done in tempdb. Tempdb installs with just one data file and one log file by default.

If your Tempdb is on a shared disk it is probably best to move it to a local disk. There are two reasons why you might want to move tempdb from a shared disk to a local disk, and both are related to performance.

The first reason is that the relatively recent increase in cost effective, ultra-fast solid-state storage presents an opportunity to achieve significant performance gains on servers experiencing heavy tempdb usage. The challenge prior to SQL Server 2012 was that solid-state storage cards plug straight into a server’s motherboard to avoid all the overhead of traditional storage buses*. This made it very difficult to use them at all in failover cluster instances and now they can be used for the discrete task of running tempdb.

The second reason you might want to use a local tempdb is to take I/O requests off your shared storage to improve the performance of the shared storage.

*Until SQL Server 2012, a failover cluster instance of SQL Server required all its database files to be on shared disk resources within the cluster. This was to ensure that when the instance failed over to another node in the cluster, all its dependent disks could be moved with it. Nothing in tempdb persists after a restart and it’s effectively recreated every time. The failover process for a clustered instance involves a restart of SQL Server so nothing in tempdb needs to be moved across to the other node and there’s no technical reason why tempdb should be on a shared disk.

A Filegroup in SQL Server is a named collection of one or more files the forms a single unit for data allocation or administrative purposes. For example, two files can be created on two separate disks (i.e. f1 and f2) and assigned to the filegroup filegroup1.  Filegroups let you target specific operations—primarily backups and restores—to a single filegroup, thereby affecting a bunch of files.

Let’s say that, for performance reasons, you’ve split your database across three .NDF files in addition to the main .MDF file. Each of those four files is located on a separate disk volume, and each volume is accessed by a different disk controller. That’s great for parallel operations, and it should help performance in the system. But now you have to back up and recover four files, making a lot more work for yourself. Instead, you can group those into a single filegroup, and then run backups against that filegroup, grabbing all four files at once.

Another option for using filegroups is to assign tables or indexes to different FILEGROUPs and then place the files for those FILEGROUPs on different physical drives, RAID arrays, or LUNs. This way it’s possible to allow SQL Server to engage in parallel I/O operations that can help speed the execution of more complex queries. Two common techniques for the use of multiple FILEGROUPs and files are:

  • Placing tables on the PRIMARY FILEGROUP, then creating key, non-clustered, indexes on an ‘INDEX’ FILEGROUP – which has files spread across other physical locations. This, in turn, allows SQL Server to engage one set of drives (or spindles) to do bookmarking activities against the table itself, while using a completely different set of drives/spindles to seek or scan against an index – rather than waiting on the same I/O subsystem from a single set of drives/spindles to queue those requests and process them as needed.
  • Put heavily used tables (via JOINs) in a ‘SECONDARY’ FILEGROUP – which has files placed on different physical drives/spindles. This, in turn, then allows quicker completion of commonly used JOINs by allowing quicker completion of underlying I/O operations as these operations are handled by different drives or sets of drives.

File placement.

It’s best practice to separate data, transaction logs, and tempdb. This recommendation lies with the separation of types of workload between different physical storage, i.e. separate physical disks.

This is still a valid recommendation for environments where you can guarantee that separation, but more commonly we see SQL Server deployed in a shared storage environment, where physical separation is much harder to achieve and usually isn’t even necessary for performance reasons.

It is still a good idea however to maintain separation to help with manageability so that potential problems are easier to isolate. For example, separating tempdb onto its own logical disk means that you can pre-size it to fill the disk without worrying about space requirements for other files, and the more separation you implement the easier it is to associate logical disk performance to specific database files.

The general objectives are as follows:

  1. Optimize parallelism of IO.
  2. Isolate different types of IO from each other that may otherwise cause a bottleneck or additional latency, such as OS and page file IO from database IO, or sequential log file IO from random data file IO.
  3. Minimize management overheads by using the minimum number of drive letters or mount points required to achieve acceptable performance.

In order to achieve objectives 1 and 2, we recommend splitting out data files and Temp

DB files from log files onto separate drive letters or mount points. This has the effect of killing two birds with one stone. By separating log files into their own drive or mount point, you maintain the sequential nature of their IO access pattern and can optimize this further at the hypervisor and physical storage layer later if necessary. If the log files share a drive or mount point, the access pattern of that device will instantly become random. Random IO is generally harder for storage devices to service. At the same time, you are able to increase the parallelism needed for the IO patterns of the data files and Temp DB files.

To achieve greater IO parallelism at the database and operating system layer, you need to allocate more drive letters or mount points. The reason for this is that each storage device (mount point or drive) in Windows has a certain queue depth, depending on the underlying IO controller type being used. Optimizing the total number of queues available to the database by using multiple drives or mount points allows more commands to be issued to the underlying storage devices in parallel. We will discuss the different IO controllers and queue depths in detail later.

As a starting point for standalone database instances, we recommend that you configure a drive letter or mount point per two data files and one Temp DB file. This recommendation assumes each file will not require the maximum performance capability of the storage device at the same time. The actual number of drive letters or mount points you need will be driven by your actual database workload. But by having fewer drives and mount points will simplify your design and make it easier to manage.

The more users, connections, and queries, the higher the IO requirements will be, and the higher the queue depth and parallelism requirements will be, and the more drive letters and mount points you will need.

Number of files.

The usual recommendation from Microsoft is 1 file per CPU core but that is a generalisation and a more accurate recommendation would be as follows:

What are MDF and NDF files in SQL Server?

Some people recommend having the number of tempdb data files equal to 1/4 to 1/2 the number of logical processor cores and this may be perfectly fine, it all depends on you setup and workload but the above table is a good place to start.

Here is an example of data files, Temp DB files, and transaction log files allocated to a SQL Server 2012 Database on a sample system with four CPU cores and 32GB RAM.

What are MDF and NDF files in SQL Server?

When formatting a partition you are given the option to choose the allocation unit size, which defaults to 4KB. Allocation units determine both the smallest size the file will take on disk and the size of the Master File Table ($MFT). If a file is 7KB in size, it will occupy two 4KB allocation clusters.

SQL Server files are usually large. Microsoft recommends the use of a 64KB allocation unit size for data, logs, and tempdb. If you use allocation units greater than 4KB, Windows will disable NTFS data compression (this does not affect SQL Server compression).

What is a NDF file in SQL Server?

An NDF data file is a secondary data file and a secondary data file exists always associated with a primary data file (. mdf file). Using a tool like SQL Server Management Studio (SSMS) you can open a .NDF file by attaching the whole database it belongs to including its associated primary data file.

What is a MDF file in SQL Server?

A file with . mdf extension is a Master Database File used by Microsoft SQL Server to store user data. It is of prime importance as all the data is stored in this file. The MDF file stores users data in relational databases in the form columns, rows, fields, indexes, views, and tables.

What type of file is NDF?

What is a NDF file? A file wiht . ndf extension is a secondary database file used by Microsoft SQL Server to store user data. NDF is secondary storage file because SQL server stores user specified data in primary storage file known as MDF.

What are NDF files tempdb?

An NDF file is a user defined secondary database file of Microsoft SQL Server with an extension . ndf, which store user data. Moreover, when the size of the database file growing automatically from its specified size, you can use . ndf file for extra storage and the .