Wednesday September 08, 2010

FILESTREAM in SQL Server 2008

by Allen Berezovsky 4. March 2009 09:08

Before SQL Server 2008 came out, there was always a big debate about how we should store BLOBs such as images, PDFs, word documents, etc. Our options were very limited. Either we stored the BLOB on the file system or file server, and stored a reference path to that BLOB in the database, or we went bold, and stored the actual BLOB in the database. Tons of online articles have been written about how you should never store BLOBs in the database, but storing them on the file system had it’s own disadvantages. For example, if someone deleted an image on your file system which is referenced in the database, the database would never know that the link is now broken. Similarly, if one deleted the database record with the link reference to an image on the file system, the image would just sit there as an orphan.

SQL Server 2008 solved these problems by introducing FILESTREAM, which integrates the Database Engine with an NTFS file system by storing varbinary(max) BLOB data as files on the file system, and at the same time maintaining security and referential integrity between the BLOB and the path to that BLOB through SQL Server. Now we can insert, update, query, search, and back up FILESTREAM data through T-SQL commands. Visual Studio 2008 now also has all of the necessary name spaces and API’s to so save and retrieve BLOB’s to and from the database. In this post, I’m going to try to walk you through, from start to finish, how to create an ASP.NET front end, and SQL Server 2008 backend to deal with BLOB’s.

STEP 1: Make sure your database is set up for Filestream:

We are all familiar with the fact that when we create a new database, at least two files are created: a Data file (.MDF) and a Log file (.LDF). We also know that those files exist on one or more filegroups. SQL Server 2008 now has a special section for Filestream filegroups, and a new File Type Filesetream Data. Without setting up a specific Filegroup and File for your filestream data, you will not be able to create a table to hold your filestream.

So first, create a new filegroup as shown below:

Filegroup

Second, add a new database file of type Filestream Data, and specify the newly created Filegroup:

File 

If you scroll to the right in the screen above, you’ll be able to specify the location for your filestream, similarly to the locations you have for your data and your log files. Now, your database is ready for a table that will hold your BLOBs.

STEP 2: CREATE  YOUR TABLE FOR YOUR MEDIA AS SHOW BELOW:

   1:  CREATE TABLE [dbo].[media](
   2:      [mediaId] [uniqueidentifier] ROWGUIDCOL  NOT NULL,
   3:      [fileName] [nvarchar](256) NOT NULL,
   4:      [contentType] [nvarchar](256) NOT NULL,
   5:      [file] [varbinary](max) NULL,
   6:  PRIMARY KEY CLUSTERED 
   7:  (
   8:      [mediaId] ASC
   9:  )WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, 
  10:   ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
  11:  ) ON [PRIMARY]
  12:   
  13:  GO
  14:   
  15:  ALTER TABLE [dbo].[media] ADD  DEFAULT (0x) FOR [file]
  16:  GO

STEP 3: CREATE AN ASP.NET APPLICATION FROM WHERE WE WILL UPLOAD FILES

ASPX file: in your default page, drop a simple FileUpload control and a button:

   1:  <html xmlns="http://www.w3.org/1999/xhtml" >
   2:  <head runat="server">
   3:      <title></title>
   4:  </head>
   5:  <body>
   6:      <form id="form1" runat="server">
   7:      <div>
   8:          <h1 style="color:Red">File Stream Example</h1><br /><br />
   9:          <asp:FileUpload ID="FileUpload1" runat="server" />&nbsp;
  10:          <asp:Button ID="btnSave" runat="server" Text="Save" onclick="btnSave_Click" />
  11:      </div>
  12:      </form>
  13:  </body>
  14:  </html>

Code Behind:

   1:  using System;
   2:  using System.Collections.Generic;
   3:  using System.Linq;
   4:  using System.Web;
   5:  using System.Web.UI;
   6:  using System.Web.UI.WebControls;
   7:  using System.IO;
   8:  using System.Data;
   9:  using System.Data.SqlClient;
  10:  using System.Data.SqlTypes;
  11:  using Microsoft.Win32;
  12:  using Microsoft.Win32.SafeHandles;
  13:  using System.Security.Permissions;
  14:   
  15:   
  16:  namespace FileStreamExample
  17:  {
  18:      public partial class _Default : System.Web.UI.Page
  19:      {
  20:          protected void Page_Load(object sender, EventArgs e)
  21:          {
  22:   
  23:          }
  24:   
  25:          protected void btnSave_Click(object sender, EventArgs e)
  26:          {
  27:              #region Database Connection
  28:              //Open a connection to the database
  29:              SqlConnectionStringBuilder conString = new SqlConnectionStringBuilder();
  30:              conString.DataSource = @"YourServer";
  31:              conString.IntegratedSecurity = true;
  32:              conString.InitialCatalog = "YourDB";
  33:              SqlConnection con = new SqlConnection(conString.ConnectionString);
  34:              con.Open();
  35:              #endregion Database Connection
  36:   
  37:              //Start up a database transaction
  38:              SqlTransaction tran = con.BeginTransaction();
  39:   
  40:              //Create a Guid to be used on insert.
  41:              Guid mediaID = Guid.NewGuid();
  42:   
  43:              //Insert a row into the table to create a handle for streaming write.
  44:              SqlCommand cmd = new SqlCommand("INSERT [dbo].[media]([mediaId], [fileName], [contentType]) VALUES( @mediaId, @fileName, @contentType);", con, tran);
  45:              cmd.Parameters.Add("@mediaId", SqlDbType.UniqueIdentifier).Value = mediaID;
  46:              cmd.Parameters.Add("@fileName", SqlDbType.NVarChar, 256).Value = FileUpload1.FileName;
  47:              cmd.Parameters.Add("@contentType", SqlDbType.NVarChar, 256).Value = FileUpload1.PostedFile.ContentType;
  48:              cmd.ExecuteNonQuery();
  49:   
  50:              // Get a filestream PathName token and filestream transaction context.
  51:              // These items will be used to open up a file handle against the empty blob instance.
  52:              cmd = new SqlCommand("SELECT [file].PathName(), GET_FILESTREAM_TRANSACTION_CONTEXT() FROM [dbo].[media] WHERE [mediaId] = @mediaId;", con, tran);
  53:              cmd.Parameters.Add("@mediaId", SqlDbType.UniqueIdentifier).Value = mediaID;
  54:   
  55:              // Read in results of query.
  56:              SqlDataReader rdr;
  57:              rdr = cmd.ExecuteReader(CommandBehavior.SingleRow);
  58:              rdr.Read();
  59:              SqlString sqlFilePath = rdr.GetSqlString(0);
  60:              SqlBinary transactionToken = rdr.GetSqlBinary(1);
  61:              rdr.Close();
  62:   
  63:              SqlFileStream sqlFile = new SqlFileStream(sqlFilePath.Value,transactionToken.Value,FileAccess.Write);
  64:              byte[] buffer = new byte[512 * 1024]; //512kb
  65:              int bytesRead = FileUpload1.PostedFile.InputStream.Read(buffer, 0, buffer.Length);
  66:              while (bytesRead > 0)
  67:              {
  68:                  sqlFile.Write(buffer, 0, bytesRead);
  69:                  bytesRead = FileUpload1.PostedFile.InputStream.Read(buffer, 0, buffer.Length);
  70:              }
  71:              #region Connection cleanup
  72:              // Commit transaction, cleanup connection.
  73:              sqlFile.Close();
  74:              FileUpload1.PostedFile.InputStream.Close();            
  75:              tran.Commit();
  76:              con.Close();
  77:              #endregion Connection cleanup
  78:          }
  79:      }
  80:  }

STEP 4: GO AND CHECK WHAT IS IN YOUR TABLE!

Results

Now, you are wondering, is the file column really where your file is? Well, the answer is not really. If you browse on your computer to the folder where you placed your filestream file in step 1, you will notice that there is a folder in there there your file is. The name will be encrypted, and your media table holds an encrypted pointer to that file.

Now…do this experiment for me. Delete the record from your media table, and see if the file disappeared from your file folder.

Did you do it?

If you did, you will notice that the answer is no. Now think for a minute why not! Now realize that the file is not a regular file on your file system. It is a file stream data controlled by the database. Therefore, if you wanted to restore the deleted row from the transaction log, and the actual file was deleted, where would it get the file from? Now…if you truncate your transaction log, you will see that the file will actually disappear. Works like a charm!

CREDITS:

I would like to thank Guy Burnstein and his blog for helping me get started with file stream. He has a lot of good stuff in there on filestream, including information not included in this blog. You can check it out at this link:

http://blogs.microsoft.co.il/blogs/bursteg/archive/2008/05/12/sqlfilestream-managed-api-for-sql-server-2008-filestream.aspx

Tags: , ,

SQL Server | FILESTREAM | SQL Server 2008 | ASP.NET

Hierarchy of a self referencing table

by Allen Berezovsky 20. February 2009 09:47

A former colleague of mine called me the other day with a following problem. She needed to write some reports against a self referencing table. Her initial thought was that she needed to create a number of temp tables, and fill them with the data for each hierarchy level. Her problem also was that she had no idea how deep the hierarchy went. When she got tired of running all kinds of select statements to find out the depth of that hierarchy, she called me for advice. Here is the sample code that I sent her.

Feel free to run this against the AdventureWorks database, and check out the results. The HumanResources.Employee table has an EmployeeID column, and a ManagerID column that references the EmployeeID within the hierarchy. The only catch is, this will not work in SQL 2000 or below.

   1:  WITH CorpStructure(ManagerID,EmployeeID, Title, Level)
   2:   AS
   3:    (
   4:    SELECT Emp.ManagerID, Emp.EmployeeID, Emp.Title, 1 as Level
   5:    FROM HumanResources.Employee as Emp
   6:    WHERE ManagerID is NULL
   7:    UNION ALL
   8:    SELECT Emp.ManagerID, Emp.EmployeeID, Emp.Title, Corp.Level + 1
   9:    FROM HumanResources.Employee as Emp INNER JOIN CorpStructure as Corp
  10:    ON Emp.ManagerID = Corp.EmployeeID
  11:   )
  12:   SELECT ManagerID, EmployeeID, Title, Level
  13:   FROM CorpStructure

Tags: ,

SQL Server

Moving SQL Server Log file to a new location

by Allen Berezovsky 23. January 2009 09:01

Everyone knows, I hope, that it is usually a best practice to have your database log file on a different drive from the data file. But what if you have a situation, where originally, the database was created with both the log and the data file being on the same drive? How do you move the log file to a new location? Here is what you do:

Step 1: Make sure nobody is using the database at the moment. (This whole procedure is better done off business hours).

Step 2: Do a full database backup. (This is not an absolutely required step, but is ALWAYS a good idea before you do something that is somewhat dangerous).

Step 3: After you backup the database, I would shrink the log file so when you move it, it doesn't take forever. (This is also not a required step, and shouldn't be really done if your log file is not huge). If you do decide to perform this step, here is the syntax: DBCC Shrinkfile ('yourDB','1');    Note: the second parameter here is the size in kilobytes to which you want to shrink your log.

Step 4: Detach your database by using this syntax:

USE master
GO
EXEC sp_detach_db @dbname=N'yourDB';
GO

Step 4: However you want, copy the actual yourDB_log.ldf file to whatever location you want. Say for this example, we'll move it to the root of the D drive.

Step 5: Attach your database, and specify the new file locations.

USE master
GO
EXEC sp_attach_db @dbname = N'yourDB',
@filename1 = N'C:\Program Files\Microsoft SQL Server\MSSQL10.MSSQLSERVER\MSSQL\Data\yourDB_Data.mdf',
@filename2 = N'D:\yourDB_log.ldf';
GO

Tags: , , , ,

SQL Server

Updating Statistics VS Recompiling Stored Procedures

by Allen Berezovsky 16. January 2009 05:09

So I was doing a lecture on sql server query optimization at UCLA yesterday, and one of the topics I was covering was statiscs. As we know, statistics is information that the sql server collects about the distribution of data in columns and indexes, and they are used by the query optimizer to determine the best execution plan. If the database option to update statistics automatically is on, which is the default, statistics will be updated as soon as the need is discovered. In general, if your table has over 500 rows in it, and at least 500 plus 20% of the rowcount rows have been modified, statistics are out of date. Therefore, when that happens, and a user does a select, sql server will first update the statistics to get the best execution plan, and then work on returning your query results. While that sounds good in theory, in practice, it kills your query performance because wating for statistics to update when you're trying to get data returned to your application is very costly. I talked about being proactive in updating statistics with good maintenance plans, etc. I also talked about the possibility of turning on the option of updating statistics asynchronously on the database. What that will do, is alow your query to return data using the old execution plan, while the statistics are updating on the different thread.

As I was discussing these issues, a great question came up. Is updating statistics sufficient to make sure that our stored procedures use the latest and greatest execution plan? The answer is, not necessarily. Before SQL Server 2005, the execution plan for a stored procedure was saved during initial compile, and if you changed your table structure or added new indexes to a table after that compile, you would have to make sure to recompile your stored procedure in order for it to generate a new and better execution plan. Since the 2005 version, there has been a slight improvement to when the optimization of your procedure's execution plan occurs. Now, the optimization happens automatically when the first time a stored procedure is executed right after SQL Server services are restarted. It will also happen automatically if an underlying table used by the stored procedure changes. However, if you add a new index from which a stored procedure might benefit, and you do not want to wait till the SQL Server is restarted for your stored procedure to reap the benefit of your new index. In this case, you might want to force the stored procedure to recompile the next time it executes, and for that you can use the sp_recompile system stored procedure.

Tags: , , ,

SQL Server | SQL Server 2008

Calendar

<<  September 2010  >>
MoTuWeThFrSaSu
303112345
6789101112
13141516171819
20212223242526
27282930123
45678910

View posts in large calendar

Credentials

MCITP


MCTS


MCP

Powered by BlogEngine.NET 1.4.5.14