Last week at my work I had to write a tool to process few hundreds of thousand email files and insert their details to the MS SQL Server 2005 database. Doing it in traditional way might require much connectivity to the database and will consume many resources. I was wondering what if I could insert those whole hundreds thousands of data at once to the db which will be much more efficient. I started googgling and I found OPENXML. This is not a whole new technology but I didn’t know about this before (strange!!).
And few more minutes of research I found out how to perform bulk operations against the databases using OPENXML and stored procedures.
What is OPENXML?
OPENXML is a function added to SQL Server 2000 that provides a row set view over an XML document. Since a row set is simply a set of rows that contain columns of data, OPENXML is the function that allows an XML document to be treated in the familiar relational database format. It allows for the passing of an XML document to a T-SQL stored procedure for updating the data.
- It extends the SQL Language
- It is used within T-SQL Stored Procedures
- XML Document passed as parameter
- It uses row and column selectors utilizing XPath
- It supports the following:
- Attribute and element-centric mappings.
- Edge table row set.
- XML annotation/overflow column.
- Hierarchy support.
OPENXML and ADO.Net
This code block explains the usage of OPENXML to insert bulk data to database. It simulates the actual process I wanted ( processing .eml files) by reading files in a folder and saving their names, created date and size in bytes.
- First create the table below:
CREATE TABLE FileDetails(
FileName varchar(50) PRIMARY KEY,
CreatedDate varchar(50) ,
Size decimal(18, 0) )
- And then create the stored procedure below:
CREATE PROC sp_bulkinsert @xmldata varchar(max)
DECLARE @hDoc int
exec sp_xml_preparedocument @hDoc OUTPUT,@xmldata
Insert Into FileDetails
SELECT FileName, CreatedDate, Size
FROM OPENXML (@hdoc, ‘/ds/fileDetails’,2)
WITH (FileName varchar(50), CreatedDate Datetime, Size decimal) XMLFileDetails
Where XMLFileDetails.FileName Not IN (Select FileName from FileDetails)
EXEC sp_xml_removedocument @hDoc
- The below sample code snippet shows how to call this stored procedure to insert bulk data to database at once.
- We can retrieve the file details using FileInfo() objects and create a Dataset of the file details and then we can convert this Dataset to xml using WriteXml() Method of the Dataset.
private static string preparexml(DataSet ds)
StringBuilder sb = new StringBuilder();
StringWriter sw = new StringWriter(sb);
This xml representation of the data can be passed as the parameter to the stored procedure as given below :
private static void bulkInsert(string xml)
SqlConnection con = creatCon();
SqlCommand cmd = new SqlCommand(“sp_bulkinsert”, con);
cmd.CommandType = CommandType.StoredProcedure;
Console.WriteLine(“Completed inserting file informations to database”);
That’s all simple isn’t it? You can see the performance of this method by writing another code to insert the same fields using the traditional 1 at a time way. Which surprisingly will show a huge difference in performance and time.
You can use the same way to update, delete records in bulk.
The complete code can be downloaded and tested in the below link :
Complete Source Code