Archive

Posts Tagged ‘Tips & Tricks’

Getting the source table of output columns for a SQL Server 2005 stored procedure

March 25th, 2009

Today’s tip is a relatively simple (but limited) way to programmatically extract the output columns of a SQL Server stored procedure or query along with the tables they come from.

This is something that came up at work. I didn’t want to do it by hand because there were many stored procedures to check, but it was difficult to automate because they could be (and were indeed) quite complex, with many selects and joins, computed columns, aliases and so on, and to make things worse the SQL code had no standard indentation or structure, so string searching and regular expressions were not reliable options. Parsing the Transact-SQL code would have been way harder than anything else, so it was out of discussion too. Unfortunately, the limitations of this solution (that I’ll mention later) stopped us from using and we had to do the job by hand.

Lazy as I am, I wanted to make SQL Server itself give me the information I needed. After meditating about it for a while an idea came to my mind: I could try to extract the information from the execution plan.

You probably know it, but the query execution plan that you can see in Microsoft SQL Server Management Studio (SSMS from now on) comes from the SQL Server engine in XML format -SSMS just displays it in a graphical form. This XML is a real gem that provides lots of information about the query (even things that aren’t displayed in the execution plan tab), and you can see it by setting the “SET STATISTICS XML” option to ON before you run your query. For example, let’s run the following query (just a plain SQL query so you can see what it’s going on, but remember that it’s the same for stored procedures) that joins the Product and ProductSubcategory tables of the AdventureWorks sample database:

ssms 01

Notice that I got two result sets. The first is the query’s result, and the second is the execution plan in XML format (if you don’t see it, make sure the “Include Actual Execution Plan” option isn’t turned on). If you click on the link SSMS will open the XML in a new window:

ssms 02

Look at the OutputList node in there. It contains all the columns that appear in the result, each one within a ColumnReference element:

<ColumnReference Database=[AdventureWorks] Schema=[Production] Table=[Product] Alias=[p] Column=ProductID />

<ColumnReference Database=[AdventureWorks] Schema=[Production] Table=[Product] Alias=[p] Column=Name />

<ColumnReference Database=[AdventureWorks] Schema=[Production] Table=[Product] Alias=[p] Column=ProductNumber />

<ColumnReference Database=[AdventureWorks] Schema=[Production] Table=[Product] Alias=[p] Column=Color />

<ColumnReference Database=[AdventureWorks] Schema=[Production] Table=[Product] Alias=[p] Column=StandardCost />

<ColumnReference Database=[AdventureWorks] Schema=[Production] Table=[ProductSubcategory] Alias=[psc] Column=Name />

As you can see the information is quite detailed: for each column you have the database, the table and the schema it belongs to. A non-intuitive thing to keep in mind is that the Alias attribute is not the column’s alias, but the table’s.

Armed with this knowledge we can now create a tool to extract the information automatically: it’s just a matter of executing the stored procedure and retrieving the execution plan to process it. But let’s do one last thing before jumping to the code. I don’t know about you, but I really hate to work with XML -I’d rather work against an object model. It turns out that we can do just that: the ShowPlan’s schema is publicly available for download, and with it and the XML Schema Definition Tool (xsd.exe, located at C:\Program Files\Microsoft SDKs\Windows\v6.0A\bin\xsd.exe on my machine) we can create the classes needed to take the execution plan’s XML and deserialize into an object graph. Run this command to generate the classes (I renamed the showplanxml.xml.xsd file I downloaded to showplanxml.xsd):

xsd C:\showplanxml.xsd /classes /o:C:\

This will create a file in C: called showplan.cs with several classes within it (the root being ShowPlanXML) that we can now include in a new project in Visual Studio (you may want to add a namespace to the generated classes). The code required to get a ShowPlanXML instance from the execution plan is really simple:

    using (XmlReader reader = XmlReader.Create(“ExecutionPlan.xml”, new XmlReaderSettings()))

    {

        XmlSerializer serializer = new XmlSerializer(typeof(ShowPlanXML));

        ShowPlanXML plan = (ShowPlanXML)serializer.Deserialize(reader);

        Console.WriteLine(plan.Version);

    }

In the code snippet above I’m loading the XML from a local file just for demo purposes. In the real application I read it from a SqlDataReader after executing the stored procedure. I’m printing the version just to make sure the deserialization worked. But let’s do something more interesting:

    foreach (StmtBlockType[] blocks in plan.BatchSequence)

    {

        foreach (StmtBlockType block in blocks)

        {

            foreach (BaseStmtInfoType item in block.Items)

            {

                StmtSimpleType statement = item as StmtSimpleType;

                if (statement != null)

                {

                    foreach (ColumnReferenceType columnReference in statement.QueryPlan.RelOp.OutputList)

                    {

                        Console.WriteLine(columnReference.Table + ” - “ + columnReference.Column);

                    }

                }

            }

        }

    }

Here I go through all the batches and statements until I reach the column references. The BaseStmtInfoType class has 4 derived classes: StmtCursorType, StmtReceiveType, StmtSimpleType, and StmtUseDbType (all these types are generated by xsd.exe too). You can tell more or less what case each one stands for based on the names. Since I’m looking for statements I only handle the case when the item is a StmtSimpleType instance. When I compile and run the code I get the following output:

console output

Voilà! This is just what I wanted. Now some things to consider about the structure of the execution plan:

  • For each step in the execution plan you’ll see a RelOp node (with its corresponding OutputList node). These nodes are nested according to the order they were executed. In our case:
    ssms 03
    We’ll have 3 RelOp nodes: the outermost one will be a Hash Match, containing two others, an Index Scan and a Clustered Index Scan (this is probably easier to understand if you see the graphical execution plan and the XML document side by side).
    The good news is that for our purposes we only need to worry about the first RelOp node, since it corresponds to the last operation (the Hash Match).
  • If you have more than one SQL statement in your stored procedure you’ll get execution plans for each of them. If you just care about the last result set you only need to process the last execution plan:
    ssms 04
  • SET STATISTICS XML ON will execute the query. The benefit of this is that you can extract the columns aliases from the SqlDataReader and match them by position with columns in the XML execution plan, provided you don’t have computed columns (see “Limitations” later). If you don’t need the aliases you can use the SET SHOWPLAN_XML option instead, which will generate the execution plan but will not actually execute the query, and therefore should be faster.

Limitations

This technique has 3 big limitations:

  • If you have computed columns the resulting XML will be much more complex. For example, let’s add a new column computed with a CASE statement to our query:
    ssms05
    Now the execution plan will look like this:
    ssms06
    As you can see the column is referred to as “Expr1004″. To actually find the expression you will have to navigate (possibly deep) into the document:
    ssms07
    The entire CASE expression is in the XML, and you can get to the actual column used (MakeFlag) but it can be tedious (and this is intended to be a simple trick).
  • You can’t determine the column aliases. The XML document doesn’t contain the aliases. At first I thought I could match the column references in the OuputList node to the column names from the query result by position, but unfortunately the columns in the XML are not necessarily in the same order. I was quite disappointed when I saw this because it makes this technique far less powerful (I’ve only checked in SQL Server 2005, perhaps things are better in SQL Server 2005 SP2 or SP3, or SQL Server 2008, but I don’t have any of them handy to check).
  • If you have conditional logic the parameters that you pass to the stored procedure matter. This what you would expect since we are working with an execution plan anyway.

Well that’s all. I’m planning to upload a sample application for all this, but I’m too tired to do it tonight, so wait for the next post if you are interested. Hope this helps, happy coding!

Julián Hidalgo SQL Server, Tips & Tricks ,

Little trick to query a SQL Server database’s schema

March 4th, 2009

This is the easiest way to query the schema of a database I know of:

  • First go to SQL Server Management Studio, expand the databases node and go to the list of tables. In this example I’m using the AdventureWorks database:

  • Launch the SQL Profiler and start a new trace.
  • Right-click the tables node and select “Refresh”:

  • Now you can see the the query that the SSMS used in the SQL Profiler:

And that’s it, you just have to customize it to fit your needs. You can apply this to query other type of objects of course, like indexes, keys and so on (basically anything you see in SSMS). For example, the following query will list all the columns and the table they belong to:

SELECT
tbl.name as TableName,
clmns .name AS [Name],
CAST (ISNULL(cik.index_column_id, 0) AS bit) AS [InPrimaryKey],
CAST (ISNULL((select TOP 1 1 from sys.foreign_key_columns AS colfk where colfk.parent_column_id = clmns.column_id and colfk.parent_object_id = clmns.object_id), 0) AS bit) AS [IsForeignKey],
usrt.name AS [DataType],
CAST (clmns.precision AS int) AS [NumericPrecision],
CAST (clmns.scale AS int) AS [NumericScale],
clmns.is_nullable AS [Nullable],
clmns.is_computed AS [Computed]
FROM
sys.tables AS tbl
INNER JOIN sys.all_columns AS clmns ON clmns.object_id=tbl.object_id
LEFT OUTER JOIN sys.indexes AS ik ON ik.object_id = clmns.object_id and 1=ik.is_primary_key
LEFT OUTER JOIN sys.index_columns AS cik ON cik.index_id = ik.index_id and cik.column_id = clmns.column_id and cik.object_id = clmns.object_id and 0 = cik.is_included_column
LEFT OUTER JOIN sys.types AS usrt ON usrt.user_type_id = clmns.user_type_id
ORDER BY tbl.name, clmns.name

To obtain it I refreshed the list of columns in a table, and from the resulting query I removed a couple of joins and the where clause, and added a different ORDER BY clause.

An extra bonus of this trick is that by examining the queries SSMS uses you’ll learn more about SQL Server itself.

Hope it helps :)

 

Julián Hidalgo SQL Server, Tips & Tricks ,