August 13, 2019

Strangest Things, Pt. 3: Linked Servers in SQL

Jared Kirkpatrick

As a consultant, you experience different environments, with different architectures, different skill sets, and different processes. Usually there’s just one thing in common: They have serious issues that need immediate attention. Such as using linked servers in SQL.

Once a consultant has some experience under his or her belt, one can usually see symptoms that can be validated quickly while engineering a solution to get the client back on track. Other times, even the most experienced professional encounters the strangest things and is propelled into a warped, upside down environment. Even though this realm is without logic, you still have to find the root issue and solve it.

Just The Facts

For this edition of the series, the client resembles the old “Dragnet” show, so we’ll refer to the client as “Friday.” Friday has an enterprise environment that separated their development/staging and production environments with a separate DMZ. The Friday team had been working on an internal application that they promoted to the production environment after many months of development and staging approval.

The application was functional and, while not blazing fast, did meet their committees’ threshold for responsiveness in the validation stage. Friday went through a normal change control process to propagate this app to production. But, as soon as they put it in production, the application took more than 17 times longer to run. That’s when it actually completed, or otherwise it would simply stall out.

Friday re-verified the production code was the same code in staging. The servers in production were the same class of servers in dev/staging. However, the production servers had at least twice the amount of resources and usually more. In other words, if staging had two CPUs, production had at least four, if not eight. If staging had 16 gig of memory, production had at least 32, if not 64, etc., across the board for disk and network bandwidth.

Clearly, it was unusual for the same code base to perform so poorly in this environment. It should have easily outperformed the previous environment, rather than the other way around. My first thought was that production must have a lot more activity that blocked the app. Alas, I validated there was not any unusual blocking. Additionally, in staging Friday had placed a load to estimate traffic, but in production this application was not live yet. So, testing involved only a single thread/client connected to the application and the database.

Searching for the Usual Suspects

This was indeed one of the strangest things that could happen, so I went straight to eliminating everything external. My assumption was that something had to be wrong with the application server or the network. I traced the application for a few simple processes, found the stored procedures that covered those processes, and executed them locally in SQL Management Studio.

If the process came back OK, this wasn’t a database issue. If it wasn’t OK, we were in for some strange times indeed. Running the stored procs manually on the local server did nothing to help. I then asked if there could be a hardware issue. This too was a dead end as the server was a Virtual SQL Server. Friday was able to get another SQL Server on a different host in under an hour with the same results.

So, nothing left to do but dig in and start tracing the code all the way through the process. I did this in staging, and while I would have done things differently, there was not anything flawed with the logic that would have kept me from migrating the code myself in a code review. The only red flag I had was the use of synonyms when referencing data in another database locally.

Next, I looked at the DMV’s in production to find the longest waiting task and then looked up the stored procedure that corresponded to that piece of code. Once found, I did a Redgate compare to execution in staging to prod and they were identical. I went through the code and saw the same thing, but when I ran just this part of the code it ran forever and had a different execution plan than staging.

Solving the Mystery

Were there different indexes? Were statistics not up to date, or were the right indexes not being selected, because on all the big datasets it always did a table scan when there should have been up-to-date indexes supporting these queries?

When looking at every table in the query, I finally found the strange issue. In staging the client used the synonym to reference another table in another database, but, in production, the synonym made a linked server call.

There are right ways and wrong ways to leverage the very impressive tool of linked servers in SQL, as documented by our own Randy Knight in “Using Linked Servers the Right Way.” However, I am not sure Friday knew of Mr. Knight’s blog, because they didn’t do it the right way. When you reference data in Server B from Server A, Server A does not know the metadata of Server B. That means it does not know how much data there is coming, what indexes could help with data retrieval, or if the data could fill up the buffer cache and hurt other processing. Therefore, it had to run this data query outside normal, internal, highly optimized routines.

do you know how to use linked servers in SQL?

This single dataset that ran in the virtual memory space and then pushed into the buffer cache brought this application to its knees — with just a single user signed in! The solution was a little different than what Mr. Knight showcased but it followed the basic principles in that every time Friday needed data from the external server, they would use the linked server to call a stored procedure remotely on Server B that would return a result set. The proc inserted that result set into a temp table on Server A and then Server A processed the rest of the routine as before.

Surviving the Upside Down…and linked servers in SQL

The moral of this story? Sometimes you find yourself facing the ghost in the machine that warps you directly into the strange realm of impossibility. That’s when you take a step back and start eliminating everything you can by making small, simple processes. Do this and you, too, may survive linked servers in SQL, even when it is turned upside down.

Please share this

Jared Kirkpatrick

My life in the tech world started at Dell a little before the Windows 95 release. I started my MCSE there and took the Database track toward a MCDBA and never looked back. I’ve been heads down in data ever since. My first exposure to SQL Server was with SQL 6.5. In addition to schooling at Wentworth Military Academy and two years at Southwestern University in San Marcos, Texas, my certifications are MCP, MCP+I, MCSE, and MCDBA. During my SQL journey, I have been a Supporting Author of SQL Server 2000 Blackbook, presenter at several PASS Chapters and at SQLSaturday, since it was still in the single digits! I was the IBM Principal Consultant for Honeywell and HR Logic, and Senior Developer for Match.com. where I met Randy Knight who started SQL Solutions Group! I have extensive background with medical systems for employers like Maxium, Broadlane, Medassets, Press Gainey, and Florida Cancer Specialists to name a few. When I am not working I spend as much time as I can with my family outdoors.

Strangest Things, Pt. 3: Linked Servers in SQL

Just The Facts

Searching for the Usual Suspects

Solving the Mystery

Surviving the Upside Down…and linked servers in SQL

Please share this

Tags

Jared Kirkpatrick

Related Articles

“Is anyone still using this database?” How to find stagnant databases

Should I Migrate SQL Server to the Cloud?

What Should Be Included in a SQL Server Migration Checklist?

Get Started

Subscribe to get the latest news from us

Subscribe to SSG Emails