Using Linked Servers the Right Way

Linked Servers are a very popular way to access distributed data in environments with lots of SQL Server instances on different serves, or even on other Database Management Systems like Oracle or MySQL. While linked servers can be very useful if used correctly, I have found that more often than not, they are not used correctly and are a source of lots of performance problems.  In this article we’ll take a look at using linked servers the right way (and a few wrong ones).

Linked Server Review

To start with, lets look at how a linked server works.  We configure a linked server by specifying the remote data source and a name for the sever.

linked1

This allows us to use the linked server name ADVENTUREWORKS to access the AdventureWorks database on the instance RKLAPTOPSS.  Note that the Linked Server name can be whatever we want it to.  It does not have to be the name of the instance. I prefer this practice because you can use the same linked server name in multiple environments (Development, QA, Production, etc.)

Authentication settings have to be configured as well.

image

In this case, I am saying to just pass through the login’s security context to the remote server.  I could also choose to always use a certain security context or to map local login’s to remote logins. The security implications of these choices are beyond the scope of this post but the most important thing is that in most cases Options 2 and 4 should not be used.


With the Remote DBA managed service from

SQL Solutions Group, we do your database dirty

work, freeing up your team for projects

that impact the bottom line.

Learn More


Four Part Naming in Linked Servers

Now that we have a linked server set up, we can access it from the local SQL Server instance.  The most common way to do this is to use four part names to refer to objects on the remote server.

<LinkedServerName>.<DatabaseName>.<Schema>.<Object>

SELECT *
FROM ADVENTUREWORKS.AdventureWorks.HumanResources.Employee
WHERE Gender = 'M'

While this is the most popular way to use Linked Servers, it is also the worst way in terms of performance.  To understand this, you need to always ask the following question: “Where will the query be optimized?”

In this case, the optimizer on the local server will determine the execution plan.  But it doesn’t know anything about indexes and statistics that exist on the Employee table on the remote server.  There may well be an index on Gender which could be taken advantage of, but the local optimizer has no way to know that.   So this query will result in a table scan, every time, regardless of remote indexes. In addition, all of the data will be sent across the network only to be filtered for Gender on the local system.

OPENQUERY()

A better way to do this is to use a function like OPENQUERY().  Essentially what we are doing is executing a remote query on the linked server and just asking for the results back.

SELECT *
FROM OPENQUERY
   (
    ADVENTUREWORKS,
    'SELECT * FROM AdventureWorks.HumanResources.Employee WHERE Gender = ''M'''
   )

The function takes two parameters.  The remote server to execute the query on, and the query itself.  Note that this is remotely executing so if the server was to another DBMS, you’d want the query written in that engine’s flavor of SQL.

This is a simple example, but imagine if the remote table had a million plus rows.  The inefficient table scan plus bringing all that over the network so that it can be filtered locally.  I have “fixed” queries that were taking 12+ hours to run and reduced the time to just minutes by making this simple change to the code.

Please share this

This Post Has 11 Comments

  1. Bruce BLACHFORD

    This works great if the full query is on the linked server.
    What if you are joining tables from the main server to the linked server.
    Is there any to get the linked server to uses indexes in this type of join?

  2. Randy Knight

    In this case I would still use OPENQUERY but use the results as a derived table that you join to. Make sure you include just the rows and columns needed. For example, if you know you are going to filter out some rows from the table on the remote server in the where clause, do this in the OPENQUERY.

  3. Yves Forget

    I tested your “worst way” between two SQL 2008 servers. The profiler on the remote server shows that it received the WHERE clause, so it did the filtering and returned only the needed records. What is your original scenario causing the problem ? The link server is to Oracle or MySQL maybe ?

    Also, what about the REMOTE join_hint ? From a performance view, is OpenQuery better than REMOTE ?
    See https://msdn.microsoft.com/en-us/library/ms173815.aspx

    1. Jason Brimhall

      I think you are confusing what was said. The article makes no claims about whether the predicate will be performed or not. It states that the optimizer evaluates the query locally even for the remote objects being fetched via the linked server. In the article, it was noted that tables that are joined have to bring the data across the wire to a common location in order to perform the join. This data transfer is unnecessary and has to be done if using the “worst” method in order to filter out rows that don’t belong in the join criteria. OpenQuery mitigates much of that.
      As far as “REMOTE”, please don’t call it a hint. It is a directive. We don’t advocate using directives (except in very special circumstances) because it limits the optimizer. And in this case, REMOTE has plenty of limitations that make it less than ideal overall. The directive also only performs the join on the site of the object on the right of the join. This does not change how the optimizer evaluates the plan (still performs that local). As for performance of that directive, I have found OpenQuery to be consistently better.

  4. Yves Forget

    Hi Jason,

    Thanks for your answer ; I have a small sub-question. You said : “The article makes no claims about whether the predicate will be performed or not”, and the article said : “In addition, all of the data will be sent across the network only to be filtered for Gender on the local system.”

  5. Mike

    Can you fix the images please? They’re all currently broken.

  6. Moize

    Thanks SSG ……………you are great

  7. DBA SQL Server

    How could I do if I use variables instead fixed values on WHERE clause

Leave a Reply

Related Articles

A toy machine and a microphone on a table Description automatically generated

The Right Tool for the Right Job

I’m looking forward to 2024. From many perspectives, I think we are going to see great advancements, including in technology, that on one hand will make our lives easier, but also make our decisions a bit harder.

Read More »