Press "Enter" to skip to content

Category: Replication

Tearing Down and Rebuilding Replication

Pamela Mooney takes us through tearing down replication, restoring a database, and rebuilding transactional replication with scripts:

If you use replication, you have had the situation occur where you had to restore a replicated database.  You’ve have doubtless been paged to restore a replicated database. You have experienced the ineffable joy of being tearing down replication-dependent indexed views (if you have them), blowing away replication, doing the restore, putting replication and indexing back together again, and finally redeploying your indexed views.  I know I have.

In fact, I’ve done it enough times that I didn’t want to do it anymore. So, you may ask, did I go to a different modality of replicating my data?  Did I go to Availability Groups or mirroring instead?  No.  I actually like replication.  It’s invaluable when you need to write code around real-time data (especially from a third party database), but you aren’t able to index the original copy.  It’s been around for a long time and is well vetted, and pretty forgiving, once you understand how it works.  So, no need to reinvent the wheel. I decided to automate replication instead.

This is specific to transactional replication. There’s a whole ‘nother kettle of fish for merge replication.

Comments closed

Transactional Replication Tips

Nate Johnson has a few things which might make SQL Server transactional replication easier for you:

For what seems like years, I’ve bemoaned the fact that SQL Transactional Replication doesn’t come with a “Just Trust Me” option. I’ll explain more about what I mean in a moment. The other thing I’ve complained about is that there’s no “Pause” button — which not entirely accurate, since obviously you could just stop the distribution and subscription agents. But specifically what I mean is, it’s not easy to ‘put it on hold so you can make some schema changes to one of the tables that’s being replicated’, and then easily “Resume” it after you’re done with said changes.

Well, I’m happy to say that now I have both of these tools/methodologies in my arsenal!

Read on for those tips and a couple more.

Comments closed

Kafka and MirrorMaker

Renu Tewari describes what MirrorMaker does for Kafka today and what is coming with version 2:

Apache Kafka has become an essential component of enterprise data pipelines and is used for tracking clickstream event data, collecting logs, gathering metrics, and being the enterprise data bus in a microservices based architectures. Kafka is essentially a highly available and highly scalable distributed log of all the messages flowing in an enterprise data pipeline. Kafka supports internal replication to support data availability within a cluster. However, enterprises require that the data availability and durability guarantees span entire cluster and site failures.

The solution, thus far, in the Apache Kafka community was to use MirrorMaker, an external utility, that helped replicate the data between two Kafka clusters within or across data centers. MirrorMaker is essentially a Kafka high-level consumer and producer pair, efficiently moving data from the source cluster to the destination cluster and not offering much else. The initial use case that MirrorMaker was designed for was to move data from clusters to an aggregate cluster within a data center or to another data center to feed batch or streaming analytics pipelines. Enterprises have a much broader set of  use cases and requirements on replication guarantees.

Read on for the list of benefits and upcoming features.

Comments closed

Conflict Tracking in Merge Replication

Ranga Babu shows the two different models for conflict detection with merge replication:

Conflict Detection:
The conflict detection depends on the type of tracking we configure for the article.
Row-level tracking: If data changes are made to any column on the same row at both ends, then it is considered a conflict.
Column-level tracking: If data changes are made on the same column at both ends, this change is qualified as a conflict.

Read on for a detailed demonstration of the two.

Comments closed

When SQL Server Replication Ignores Tables

Matt Slocum takes us through a tricky replication scenario (hint, they all are):

There are occasions when Updates, Inserts, and Deletes on a replicated table do not replicate out to the Subscriber.  You’ve verified that the table is listed in the Articles included in the Publication, and that there is at least one Subscription on the Publication. 

The strange thing is that there are likely other tables in the same Publication that are properly being replicated to the same Subscriber.

What is happening here?  Why is replication ignoring this table?

Read on to see Matt’s explanation and fix.

Comments closed

The Distribution Database and AGs

Andy Mallon looks at a wrinkle Availability Groups adds to replication:

This has always worked super, and has been a go-to query for me for years. But when looking at a SQL Server 2017 distributor with the distribution database in an Availability Group, that wasn’t working. All the publications had a publisher_id of 1, but in sys.serversserver_id 1 was some random linked server, and definitely not the publisher. But replication was working great. Maybe replication was set up on the other AG replica, and server_id 1 came from there.

Nope. On the other replica, it was the same story. Server_id 1 was a random linked server, and nothing to do with replication at all, let alone the publisher. But replication was working perfectly. A teammate fooling around with it in dev confirmed that if he updated the publisher_id to match the server_id we thought it should join to, replication stopped working. So, that publisher_id of 1 was correct. Or special. But also definitely different than what I’ve seen in prior versions of SQL Server.

Read on to see what Andy learned.

Comments closed

Using Replication With SQL Server In Containers

Andrew Pruski shows us how we can build up snapshot replication with SQL Server in containers:

Last week I saw a thread on twitter about how to get replication setup for SQL Server running in a container. Now I know very little about replication, it’s not an area of SQL that I’ve had a lot of exposure to but I’m always up for figuring stuff out (especially when it comes to SQL in containers).
So let’s run through how to set it up here.
First, create a dockerfile to build an image from the SQL Server 2019 CTP 2.2 image with the SQL Server Agent enabled: –

Now that Andrew is a replication expert…

Comments closed

Configuring Snapshot Replication

Nisarg Upadhyay shows us how to configure snapshot replication:

On the next screen, configure the SQL Agent security. To configure the Agent security, click the Security Settings button. The Snapshot Agent Security dialog box opens. In the dialog box, provide the account under which the subscriber connects to the publisher. Moreover, provide the account information under which the SQL Server agent job will be executed. For this demo, SQL Server jobs are executed under the SQL server agent service account, hence select the Run under the SQL Server Agent service account option. Subscribers will be connected to the publisher using SQL login, hence select the Using the following SQL Server login option and provide SQL login and password. In this demo, connect using the sa login. Click OK to close the dialog box and Click Next.

Snapshot replication is the easiest to get right, but most of the setup is the same for transactional or merge replication.

Comments closed

Index Maintenance With Replication

Ajay Dwivedi shares his rules of thumb for index maintenance on replicated databases:

Like any other DBA, I fell into the trap of using straight maintenance solution using Reorganize operation for Indexes with avg fragmentation with 30% or less with Index Rebuild for avg fragmentation greater than 30%.

Well above approach works fine in common scenarios, but can create problems for servers using transaction log based High Availability technologies, such as AlwaysOn Availability Groups, database mirroring, log shipping, and replication. Both index rebuild and reorganize introduce heavy transaction log activity and generate a large number of log records. This becomes an issue in case of node failover, server with limited storage, database file with restricted growth, wrong file auto growth setting, or database with high VLF counts.

The best option for servers with High Availability is to identify kind of server workload (OLTP/OLAP/mixed), fill factor (based on Page Splits/sec), fragmentation, underlying storage load (random/sequential), Index Scans vs Index Searches, job time frame (low activity outside business hours) etc. After calculating all the above factors, all we need is to have a robust Index Maintenance solution. This is where I find Ola Hallengren’s SQL Server Maintenance Solution a perfect fit.

Ajay uses Ola Hallengren’s solution and gives us the breakdown percentages he uses.

Comments closed

Replicating Data In HDFS Between Clusters

Murali Ramasami and Niru Anisetti have an article showing how to use the Hortonworks Data Lifecycle Manager to set up replication between two Hadoop clusters:

Data Lifecycle Manager (DLM) delivers on the promise of location-agnostic, secure replication by encapsulating and copying data seamlessly across physical private storage and public cloud environments. This empowers businesses to deliver the right data in the right environment to power the right use cases.

DLM v1.1 provides a complete solution to replicate data, metadata and security policies between on-premises and in cloud. It also supports data movement for data-at-rest and data-in-motion – whether the data is encrypted using a single key or multiple keys on both source and target clusters. DLM supports HDFS and Apache Hive dataset replication.

With DLM infrastructure administrators can manage their data, metadata and security management on-prem and in-cloud using a single-pane of glass that is built on open source technology. Business users can consume their workload outputs in the cloud with data-source-abstraction. DLM also enables business to reduce their capital expenditures and enjoy the benefits of flexibility and elasticity that cloud provides.

Click through for a demo.  May HDFS replication have as long a life and slightly less vitriol than SQL Server replication.

Comments closed