Quantcast

Streaming replication with sync slave, but disconnects due to missing WAL segments

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Streaming replication with sync slave, but disconnects due to missing WAL segments

Mads.Tandrup

Hi all

I have a question about sync streaming replication.

I have 2 postgresql 9.1 servers set up with streaming replication. On the
master node the slave is configured as a synchronous standby. I've verified
that pg_stat_replication shows sync_state = sync for the slave node.

It all seems to work fine. But I have noticed that sometimes when I restore
backups created by pg_dump. The slave node will disconnect with the message
in the postgresql log:
2013-06-03 13:13:48 GMT 4271  FATAL:  could not receive data from WAL
stream: SSL connection has been closed unexpectedly
2013-06-03 13:13:53 GMT 4270  LOG:  invalid magic number 0000 in log file
15, segment 65, offset 11665408
2013-06-03 13:13:54 GMT 36428  LOG:  streaming replication successfully
connected to primary
2013-06-03 13:13:54 GMT 36428  FATAL:  could not receive data from WAL
stream: FATAL:  requested WAL segment 000000010000000F00000041 has already
been removed
2013-06-03 13:13:58 GMT 36458  LOG:  streaming replication successfully
connected to primary
2013-06-03 13:13:58 GMT 36458  FATAL:  could not receive data from WAL
stream: FATAL:  requested WAL segment 000000010000000F00000041 has already
been removed

On the master I get this in the log file in the same timespan:
2013-06-03 13:13:47 GMT 1471  LOG:  checkpoints are occurring too
frequently (2 seconds apart)
2013-06-03 13:13:47 GMT 1471  HINT:  Consider increasing the configuration
parameter "checkpoint_segments".
2013-06-03 13:13:48 GMT 6189 [unknown] FATAL:  requested WAL segment
000000010000000F00000041 has already been removed
2013-06-03 13:13:48 GMT 6189 [unknown] LOG:  disconnection: session time:
77:37:37.684 user=root database= host=10.216.80.38 port=56114
2013-06-03 13:13:49 GMT 1471  LOG:  checkpoints are occurring too
frequently (2 seconds apart)
2013-06-03 13:13:49 GMT 1471  HINT:  Consider increasing the configuration
parameter "checkpoint_segments".
2013-06-03 13:13:51 GMT 1471  LOG:  checkpoints are occurring too
frequently (2 seconds apart)
2013-06-03 13:13:51 GMT 1471  HINT:  Consider increasing the configuration
parameter "checkpoint_segments".
2013-06-03 13:13:51 GMT 1468  LOG:  received SIGHUP, reloading
configuration files
2013-06-03 13:13:51 GMT 1468  LOG:  parameter "synchronous_standby_names"
removed from configuration file, reset to default
2013-06-03 13:13:53 GMT 1471  LOG:  checkpoints are occurring too
frequently (2 seconds apart)
2013-06-03 13:13:53 GMT 1471  HINT:  Consider increasing the configuration
parameter "checkpoint_segments".
2013-06-03 13:13:53 GMT 44063 [unknown] LOG:  connection received:
host=10.216.80.38 port=34038
2013-06-03 13:13:54 GMT 44063 [unknown] LOG:  replication connection
authorized: user=root
2013-06-03 13:13:54 GMT 44063 [unknown] FATAL:  requested WAL segment
000000010000000F00000041 has already been removed
2013-06-03 13:13:54 GMT 44063 [unknown] LOG:  disconnection: session time:
0:00:00.090 user=root database= host=10.216.80.38 port=34038

What I don't understand is how the slave node can miss a WAL segment since
it should be sync?

Shouldn't sync prevent the server from continuing if the slave is not able
to get WAL segments fast enough?

I have only noticed it while restoring a database. But the general load on
the DB has not been that high, so I'm not sure if it can occur with other
workloads.

Best regards,
Mads



--
Sent via pgsql-general mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Streaming replication with sync slave, but disconnects due to missing WAL segments

Jeff Janes
On Tue, Jun 4, 2013 at 6:25 AM, <[hidden email]> wrote:

Hi all

I have a question about sync streaming replication.

I have 2 postgresql 9.1 servers set up with streaming replication. On the
master node the slave is configured as a synchronous standby. I've verified
that pg_stat_replication shows sync_state = sync for the slave node.

It all seems to work fine. But I have noticed that sometimes when I restore
backups created by pg_dump. The slave node will disconnect with the message
in the postgresql log:

You cannot use pg_dump to set up a database for receiving streaming replication.

For that it must be a physical copy, not a logical copy which is what pg_dump generates.

Cheers,

Jeff
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

vacuum_cost_delay and autovacuum_cost_delay

ascot.moss@gmail.com
Hi,

if I change the value of "vacuum_cost_delay", what is the impact of it on autovacuum  side?

regards

--
Sent via pgsql-general mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: vacuum_cost_delay and autovacuum_cost_delay

Sergey Konoplev-2
On Tue, Jun 4, 2013 at 5:06 PM, [hidden email]
<[hidden email]> wrote:
> if I change the value of "vacuum_cost_delay", what is the impact of it on autovacuum  side?

If autovacuum_vacuum_cost_delay is not -1 then autovacuum will use
this value, and there will be no effect of changing vacuum_cost_delay
on autovacuum. However, if it is -1 than it will use the value of
vacuum_cost_delay.

http://www.postgresql.org/docs/9.2/static/runtime-config-autovacuum.html#GUC-AUTOVACUUM-VACUUM-COST-DELAY

>
> regards
>
> --
> Sent via pgsql-general mailing list ([hidden email])
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general



--
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA

Profile: http://www.linkedin.com/in/grayhemp
Phone: USA +1 (415) 867-9984, Russia +7 (901) 903-0499, +7 (988) 888-1979
Skype: gray-hemp
Jabber: [hidden email]


--
Sent via pgsql-general mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Streaming replication with sync slave, but disconnects due to missing WAL segments

Albe Laurenz *EXTERN*
In reply to this post by Jeff Janes
Jeff Janes wrote:

> On Tue, Jun 4, 2013 at 6:25 AM, <[hidden email]> wrote:
>> I have a question about sync streaming replication.
>>
>> I have 2 postgresql 9.1 servers set up with streaming replication. On the
>> master node the slave is configured as a synchronous standby. I've verified
>> that pg_stat_replication shows sync_state = sync for the slave node.
>>
>> It all seems to work fine. But I have noticed that sometimes when I restore
>> backups created by pg_dump. The slave node will disconnect with the message
>> in the postgresql log:
>
> You cannot use pg_dump to set up a database for receiving streaming replication.
>
> For that it must be a physical copy, not a logical copy which is what pg_dump generates.

I think you misunderstood - he restores a dump on
the *primary* and that makes replication fall over
because it cannot catch up.

Yours,
Laurenz Albe

--
Sent via pgsql-general mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Streaming replication with sync slave, but disconnects due to missing WAL segments

Jeff Janes
On Wed, Jun 5, 2013 at 1:30 AM, Albe Laurenz <[hidden email]> wrote:
Jeff Janes wrote:
> On Tue, Jun 4, 2013 at 6:25 AM, <[hidden email]> wrote:
>> I have a question about sync streaming replication.
>>
>> I have 2 postgresql 9.1 servers set up with streaming replication. On the
>> master node the slave is configured as a synchronous standby. I've verified
>> that pg_stat_replication shows sync_state = sync for the slave node.
>>
>> It all seems to work fine. But I have noticed that sometimes when I restore
>> backups created by pg_dump. The slave node will disconnect with the message
>> in the postgresql log:
>
> You cannot use pg_dump to set up a database for receiving streaming replication.
>
> For that it must be a physical copy, not a logical copy which is what pg_dump generates.

I think you misunderstood - he restores a dump on
the *primary* and that makes replication fall over
because it cannot catch up.

Ah, I see.  And the synchronization point doesn't help, because it only occurs at commit-time and if the dump is being loaded in a single transaction, then the transaction could last for several checkpoints.  By the time it tries to synchronize, it is already too late.

Cheers,

Jeff
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Streaming replication with sync slave, but disconnects due to missing WAL segments

Mads.Tandrup
Hi

Thanks for your reply. Do you know of any options that I could give pg_dump/psql to avoid creating one big transaction? I'm using the plain text format for pg_dump.

Best regards,
Mads


-----[hidden email] skrev: -----
Til: Albe Laurenz <[hidden email]>
Fra: Jeff Janes
Sendt af: [hidden email]
Dato: 05-06-2013 12:43
Cc: "[hidden email]" <[hidden email]>, "[hidden email]" <[hidden email]>
Emne: Re: [GENERAL] Streaming replication with sync slave, but disconnects due to missing WAL segments

On Wed, Jun 5, 2013 at 1:30 AM, Albe Laurenz <[hidden email]> wrote:
Jeff Janes wrote:
> On Tue, Jun 4, 2013 at 6:25 AM, <[hidden email]> wrote:
>> I have a question about sync streaming replication.
>>
>> I have 2 postgresql 9.1 servers set up with streaming replication. On the
>> master node the slave is configured as a synchronous standby. I've verified
>> that pg_stat_replication shows sync_state = sync for the slave node.
>>
>> It all seems to work fine. But I have noticed that sometimes when I restore
>> backups created by pg_dump. The slave node will disconnect with the message
>> in the postgresql log:
>
> You cannot use pg_dump to set up a database for receiving streaming replication.
>
> For that it must be a physical copy, not a logical copy which is what pg_dump generates.

I think you misunderstood - he restores a dump on
the *primary* and that makes replication fall over
because it cannot catch up.

Ah, I see.  And the synchronization point doesn't help, because it only occurs at commit-time and if the dump is being loaded in a single transaction, then the transaction could last for several checkpoints.  By the time it tries to synchronize, it is already too late.

Cheers,

Jeff

______________________________________________________________________
This email has been scanned by the Symantec Email Security.cloud service.
______________________________________________________________________
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Streaming replication with sync slave, but disconnects due to missing WAL segments

Shaun Thomas-2
In reply to this post by Mads.Tandrup
On 06/04/2013 08:25 AM, [hidden email] wrote:

> It all seems to work fine. But I have noticed that sometimes when I
> restore backups created by pg_dump. The slave node will disconnect
> with the message in the postgresql log:

You need to increase wal_keep_segments on the master. Any pg_dump or
pg_restore is one giant transaction, and the dirty little secret about
PG replication, is that no slave can replay while a transaction is in
progress.

The slave keeps waiting while the master is doing all this extra work,
but without enough segments in reserve, the master may archive or delete
what it considers "excess" segments before slaves can receive or process
them.

One way to avoid keeping these around is to set your archive_command to
transmit archived WAL files to a location where the slave can read them.
Then in your recovery.conf, you can make it look something like this:

standby_mode     = 'on'
primary_conninfo = 'host=master-node user=replication'
trigger_file     = '/db/pgdata/promote'
restore_command  = 'cp -f /db/wal/pg_archived/%f %p'

So if your connection gets disrupted, or the slave falls behind, it can
always catch up again.

Honestly in sync rep, I'm surprised the master doesn't keep segments
until a waiting slave either disconnects, or can consume WAL files being
held up by a long-running transaction. Not that it matters, since you
can fake that behavior this way.

--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-676-8870
[hidden email]

______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email


--
Sent via pgsql-general mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Streaming replication with sync slave, but disconnects due to missing WAL segments

Jeff Janes
In reply to this post by Mads.Tandrup
On Wed, Jun 5, 2013 at 11:26 PM, <[hidden email]> wrote:
Hi

Thanks for your reply. Do you know of any options that I could give pg_dump/psql to avoid creating one big transaction? I'm using the plain text format for pg_dump.


For the plain text format, it is already not one big transaction, unless you specify to -1 to the psql.  However, the load of any individual table will still be a single transaction, so for a very large table it will still be a very long transaction.  

Using pg_dump for --inserts could get around this, but it would probably be better to fix the fundamental problem by increasing wal_keep_segments or something of that nature.

Cheers,

Jeff
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Streaming replication with sync slave, but disconnects due to missing WAL segments

Mads.Tandrup
Hi Jeff.

Thanks for the clarification.

I'll adjust wal_keep_segments for the expected biggest table in the backup.

Best regards,
Mads





From: Jeff Janes <[hidden email]>
To: "[hidden email]"
            <[hidden email]>,
Cc: Albe Laurenz <[hidden email]>,
            "[hidden email]" <[hidden email]>
Date: 06-06-2013 18:33
Subject: Re: [GENERAL] Streaming replication with sync slave, but
            disconnects due to missing WAL segments
Sent by: [hidden email]



On Wed, Jun 5, 2013 at 11:26 PM, <[hidden email]>
wrote:
      Hi

      Thanks for your reply. Do you know of any options that I could give
      pg_dump/psql to avoid creating one big transaction? I'm using the
      plain text format for pg_dump.


For the plain text format, it is already not one big transaction, unless
you specify to -1 to the psql.  However, the load of any individual table
will still be a single transaction, so for a very large table it will still
be a very long transaction.

Using pg_dump for --inserts could get around this, but it would probably be
better to fix the fundamental problem by increasing wal_keep_segments or
something of that nature.

Cheers,

Jeff

______________________________________________________________________
This email has been scanned by the Symantec Email Security.cloud service.
______________________________________________________________________



--
Sent via pgsql-general mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Loading...