|
(please Cc me on replies, I am not subscribed) Hi, libpq currently does not use TCP keepalives. This is a problem in our case where we have some clients waiting for notifies and then the connection is dropped on the server side. The client never gets the FIN and thinks the connection is up. The attached patch unconditionally adds keepalives. I chose unconditionally as this is what the server does. We didn't need the ability to tune the timeouts, but that could be added with reasonable ease. -- Tollef Fog Heen UNIX is user friendly, it's just picky about who its friends are -- Sent via pgsql-hackers mailing list ([hidden email]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
|
On Tue, Feb 9, 2010 at 14:03, Tollef Fog Heen
<[hidden email]> wrote: > > (please Cc me on replies, I am not subscribed) > > Hi, > > libpq currently does not use TCP keepalives. This is a problem in our > case where we have some clients waiting for notifies and then the > connection is dropped on the server side. The client never gets the FIN > and thinks the connection is up. The attached patch unconditionally > adds keepalives. I chose unconditionally as this is what the server > does. We didn't need the ability to tune the timeouts, but that could > be added with reasonable ease. Seems reasonable to add this. Are there any scenarios where this can cause trouble, that would be fixed by having the ability to select non-standard behavior? I don't recall ever changing away from the standard behavior in any of my deployments, but that might be platform dependent? If not, I think this is small and trivial enough not to have to push back for 9.1 ;) -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list ([hidden email]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
|
]] Magnus Hagander
| Seems reasonable to add this. Are there any scenarios where this can | cause trouble, that would be fixed by having the ability to select | non-standard behavior? Well, it might be unwanted if you're on a pay-per-bit connection such as 3G, but in this case, it just makes the problem a bit worse than the server keepalive already makes it – it doesn't introduce a new problem. | I don't recall ever changing away from the standard behavior in any of | my deployments, but that might be platform dependent? If you were (ab)using postgres as an IPC mechanism, I could see it being useful, but not in the normal case. | If not, I think this is small and trivial enough not to have to push | back for 9.1 ;) \o/ -- Tollef Fog Heen UNIX is user friendly, it's just picky about who its friends are -- Sent via pgsql-hackers mailing list ([hidden email]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
|
In reply to this post by Tollef Fog Heen-8
Tollef Fog Heen wrote:
> (please Cc me on replies, I am not subscribed) > > Hi, > > libpq currently does not use TCP keepalives. This is a problem in our > case where we have some clients waiting for notifies and then the > connection is dropped on the server side. The client never gets the FIN > and thinks the connection is up. The attached patch unconditionally > adds keepalives. I chose unconditionally as this is what the server > does. We didn't need the ability to tune the timeouts, but that could > be added with reasonable ease. ISTM that the default behavior should be keep alives disabled, as it is now, and those wanting it can just set it in their apps: setsockopt(PQsocket(conn), SOL_SOCKET, SO_KEEPALIVE, ...) If you really want libpq to manage this, I think you need to expose the probe interval and timeouts. There should be some platform checks as well. Check out... http://www.mail-archive.com/pgsql-hackers@.../msg128603.html -- Andrew Chernow eSilo, LLC every bit counts http://www.esilo.com/ -- Sent via pgsql-hackers mailing list ([hidden email]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
|
On Tue, Feb 9, 2010 at 11:34 PM, Andrew Chernow <[hidden email]> wrote:
> If you really want libpq to manage this, I think you need to expose the > probe interval and timeouts. Agreed. Previously I was making the patch that exposes them as conninfo options so that the standby can detect a network outage ASAP in SR. I attached that WIP patch as a reference. Hope this helps. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list ([hidden email]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
|
In reply to this post by Andrew Chernow-3
On Tue, Feb 09, 2010 at 09:34:10AM -0500, Andrew Chernow wrote:
> Tollef Fog Heen wrote: > >(please Cc me on replies, I am not subscribed) > > > >Hi, > > > >libpq currently does not use TCP keepalives. This is a problem in our > >case where we have some clients waiting for notifies and then the > >connection is dropped on the server side. The client never gets the FIN > >and thinks the connection is up. The attached patch unconditionally > >adds keepalives. I chose unconditionally as this is what the server > >does. We didn't need the ability to tune the timeouts, but that could > >be added with reasonable ease. > > ISTM that the default behavior should be keep alives disabled, as it is > now, and those wanting it can just set it in their apps: > > setsockopt(PQsocket(conn), SOL_SOCKET, SO_KEEPALIVE, ...) I disagree. I have clients who have problems with leftover client connections due to server host failures. They do not write apps in C. For a non-default change to be effective we would need to have all the client drivers, eg JDBC, psycopg, DBD-DBI, and the apps like psql make changes to turn it on. Adding this option as a non-default will not really help. -dg -- David Gould [hidden email] 510 536 1443 510 282 0869 If simplicity worked, the world would be overrun with insects. -- Sent via pgsql-hackers mailing list ([hidden email]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
|
]] daveg
| I disagree. I have clients who have problems with leftover client connections | due to server host failures. They do not write apps in C. For a non-default | change to be effective we would need to have all the client drivers, eg JDBC, | psycopg, DBD-DBI, and the apps like psql make changes to turn it on. Adding | this option as a non-default will not really help. FWIW, this is my case. My application uses psycopg, which provides no way to get access to the underlying socket. Sure, I could hack my way around this, but from the application writer's point of view, I have a connection that I expect to stay around and be reliable. Whether that connection is over a UNIX socket, a TCP socket or something else is something I would rather not have to worry about; it feels very much like an abstraction violation. -- Tollef Fog Heen UNIX is user friendly, it's just picky about who its friends are -- Sent via pgsql-hackers mailing list ([hidden email]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
|
In reply to this post by DavidGould
2010/2/10 daveg <[hidden email]>:
> On Tue, Feb 09, 2010 at 09:34:10AM -0500, Andrew Chernow wrote: >> Tollef Fog Heen wrote: >> >(please Cc me on replies, I am not subscribed) >> > >> >Hi, >> > >> >libpq currently does not use TCP keepalives. This is a problem in our >> >case where we have some clients waiting for notifies and then the >> >connection is dropped on the server side. The client never gets the FIN >> >and thinks the connection is up. The attached patch unconditionally >> >adds keepalives. I chose unconditionally as this is what the server >> >does. We didn't need the ability to tune the timeouts, but that could >> >be added with reasonable ease. >> >> ISTM that the default behavior should be keep alives disabled, as it is >> now, and those wanting it can just set it in their apps: >> >> setsockopt(PQsocket(conn), SOL_SOCKET, SO_KEEPALIVE, ...) > > I disagree. I have clients who have problems with leftover client connections > due to server host failures. They do not write apps in C. For a non-default > change to be effective we would need to have all the client drivers, eg JDBC, > psycopg, DBD-DBI, and the apps like psql make changes to turn it on. Adding > this option as a non-default will not really help. Yes, that's definitely the use-case. PQsocket() will work fine for C apps only. But it should work fine as an option, no? As long as you can specify it on the connection string - I don't think there are any interfaces that won't take a connection string? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list ([hidden email]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
|
>>> ISTM that the default behavior should be keep alives disabled, as it is >>> now, and those wanting it can just set it in their apps: >>> >>> setsockopt(PQsocket(conn), SOL_SOCKET, SO_KEEPALIVE, ...) >> I disagree. I have clients who have problems with leftover client connections >> due to server host failures. They do not write apps in C. For a non-default >> change to be effective we would need to have all the client drivers, eg JDBC, >> psycopg, DBD-DBI, and the apps like psql make changes to turn it on. Adding >> this option as a non-default will not really help. > > Yes, that's definitely the use-case. PQsocket() will work fine for C apps only. > > But it should work fine as an option, no? As long as you can specify > it on the connection string - I don't think there are any interfaces > that won't take a connection string? > Perl and python appear to have the same abilities as C. I don't use either of these drivers but I *think* the below would work: DBD:DBI setsockopt($dbh->pg_socket(), ...); psycopg conn.cursor().socket().setsockopt(...); Although, I think Dave's comments have made me change my mind about this patch. Looks like it serves a good purpose. That said, there is no guarentee the driver will implement the new feature ... JDBC seems to lack the ability to get the backing Socket object but java can set socket options. Maybe a JDBC kong fu master knows how to do this. -- Andrew Chernow eSilo, LLC every bit counts http://www.esilo.com/ -- Sent via pgsql-hackers mailing list ([hidden email]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
|
In reply to this post by Tollef Fog Heen-8
On Thu, Feb 11, 2010 at 2:15 AM, Tollef Fog Heen
<[hidden email]> wrote: > ]] daveg > > | I disagree. I have clients who have problems with leftover client connections > | due to server host failures. They do not write apps in C. For a non-default > | change to be effective we would need to have all the client drivers, eg JDBC, > | psycopg, DBD-DBI, and the apps like psql make changes to turn it on. Adding > | this option as a non-default will not really help. > > FWIW, this is my case. My application uses psycopg, which provides no > way to get access to the underlying socket. Sure, I could hack my way > around this, but from the application writer's point of view, I have a > connection that I expect to stay around and be reliable. Whether that > connection is over a UNIX socket, a TCP socket or something else is > something I would rather not have to worry about; it feels very much > like an abstraction violation. I've sometimes wondered why keepalives aren't the default for all TCP connections. They seem like they're usually a Good Thing (TM), but I wonder if we can think of any situations where someone might not want them? ...Robert -- Sent via pgsql-hackers mailing list ([hidden email]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
|
Robert Haas <[hidden email]> wrote:
> I've sometimes wondered why keepalives aren't the default for all > TCP connections. They seem like they're usually a Good Thing > (TM), but I wonder if we can think of any situations where someone > might not want them? I think it's insane not to use them at all, but there are valid use cases for different timings. Personally, I'd be happy to see a default of sending them if a connection is idle for two minutes, but those people who create 2000 lightly used connections to the database might feel differently. -Kevin -- Sent via pgsql-hackers mailing list ([hidden email]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
|
From the Slony-I docs (http://www.slony.info/documentation/faq.html) :
"Supposing you experience some sort of network outage, the connection between slon and database may fail, and the slon may figure this out long before the PostgreSQL instance it was connected to does. The result is that there will be some number of idle connections left on the database server, which won't be closed out until TCP/IP timeouts complete, which seems to normally take about two hours. For that two hour period, the slon will try to connect, over and over, and will get the above fatal message, over and over. " Speaking as someone who uses Slony quite a lot, this patch sounds very helpful. Why hasn't libpq had keepalives for years? Regards, Peter Geoghegan -- Sent via pgsql-hackers mailing list ([hidden email]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
|
In reply to this post by Robert Haas
Robert Haas wrote:
> On Thu, Feb 11, 2010 at 2:15 AM, Tollef Fog Heen > <[hidden email]> wrote: >> ]] daveg >> >> | I disagree. I have clients who have problems with leftover client connections >> | due to server host failures. They do not write apps in C. For a non-default >> | change to be effective we would need to have all the client drivers, eg JDBC, >> | psycopg, DBD-DBI, and the apps like psql make changes to turn it on. Adding >> | this option as a non-default will not really help. >> >> FWIW, this is my case. My application uses psycopg, which provides no >> way to get access to the underlying socket. Sure, I could hack my way >> around this, but from the application writer's point of view, I have a >> connection that I expect to stay around and be reliable. Whether that >> connection is over a UNIX socket, a TCP socket or something else is >> something I would rather not have to worry about; it feels very much >> like an abstraction violation. > > I've sometimes wondered why keepalives aren't the default for all TCP > connections. They seem like they're usually a Good Thing (TM), but I > wonder if we can think of any situations where someone might not want > them? > The only case I can think of are systems that send application layer keepalive-like packets; I've worked on systems like this. The goal wasn't to reinvent keepalives but to check-in every minute or two to meet a different set of requirements, thus TCP keepalives weren't needed. However, I don't think they would of caused any harm. The more I think about this the more I think it's a pretty non-invasive change to enable keepalives in libpq. I don't think this has any negative impact on clients written while the default was disabled. This is really a driver setting. There is no way to ensure libpq, DBI, psycopg, JDBC, etc... all enable or disable keepalives by default. I only bring this up because it appears there are complaints from non-libpq clients. This patch wouldn't fix those cases. -- Andrew Chernow eSilo, LLC every bit counts http://www.esilo.com/ -- Sent via pgsql-hackers mailing list ([hidden email]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
|
In reply to this post by Kevin Grittner
"Kevin Grittner" <[hidden email]> writes:
> those people who create 2000 lightly used connections to the > database might feel differently. Yeah I still run against installation using the infamous PHP pconnect() function. You certainly don't want to add some load there, but that could urge them into arranging for being able to use pgbouncer in transaction pooling mode (and stop using pconnect(), damn it). Regards, -- dim -- Sent via pgsql-hackers mailing list ([hidden email]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
|
In reply to this post by Peter Geoghegan
Also, more importantly (from
http://www.slony.info/documentation/slonyadmin.html): "A WAN outage (or flakiness of the WAN in general) can leave database connections "zombied", and typical TCP/IP behaviour will allow those connections to persist, preventing a slon restart for around two hours. " Regards, Peter Geoghegan -- Sent via pgsql-hackers mailing list ([hidden email]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
|
In reply to this post by Robert Haas
]] Robert Haas
| I've sometimes wondered why keepalives aren't the default for all TCP | connections. They seem like they're usually a Good Thing (TM), but I | wonder if we can think of any situations where someone might not want | them? As somebody mentioned somewhere else (I think): If you pay per byte transmitted, be it 3G/GPRS. Or if you're on a very, very high-latency link or have no bandwidth. Like, a rocket to Mars or maybe the moon. While I think they are valid use-cases, requiring people to change the defaults if that kind of thing sounds like a sensible solution to me. -- Tollef Fog Heen UNIX is user friendly, it's just picky about who its friends are -- Sent via pgsql-hackers mailing list ([hidden email]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
|
In reply to this post by Andrew Chernow-3
On Thu, 11 Feb 2010, Andrew Chernow wrote: > > Although, I think Dave's comments have made me change my mind about this > patch. Looks like it serves a good purpose. That said, there is no > guarentee the driver will implement the new feature ... JDBC seems to > lack the ability to get the backing Socket object but java can set > socket options. Maybe a JDBC kong fu master knows how to do this. Use the tcpKeepAlive connection option as described here: http://jdbc.postgresql.org/documentation/84/connect.html#connection-parameters Java can only enable/disable keep alives, it can't set the desired timeout. http://java.sun.com/javase/6/docs/api/java/net/Socket.html#setKeepAlive%28boolean%29 Kris Jurka -- Sent via pgsql-hackers mailing list ([hidden email]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
|
In reply to this post by Peter Geoghegan
On Fri, Feb 12, 2010 at 1:33 AM, Peter Geoghegan
<[hidden email]> wrote: > Why hasn't libpq had keepalives for years? I guess that it's because keepalive doesn't work as expected in some cases. For example, if the network outage happens before a client sends some packets, keepalive doesn't work, then it would have to wait for a long time until it detects the outage. This is the specification of linux kernel. So a client should not have excessive expectations of keepalive, and should have another timeout like QueryTimeout of JDBC. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list ([hidden email]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
|
> <[hidden email]> wrote:
>> Why hasn't libpq had keepalives for years? > > I guess that it's because keepalive doesn't work as expected > in some cases. For example, if the network outage happens > before a client sends some packets, keepalive doesn't work, > then it would have to wait for a long time until it detects > the outage. This is the specification of linux kernel. So > a client should not have excessive expectations of keepalive, > and should have another timeout like QueryTimeout of JDBC. In my experience, the problems described are common when using libpq over any sort of flaky connection, which I myself regularly do (not just with Slony, but with a handheld wi-fi PDT application, where libpq is used without any wrapper). The slony docs say it takes about 2 hours for the problem to correct itself, but I have found that it may take a lot longer, perhaps because I have a hybrid Linux/Windows Slony cluster. > keepalive doesn't work, > then it would have to wait for a long time until it detects > the outage. I'm not really sure what you mean. In this scenario, would it take as long as it would have taken had keepalives not been used? I strongly welcome anything that can ameliorate these problems, which are probably not noticed by the majority of users, but are a real inconvenience when they do arise. Regards, Peter Geoghegan -- Sent via pgsql-hackers mailing list ([hidden email]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
|
On Fri, Feb 12, 2010 at 6:40 PM, Peter Geoghegan
<[hidden email]> wrote: >> keepalive doesn't work, >> then it would have to wait for a long time until it detects >> the outage. > > I'm not really sure what you mean. In this scenario, would it take as > long as it would have taken had keepalives not been used? Please see the following threads. http://archives.postgresql.org/pgsql-bugs/2006-08/msg00098.php http://lkml.indiana.edu/hypermail/linux/kernel/0508.2/0757.html Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list ([hidden email]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers |
| Powered by Nabble | Edit this page |
