UniverseUniversity


Home Projects Jobs Clientele Contact

uu


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Db pool [was: My little comments to CPP code]



Alexey Parshin wrote:
> Not exactly. When we have a heavy load, it's still a finite number of
> connections. The point is to create a connection when we can, not when
> we need it.
>
> Also, I didn't mention it. We could and should keep a connection for
> the session. This way we don't need that many connections. We just
> need one per session, not one per page in session. And they can expire
> if not used, say, in 10 minutes.
So, let's take a look at this scenario.
It's high-time before state exams, and everybody is studying for them
like crazy.
We have ~500 hits from 100 unique visitors per minute. On top of that, some
retard decided it would be a good idea to launch a DDoS on our site, and we
get an extra ~1000 hits per minute from bots (difference between them and
people is that bots do not save cookies, and so open a new session with
every hit).

In all examples below, I presume that database is limited to total of
500 concurrent
connections.

Implementation 1: pre-open connections and set connection per session.
    1. Within first minute first minute all 500 connections are consumed and
       bound to some sessions. There are few lucky people who actually
got them
       rest is bound to bots.
    2. As time goes on, people get errors/timeouts, some go on break,
etc, so
       even those who had their connections, lose them, and attacker makes
       site completely unusable within few hours.

Implementation 2: pre-open connections in an external thread, use one
connection
per request. Let's presume following:
a. Bot has no login credentials, so worst it can do is failed login
attempt. No other
database access is possible for it.
b. Opening a new connection takes ~5ms
c. execution of failed login (including formatting and sending response)
    10ms
d. Non-bot hit execution time averages at 50ms
e. We pre-open 100 database connections

    1. we have 1500 hits per minute - that averages at about ~25 hits
per second.
    of them ~15 are bots, and ~10 - users
    2. thus, per second: bot hits consume 75ms, user hits consume 500ms
    3. connection opening thread thus has ~425ms of CPU time left
Thus, in this situation we can cope for a while: we can open ~8 connections
per second, and we need 25. IOW, we'll run out of pre-open connections
in first 5-10 seconds, and after that our response rate will essentially
be limited
by number of connections we can open.

Implementation 3: Pre-open connections and reuse them
numbers are same as above, except for the connection opening part - thus
we'll have only slightly over 50% CPU utilization.

Of course, all the numbers above were just randomly pulled out of my arse,
but I think you get the idea.

>
> There are are a couple of good reasons why I'm so persistently asking
> about the clean connection. If any program or procedure leaves behind
> any temporary tables, that can consume a lot of memory and create
> conflicts if another program/procedure tries to create same temporary
> table.
That would be a bug in specific procedure. No uncleaned resources should
be left behind.
In worst case, we _could_ paper over such bugs by closing connections
once in a while
(i.e. after 1000 uses, or something like that), but over all, I think
those bugs should just be
fixed in procedures.
> These reasons, actually, are serious enough (IMHO) to avoid re-using
> connections.
>
> 2007/3/27, Ilya A. Volynets-Evenbakh <ilya@total-knowledge.com
> <mailto:ilya@total-knowledge.com>>:
>
>     You missed the point. Underruns will happen at the time of
>     heaviest load.
>     If we are closing every connection after single use, underruns will
>     cause connection opening to happen during the heavy load period,
>     while whole purpose of the pool is to avoid it. Overhead is not so
>     large,
>     as to cause problems/noticeable slowdown under light load.
>
>
>     Alexey Parshin wrote:
>     > Make a pool of, say, 100 available connections. If they are
>     underunned
>     > - just raise the bar and allocate more. That isn't a problem.
>     >
>     > 2007/3/27, Ilya A. Volynets-Evenbakh < ilya@total-knowledge.com
>     <mailto:ilya@total-knowledge.com>
>     > <mailto:ilya@total-knowledge.com
>     <mailto:ilya@total-knowledge.com>>>:
>     >
>     >     It is fairly easy to underrun such scheme, and in that case
>     we will
>     >     still have overhead of establishing new connections while under
>     >     heavy load, which pretty much defeats the purpose of the pool.
>     >
>     >     Alexey Parshin wrote:
>     >     > IMHO, connection doesn't require the connection reuse. The
>     only
>     >     > requirements is the instant connection availablity upon
>     request.
>     >     That
>     >     > can be achieved by generation of necessary connections in
>     >     advance. The
>     >     > separate thread generates the connections and places 'em
>     in the
>     >     queue.
>     >     > If the number of connections is equel to predefined pool
>     size - then
>     >     > the thread sleeps until one or more connections are requested
>     >     from the
>     >     > pool. In this case we always have some number of clean
>     available
>     >     > connections.
>     >     >
>     >     > 2007/3/27, Ilya A. Volynets-Evenbakh
>     <ilya@total-knowledge.com <mailto:ilya@total-knowledge.com>
>     >     <mailto:ilya@total-knowledge.com
>     <mailto:ilya@total-knowledge.com> >
>     >     > <mailto:ilya@total-knowledge.com
>     <mailto:ilya@total-knowledge.com>
>     >     <mailto:ilya@total-knowledge.com
>     <mailto:ilya@total-knowledge.com>>>>:
>     >     >
>     >     >     If we close connection every time, it will not be pool any
>     >     more :)
>     >     >     We could rename it to "connection manager", and make
>     pooling
>     >     >     optional, of course.
>     >     >
>     >     >     However, I still want for pool to be possible. We have two
>     >     options
>     >     >     here.
>     >     >     1. Database provides "cleanup" function that pool will
>     make sure
>     >     >     to call
>     >     >     whenever connection is released.
>     >     >     2. Database expects unclean environment and knows how to
>     >     deal with it
>     >     >     on its own.
>     >     >
>     >     >
>     >     >     Alexey Parshin wrote:
>     >     >     > Gentlemen,
>     >     >     >
>     >     >     > Before we finalize the connection pool, I'd like to
>     wish for
>     >     >     > something. When we finish working with the connection
>     >     obtained from
>     >     >     > the pool, can we close that connection to release
>     all the
>     >     resources
>     >     >     > (temporary tables, primarily) to the server? This would
>     >     make us to
>     >     >     > always have only freshly created connections in the
>     pool..
>     >     >     >
>     >     >     > I've almost finished with my emergency on the job,
>     so I'd
>     >     start
>     >     >     > working with UU by the end of the week.
>     >     >     >
>     >     >     > 2007/3/27, Ilya A. Volynets-Evenbakh <
>     >     ilya@total-knowledge.com <mailto:ilya@total-knowledge.com>
>     <mailto:ilya@total-knowledge.com <mailto:ilya@total-knowledge.com>>
>     >     >     <mailto:ilya@total-knowledge.com
>     <mailto:ilya@total-knowledge.com>
>     >     <mailto:ilya@total-knowledge.com
>     <mailto:ilya@total-knowledge.com>>>
>     >     >     > <mailto: ilya@total-knowledge.com
>     <mailto:ilya@total-knowledge.com>
>     >     <mailto: ilya@total-knowledge.com
>     <mailto:ilya@total-knowledge.com>>
>     >     >     <mailto:ilya@total-knowledge.com
>     <mailto:ilya@total-knowledge.com>
>     >     <mailto: ilya@total-knowledge.com
>     <mailto:ilya@total-knowledge.com>>>>>:
>     >     >     >
>     >     >     >     sergey@total-knowledge.com
>     <mailto:sergey@total-knowledge.com>
>     >     <mailto: sergey@total-knowledge.com
>     <mailto:sergey@total-knowledge.com>>
>     >     >     <mailto:sergey@total-knowledge.com
>     <mailto:sergey@total-knowledge.com>
>     >     <mailto:sergey@total-knowledge.com
>     <mailto:sergey@total-knowledge.com>>>
>     >     >     <mailto: sergey@total-knowledge.com
>     <mailto:sergey@total-knowledge.com>
>     >     <mailto:sergey@total-knowledge.com
>     <mailto:sergey@total-knowledge.com>>
>     >     >     <mailto:sergey@total-knowledge.com
>     <mailto:sergey@total-knowledge.com>
>     >     <mailto:sergey@total-knowledge.com
>     <mailto:sergey@total-knowledge.com>>>> wrote:
>     >     >     >     >> sergey@total-knowledge.com
>     <mailto:sergey@total-knowledge.com>
>     >     <mailto:sergey@total-knowledge.com
>     <mailto:sergey@total-knowledge.com>>
>     >     >     <mailto: sergey@total-knowledge.com
>     <mailto:sergey@total-knowledge.com>
>     >     <mailto:sergey@total-knowledge.com
>     <mailto:sergey@total-knowledge.com>>>
>     >     >     <mailto: sergey@total-knowledge.com
>     <mailto:sergey@total-knowledge.com>
>     >     <mailto:sergey@total-knowledge.com
>     <mailto:sergey@total-knowledge.com>>
>     >     >     <mailto:sergey@total-knowledge.com
>     <mailto:sergey@total-knowledge.com>
>     >     <mailto:sergey@total-knowledge.com
>     <mailto:sergey@total-knowledge.com>>>>
>     >     >     >     wrote:
>     >     >     >     >>
>     >     >     >     >>> - What pooling strategies we want to support
>     (and
>     >     why)
>     >     >     >     >>>
>     >     >     >     >>> 1. Close connections that have not been used
>     for a
>     >     specified
>     >     >     >     time.
>     >     >     >     >>> 2. Queueing connections. If there is max
>     >     connections for
>     >     >     the pool
>     >     >     >     >>> specified, then objects requesting a
>     connection can be
>     >     >     put on
>     >     >     >     hold until
>     >     >     >     >>> another object frees its connection.
>     >     >     >     >>>
>     >     >     >     >>>
>     >     >     >     >> How about fixed number of connections open at
>     >     creation time?
>     >     >     >     >> For example, make it as large as CPPSERV's thread
>     >     pool? This
>     >     >     >     >> way we'll always have a connection ready when
>     a request
>     >     >     arrives.
>     >     >     >     >>
>     >     >     >     > I got the idea and I see that thread pool size
>     >     retrieved from
>     >     >     >     engine.xml
>     >     >     >     > or set to 15 if not presented in engine.xml.
>     >     >     >     > Looks like I'll have to parse engine.xml too
>     to get this
>     >     >     number
>     >     >     >     for DB pool.
>     >     >     >     >
>     >     >     >     Of course not. It's all parsed for you already. Just
>     >     add an
>     >     >     app-level
>     >     >     >     parameter
>     >     >     >     >>> - How will pooling and HTTP request object
>     >     lifetimes will
>     >     >     >     interoperate
>     >     >     >     >>>
>     >     >     >     >>> Pool initialized at init() time(not sure
>     about this
>     >     >     one), then
>     >     >     >     the same
>     >     >     >     >>> connection can be used for subsequent
>     requests. Since
>     >     >     servlets
>     >     >     >     can keep
>     >     >     >     >>> information between requests, a pool lives
>     longer
>     >     then HTTP
>     >     >     >     request.
>     >     >     >     >> And what happens to connection during request
>     >     processing?
>     >     >     >     >>
>     >     >     >     > It stays open. That's the whole point of
>     having DB
>     >     pool. When
>     >     >     >     many hits
>     >     >     >     > arrive to the same servlet at once, the pool will
>     >     open as many
>     >     >     >     connections
>     >     >     >     > as needed (if not exceeded DB pool size).
>     >     >     >     >
>     >     >     >     Opening and closing are not the only things that can
>     >     happen to
>     >     >     >     connections.
>     >     >     >     >>> - What will API look like
>     >     >     >     >>>
>     >     >     >     >>> 1. The uuDBPool() constructor that takes these
>     >     parameters:
>     >     >     >     >>> string login,
>     >     >     >     >>> string password,
>     >     >     >     >>>
>     >     >     >     >>>
>     >     >     >     >> Whose login and password?
>     >     >     >     >>
>     >     >     >     >
>     >     >     >     >
>     >     >     >     > This is not current user's login/password, I
>     was wrong.
>     >     >     This is DB
>     >     >     >     > login/password. This is used for differentiating
>     >     users of the
>     >     >     >     pool in case
>     >     >     >     > of some of the operations in DB are restricted to
>     >     authorized
>     >     >     >     users only.
>     >     >     >     >
>     >     >     >     I don't remember what we ended up with for sure,
>     but I
>     >     think we
>     >     >     >     decided
>     >     >     >     on single
>     >     >     >     set of database authentication credentials.
>     Alexey would
>     >     >     have to
>     >     >     >     confirm
>     >     >     >     that.
>     >     >     >     >>> int server(this one is needed for login()
>     proc),
>     >     >     >     >>> string sessionid(needed for login() proc,
>     not really
>     >     >     used now),
>     >     >     >     >>>
>     >     >     >     >>>
>     >     >     >     >> How does that correlate with initializing the
>     pool
>     >     at servlet
>     >     >     >     init time?
>     >     >     >     >>
>     >     >     >     >
>     >     >     >     > I was wrong here again. Current user
>     login/password,
>     >     >     server and
>     >     >     >     sessionid
>     >     >     >     > will be used for initializing a single
>     connection not at
>     >     >     init()
>     >     >     >     time, but
>     >     >     >     > the time when DB connection is actually needed.
>     >     >     >     >
>     >     >     >     >
>     >     >     >     >
>     >     >     >     >>> int maxCon(max connections for this pool).
>     >     >     >     >>>
>     >     >     >     >>>
>     >     >     >     >>> - Provide few use cases relevant for our
>     application
>     >     >     >     >>>
>     >     >     >     >>> 1. Browser sends request to the servlet.
>     >     >     >     >>> 2. No connections available in the pool.
>     >     >     >     >>> 3. Is maxCon been reached?
>     >     >     >     >>>     a. Case "NO"
>     >     >     >     >>>             1) Create new connection using
>     >     getConnection()'
>     >     >     >     login() procedure.
>     >     >     >     >>>             2) Increament connections counter in
>     >     the pool.
>     >     >     >     >>>             3) Return connection to the client.
>     >     >     >     >>>     b. Case "YES"
>     >     >     >     >>>             1) Wait 'timeout' milliseconds.
>     >     >     >     >>>
>     >     >     >     >>>
>     >     >     >     >> UGH. No. Read up on mutexes and signals. Also
>     check out
>     >     >     ThreadPool
>     >     >     >     >> implementation
>     >     >     >     >> in cppserv.
>     >     >     >     >>
>     >     >     >     >
>     >     >     >     >
>     >     >     >     > I read up on mysterious mutexes and signals
>     and tried to
>     >     >     learn how
>     >     >     >     > ThreadPool works in cppserv.
>     >     >     >     > The pool job is pretty much similar in both
>     >     ThreadPool and
>     >     >     >     uuDBPool imho.
>     >     >     >     >
>     >     >     >     That is correct. That is why they both that
>     "Pool" in
>     >     their
>     >     >     name :-)
>     >     >     >     > Mutex object will prevent from using the same
>     connection
>     >     >     by more
>     >     >     >     than one
>     >     >     >     > user at the same time. In cthreadpool.cpp you use
>     >     >     sptk::CWaiter'
>     >     >     >     lock()
>     >     >     >     > and unlock() operations to do this kind of job for
>     >     threads.
>     >     >     >     >
>     >     >     >     Correct up to here.
>     >     >     >     > Theoretically speaking, signals are the
>     "notificators"
>     >     >     that sent
>     >     >     >     to a
>     >     >     >     > process, they interrupt whatever the process is
>     >     doing at this
>     >     >     >     moment.
>     >     >     >     You mixed up system-level signals (as in
>     signal(2)), and
>     >     >     inter-thread
>     >     >     >     signals, as implemented by CWaiter::sendSignal() &
>     >     friends.
>     >     >     >     Ideas below are close, but not quite right.
>     Re-think with
>     >     >     this note
>     >     >     >     in mind.
>     >     >     >     >  My
>     >     >     >     > understanding is that in order to free the
>     currently
>     >     used
>     >     >     >     connection and
>     >     >     >     > to return it to the pool, the process has to
>     get a
>     >     signal
>     >     >     that this
>     >     >     >     > connection is no longer used(for example, when
>     user
>     >     leaves the
>     >     >     >     servlet his
>     >     >     >     > connection stays unused for some time that
>     defined
>     >     in the
>     >     >     pool,
>     >     >     >     then is
>     >     >     >     > considered "abandoned"). We would never know about
>     >     it if we
>     >     >     >     wouldn't use
>     >     >     >     > some kind of signal to notify the pool.
>     >     >     >     > So the use case will be:
>     >     >     >     >
>     >     >     >     > 1. Browser sends request to the servlet.
>     >     >     >     > 2. No connections available in the pool.
>     >     >     >     > 3. Is maxCon been reached?
>     >     >     >     >         a. Case "NO"
>     >     >     >     >                 1) Create new connection using
>     >     >     getConnection()'
>     >     >     >     login()
>     >     >     >     > procedure.
>     >     >     >     >                 2) Increament connections
>     counter in
>     >     the pool.
>     >     >     >     >                 3) Return connection to the
>     client.
>     >     >     >     >         b. Case "YES"
>     >     >     >     >                 1) Wait for signal that
>     indicates that
>     >     >     there is an
>     >     >     >     > available connection in the pool
>     >     >     >     >                 2) Increament connections
>     counter in
>     >     the pool.
>     >     >     >     >               3) Return connection to the client.
>     >     >     >     >
>     >     >     >     >
>     >     >     >     >
>     >     >     >     >>>             2) Return to 3.
>     >     >     >     >>>
>     >     >     >     >>>
>     >     >     >     >>> Another case:
>     >     >     >     >>>
>     >     >     >     >>> 1. Browser sends request to the servlet.
>     >     >     >     >>> 2. Connections available in the pool.
>     >     >     >     >>> 3. Return a connection using getConnection()
>     >     >     >     >>>
>     >     >     >     >>>
>     >     >     >     >>> One more:
>     >     >     >     >>>
>     >     >     >     >>> 1. User leaves the site.
>     >     >     >     >>>
>     >     >     >     >>>
>     >     >     >     >> How do we know that?
>     >     >     >     >>
>     >     >     >     >
>     >     >     >     > When user leaves the site his connection stays
>     >     unused for
>     >     >     some
>     >     >     >     time that
>     >     >     >     > defined in the pool, then is considered
>     "abandoned"
>     >     >     >     >
>     >     >     >     What connection? Didn't we decide that
>     connection is
>     >     used during
>     >     >     >     single
>     >     >     >     request?
>     >     >     >     >>> 2. Connection returned to the pool using
>     >     releaseConnection()
>     >     >     >     function
>     >     >     >     >>>
>     >     >     >     >>>
>     >     >     >     >> Why did we have a connection?
>     >     >     >     >>
>     >     >     >     >
>     >     >     >     > B/c it was established by the pool when user
>     >     initilly sent a
>     >     >     >     request to
>     >     >     >     > servlet that uses DB connection.
>     >     >     >     >
>     >     >     >     Again, see above. It can't be stressed enough
>     that we
>     >     cannot
>     >     >     open
>     >     >     >     a database
>     >     >     >     connection for each session.
>     >     >     >     >
>     >     >     >     >>> 3. Number of used connections is decremented.
>     >     >     >     >>>
>     >     >     >     >>>
>     >     >     >     >>> I'm not sure when shutdown() should be called.
>     >     >     >     >>>
>     >     >     >     >>>
>     >     >     >     >> Why did you introduce it then?
>     >     >     >     >>
>     >     >     >     > I found plenty of articles on internet about what
>     >     typical DB
>     >     >     >     connection
>     >     >     >     > pool should do, some of them have shutdown()-like
>     >     operations,
>     >     >     >     that provide
>     >     >     >     > a method for graceful shutdown. I guess it
>     should be
>     >     used when
>     >     >     >     server is
>     >     >     >     > shut down, it's uuDBPool() responsibility to
>     close all
>     >     >     >     connections in the
>     >     >     >     > pool.
>     >     >     >     >
>     >     >     >     Don't forget about a thing called destructor...
>     >     >     >     >
>     >     >     >     > I have a question:
>     >     >     >     > Is it right idea to use sptk::CWaiter class
>     for my
>     >     DB pool? I
>     >     >     >     checked this
>     >     >     >     > class in sptk3, it has Mutex object and all the
>     >     signal methods
>     >     >     >     that I
>     >     >     >     > need.
>     >     >     >     >
>     >     >     >     Yes.
>     >     >     >     > Also I found a source on codeproject.com
>     <http://codeproject.com>
>     >     < http://codeproject.com>
>     >     >     < http://codeproject.com>
>     >     >     >     <http://codeproject.com> that has implementation of
>     >     >     >     > template based Generic Pool using C++. It does
>     what we
>     >     >     need in
>     >     >     >     the most
>     >     >     >     > generic way, you can check if you'd like to:
>     >     >     >     >
>     http://www.codeproject.com/library/Generic_Pool.asp
>     >     < http://www.codeproject.com/library/Generic_Pool.asp>
>     >     >     >     >
>     >     >     >     I glanced a while back at it. I don't remember what
>     >     license is
>     >     >     >     hooked to
>     >     >     >     that code though. Nor am I sure it'll work too well
>     >     for us, but
>     >     >     >     feel free
>     >     >     >     to study it for ideas.
>     >     >     >
>     >     >     >     --
>     >     >     >     Ilya A. Volynets-Evenbakh
>     >     >     >     Total Knowledge. CTO
>     >     >     >     http://www.total-knowledge.com
>     <http://www.total-knowledge.com>
>     >     >     >
>     >     >     >
>     >     >     >
>     >     >     >
>     >     >     > --
>     >     >     > Alexey Parshin,
>     >     >     > http://www.sptk.net
>     >     >
>     >     >     --
>     >     >     Ilya A. Volynets-Evenbakh
>     >     >     Total Knowledge. CTO
>     >     >     http://www.total-knowledge.com
>     <http://www.total-knowledge.com>
>     >     >
>     >     >
>     >     >
>     >     >
>     >     > --
>     >     > Alexey Parshin,
>     >     > http://www.sptk.net
>     >
>     >     --
>     >     Ilya A. Volynets-Evenbakh
>     >     Total Knowledge. CTO
>     >     http://www.total-knowledge.com
>     >
>     >
>     >
>     >
>     > --
>     > Alexey Parshin,
>     > http://www.sptk.net
>
>     --
>     Ilya A. Volynets-Evenbakh
>     Total Knowledge. CTO
>     http://www.total-knowledge.com
>
>
>
>
> -- 
> Alexey Parshin,
> http://www.sptk.net 

-- 
Ilya A. Volynets-Evenbakh
Total Knowledge. CTO
http://www.total-knowledge.com


Authoright © Total Knowledge: 2001-2008