程序辅导案例 > Program >

代写辅导接单-COMP9311 24T2

欢迎使用51辅导，51作业君孵化低价透明的学长辅导平台，服务保持优质，平均费用压低50%以上！ 51fudao.top

Transaction

Management

COMP9311 24T2; Week 8.1

By Zhengyi Yang, UNSW

Transaction

A transaction is a unit of program execution that accesses and possibly

updates various data items.

E.g., transaction to transfer $50 from account A to account B:

1. read(A)

2. A := A – 50

3. write(A)

4. read(B)

5. B := B + 50

6. write(B)

Two main issues to deal with:

➢ Failures of various kinds, such as hardware failures and system crashes

➢ Concurrent execution of multiple transactions

Issue (1)

Concurrent execution of multiple transactions is needed

Why?

➢

i. Multiple users/transactions may read/change the same data

ii.Allowing multiple transactions to update data concurrently can result in complications -

data inconsistency.

Therefore transaction processing systems...

➢

i. need to support multiple transactions at the same time.

ii.usually allow multiple transactions to run concurrently.

Issue (2)

Failures of various kinds

a.System failure:

i. Disk failure - e.g., head crash, media fault.

ii.System crash - e.g., unexpected failure requiring a reboot.

b.Program error:

i. e.g., divide by zero.

c.Exception conditions:

i. e.g., no seats for your reservation.

d.Concurrency control:

i. e.g., deadlock, expired locks.

Transaction Processing Systems need to be robust against failure

Example of Fund Transfer (1)

A transaction is a unit of program execution that accesses and

possibly updates various data items.

Example: A possible transaction to transfer $50 from account A

to account B:

Each transaction typically

1. read(A)

includes some database

access operations

2. A := A – 50

3. write(A)

Operations relevant to

4. read(B) transaction processing:

1. Read

5. B := B + 50

2. Write

6. write(B)

3. Computation

Example of Fund Transfer (2)

Atomicity requirement

If the transaction fails after step 3 and before step 6, money will be “lost”

➢

leading to an inconsistent database state

➢ Failure could be due to software or hardware

The system should ensure that updates of a partially executed

➢

transaction are not reflected in the database (all-or-nothing)

Example:

Transaction to transfer $50

from account A to account B:

1. read(A)

2. A := A – 50

3. write(A)

4. read(B)

5. B := B + 50

6. write(B)

Example of Fund Transfer (3)

Durability requirement

Once the user has been notified that the transaction has completed (i.e.,

➢

the transfer of the $50 has taken place), the updates to the database by

the transaction must persist even if there are software or hardware

failures.

Example:

Transaction to transfer $50

from account A to account B:

1. read(A)

2. A := A – 50

3. write(A)

4. read(B)

5. B := B + 50

6. write(B)

Example of Fund Transfer (4)

Consistency requirement in above example:

➢ The sum of A and B is unchanged by the execution of the transaction

In general, consistency requirements include

➢ Explicitly specified integrity constraints such as primary keys and foreign keys

➢ Implicit integrity constraints

➢ A transaction must see a consistent database.

During transaction execution the database may be temporarily inconsistent.

➢ When the transaction completes successfully, the database must be consistent

➢ Erroneous transaction can lead to inconsistency

Example of Fund Transfer (5)

Isolation requirement — if between steps 3 and 6, another transaction T2 is allowed to access the

partially updated database, it will see an inconsistent database (the sum A + B will be less than it

should be)

T1 T2

1. read(A)

2. A := A – 50

3. write(A)

read(A), read(B), print(A+B)

4. read(B)

5. B := B + 50

6. write(B)

Isolation can be ensured trivially by running transactions serially. That is, one after the other.

However, executing multiple transactions concurrently has significant benefits.

ACID Summary

A transaction is a unit of program execution that accesses and possibly updates various data items. To

preserve the integrity of data, the database system must ensure the ACID property.

➢ Atomicity: Either all operations of the transaction are properly reflected in the database, or none are.

➢ Consistency: Every transaction sees a consistent database.

➢ Isolation: Although multiple transactions may execute concurrently, each transaction must be

unaware of other concurrently executing transactions. Intermediate transaction results must be hidden

from other concurrently executed transactions.

➢ That is, for every pair of transactions T and T, it must appear to T that either T, finished execution

i j i j

before T started, or T started execution after T finished.

i j i

➢ Durability: After a transaction completes successfully, the changes it has made to the database

persist, even if there are system failures.

Transaction States

Active – the initial state; the transaction stays in this state while it is executing

Partially committed – after the final statement has been executed.

Failed -- after the discovery that normal execution can no longer proceed.

Aborted – after the transaction has been rolled back using log and the database restored to its state

prior to the start of the transaction. Two options after it has been aborted:

➢ Restart the transaction

➢ Kill the transaction

Committed – after successful completion.

Concurrent Executions

Multiple transactions are allowed to run concurrently in the system.

The advantages are:

➢ Increased processor and disk utilization

➢ leading to better transaction throughput

➢ E.g., one transaction can be using the CPU while another is reading from or writing to the disk

➢ Reduced average response time for transactions: short transactions need not wait

behind long ones.

Concurrency control schemes – mechanisms to achieve isolation

➢ That is, to control the interaction among the concurrent transactions to prevent them

from destroying the consistency of the database

A Simple Transaction Model

We do not consider the full set of SQL language, and ignore SQL insertion/delete operations.

Two operations:

➢

read(X) to transfer the data item X from database to a variable, also called X in a buffer in main memory.

➢

write(X) to transfer the value in the variable X in the buffer to the data item in the database.

Recall example: transaction transfers $50 from account A to account B:

1. read(A)

2. A := A – 50

3. write(A)

4. read(B)

5. B := B + 50

6. write(B)

More Concepts

Concurrency/Isolation/Schedule/Control

➢ The concurrent execution in a database system is similar to the multiprogramming in

an operating system (OS).

➢ When several transactions run concurrently, the isolation property may be violated.

➢ A schedule can help identify the executions that are guaranteed to ensure the isolation

property and thus database consistency.

➢ A concurrency-control scheme is to control the interaction among the concurrent

transactions to prevent them from destroying the consistency of the database.

Schedules

Schedule – a sequence that specifies the order in which instructions of concurrent

transactions are executed.

➢ A schedule for a set of transactions must consist of all instructions of those transactions

➢

Must preserve the order in which the instructions appear in each individual transaction.

➢ A transaction that successfully completes its execution will have a commit instruction as the

last statement

➢

By default, transaction is assumed to execute commit instruction as its last step

➢ A transaction that fails to successfully complete its execution will have an abort instruction

as the last statement

Two Transactions

Consider that the system receives two transactions

Let T transfer $50 from A to B,

Let T transfer 10% of the balance from A to B.

read (A)

temp := A * 0.1

A := A – 50

A := A - temp

write (A)

read (B)

B := B + 50

B := B + temp

write (B)

commit

Schedule 1

A schedule S is serial if for every T1 T2

read (A)

transaction T in the schedule, all (the

A := A – 50

operations of) T are executed

write (A)

read (B)

consecutively in the schedule.

B := B + 50

write (B)

Example: a serial schedule in which T is

1 commit

followed by T :

read (A)

temp :=A * 0.1

A := A - temp

write (A)

read (B)

B := B + temp

write (B)

commit

Schedule 2

Example: another valid serial schedule where T is followed by T

2 1

T1 T2

read (A)

temp :=A * 0.1

A := A - temp

write (A)

read (B)

B := B + temp

write (B)

commit

read (A)

A := A – 50

write (A)

read (B)

B := B + 50

write (B)

commit

Schedule 3

T1 T2

Let T and T be the transactions given previously.

1 2

read (A)

➢ The following schedule is not a serial schedule A := A – 50

write (A)

➢ but it is equivalent to Schedule 1 read (A)

temp :=A * 0.1

A := A - temp

write (A)

read (B)

B := B + 50

write (B)

In Schedules 1, 2 and 3, the sum A + B is preserved.

commit

read (B)

B := B + temp

write (B)

commit

Schedule 4

T1 T2

The following concurrent schedule:

read (A)

➢ does not preserve the value of (A + B). A := A – 50

read (A)

temp :=A * 0.1

A := A - temp

write (A)

read (B)

write (A)

read (B)

B := B + 50

write (B)

Not all concurrent schedules are desirable

commit

B := B + temp

write (B)

commit

How to Avoid These Problems?

If operations are interleaved arbitrarily, incorrect results may occur.

➢ Isolation can be ensured trivially by running transactions serially.

Question: why not run only serial schedules?

Answer: Because of very poor throughput due to disk latency

➢ If a transaction waits for an I/O operation to complete, we cannot switch the CPU

processor to another transaction, thus wasting valuable CPU processing time.

➢ Additionally, if some transaction T is quite long, the other transactions must wait for T

to complete all its operations before starting.

How to Avoid These Problems?

Question: why run non-serial schedules?

➢ It is desirable to interleave the operations of transactions in an appropriate way.

➢ We can fully utilise resources.

➢ For example, if one transaction is waiting for I/O to complete, another transaction can use the CPU.

Point:

➢ serial schedules are considered unacceptable in practice

➢ executing multiple transactions concurrently has significant benefits.

Motivation (1)

We need to study the notion of correctness of concurrent executions.

➢ Every transaction is executed from beginning to end in isolation from the

operations of other transactions, we get a correct end result on the database.

We first need to define types of schedules that are always considered

to be correct when concurrent transactions are executing.

Motivation (2)

Question: How do we determine if non-serial schedules are

correct?

Intuition: If we can determine which non-serial schedules are

equivalent to a serial schedule, we can allow these schedules to

occur.

Summary

➢ Every serial schedule is considered correct

➢ We can assume this because every transaction is assumed to be correct if executed

on its own (according to the consistency preservation property).

➢ For serial schedules, it does not matter which transaction execute first. They are all

correct.

Basic Assumption – Each transaction preserves database consistency

➢ Therefore, a serial execution of a set of transactions preserves database consistency.

➢ Serial executions are correct

Serializability

A (possibly concurrent) schedule S of n transactions is

serializable if

it is equivalent to some serial schedule of the same n transactions.

➢

There are many notions forms of schedule equivalence give rise

to the notion of Conflict serializability (Will Discuss Later)

A Simplified View of Transactions

We ignore operations other than read and write instructions

We assume that transactions may perform arbitrary computations on data in local buffers in

between reads and writes.

Our simplified schedules consist of only read and write instructions.

An example: For a transaction that transfers $50 from account A to account B:

1. read(A)

We simply considers the

2. A := A – 50

read/write only

3. write(A) read(A)

4. read(B) write(A)

5. B := B + 50 read(B)

write(B)

6. write(B)

Conflicting Instructions

Instructions l and l of different transactions T and T respectively, conflict if and only if there exists

i j i j

some item Q accessed by both l and l, and at least one of these instructions wrote Q.

i j

1. l = read(Q), l = read(Q). l and l don’t conflict.

i j i j

2. l = read(Q), l = write(Q). They conflict.

i j

3. l = write(Q), l = read(Q). They conflict

i j

4. l = write(Q), l = write(Q). They conflict

i j

Intuitively, a conflict between l and l forces a (logical) temporal order between them. i.e., changing

i j

their order can result in a different combined outcome.

If l and l are consecutive in a schedule and they do not conflict, their results would remain the same

i j

even if they had been interchanged in the schedule.

For example, read–read operations are are not conflicting.

Summary: Conflicting Instructions

Summary: Two operations O and O are conflicting if

1 2

➢ They are in different transactions

➢ They access the same data item,

➢ At least one of them must be a write.

Language: The transaction of the second operation in the pair is

said to be in conflict with the transaction of the first operation.

Conflict Equivalence

Two schedules are said to be conflict equivalent if:

➢ the order of any two conflicting operations is the same in both schedules.

Two schedules are conflict equivalent if:

➢ Involve the same actions of the same transactions

➢ Every pair of conflicting actions is ordered the same way

For two schedules to be conflict equivalent:

➢ the operations applied to each data item affected by the schedules should be applied to

that item in both schedules in the same order.

We define equivalence of schedules by conflict equivalence, which is the more

commonly used definition

A better definition of equivalence compared to result equivalence .

Conflict Serializability

Using the notion of conflict equivalence, we define a schedule S to be conflict serializable if it is

(conflict) equivalent to some serial schedule S.

For conflict serializable schedules:

➢ we can reorder the nonconflicting operations in S until we form the equivalent serial schedule S.

This means that if a schedule can be transformed to any serial schedule without changing orders of

conflicting operations (but changing orders of non-conflicting, while preserving operation order inside

each transaction), then the outcome of both schedules is the same, and the schedule is conflict-

serializable by definition.

If a schedule S can be transformed into a schedule S’ by a series of swaps of non-conflicting

instructions, we say that S and S’ are conflict equivalent.

We say that a schedule S is conflict serializable if it is conflict equivalent to a serial schedule.

Conflict Serializability (2)

Schedule 3 can be transformed into Schedule 6, a serial schedule where T follows T , by

2 1

series of swaps of non-conflicting instructions. Therefore Schedule 3 is conflict serializable.

T1 T2 T1 T2

read (A) read (A)

write (A) write (A)

read (A) read (B)

write (A) write (B)

read (B) read (A)

write (B) write (A)

read (B) read (B)

write (B) write (B)

Schedule 3 Schedule 6

Note: any conflict serializable schedule is also a serializable schedule

Conflict Serializability (3)

Example of a schedule that is not conflict serializable:

T3 T4

read (Q)

write (Q)

We are unable to swap instructions in the above schedule to obtain

either the serial schedule , or the serial schedule .

3 4 4 3

Note: Not all schedules are conflict serializable.

Allow Some Concurrent Schedules

Now we characterized the types of schedules that are always

considered to be correct when concurrent transactions are

executing.

The concept of serializability of schedules is used to identify

which schedules are correct when transaction executions have

interleaving of their operations in the schedules.

Since serial schedules are not practical, we can allow conflict

serializable schedules since they are correct!

Allow Some Concurrent Schedules

Saying that a non-serial schedule S is serializable is equivalent to

saying that it is correct

because it is equivalent to a serial schedule, which is always

➢

considered correct.

Practice:

Is a serializable schedule correct?

Is being serializable the same as being serial.

Is a non-serializable schedule correct?

Is a nonserial schedule correct?

Learning Outcomes

Transaction Concept

➢

Transaction State

➢

Concurrent Executions

➢

Serializability

➢

A nonserial schedules can be one of the following case:

➢

➢ those that are equivalent to one (or more) of the serial schedules

➢ those that are not equivalent to any serial schedule and hence are not

serializable.