Quantcast
Channel: Teradata Downloads
Viewing all articles
Browse latest Browse all 780

Why don't you use ALTER TABLE to alter a table?

$
0
0

To add or drop a column or modify the list of compressed values of an existing table is a quite expensive operation. For a large table it might result in a huge amount of CPU and IO usage and a loooooooong runtime. This blog discusses the pros and cons of the different ways to do it.

Alter Table vs. Insert Select vs. Merge Into

As always in SQL one got multiple choices to reach the same goal: modify a table directly or move the data to a new table. The former is ALTER TABLE (Alter), the latter INSERT SELECT (InsSel) or its less well-known variation MERGE INTO (Merge).

Let's start with a list of pros and cons (red = negative, green = positive):

  ALTER TABLE INSERT SELECT MERGE INTO
Needs Transient Journal?nonono
ABORT possible?noyes (fast)yes (fast)
Rollback during system restart?noyes (fast)yes (fast)
LOCK on source table

exclusive

read

read

Spoolspace used

no

yes, same as source

no

Additional Permspace used

low, 2 cylinders per AMP

high, same as source

high, same as source

Works on a table copy?noyesyes
Must Create/Drop/Rename Table?

no

yes

yes

Must recreate
Secondary/Hash/Join Indexes

Foreign Keys/Statistics/Comments
Access Rights?
noyesyes
Supports changing
Primary Index/Partitioning?
noyesyes

 

You can easily spot that InsSel and Merge are quite similar, but Alter is usually different.

The only common ground is the Transient Journal, all three don't use it (of course there are some entries indicating there some work going on, but the actual rows are not journaled). Due to that fact InsSel and Merge can be easily aborted and will rollback quite fast (just deleting all rows in the target table), but once Alter started it must finish, there's no way to abort it. Even a system shutdown can't stop it, it will simply continue after the restart. Some will consider this as positive others as negative :-)

The most important difference is the availability during the restructure process: Both InsSel and Merge apply a read lock allowing concurrent read access while Alter needs an exclusive lock blocking any access to the target table. That's the main reason why Alter is not used in most environments. Additionally before TD13 there was a table level write lock on dbc.AccessRights which was held throughout the whole process easily blocking other sessions. Yet in current releases this lock duration has been greatly reduced, now other requests will only be blocked for a short period. Some additional RowHash locks on system tables usually don't interfere with other requests, but might block backups.

Both Alter and Merge don't use Spool, Alter moves block on a cylinder level and Merge directly merges the source rows into the target table. But InsSel always needs to spool the source data, of course this is especially bad for large tables when explain shows "The result spool file will not be cached in memory".

Keeping a copy of the original table is often regarded as an advantage of Merge and InsSel ("just in case"), but when you're constraint on permspace you might prefer Alter's low overhead of a few megabytes per AMP.

However the biggest advantage of Alter is its simplicity, just submit "ALTER TABLE tab ADD new_col int, ADD existing_col COPRESS ('bla');", that's it.

Compare this to all the additional steps needed for InsSel or Merge. It's not only CREATE/DROP/RENAME, all those COMMENTs, GRANTs, COLLECT STATS must be scripted before and then reapplied, too. Maintaining Referential Integrity might be complicated when the table is referenced in a Foreign Key. And to speed up processing the target table will be created with the Primary Index only, any additional index must be recreated subsequently.

Resource usage and runtime

I'm not showing exact number because your mileage may vary, but for tables without secondary indexes the CPU/IO scoring is usually:

  1. Alter Table
  2. Merge Into
  3. Insert Select

In my test cases Merge needed almost twice the CPU and IO of an Alter and InsSel added another 20%.

When Secondary/Hash/Join indexes exist InsSel gets closer to Merge but the gap to Alter increases drastically: Alter still needs to modify only the base rows instead of re-building all the indexes.

Runtime differences should be similar to CPU/IO, but they will vary greatly amongst systems due to different bottlenecks and you should run some tests on your own system.

Conclusion

I would strongly recommend implementing Alter Table, at least start considering it. If you're concerned about availability you should bear in mind that this process will probably be scheduled out of business hours anyway.

And when you need to change the [P]PI or you just want the safeness of a copy of the old table you should definitely prefer Merge Into over good ol'Insert Select.

Ignore ancestor settings: 
0
Apply supersede status to children: 
0

Viewing all articles
Browse latest Browse all 780

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>