數(shù)據(jù)庫設(shè)計外文翻譯_第1頁
已閱讀1頁,還剩12頁未讀, 繼續(xù)免費閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報或認(rèn)領(lǐng)

文檔簡介

1、<p><b>  外文翻譯:</b></p><p><b>  索 引</b></p><p>  原文來源:Thomas Kyte.Expert Oracle Database Architecture .2nd Edition. </p><p><b>  譯文正文:</b><

2、;/p><p>  什么情況下使用B*樹索引?</p><p>  我并不盲目地相信“法則”(任何法則都有例外),對于什么時候該用B*索引,我沒有經(jīng)驗可以告訴你。為了證明為什么這個方面我無法提供任何經(jīng)驗,下面給出兩種等效作法:</p><p>  ? 使用B*樹索引,如果你想通過索引的方式去獲得表中所占比例很小的那些行。</p><p>  ?

3、使用B *樹索引,如果你要處理的表和索引許多可以代替表中使用的行。</p><p>  這些規(guī)則似乎提供相互矛盾的意見,但在現(xiàn)實中,他們不是這樣的,他們只是涉及兩個極為不同的情況。有兩種方式使用上述意見給予索引:</p><p>  ? 作為獲取表中某些行的手段。你將讀取索引去獲得表中的某一行。在這里你想獲得表中所占比例很小的行。</p><p>  ? 作為獲取查

4、詢結(jié)果的手段。這個索引包含足夠信息來回復(fù)整個查詢,我們將不用去查詢?nèi)?。這個索引將作為該表的一個瘦版本。</p><p>  還有其他方式—例如,我們使用索引去檢索表的所有行,包括那些沒有建索引的列。這似乎違背了剛提出的兩個規(guī)則。這種方式獲得將是一個真正的交互式應(yīng)用程序。該應(yīng)用中,其中你將獲取其中的某些行,并展示它們,等等。你想獲取的是針對初始響應(yīng)時間的查詢優(yōu)化,而不是針對整個查詢吞吐量的。</p>

5、<p>  在第一種情況(也就是你想通過索引獲得表中一小部分的行)預(yù)示著如果你有一個表T(使用與早些時候使用過的相一致的表T),然后你獲得一個像這樣查詢的執(zhí)行計劃:</p><p>  ops$tkyte%ORA11GR2> set autotrace traceonly explain</p><p>  ops$tkyte%ORA11GR2> select ow

6、ner, status</p><p><b>  2 from t</b></p><p>  3 where owner = USER;</p><p>  Execution Plan</p><p>  -----------------------------------------------------

7、-----</p><p>  Plan hash value: 1049179052</p><p>  ------------------------------------------------------------------</p><p>  | Id | Operation | Name

8、 | Rows | Bytes |</p><p>  ------------------------------------------------------------------</p><p>  | 0 | SELECT STATEMENT | | 2120 | 23320 |</p><

9、p>  | 1 | TABLE ACCESS BY INDEX ROWID |T | 2120 | 23320 |</p><p>  | *2 | INDEX RANGE SCAN | DESC_T_IDX | 8 | |</p><p>  ------------------------------

10、------------------------------------</p><p>  Predicate Information (identified by operation id):</p><p>  ---------------------------------------------------</p><p>  2 - access(SY

11、S_OP_DESCEND("OWNER")=SYS_OP_DESCEND(USER@!))</p><p>  filter(SYS_OP_UNDESCEND(SYS_OP_DESCEND("OWNER"))=USER@!)</p><p>  你應(yīng)該訪問到該表的一小部分。這個問題在這里看是INDEX (RANGE SCAN) 緊跟在TABLE AC

12、CESS BY INDEX ROWID之后。這也意味著Oracle先讀取索引,然后獲取索引項。該索引項將執(zhí)行一個數(shù)據(jù)庫塊讀(邏輯或者物理的I/O)去獲取行數(shù)據(jù)。如果你想通過索引去訪問數(shù)據(jù)表T中的大部分?jǐn)?shù)據(jù),這不是最高效的方式(我們將很快定義什么是大部分的數(shù)據(jù))。</p><p>  第二種情況,(也就是你想通過索引去代替表),你將通過索引去處理100%(事實上可以是任何比例)的行。也許你想通過索引索引去獲得一個縮

13、小版的表。接下來的查詢證明了這種方式:</p><p>  ops$tkyte%ORA11GR2> select count(*)</p><p><b>  2 from t</b></p><p>  3 where owner = user;</p><p>  Execution Plan</p&

14、gt;<p>  ----------------------------------------------------------</p><p>  Plan hash value: 293504097</p><p>  --------------------------------------------------------------------------

15、-</p><p>  |Id | Operation. | Name | Rows | Bytes .| Cost (%CPU)| Time |</p><p>  ---------------------------------------------------------------------------</p><p>

16、  | 0 | SELECT STATEMENT | | 1 | 6 | 17 (0) | 00:00:01 |</p><p>  | 1 | SORT AGGREGAT E | | 1  | 6 | .. | .|</p><p>  | * 2 | INDEX RANGE SCAN |

17、 T_IDX | 2 120 | 12720 | 17 (0) | 00: 00: 01 |</p><p>  ---------------------------------------------------------------------------</p><p>  Predicate Information (identified by operation

18、id):</p><p>  ---------------------------------------------------</p><p>  2 - access("OWNER"=USER@!)</p><p>  這里,僅僅是使用索引去作為查詢的返回集-現(xiàn)在再也不在乎我們只通過索引的方式,想訪問多少比例的行。從執(zhí)行計劃中可以看到,查

19、詢語句從未訪問過表,僅僅掃描索引結(jié)構(gòu)本身。</p><p>  理解兩種概念的區(qū)別很重要。當(dāng)執(zhí)行TABLE ACCESS BY ROWID操作時,我們必須確保僅訪問表中一小部分的塊,也就相當(dāng)于僅訪問一小部分的行或者是盡量塊地獲取第一的數(shù)據(jù)。(最終的用戶將會為了這幾行數(shù)據(jù)等得不耐煩的)。如果想通過訪問比較高比例的行(所占比例高于20%),使用B*索引的話,它將花費比全表掃描更多的時間。</p><

20、;p>  使用第二種查詢方式,那些在索引中可以找到所需結(jié)果的,情況就完全不同了。我們讀取索引塊,然后拾取其中的很多行進(jìn)行處理,如此繼續(xù)下一個索引塊,從不訪問表。某些情況下,還可以在索引上執(zhí)行一個快速全面掃描??焖偃鎾呙枋侵福瑪?shù)據(jù)庫不按特定的順序讀取索引塊,只是開始讀取它們。這里不再是將索引只當(dāng)一個索引,此時更像是一個表。如果采用全表掃描,將不會按索引項來順序獲取行。</p><p>  一般來講,B*樹索

21、引將會被放在查詢時頻繁使用的列上。而且我們希望從表中只返回少量的數(shù)據(jù)或者最終用戶的請求想立即得到反饋。在一個瘦表(也就是一個含有很少的列或者列很?。┲?,這個比例可能很小。一個查詢,使用該索引應(yīng)該可以在表中取回約2% ~ 3%或更少的行。在一個胖表(也就是含有很多列或者列很寬)中,這個比例將一直上升到該表的20%~25%。以上建議并不對每個人都有作用。這個比例并不直觀,但很精確。索引根據(jù)索引鍵進(jìn)行排序存儲。索引會按鍵的有序順序進(jìn)行訪問。索

22、引指向的塊都隨機(jī)存儲在堆中。因此,我們通過索引訪問表時,會執(zhí)行大類分散、隨機(jī)的I/O。這里的“分散”是指,索引會告訴我們讀取塊1,然后是塊1000,塊205,塊1,塊1032,塊1等等。它們不會要求我們按照塊1,塊2然后塊3的方式。我們將以一種非常隨意的方式讀取和重新讀取塊,這種塊I/O可能非常慢。</p><p>  讓我們看一下簡化版的例子,假設(shè)我們通過索引讀取一個瘦表,而且要讀取表中的20%的行。若這個表中

23、有100000行,這個表得20%就是20000行。如果行大小約為80個字節(jié),在一個塊大小為8KB的數(shù)據(jù)庫中,我們將在每個塊中獲得100行數(shù)據(jù)。這也就意味著這個表有1000個塊。了解這些,計算起來就很容易了。我們想通過索引去讀取2000行,這也就意味著幾乎相當(dāng)于20000次的TABLE ACCESS BY ROWID 操作。這將導(dǎo)致執(zhí)行這個操作要處理20000個表塊。不過,這個表總共才只有1000塊。我們將對表的每個塊要執(zhí)行讀和處理20次

24、。即時把行的大小提高到一個數(shù)量級,達(dá)到每行800字節(jié),這樣每塊有10行,那樣這個表現(xiàn)在有10000塊。要通過索引20000行,仍要求我們把每一塊平均讀取2次。在這種情況下,全表掃描就比使用索引高效得多。因為每個塊只會命中一次。如果把查詢使用這個索引來訪問數(shù)據(jù),效率都不會高,除非對應(yīng)800字節(jié)的行,平均只訪問表中不到5%的數(shù)據(jù)(這樣一來,我們訪問的大概為5000塊),如果是80字節(jié)的行,則訪問的數(shù)據(jù)應(yīng)當(dāng)只占更小的百分比(大約0.5%或更少

25、)。</p><p>  什么情況下使用位圖索引?</p><p>  位圖索引是最適合于低相異基數(shù)數(shù)據(jù)的情形(也就是說,與整個數(shù)據(jù)集得基數(shù)相比,這個數(shù)據(jù)只有很少幾個不同的值)。對此作出量化是不太可能的——換句話說,就是很難定義這個低相異基數(shù)數(shù)據(jù)有到底多么不同。在一個有幾千條記錄的數(shù)據(jù)集中,2就是一個低相異基數(shù),但是在一個只有兩行記錄的數(shù)據(jù)表中,2就不再是低相異基數(shù)了。而在一個上千萬或者上

26、億條記錄的表中,甚至100,000都能作為一個低相異基數(shù)。所以,多大才算是低相異基數(shù),這要相對于結(jié)果集得大小來說。這里是指行集中不同項的個數(shù)除以行數(shù)應(yīng)該是一個很小的數(shù)(接近于0)。例如,GENDER列可能取值為M、F和NULL。如果一個表中有20,000條員工記錄,那么你將會發(fā)現(xiàn)3/20,000=0.00015。同樣地,10,000,000中100,000個不同值得比例為0.01,——同樣,值很小。這些列就可以建立位圖索引。他們可能不合

27、適建立B*樹索引,因為每個值可能會獲取表中的大量數(shù)據(jù)。如同前面所述,B*數(shù)索引一般來講是選擇性的。位圖索引不帶有選擇性的——相反,一般是“沒有選擇性”的。</p><p>  位圖索引在有很多即時查詢的時候極其有用,尤其是在查詢涉及很多列或者會生成諸如COUNT之類的聚會。例如,假設(shè)有一個含有GENDER,LOCATION,和AGE_GROUP三個字段的大表。在這個表中,GENDER的值為M或者F,LOCATIO

28、N可以選取1到50之間的值,AGE_GROUP為代表18歲及以下,19-25,26-30,31-40,和40歲及以上的代碼?,F(xiàn)在不得不通過以下的方式執(zhí)行大量的即時查詢:</p><p>  Select count(*)</p><p><b>  from T</b></p><p>  where gender = 'M'&

29、lt;/p><p>  and location in ( 1, 10, 30 )</p><p>  and age_group = '41 and over';</p><p><b>  select *</b></p><p><b>  from t</b></p>

30、<p>  where ( ( gender = 'M' and location = 20 )</p><p>  or ( gender = 'F' and location = 22 ))</p><p>  and age_group = '18 and under';</p><p>  sel

31、ect count(*) from t where location in (11,20,30);</p><p>  select count(*) from t where age_group = '41 and over' and gender = 'F';</p><p>  你會發(fā)現(xiàn),這里用傳統(tǒng)的B*樹索引是沒用的。如果想通過索引獲取結(jié)果集,你將件

32、至少組合3~6個B*樹的索引獲取數(shù)據(jù)。從任意的3列或任何3列的子集將會出現(xiàn)。你將需要會在以下列建立大量串聯(lián)B*數(shù)索引:</p><p>  ? GENDER, LOCATION, AGE_GROUP:對應(yīng)使用三列的查詢,或者使用GENDER和LOCATION的查詢或者單獨使用GENDER的查詢</p><p>  ? LOCATION,AGE_GROUP:對應(yīng)使用LOCATION和AG

33、E_GROUP或只用LOCATION的查詢。</p><p>  ? AGE_GROUP,GENDER:對應(yīng)只用了AGE_GROUP和GENDER或者只用AGE_GROUP的查詢。</p><p>  為檢索被檢索到數(shù)據(jù)量,還可以有其他排列,以減少所掃描索引結(jié)構(gòu)的大小。這是因為在此忽略了這樣一個重要事實:對這種低基數(shù)數(shù)據(jù)建B*樹索引是不明智的。</p><p> 

34、 這里位圖索引就派上用場了。利用分別建立在各個列上的3個較小的位圖索引,就能高效地滿足前面的所有條件。Oracle只需使用函數(shù)AND,OR和NOT,利用位圖將這三個索引放在一起,就能得到引用該三列的結(jié)果集。它會得到合并后的位圖,如果必要還可以將位圖中的“1”轉(zhuǎn)換為rowid來訪問數(shù)據(jù)(如果只是統(tǒng)計與條件匹配的行數(shù),Oracle就只會統(tǒng)計“1”位的個數(shù))。下面來看一個例子。首先,生成一些測試數(shù)據(jù)(滿足我們指定的相異基數(shù)),建立索引,我們將

35、得到DBMS_RANDOM包來生成我們的分布要求的隨機(jī)數(shù)據(jù):</p><p>  ops$tkyte%ORA11GR2> create table t</p><p>  2 ( gender not null,</p><p>  3 location not null,</p><p>  4 age_group not nu

36、ll,</p><p><b>  5 data</b></p><p><b>  6 )</b></p><p><b>  7 as</b></p><p>  8 select decode( ceil(dbms_random.value(1,2)),</

37、p><p>  9 1, 'M',</p><p>  10 2, 'F' ) gender,</p><p>  11 ceil(dbms_random.value(1,50)) location,</p><p>  12 decode( ceil(dbms_random.value(1,5)),<

38、;/p><p>  13 1,'18 and under',</p><p>  14 2,'19-25',</p><p>  15 3,'26-30',</p><p>  16 4,'31-40',</p><p>  17 5,'41

39、 and over'),</p><p>  18 rpad( '*', 20, '*')</p><p>  19 from big_table.big_table</p><p>  20 where rownum <= 100000;</p><p>  Table created.&

40、lt;/p><p>  ops$tkyte%ORA11GR2> create bitmap index gender_idx on t(gender);</p><p>  Index created.</p><p>  ops$tkyte%ORA11GR2> create bitmap index location_idx on t(location);

41、</p><p>  Index created.</p><p>  ops$tkyte%ORA11GR2> create bitmap index age_group_idx on t(age_group);</p><p>  Index created.</p><p>  ops$tkyte%ORA11GR2> exec

42、 dbms_stats.gather_table_stats( user, 'T');</p><p>  PL/SQL procedure successfully completed.</p><p>  現(xiàn)在我們來看看前面各個即時查詢的相應(yīng)查詢計劃:</p><p>  ops$tkyte%ORA11GR2> Select count(*

43、)</p><p><b>  2 from T</b></p><p>  3 where gender = 'M'</p><p>  4 and location in ( 1, 10, 30 )</p><p>  5 and age_group = '41 and over

44、9;;</p><p>  Execution Plan</p><p>  ----------------------------------------------------------</p><p>  Plan hash value: 1811480857</p><p>  --------------------------

45、----------------------------------------------------------</p><p>  | Id | Operation | Name | Rows | Bytes | Cost (%CPU)|</p><p>  ---------------------------------------------------------------

46、---------------------</p><p>  | 0 | SELECT STATEMENT | | 1 | 13 | 5 (0)|</p><p>  | 1 | SORT AGGREGATE | | 1 | 13 | |</p><p>  | 2 | BITMAP CONVERSION COUNT | | 1 | 13 | 5 (0)|<

47、/p><p>  | 3 | BITMAP AND | | | | |</p><p>  |* 4 | BITMAP INDEX SINGLE VALUE | GENDER_IDX | | | |</p><p>  | 5 | BITMAP OR | | | | |</p><p>  |* 6 | BITMAP INDEX SINGLE V

48、ALUE| LOCATION_IDX | | | |</p><p>  |* 7 | BITMAP INDEX SINGLE VALUE| LOCATION_IDX | | | |</p><p>  |* 8 | BITMAP INDEX SINGLE VALUE| LOCATION_IDX | | | |</p><p>  |* 9 | BITMAP IND

49、EX SINGLE VALUE | AGE_GROUP_IDX | | | |</p><p>  ------------------------------------------------------------------------------------</p><p>  Predicate Information (identified by operation id):

50、</p><p>  ---------------------------------------------------</p><p>  4 - access("GENDER"='M')</p><p>  6 - access("LOCATION"=1)</p><p>  7

51、 - access("LOCATION"=10)</p><p>  8 - access("LOCATION"=30)</p><p>  9 - access("AGE_GROUP"='41 and over')</p><p>  這個例子顯示了位圖索引的強(qiáng)大能力。Oracle可以看到

52、在(1,10,30)中的位置和知道讀取賦予3個值的位置的索引,且在位圖中對這些“位”進(jìn)行邏輯OR運算。然后將所得到的位圖與AGE_GROUP=’41 AND OVER’和GENDER=’M’的相應(yīng)位圖執(zhí)行邏輯AND運算。再統(tǒng)計“1”的個數(shù),這樣就得到了答案:</p><p>  ops$tkyte%ORA11GR2> select *</p><p><b>  2 fr

53、om t</b></p><p>  3 where ( ( gender = 'M' and location = 20 )</p><p>  4 or ( gender = 'F' and location = 22 ))</p><p>  5 and age_group = '18 and unde

54、r';</p><p>  Execution Plan</p><p>  ----------------------------------------------------------</p><p>  Plan hash value: 906765108</p><p>  -----------------------

55、----------------------------------------------------------</p><p>  | Id | Operation | Name | Rows | Bytes |Cost(%C)|</p><p>  -------------------------------------------------------------------

56、--------------</p><p>  | 0 | SELECT STATEMENT | | 510 | 16830 |78 (0)|</p><p>  | 1 | TABLE ACCESS BY INDEX ROWID | T | 510 | 16830 |78 (0)|</p><p>  | 2 | BITMAP CONVERSION TO ROW

57、IDS | | | | |</p><p>  | 3 | BITMAP AND | | | | |</p><p>  |* 4 | BITMAP INDEX SINGLE VALUE | AGE_GROUP_IDX | | | |</p><p>  | 5 | BITMAP OR | | | | |</p><p>  | 6 | BI

58、TMAP AND | | | | |</p><p>  |* 7 | BITMAP INDEX SINGLE VALUE| LOCATION_IDX | | | |</p><p>  |* 8 | BITMAP INDEX SINGLE VALUE| GENDER_IDX | | | |</p><p>  | 9 | BITMAP AND | | | | |&

59、lt;/p><p>  |* 10 | BITMAP INDEX SINGLE VALUE| GENDER_IDX | | | |</p><p>  |* 11 | BITMAP INDEX SINGLE VALUE| LOCATION_IDX | | | |</p><p>  --------------------------------------------

60、-------------------------------------</p><p>  Predicate Information (identified by operation id):</p><p>  ---------------------------------------------------</p><p>  4 - access(&

61、quot;AGE_GROUP"='18 and under')</p><p>  7 - access("LOCATION"=22)</p><p>  8 - access("GENDER"='F')</p><p>  10 - access("GENDER"

62、='M')</p><p>  11 - access("LOCATION"=20)</p><p>  這個邏輯與前面是類似的。由計劃顯示:這里執(zhí)行邏輯OR的兩個條件是通過AND適當(dāng)?shù)奈粓D邏輯計算得到的,然后再對這些結(jié)果進(jìn)行OR運算。再加上一個AND去滿足AGE_GROUP='18 AND UNDER',那樣就獲得了滿足條件的結(jié)果。由于

63、這一次要請求具體的行,索引Oracle將轉(zhuǎn)換每一個位圖1和0為rowid,來獲取源數(shù)據(jù)。</p><p>  在數(shù)據(jù)倉庫或者一個支持很多即時SQL查詢的大型報告系統(tǒng)中,能同時合理地使用盡可能多的索引確實很有作用。在這里將幾乎不使用傳統(tǒng)的B*樹索引,或者是根本上無用。當(dāng)被即時查詢的列數(shù)增加時,B*樹索引的組合數(shù)也就同時增加了。</p><p>  然而,有些時候使用位圖索引也是不合適的。它們

64、在讀密集型環(huán)境中是運行很好的,但是在寫密集型環(huán)境中運行效果極度差。原因是一個位圖索引鍵條目對應(yīng)多行。如果一個會話修改所索引到的數(shù)據(jù),那么所有被索引到的行將受影響,被鎖定。Oracle無法鎖一個索引項中的某一位,它鎖住整個索引項。倘若其他修改也需要更新同樣的這個位圖索引項,它將被鎖在外面。由于每次更新都會無意識地鎖住很多行,阻止它們的位圖列被及時地更新,這樣將大大影響并發(fā)性。在此不是像你所想的那樣鎖定每一行——只是鎖定部分。位圖是以大塊地

65、進(jìn)行存儲,所以使用前面的EMP就可以看到,索引鍵ANALYST在索引中出現(xiàn)了多次,每次都是指向數(shù)百行。更新一行時如果修改了JOB列,則需要獨占地訪問其中兩個索引項:對應(yīng)老值的索引項和對應(yīng)于新值得索引項。這兩個條目指向的數(shù)百行就不允許其他會話修改直到UPDATE提交。</p><p><b>  原文正文:</b></p><p>  When Should You U

66、se a B*Tree Index?</p><p>  Not being a big believer in “rules of thumb” (there are exceptions to every rule), I don’t have any rules of thumb for when to use (or not to use) a B*Tree index. To demonstrate w

67、hy I don’t have any rules of thumb for this case, I’ll present two equally valid ones:</p><p>  ? Only use B*Tree to index columns if you are going to access a very small percentage of the rows in the table

68、via the index.</p><p>  ? Use a B*Tree index if you are going to process many rows of a table and the index can be used instead of the table.</p><p>  These rules seem to offer conflicting advic

69、e, but in reality, they do not—they just cover two extremely different cases. There are two ways to use an index given the preceding advice:</p><p>  ? As the means to access rows in a table: You will read t

70、he index to get to a row in the table. Here you want to access a very small percentage of the rows in the table.</p><p>  ? As the means to answer a query: The index contains enough information to answer the

71、 entire query—we will not have to go to the table at all. The index will be used as a thinner version of the table.</p><p>  There are other ways as well—for example, we could be using an index to retrieve a

72、ll of the rows in a table, including columns that are not in the index itself. That seemingly goes counter to both rules just presented. The case in which that would be true would be an interactive application where you

73、are getting some of the rows and displaying them, then some more, and so on. You want to have the query optimized for initial response time, not overall throughput.</p><p>  The first case (i.e., use the ind

74、ex if you are going to access a small percentage of the table) says if you have a table T (using the same table T from earlier) and you have a query plan that looks like this</p><p>  ops$tkyte%ORA11GR2>

75、set autotrace traceonly explain</p><p>  ops$tkyte%ORA11GR2> select owner, status</p><p><b>  2 from t</b></p><p>  3 where owner = USER;</p><p>  Exec

76、ution Plan</p><p>  ----------------------------------------------------------</p><p>  Plan hash value: 1049179052</p><p>  --------------------------------------------------------

77、----------</p><p>  | Id | Operation | Name | Rows | Bytes |</p><p>  ------------------------------------------------------------------</p><p>  

78、| 0 | SELECT STATEMENT | | 2120 | 23320 |</p><p>  | 1 | TABLE ACCESS BY INDEX ROWID |T | 2120 | 23320 |</p><p>  | *2 | INDEX RANGE SCAN

79、 | DESC_T_IDX | 8 | |</p><p>  ------------------------------------------------------------------</p><p>  Predicate Information (identified by operation id):</p><p>  -

80、--------------------------------------------------</p><p>  2 - access(SYS_OP_DESCEND("OWNER")=SYS_OP_DESCEND(USER@!))</p><p>  filter(SYS_OP_UNDESCEND(SYS_OP_DESCEND("OWNER"

81、))=USER@!)</p><p>  you should be accessing a very small percentage of this table. The issue to look at here is the INDEX (RANGE SCAN) followed by the TABLE ACCESS BY INDEX ROWID. This means that Oracle will

82、 read the index and then, for the index entries, it will perform a database block read (logical or physical I/O) to get the row data. This is not the most efficient method if you are going to have to access a large perce

83、ntage of the rows in T via the index (we will soon define what a large percentage might be).</p><p>  In the second case (i.e., when the index can be used instead of the table), you can process 100 percent (

84、or any percentage, in fact) of the rows via the index. You might use an index just to create a thinner version of a table. The following query demonstrates this concept:</p><p>  ops$tkyte%ORA11GR2> selec

85、t count(*)</p><p><b>  2 from t</b></p><p>  3 where owner = user;</p><p>  Execution Plan</p><p>  ------------------------------------------------------

86、----</p><p>  Plan hash value: 293504097</p><p>  ---------------------------------------------------------------------------</p><p>  |Id | Operation. | Name | Rows | B

87、ytes .| Cost (%CPU)| Time |</p><p>  ---------------------------------------------------------------------------</p><p>  | 0 | SELECT STATEMENT | | 1 | 6 | 17 (0) | 00:00

88、:01 |</p><p>  | 1 | SORT AGGREGAT E | | 1  | 6 | .. | .|</p><p>  | * 2 | INDEX RANGE SCAN | T_IDX | 2 120 | 12720 | 17 (0) | 00: 00: 01 |</p><p&

89、gt;  ---------------------------------------------------------------------------</p><p>  Predicate Information (identified by operation id):</p><p>  -------------------------------------------

90、--------</p><p>  2 - access("OWNER"=USER@!)</p><p>  Here, only the index was used to answer the query—it would not matter now what percentage of rows we were accessing, as we would u

91、se the index only. We can see from the plan that the underlying table was never accessed; we simply scanned the index structure itself.</p><p>  It is important to understand the difference between the two c

92、oncepts. When we have to do a TABLE ACCESS BY INDEX ROWID, we must ensure we are accessing only a small percentage of the total blocks in the table, which typically equates to a small percentage of the rows, or that we n

93、eed the first rows to be retrieved as fast as possible (the end user is waiting for them impatiently). If we access too high a percentage of the rows (larger than somewhere between 1 and 20 percent of the rows), then i&l

94、t;/p><p>  With the second type of query, where the answer is found entirely in the index, we have a different story. We read an index block and pick up many rows to process, then we go on to the next index blo

95、ck, and so on—we never go to the table. There is also a fast full scan we can perform on indexes to make this even faster in certain cases. A fast full scan is when the database reads the index blocks in no particular or

96、der; it just starts reading them. It is no longer using the index as an index, bu</p><p>  In general, a B*Tree index would be placed on columns that we use frequently in the predicate of a query, and we wou

97、ld expect some small fraction of the data from the table to be returned or the end user demands immediate feedback. On a thin table (i.e., a table with few or small columns), this fraction may be very small. A query that

98、 uses this index should expect to retrieve 2 to 3 percent or less of the rows to be accessed in the table. On a fat table (i.e., a table with many columns or very w</p><p>  As a simplistic example of this,

99、let’s say we are reading that thin table via an index, and we are going to read 20 percent of the rows. Assume we have 100,000 rows in the table. Twenty percent of that is 20,000 rows. If the rows are about 80 bytes apie

100、ce in size, on a database with an 8KB block size, we will find about 100 rows per block. That means the table has approximately 1,000 blocks. From here, the math is very easy. We are going to read 20,000 rows via the ind

101、ex; this will mean quite li</p><p>  When Should You Use a Bitmap Index?</p><p>  Bitmap indexes are most appropriate on low distinct cardinality data (i.e., data with relatively few discrete va

102、lues when compared to the cardinality of the entire set). It is not really possible to put a value on this—in other words, it is difficult to define what low distinct cardinality is truly. In a set of a couple thousand r

103、ecords, 2 would be low distinct cardinality, but 2 would not be low distinct cardinality in a two-row table. In a table of tens or hundreds of millions records, 100,000</p><p>  Bitmap indexes are extremely

104、useful in environments where you have lots of ad hoc queries, especially queries that reference many columns in an ad hoc fashion or produce aggregations such as COUNT. For example, suppose you have a large table with th

105、ree columns: GENDER, LOCATION, and AGE_GROUP. In this table, GENDER has a value of M or F, LOCATION can take on the values 1 through 50, and AGE_GROUP is a code representing 18 and under, 19-25, 26-30, 31-40, and 41 and

106、over. You have to support a lar</p><p>  Select count(*)</p><p><b>  from T</b></p><p>  where gender = 'M'</p><p>  and location in ( 1, 10, 30 )&l

107、t;/p><p>  and age_group = '41 and over';</p><p><b>  select *</b></p><p><b>  from t</b></p><p>  where ( ( gender = 'M' and locat

108、ion = 20 )</p><p>  or ( gender = 'F' and location = 22 ))</p><p>  and age_group = '18 and under';</p><p>  select count(*) from t where location in (11,20,30);<

109、/p><p>  select count(*) from t where age_group = '41 and over' and gender = 'F';</p><p>  You would find that a conventional B*Tree indexing scheme would fail you. If you wanted to

110、 use an index to get the answer, you would need at least three and up to six combinations of possible B*Tree indexes to access the data via the index. Since any of the three columns or any subset of the three columns may

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 眾賞文庫僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

評論

0/150

提交評論