B+Trees - MITdb.lcs.mit.edu/6.830/lectures/lec9.pdfProperties of B+Trees •Branching factor = B...

B+Trees

ptr val11 ptr val12 ptr val13 …

ptr valn1 ptr valn2 ptr valn3 …

RIDn RIDn+1 RIDn+2 ptr RIDn+3 RIDn+4 RIDn+5 ptr

<val11

>val21, <val22

<valn1

Leaf nodes; RIDs in sorted order, w/ link pointers

Root node

Inner nodes

B+Trees

<val11

>val21, <val22

<valn1

Root node

Inner nodes

<valn1

B+Trees

Properties of B+Trees

• Branching factor = B• LogB(tuples) levels• Logarithmic insert/delete/lookup performance• Support for range scans

• Link pointers• No data in internal pages• Balanced (see text “rotation”) algorithms to rebalance on

insert/delete• Fill factor: All nodes except root kept at least 50% full

(merge when falls below)• Clustered / unclustered

IndexesHeap File B+Tree Hash File

Insert O(1) O(1)Delete O(P) O(1)Scan O(P) -- / O(P)Lookup O(P) O(1)

n : number of tuplesP : number of pages in fileB : branching factor of B-TreeR : number of pages in range

IndexesHeap File B+Tree Hash File

Insert O(1) O( logB n ) O(1)Delete O(P) O( logB n ) O(1)Scan O(P) O( logB n + R ) -- / O(P)Lookup O(P) O( logB n ) O(1)

n : number of tuplesP : number of pages in fileB : branching factor of B-TreeR : number of pages in range

Can you Think of Potential Optimizations

<val11

>val21, <val22

<valn1

Root node

Inner nodes

B+Trees are Inappropriate For Multi-dimensional Data

• Consider points of the form (x,y) that I want to index

• Suppose I store tuples with key (x,y) in a B+Tree

• Problem: can’t look up y’s in a particular range without also reading x’s

Exampleof the Problem

Have to scan everyX value to look for matching Ys

QUERYFor Y

In SomeRange

XX=1 X= 2 X= 3 X= 4 X= 5

R-Trees / Spatial Indexes

Study Break• What indexes would you create for the following

queries (assuming each query is the only query the database runs)

SELECT MAX(amount) FROM orderB+Tree on order.amount

SELECT c_id FROM order WHERE id = 1Hash index on order.id

SELECT name FROM order WHERE amount > 100kB+Tree on order.amount(maybe)

SELECT * FROM order o WHERE amount > 100k AND c_id = 2B+tree on o.amount (maybe), Hash on o.c_id (maybe)

Clicker Question

Assuming disk can do 100 MB/sec I/O, and 10ms / seek, and the following schema:grades (cid int, g_sid int, grade char(2))students (s_int, name char(100))

Estimate time to sequentially scan grades, assuming it contains 1M records (Consider: field sizes, headers)

(a) .22 sec (b) 1 sec (c) .23 sec (d) 0.21 sec

Int = 8 Bytes, 1MB = 1000KB, 1000000 B-

Seq Scan Grades

grades (cid int, g_sid int, grade char(2))• 8 bytes (cid) + 8 bytes (g_sid) + 2 bytes (grade) + 4 bytes (header) = 22 bytes• 22 x 1M = 22 MB / 100 MB/sec = .22 sec + 10ms seek è .23 sec

Join Processing

• S = {S} tuples, |S| pages • R = {R} tuples, |R| pages• |R| < |S|

Nested Loop Join

for s in Sfor r in R

if pred(s,r) output s join r

Does it matter which is inner, outer? (yes)

suppose |S| = 4, |R| = 2, M=3, LRU S inner = 2 + 4 + 4 = 10 i/o pages read R inner = 4 + 2 = 6 i/o pages read

Nested Loop Join

B = block size (< M) while (not at end of S)

S' = read B records from S for r in R

for s in S’if pred(s,r) output (s join r)

Index Nested Loops (Index on R)

for s in Sfind matches in R

|S| + {R} i/o

Sort-Merge Join(|R| + |S| < M -- in memory)

Sort R, Sort SMerge

Hash Join(simple, |R| < M -- in memory )

Build hash on R, probe with S

|R| + |S| i/os

Blocked hash (|R| > M)

Repeat until read all of R: Read in M pages of RBuild hash on read pagesprobe with S

|R| + |S| * ceil (|R| / M) i/os

Grace hash

choose P partitions, with one page per partition hash r into partitions, flushing pages as they fill hash s into partitions, flushing pages as they fill for each partition p

build a hash table T on r tuples in p lookup each s in T outputting matches

Grace hash - Example R = 1, 4, 3, 6, 9, 14, 1, 7, 11S = 2, 3, 7, 12, 9, 8, 4, 15, 6 h(x) = x mod 3 Partitioning:R0 = 3 6 9 R1 = 1 4 1 7 R2 = 14 11

S0 = 3 12 9 15 6 S1= 7 4S2= 2 8

Now, join R0 with S0, R1 with S1, R2 with S2 Because we are using the same hash function for R and S we can guarantee that the only tuples that will join with partition Ri are those in Si

Partition Size / IO-Complexity

How do I pick the partition size?

(Assume uniform distribution of tuples to partitions, make each partition equal to M pages (minus a couple for active pages of S being read.)

I/O: read R+S (seq)write R+S (semi-random) read R+S (seq)

3x (|R|+|S|) I/OS

What's hard about this?

• Possible that some partitions will overflow -- e.g., if many duplicate values • What do they say we should do? • (Leave some more slop (assign fewer values to each

partition) by assuming that each record takes a few more bytes to store in hash table.) • Split partitions that overflow

Clicker Question

Assuming disk can do 100 MB/sec I/O, and 10ms / seek, and the following schema:

grades (cid int, g_sid int, grade char(2))students (s_int, name char(100))

Estimate time to join these two tables, using nested loops, assuming students fits in memory but grades does not, and students contains 10K records.

(a) .241 sec (b) 0.251 sec (c) .211 sec (d) 0.502 sec

Int = 8 Bytes, 1MB = 1000KB, 1000000 B-

NL Join Grades and Students

grades (cid int, g_sid int, grade char(2))students (s_int, name char(100))

10 K students x (100 + 8 + 4 bytes) = 1.1 MB

Students Inner (Preferred)• Cache students in buffer pool in memory: 1.1/100 s + 0.01 = .021 s• One pass over students (cached) for each grade (no additional cost beside caching)• Time to scan students (previous slide) = .23 sè .251 s

Grades Inner• One pass over grades for each student, at .22 sec / pass, plus one seek at 10 ms (.01

sec) è .23 sec / passè ~2300 seconds overall

• (Time to scan students is .011 s, so negligible)

B+Trees - MITdb.lcs.mit.edu/6.830/lectures/lec9.pdfProperties of B+Trees •Branching factor = B...

Documents

Transcript of B+Trees - MITdb.lcs.mit.edu/6.830/lectures/lec9.pdfProperties of B+Trees •Branching factor = B...

ENSAYO ÁRBOLES LONGEVOS DE MÉXICO ANCIENT TREES OF … · 7 ENSAYO ÁRBOLES LONGEVOS DE MÉXICO ANCIENT TREES OF MEXICO José Villanueva Díaz 1, Julián Cerano Paredes1, D. W.

Los árboles y el servicio eléctrico confiable - pepco.com · Los Trees planted directly below power lines pose the biggestárboles plantados directamente bajo las líneas del tendido

cuaderno de resumenes symposium...RECOVERING OLD POLLARD TREES, TECHNIQUES AND RESULTS, LEITZA– NAVARRA 21/22 NOVEMBER 2017 DEUXIÈME SYMPOSIUM EUROPÉEN DES TÊTARDS. RESTAURATION

cataleg Guia Vivers de Girona Guía Viveros de … Vivers de Girona 2011...SituacióSituationSituación Situazione Arbres - Arboles - Trees - Alberi Arbusts - Arbustos - Arbustes Shrubs

Prospecting and characterization of promising cocoa trees ...

10. Biosphera - Trees and Forests - Spanish (2)

Preferencias de consumo por productos derivados del ... · derivados alternativos del cocotero: tuba, aceite y agua de coco. ... 300 000 coconut trees are destined for the production

Informe Anual 2014 ESP · PDF fileRESUMEN DEL AÑO 7 Los sectores más ... Pasivos por impuesto diferido PASIVOS CORRIENTES Otras provisiones Deudas a corto plazo ... 16.105 251 6.830

lilianafelipe.com · 2020-02-13 · & & & &? & &??? & ã ã & & & B?? b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b

jspmjsimr.edu.in MoA_JSIMR_02... · 2020. 12. 14. · expenses and training, To plant trees at various & increase the awareness the of usefulness trees, account of Soaal Responsibilities

CENTRO AGRONÓMICO TROPICAL DE INVESTIGACIÓN Y …repositorio.bibliotecaorton.catie.ac.cr/bitstream/handle/11554/... · Key words: Cocoa - agroforestry systems, musaceae, fruit trees,

Conocimiento local sobre servicios ecosistémicos de ...akt.bangor.ac.uk/documents/carlosMScfinal.pdf · 2.3 Cafeticultura en Costa Rica ... Shade trees that are considered as “fresh”

Compressed Data Structures for Suffix Trees - Departamento de

B B B B B B B B B B - alojaweb.educastur.es

CONTENTS...CONTENTS Foreword 3 Introduction 4 Trees in Kent Ridge Campus 5 1.Ceiba pentandra (Kapok Tree) ... 16.Hevea brasiliensis (Brazilian Rubber Tree) Trees in Bukit Timah Campus

Decision Trees Exercise

¡INVASORES EN NUESTRA COMUNA!: ÁRBOLES NATIVOS V/S ... · NATIVOS V/S INTRODUCIDOS INVADERS IN OUR COMMUNE: NATIVE VERSUS INTRODUCED TREES César Piucón • Fabiana Moreno •

Árboles. / Trees. Días a celebrar

Arboles de decisi ´ on para´ grandes conjuntos de datos · Arboles de decisi´ on para ... nerates multivariate decision trees using splitting attribute subsets, the time needed

©2009 Rainforest Alliance PREPARÁNDOSE PARA REDD José Román Carrera Gerente Regional TREES / RAINFOREST ALLIANCE Honduras, Septiembre 2013 La forestería.