(This page was last modified on 2000/07/30 20:04:43 GMT)
In order to execute an SQL statement, the SQLite library first parses the SQL, analyzes the statement, then generates a short program to execute the statement. The program is generated for a "virtual machine" implemented by the SQLite library. This document describes the operation of that virtual machine.
This document is intended as a reference, not a tutorial. A separate Virtual Machine Tutorial is available. If you are looking for a narrative description of how the virtual machine works, you should read the tutorial and not this document. Once you have a basic idea of what the virtual machine does, you can refer back to this document for the details on a particular opcode.
The source code to the virtual machine is in the vdbe.c source file. All of the opcode definitions further down in this document are contained in comments in the source file. In fact, the opcode table in this document was generated by scanning the vdbe.c source file and extracting the necessary information from comments. So the source code comments are really the canonical source of information about the virtual macchine. When in doubt, refer to the source code.
Each instruction in the virtual machine consists of an opcode and up to three operands named P1, P2 and P3. P1 may be an arbitrary integer. P2 must be a non-negative integer. P2 is always the jump destination in any operation that might cause a jump. P3 is a null-terminated string or NULL. Some operators use all three operands. Some use one or two. Some operators use none of the operands.
The virtual machine begins execution on instruction number 0. Execution continues until (1) a Halt instruction is seen, or (2) the program counter becomes one greater than the address of last instruction, or (3) there is an execution error. When the virtual machine halts, all memory that it allocated is released and all database files it may have had open are closed.
The virtual machine also contains an operand stack of unlimited depth. Many of the opcodes use operands from the stack. See the individual opcode descriptions for details.
The virtual machine can have zero or more cursors. Each cursor is a pointer into a single GDBM file. There can be multiple cursors pointing at the same file. All cursors operate independently, even cursors pointing to the same file. The only way for the virtual machine to interact with a GDBM file is through a cursor. Instructions in the virtual machine can create a new cursor (Open), read data from a cursor (Field), advance the cursor to the next entry in the GDBM file (Next), and many other operations. All cursors are automatically closed when the virtual machine terminates.
The virtual machine contains an arbitrary number of fixed memory locations with addresses beginning at zero and growing upward. Each memory location can hold an arbitrary string. The memory cells are typically used to hold the result of a scalar SELECT that is part of a larger expression.
The virtual machine contains an arbitrary number of sorters. Each sorter is able to accumulate records, sort those records, then play the records back in sorted order. Sorters are used to implement the ORDER BY clause of a SELECT statement. The fact that the virtual machine allows multiple sorters is an historical accident. In practice no more than one sorter (sorter number 0) ever gets used.
The virtual machine may contain an arbitrary number of "Lists". Each list stores a list of integers. Lists are used to hold the GDBM keys for records of a GDBM file that needs to be modified. (See the file format description for more information on GDBM keys in SQLite table files.) The WHERE clause of an UPDATE or DELETE statement scans through the table and writes the GDBM key of every record to be modified into a list. Then the list is played back and the table is modified in a separate step. It is necessary to do this in two steps since making a change to a GDBM file can alter the scan order.
The virtual machine can contain an arbitrary number of "Sets". Each set holds an arbitrary number of strings. Sets are used to implement the IN operator with a constant right-hand side.
The virtual machine can open a single external file for reading. This external read file is used to implement the COPY command.
Finally, the virtual machine can have a single set of aggregators. An aggregator is a device used to implement the GROUP BY clause of a SELECT. An aggregator has one or more slots that can hold values being extracted by the select. The number of slots is the same for all aggregators and is defined by the AggReset operation. At any point in time a single aggregator is current or "has focus". There are operations to read or write to memory slots of the aggregator in focus. There are also operations to change the focus aggregator and to scan through all aggregators.
Every SQL statement that SQLite interprets results in a program for the virtual machine. But if you precede the SQL statement with the keyword "EXPLAIN" the virtual machine will not execute the program. Instead, the instructions of the program will be returned like a query result. This feature is useful for debugging and for learning how the virtual machine operates.
You can use the sqlite command-line tool to see the instructions generated by an SQL statement. The following is an example:
$ sqlite ex1 sqlite> .explain sqlite> explain delete from tbl1 where two<20; addr opcode p1 p2 p3 ---- ------------ ----- ----- ------------------------------------- 0 ListOpen 0 0 1 Open 0 1 tbl1 2 Next 0 9 3 Field 0 1 4 Integer 20 0 5 Ge 0 2 6 Key 0 0 7 ListWrite 0 0 8 Goto 0 2 9 Noop 0 0 10 ListRewind 0 0 11 ListRead 0 14 12 Delete 0 0 13 Goto 0 11 14 ListClose 0 0
All you have to do is add the "EXPLAIN" keyword to the front of the SQL statement. But if you use the ".explain" command to sqlite first, it will set up the output mode to make the program more easily viewable.
If sqlite has been compiled without the "-DNDEBUG=1" option (that is, with the NDEBUG preprocessor macro not defined) then you can put the SQLite virtual machine in a mode where it will trace its execution by writing messages to standard output. There are special comments to turn tracing on and off. Use the --vdbe-trace-on-- comment to turn tracing on and the --vdbe-trace-off-- comment to turn tracing back off.
There are currently 92 opcodes defined by the virtual machine. All currently defined opcodes are described in the table below. This table was generated automatically by scanning the source code from the file vdbe.c.
Opcode Name | Description |
---|---|
Add | Pop the top two elements from the stack, add them together, and push the result back onto the stack. If either element is a string then it is converted to a double using the atof() function before the addition. |
AddImm | Add the value P1 to whatever is on top of the stack. |
AggFocus | Pop the top of the stack and use that as an aggregator key. If an aggregator with that same key already exists, then make the aggregator the current aggregator and jump to P2. If no aggregator with the given key exists, create one and make it current but do not jump. The order of aggregator opcodes is important. The order is: AggReset AggFocus AggNext. In other words, you must execute AggReset first, then zero or more AggFocus operations, then zero or more AggNext operations. You must not execute an AggFocus in between an AggNext and an AggReset. |
AggGet | Push a new entry onto the stack which is a copy of the P2-th field of the current aggregate. Strings are not duplicated so string values will be ephemeral. |
AggIncr | Increase the integer value in the P2-th field of the aggregate element current in focus by an amount P1. |
AggNext | Make the next aggregate value the current aggregate. The prior aggregate is deleted. If all aggregate values have been consumed, jump to P2. The order of aggregator opcodes is important. The order is: AggReset AggFocus AggNext. In other words, you must execute AggReset first, then zero or more AggFocus operations, then zero or more AggNext operations. You must not execute an AggFocus in between an AggNext and an AggReset. |
AggReset | Reset the aggregator so that it no longer contains any data. Future aggregator elements will contain P2 values each. |
AggSet | Move the top of the stack into the P2-th field of the current aggregate. String values are duplicated into new memory. |
And | Pop two values off the stack. Take the logical AND of the two values and push the resulting boolean value back onto the stack. |
Callback | Pop P1 values off the stack and form them into an array. Then invoke the callback function using the newly formed array as the 3rd parameter. |
Close | Close a cursor previously opened as P1. If P1 is not currently open, this instruction is a no-op. |
ColumnCount | Specify the number of column values that will appear in the array passed as the 4th parameter to the callback. No checking is done. If this value is wrong, a coredump can result. |
ColumnName | P3 becomes the P1-th column name (first is 0). An array of pointers to all column names is passed as the 4th parameter to the callback. The ColumnCount opcode must be executed first to allocate space to hold the column names. Failure to do this will likely result in a coredump. |
Concat | Look at the first P1 elements of the stack. Append them all together with the lowest element first. Use P3 as a separator. Put the result on the top of the stack. The original P1 elements are popped from the stack if P2==0 and retained if P2==1. If P3 is NULL, then use no separator. When P1==1, this routine makes a copy of the top stack element into memory obtained from sqliteMalloc(). |
Delete | The top of the stack is a key. Remove this key and its data from database file P1. Then pop the stack to discard the key. |
DeleteIdx | The top of the stack is a key and next on stack is integer which is the key to a record in an SQL table. Locate the record in the cursor P1 (P1 represents an SQL index) that has the same key as the top of stack. Then look through the integer table-keys contained in the data of the P1 record. Remove the integer table-key that matches the NOS and write the revised data back to P1 with the same key. If this routine removes the very last integer table-key from the P1 data, then the corresponding P1 record is deleted. |
Destroy | Drop the disk file whose name is P3. All key/data pairs in the file are deleted and the file itself is removed from the disk. |
Distinct | Use the top of the stack as a key. If a record with that key does not exist in file P1, then jump to P2. If the record does already exist, then fall thru. The record is not retrieved. The key is not popped from the stack. This operation is similar to NotFound except that this operation does not pop the key from the stack. |
Divide | Pop the top two elements from the stack, divide the first (what was on top of the stack) from the second (the next on stack) and push the result back onto the stack. If either element is a string then it is converted to a double using the atof() function before the division. Division by zero returns NULL. |
Dup | A copy of the P1-th element of the stack is made and pushed onto the top of the stack. The top of the stack is element 0. So the instruction "Dup 0 0 0" will make a copy of the top of the stack. |
Eq | Pop the top two elements from the stack. If they are equal, then jump to instruction P2. Otherwise, continue to the next instruction. |
Fcnt | Push an integer onto the stack which is the total number of OP_Fetch opcodes that have been executed by this virtual machine. This instruction is used to implement the special fcnt() function in the SQL dialect that SQLite understands. fcnt() is used for testing purposes. |
Fetch | Pop the top of the stack and use its value as a key to fetch a record from cursor P1. The key/data pair is held in the P1 cursor until needed. |
Field | Interpret the data in the most recent fetch from cursor P1 is a structure built using the MakeRecord instruction. Push onto the stack the value of the P2-th field of that structure. The value pushed is just a pointer to the data in the cursor. The value will go away the next time a record is fetched from P1, or when P1 is closed. Make a copy of the string (using "Concat 1 0 0") if it needs to persist longer than that. If the KeyAsData opcode has previously executed on this cursor, then the field might be extracted from the key rather than the data. Viewed from a higher level, this instruction retrieves the data from a single column in a particular row of an SQL table file. Perhaps the name of this instruction should be "Column" instead of "Field"... |
FileClose | Close a file previously opened using FileOpen. This is a no-op if there is no prior FileOpen call. |
FileField | Push onto the stack the P1-th field of the most recently read line from the input file. |
FileOpen | Open the file named by P3 for reading using the FileRead opcode. If P3 is "stdin" then open standard input for reading. |
FileRead | Read a single line of input from the open file (the file opened using FileOpen). If we reach end-of-file, jump immediately to P2. If we are able to get another line, split the line apart using P3 as a delimiter. There should be P1 fields. If the input line contains more than P1 fields, ignore the excess. If the input line contains fewer than P1 fields, assume the remaining fields contain an empty string. |
Found | Use the top of the stack as a key. If a record with that key does exist in file P1, then jump to P2. If the record does not exist, then fall thru. The record is not retrieved. The key is popped from the stack. |
FullKey | Push a string onto the stack which is the full text key associated with the last Next operation on file P1. Compare this with the Key operator which pushs an integer key. |
Ge | Pop the top two elements from the stack. If second element (the next on stack) is greater than or equal to the first (the top of stack), then jump to instruction P2. In other words, jump if NOS>=TOS. |
Glob | Pop the top two elements from the stack. The top-most is a "glob" pattern. The lower element is the string to compare against the glob pattern. Jump to P2 if the two compare, and fall through without jumping if they do not. The '*' in the top-most element matches any sequence of zero or more characters in the lower element. The '?' character in the topmost matches any single character of the lower element. [...] matches a range of characters. [^...] matches any character not in the range. Case is significant for globs. If P1 is not zero, the sense of the test is inverted and we have a "NOT GLOB" operator. The jump is made if the two values are different. |
Goto | An unconditional jump to address P2. The next instruction executed will be the one at index P2 from the beginning of the program. |
Gt | Pop the top two elements from the stack. If second element (the next on stack) is greater than the first (the top of stack), then jump to instruction P2. In other words, jump if NOS>TOS. |
Halt | Exit immediately. All open DBs, Lists, Sorts, etc are closed automatically. |
If | Pop a single boolean from the stack. If the boolean popped is true, then jump to p2. Otherwise continue to the next instruction. An integer is false if zero and true otherwise. A string is false if it has zero length and true otherwise. |
Integer | The integer value P1 is pushed onto the stack. |
IsNull | Pop a single value from the stack. If the value popped is NULL then jump to p2. Otherwise continue to the next instruction. |
Key | Push onto the stack an integer which is the first 4 bytes of the the key to the current entry in a sequential scan of the database file P1. The sequential scan should have been started using the Next opcode. |
KeyAsData | Turn the key-as-data mode for cursor P1 either on (if P2==1) or off (if P2==0). In key-as-data mode, the OP_Field opcode pulls data off of the key rather than the data. This is useful for processing compound selects. |
Le | Pop the top two elements from the stack. If second element (the next on stack) is less than or equal to the first (the top of stack), then jump to instruction P2. In other words, jump if NOS<=TOS. |
Like | Pop the top two elements from the stack. The top-most is a "like" pattern -- the right operand of the SQL "LIKE" operator. The lower element is the string to compare against the like pattern. Jump to P2 if the two compare, and fall through without jumping if they do not. The '%' in the top-most element matches any sequence of zero or more characters in the lower element. The '_' character in the topmost matches any single character of the lower element. Case is ignored for this comparison. If P1 is not zero, the sense of the test is inverted and we have a "NOT LIKE" operator. The jump is made if the two values are different. |
ListClose | Close the temporary storage buffer and discard its contents. |
ListOpen | Open a "List" structure used for temporary storage of integer table keys. P1 will server as a handle to this list for future interactions. If another list with the P1 handle is already opened, the prior list is closed and a new one opened in its place. |
ListRead | Attempt to read an integer from temporary storage buffer P1 and push it onto the stack. If the storage buffer is empty, push nothing but instead jump to P2. |
ListRewind | Rewind the temporary buffer P1 back to the beginning. |
ListWrite | Write the integer on the top of the stack into the temporary storage list P1. |
Lt |
Pop the top two elements from the stack. If second element (the
next on stack) is less than the first (the top of stack), then
jump to instruction P2. Otherwise, continue to the next instruction.
In other words, jump if NOS |
MakeKey | Convert the top P1 entries of the stack into a single entry suitable for use as the key in an index or a sort. The top P1 records are concatenated with a tab character (ASCII 0x09) used as a record separator. The entire concatenation is null-terminated. The lowest entry in the stack is the first field and the top of the stack becomes the last. If P2 is not zero, then the original entries remain on the stack and the new key is pushed on top. If P2 is zero, the original data is popped off the stack first then the new key is pushed back in its place. See also the SortMakeKey opcode. |
MakeRecord | Convert the top P1 entries of the stack into a single entry suitable for use as a data record in the database. To do this all entries (except NULLs) are converted to strings and concatenated. The null-terminators are preserved by the concatation and serve as a boundry marker between fields. The lowest entry on the stack is the first in the concatenation and the top of the stack is the last. After all fields are concatenated, an index header is added. The index header consists of P1 integers which hold the offset of the beginning of each field from the beginning of the completed record including the header. The index for NULL entries is 0. |
Max | Pop the top two elements from the stack then push back the largest of the two. |
MemLoad | Push a copy of the value in memory location P1 onto the stack. |
MemStore | Pop a single value of the stack and store that value into memory location P1. P1 should be a small integer since space is allocated for all memory locations between 0 and P1 inclusive. |
Min | Pop the top two elements from the stack then push back the smaller of the two. |
Multiply | Pop the top two elements from the stack, multiply them together, and push the result back onto the stack. If either element is a string then it is converted to a double using the atof() function before the multiplication. |
Ne | Pop the top two elements from the stack. If they are not equal, then jump to instruction P2. Otherwise, continue to the next instruction. |
Negative | Treat the top of the stack as a numeric quantity. Replace it with its additive inverse. |
New | Get a new integer key not previous used by the database file associated with cursor P1 and push it onto the stack. |
Next | Advance P1 to the next key/data pair in the file. Or, if there are no more key/data pairs, rewind P1 and jump to location P2. |
NextIdx | The P1 cursor points to an SQL index. The data from the most recent fetch on that cursor consists of a bunch of integers where each integer is the key to a record in an SQL table file. This instruction grabs the next integer table key from the data of P1 and pushes that integer onto the stack. The first time this instruction is executed after a fetch, the first integer table key is pushed. Subsequent integer table keys are pushed in each subsequent execution of this instruction. If there are no more integer table keys in the data of P1 when this instruction is executed, then nothing gets pushed and there is an immediate jump to instruction P2. |
Noop | Do nothing. This instruction is often useful as a jump destination. |
Not | Interpret the top of the stack as a boolean value. Replace it with its complement. |
NotFound | Use the top of the stack as a key. If a record with that key does not exist in file P1, then jump to P2. If the record does exist, then fall thru. The record is not retrieved. The key is popped from the stack. The difference between this operation and Distinct is that Distinct does not pop the key from the stack. |
NotNull | Pop a single value from the stack. If the value popped is not an empty string, then jump to p2. Otherwise continue to the next instruction. |
Null | Push a NULL value onto the stack. |
OpenIdx | Open a new cursor for the database file named P3. Give the cursor an identifier P1. The P1 values need not be contiguous but all P1 values should be small integers. It is an error for P1 to be negative. Open readonly if P2==0 and for reading and writing if P2!=0. The file is created if it does not already exist and P2!=0. If there is already another cursor opened with identifier P1, then the old cursor is closed first. All cursors are automatically closed when the VDBE finishes execution. If P3 is null or an empty string, a temporary database file is created. This temporary database file is automatically deleted when the cursor is closed. The database file opened must be able to map arbitrary length keys into arbitrary data. A similar opcode, OpenTbl, opens a database file that maps integer keys into arbitrary length data. This opcode opens database files used as SQL indices and OpenTbl opens database files used for SQL tables. |
OpenTbl | This works just like the OpenIdx operation except that the database file that is opened is one that will only accept integers as keys. Some database backends are able to operate more efficiently if keys are always integers. So if SQLite knows in advance that all keys will be integers, it uses this opcode rather than Open in order to give the backend an opportunity to run faster. This opcode opens database files used for storing SQL tables. The OpenIdx opcode opens files used for SQL indices. |
Or | Pop two values off the stack. Take the logical OR of the two values and push the resulting boolean value back onto the stack. |
Pop | P1 elements are popped off of the top of stack and discarded. |
Pull | The P1-th element is removed from its current location on the stack and pushed back on top of the stack. The top of the stack is element 0, so "Pull 0 0 0" is a no-op. |
Put | Write an entry into the database file P1. A new entry is created if it doesn't already exist, or the data for an existing entry is overwritten. The data is the value on the top of the stack. The key is the next value down on the stack. The stack is popped twice by this instruction. |
PutIdx | The top of the stack hold an SQL index key (probably made using the MakeKey instruction) and next on stack holds an integer which the key to an SQL table entry. Locate the record in cursor P1 that has the same key as on the TOS. Create a new record if necessary. Then append the integer table key to the data for that record and write it back to the P1 file. |
Reorganize | Compress, optimize, and tidy up the GDBM file named by P3. |
ResetIdx | Begin treating the current data in cursor P1 as a bunch of integer keys to records of a (separate) SQL table file. This instruction causes the new NextIdx instruction push the first integer table key in the data. |
Rewind | The next use of the Key or Field or Next instruction for P1 will refer to the first entry in the database file. |
SetClear | Remove all elements from the P1-th Set. |
SetFound | Pop the stack once and compare the value popped off with the contents of set P1. If the element popped exists in set P1, then jump to P2. Otherwise fall through. |
SetInsert | If Set P1 does not exist then create it. Then insert value P3 into that set. If P3 is NULL, then insert the top of the stack into the set. |
SetNotFound | Pop the stack once and compare the value popped off with the contents of set P1. If the element popped does not exists in set P1, then jump to P2. Otherwise fall through. |
Sort | Sort all elements on the given sorter. The algorithm is a mergesort. |
SortCallback | The top of the stack contains a callback record built using the SortMakeRec operation with the same P1 value as this instruction. Pop this record from the stack and invoke the callback on it. |
SortClose | Close the given sorter and remove all its elements. |
SortKey | Push the key for the topmost element of the sorter onto the stack. But don't change the sorter an any other way. |
SortMakeKey | Convert the top few entries of the stack into a sort key. The number of stack entries consumed is the number of characters in the string P3. One character from P3 is prepended to each entry. The first character of P3 is prepended to the element lowest in the stack and the last character of P3 is appended to the top of the stack. All stack entries are separated by a \000 character in the result. The whole key is terminated by two \000 characters in a row. See also the MakeKey opcode. |
SortMakeRec | The top P1 elements are the arguments to a callback. Form these elements into a single data entry that can be stored on a sorter using SortPut and later fed to a callback using SortCallback. |
SortNext | Push the data for the topmost element in the given sorter onto the stack, then remove the element from the sorter. |
SortOpen | Create a new sorter with index P1 |
SortPut | The TOS is the key and the NOS is the data. Pop both from the stack and put them on the sorter. |
String | The string value P3 is pushed onto the stack. |
Strlen | Interpret the top of the stack as a string. Replace the top of stack with an integer which is the length of the string. |
Substr | This operation pops between 1 and 3 elements from the stack and pushes back a single element. The bottom-most element popped from the stack is a string and the element pushed back is also a string. The other two elements popped are integers. The integers are taken from the stack only if P1 and/or P2 are 0. When P1 or P2 are not zero, the value of the operand is used rather than the integer from the stack. In the sequel, we will use P1 and P2 to describe the two integers, even if those integers are really taken from the stack. The string pushed back onto the stack is a substring of the string that was popped. There are P2 characters in the substring. The first character of the substring is the P1-th character of the original string where the left-most character is 1 (not 0). If P1 is negative, then counting begins at the right instead of at the left. |
Subtract | Pop the top two elements from the stack, subtract the first (what was on top of the stack) from the second (the next on stack) and push the result back onto the stack. If either element is a string then it is converted to a double using the atof() function before the subtraction. |