This explains how the bytecodes work. Bytecodes are represents as 8-bit
byte sequences which can be stored in strings. A bytecode string is made up
of a sequence of variable length instructions. Each instruction has a 1
byte opcode followed by a series of object source operands, integer source
operands and destination operands (in that order).
Note that since bytecodes are stored as strings, a zero can never be used as
any bytecode value.
OPCODES
The opcode values are defined in interp.h and the bytecode interpretor is
interp/interp.C.
general
movement
copy obj_src dest
dup obj_src dest
move obj_src dest
movec dest1 dest2
movei int_src dest
reloc obj_src dest
predicates
eq obj1_src obj2_src dest
iscontinuation obj_src dest
isdmutable obj_src dest
isfile obj_src dest
isfun_value obj_src dest
ishash obj_src dest
ishash_body obj_src dest
islink obj_src dest
islong obj_src dest
ismutable obj_src dest
isnil obj_src dest
isobject obj_src dest
isref obj_src dest
isstring obj_src dest
issymbol obj_src dest
isvector obj_src dest
iswait obj_src dest
control flow
error_admin obj_format_string_src obj_string_src
error_pgmr obj_format_string_src obj_string_src
error_user obj_format_string_src obj_string_src
jf obj_src label
jge int_src label
jgt int_src label
jle int_src label
jls int_src label
jmp label
jnf obj_src label
jnil obj_src label
jnnil obj_src label
jnz int_src label
jz int_src label
prepare_call fun_value_src kwnames_src #pos_args_src call_dest args_dest
prepare_send obj_src fun_value_src kwnames_src #pos_args_src call_dest args_dest
sched label
stop
done
misc
assemble obj_src dest
dsize obj_src dest
flush
hash_code obj_src dest
sched_gc
type obj_src dest
boolean
and obj1_src obj2_src dest
not obj_src dest
or obj1_src obj2_src dest
continuation
cdec dest
cnum_values continuation_src dest
file
close file_src
read_obj file_src dest
readline file_src dest
print_string file_src string_src
print file_src obj_src int_position_src int_indent_src dest
print_len obj_src dest
fun_value
cfvnew fun_value_src vector_src int_offset_src dest
compose fun_value1_src fun_value2_src int_num_args2_src dest
mfvnew operation_name_src dest
mkcor_fun next_coroutine_src int_num_args_src first_cont_dest dest2
mkcor_head int_num_args_src dest
sfvnew fun_body_src dest
hash
hdel hash_src key_src
hget hash_src key_src int_not_found_flag_src dest
hnew int_max_#elements_src int_avg_element_size_src dest
hnum_elem hash_src dest
hset hash_src key_src value_src
integer
divide int1_src int2_src dest
icmp int1_src int2_src dest
minus int1_src int2_src dest
negate int_src dest
plus int1_src int2_src dest
remainder int1_src int2_src dest
times int1_src int2_src dest
link
lnew symbol_src dest
lref link_src dest
ref
rget ref_src dest
rnew obj_src dest
rset ref_src obj_src
string
strcmp string1_src string2_src dest
strdel string_src int_start_src int_length_src dest
strfind string_src int_char_src dest
strins string_src int_char_src int_offset_src dest
strlen string_src dest
strslice string_src int_start_src int_length_src dest
strsplice outer_string_src string_to_insert_src int_offset_src dest
strsub string_src int_index_src dest
symbol
sgensym string_src dest
sget symbol_src int_not_found_flag_src dest
sintern string_src dest
sisper symbol_src dest
sset symbol_src obj_src
sunset symbol_src
user_object
oclsof user_obj_src dest
oget user_obj_src field_name_src int_src dest
oiscls user_obj_src dest
onew class_src dest
oset user_obj_src field_name_src obj_src
vector
vdel vector_src int_start_src int_length_src dest
vflt vector_src dest
vins vector_src obj_src int_offset_src dest
vlen vector_src dest
vnew int_length_src dest
vset vector_src obj_src int_index_src
vslice vector_src int_start_src int_length_src dest
vsplice outer_vector_src inner_vector_src int_offset_src dest
vsub vector_src int_index_src dest
wait
wfar wait_src dest
wnew far_src int_wait_count_src label dest
SOURCE OPERANDS
Since we don't know the region to allocate to when we're processing source
operands, source operands may not create objects.
A source operand is a variable length sequence of codes starting
with an initial source code and (optionally) followed by any number of
extension
source codes. The last source code (initial or extension) has the high
order (0x80) bit set to indicate the end of the source operand sequence.
There are two kinds of source operands: object source operands and integer
source operands.
All source codes (initials and extensions) are legal in integer source
operands. But object source operands may not include certain source
codes that are only legal in integer source operands (each of these are marked
below).
Labels (e.g., for jump instructions) are stored as integer source operands
indicating the byte offset (positive or negative) from the start of the
next bytecode instruction to the desired point within the bytecode
string.
The values for the source codes are defined in source.h. The code that evaluates source
operands is interp/source.C.
Initial Source Codes
A source operand starts with one of the following codes. If this is the
only code for the source operand, the high-order bit (0x80) is set.
- nil
- zero ; only value in integer sources
- imm1 signed_int(!=0) ; only valid in integer sources
- imm2 hi_int+1 low_int+1 ; value is hi_int * 255 + low_int - 127*255 ; only valid in integer sources
- lit literal#+1
- far
- cur_ret
- cur_ret_offset
- cur_local var#+1
- cur_num_args
- cur_rest_len ; only valid in integer sources
- cur_rest var#+1
- cur_keyword literal#+1 literal#+1
- global literal#+1
- symget literal#+1
Extension Source Codes
These codes may be used to extend a source operand. There may any number
of these after the initial source code (above). The last extension source
code has the high-order bit (0x80) set.
- fun_body follows far
- ret follows far
- ret_offset follows far
- local var#+1 follows far
- num_args follows far
- rest var#+1 follows far
- keyword literal#+1 literal#+1 follows far
- vsub index#(>=0)+1 follows vector
- hget literal#+1 follows hash
- refget follows ref
- symget follows symbol
- sintern follows string
- sgensym follows string
- clsof follows user_object
- field literal#+1 follows user_object
- lref follows link
- deref follows any object
All of the remaining source codes are only valid for integer sources:
- rest_len follows far
- slen follows string
- vlen follows vector
- hlen follows hash
- cont_num_values follows continuation
- plus signed_int(!=0) follows integer
- times signed_int(!=0) follows integer
- divide signed_int(!=0) follows integer
- remainder signed_int(!=0) follows integer
DESTINATION OPERANDS
Destination operands specify two things: a continuation object (either
already existing or to be newly created) and the index to be passed to its
'store' operation.
These may be specified with two operand sequences: the first for the
continuation object and the second for the index. But, if the final
continuation code has the high-order bit (0x80) set, there is no sequence
for the index and the index is taken as 0 (except for 'cur_ret', see below).
The destination codes are defined in dest.h.
The code that evaluates destination operands is interp/dest.C.
The continuation object operand sequence may select an existing continuation
or may create a new standard continuation. (Argument
continuations are only created by function values on prepare_call or
prepare_send).
Selecting Existing Continuation
The destination codes for using existing continuations are the following.
If the high-order bit is set, the index is taken as 0 (except for 'cur_ret',
where it is taken as the 'cur_ret_offset'), otherwise a source integer operand
follows to specify the index.
- cur_ret
- local var#+1
- cont_follows ; source obj operand follows
Creating a New Standard Continuation
Alternatively, the following sequence may be used to create a standard
continuation. Three things must be specified: the wait information, the
allocation region, and the destination address. These are specified by three
operand sequences in that order. If the destination address code has the
high-order bit set, the index is taken as 0, otherwise a source integer
operand follows the destination address sequence to specify the index.
The wait codes are:
- <omitted> ; no wait processing
- wait hi_bc+1 low_bc+1 wait_count(!=0) ; bc_offset = hi_bc*255 + low_bc - 127*255 ; specify new wait
- wait_follows ; wait from following source obj operand
The region codes are:
- <omitted> ; use region of destination address, below
- cur_far
- global
- region_of ; any object in following source obj operand
- cont_region index+1 ; does get_region on continuation in following source obj operand
The destination address codes are:
- discard num_values#+0x80
- cur_local var#+1 num_values#+0x80
- global literal#+1
- symset literal#+1
- local var#+1 num_values#+0x80 ; source obj operand follows
- vsub ; index int source operand ; num_values int source operand ; vector obj source operand
- hset ; key obj source operand ; hash source obj operand
- ref ; ref source obj operand follows
- symset_of ; symbol source obj operand follows
- field literal#+1 ; user_object source obj operand follows