The word thunk has at least three related meanings in computer science. A "thunk" may be:
In all three senses, the word thunk refers to a piece of low-level code, usually machine-generated, that implements some detail of a particular software system.
In call-by-need, the thunk is replaced by its return value after its first execution. In languages with late binding, the "computation" performed by the thunk may include a lookup in the run-time context of the program to determine the current binding of a variable.
An early implementation of thunks for call-by-name was in Algol 60.
begin
integer idx;
real procedure sum (i, lo, hi, term);
value lo, hi;
integer i, lo, hi;
real term;
comment term is passed by-name, and so is i;
begin
real temp;
temp := 0;
for i := lo step 1 until hi do
temp := temp + term;
sum := temp
end;
print (sum (idx, 1, 100, 1/idx))
endThe above example (see Jensen's Device) relies on the fact that the actual parameters idx and 1/idx are passed "by name", so that the program is equivalent to the "inlined" version
begin
integer idx;
real sum;
begin
real temp;
temp := 0;
for idx := 1 step 1 until 100 do
temp := temp + 1/idx;
sum := temp
end;
print (sum)
endNotice that the expression 1/i is not evaluated at the point of the call to sum; instead, it is evaluated anew each time the formal parameter term is mentioned in the definition of sum. A compiler using thunks to implement call-by-name would process the original code as if it had been written using function pointers or lambdas (represented below in Algol-like pseudocode):
begin
integer idx;
real procedure sum (i_lvalue, lo, hi, term_rvalue);
value lo, hi;
integer lo, hi;
thunk i_lvalue;
thunk term_rvalue;
begin
real temp;
temp := 0;
for i_lvalue() := lo step 1 until hi do
temp := temp + term_rvalue();
sum := temp
end;
procedure lvalue_of_idx ();
begin
lvalue_of_idx := address of idx
end;
procedure rvalue_of_1_over_idx ();
begin
rvalue_of_1_over_idx := 1/idx
end;
print (sum (lvalue_of_idx, 1, 100, rvalue_of_1_over_idx))
endThe procedures lvalue_of_idx and rvalue_of_1_over_idx would be generated automatically by the compiler whenever a call-by-name actual parameter was encountered. These automatically generated procedures would be called thunks.
Thunks also arise naturally in other situations, for example in the implementation of constant functions, which may be useful in higher-order programming. In Common Lisp, constant functions are created by constantly: (constantly 6) evaluates to a thunk that, when called, always yields the value 6.
int use(B *b)
{
... C c; use(&c); ...
int value;
virtual int access() { return this->value; }
};
struct B {
int value;
virtual int access() { return this->value; }
};
struct C : public A, public B {
int better_value;
virtual int access() { return this->better_value; }
}; return b->access();
}
Since the function B::access is virtual, the call to b->access() requires a vtable dispatch. In naïve implementations, the dispatch will consist of five steps:
b holds a pointer to the vtable. Load that pointer into a register.B::access is at some known offset in the vtable for B; find that entry E.C::access). Load that function pointer C::access.C::access expects a this pointer to an instance of C, but b is an instance of B, we must decrement b by the offset of B in C (in this example, probably 8 bytes: the size of C::A::value plus the size of C::A's vtable pointer). Since this offset is not known to use at compile time, it must also be loaded from E.C::access with the adjusted value of b.
The fourth step, in which an offset is loaded from E and added to b, can be completely eliminated by the compiler, thus speeding up every virtual method call, if the compiler generates a wrapper function like this, and places its address in the vtable entry E:
C *adjusted_b = (C *)b; /* decrements "b" by 8 */
return adjusted_b->C::access(); /* a tail call to the original method */
}
Then the steps for use() become:
b holds a pointer to the vtable. Load that pointer into a register.B::access is at some known offset in the vtable for B; find that entry E.C::access). Load that function pointer W.b. If b was really of dynamic type B, then W = B::access, and so we have saved two instructions (an expensive memory load and a cheap addition). If b was really of dynamic type C, then W = thunk_for_C_access_in_B, and so we have added one instruction (a cheap unconditional branch at the end of thunk_for_C_access_in_B).
Since the particular pattern of multiple inheritance in class C is rare in practice, we will generally save more instructions than we add. At the same time, we no longer need to store an offset for each entry E in the vtable, and so we have halved the size of every vtable in the program.
The name "thunk" for these compiler-generated functions is something of a misnomer, since they don't have anything to do with delaying computation and could have been described simply as compiler-generated wrapper functions, but the term "thunk" for these functions is now quite well established.
Thunking may also refer to creating a 16-bit virtual DOS machine (VDM) within a 32-bit operating platform so that there is backward compatibility for applications using older code or system calls.
The most common usage is in the Win16/Win32 APIs, where thunking is used to convert a 16-bit address into a 32-bit equivalent or vice versa. An early example was that Windows for Workgroups version 3.11 shipped with a 32-bit TCP/IP protocol stack (code-named "Wolverine", this was an early implementation of the TCP/IP stack that would later ship with Windows 95). To allow this stack to operate with 16-bit applications, a version of the 16-bit winsock.dll library was included that simply thunked WinSock calls into the 32-bit stack.
Microsoft later created a mostly-complete thunking layer, called Win32s, which allowed 32-bit Windows applications (written to a specific subset of the Win32 API, hence the "s" in Win32s) to run on top of 16-bit Windows 3.1x. In many ways, Windows 95 was essentially a full-scale expansion of Win32s, because much of the underpinnings of Win95 was still 16-bit.
Similar thunking was required in many cases in OS/2 2.x—while most of the operating system was 32-bit, many parts of the kernel and device drivers were 16-bit for compatibility reasons.
Thunking was used in Windows NT/2000 compatibility subsystems: the OS/2 subsystem allowed 16-bit console-mode OS/2 applications to run on Windows NT (x86 only), and the Windows on Windows (a.k.a. "WoW") subsystem permitted 16-bit Windows applications the same ability. The OS/2 subsystem was dropped after Windows 2000, and the WoW subsystem is not provided in 64-bit versions of Windows. 64-bit versions of Windows provide a similar thunking layer, WOW64, to allow use of 32-bit Windows applications.