Ruby Hacking Guide

Chapter 16: Blocks

Iterator

In this chapter, BLOCK, which is the last big name among the seven Ruby stacks, comes in. After finishing this, the internal state of the evaluator is virtually understood.

The Whole Picture

What is the mechanism of iterators? First, let’s think about a small program as below:

▼The Source Program

iter_method() do
  9   # a mark to find this block
end

Let’s check the terms just in case. As for this program, iter_method is an iterator method, do ~ end is an iterator block. Here is the syntax tree of this program being dumped.

▼Its Syntax Tree

NODE_ITER
nd_iter:
    NODE_FCALL
    nd_mid = 9617 (iter_method)
    nd_args = (null)
nd_var = (null)
nd_body:
    NODE_LIT
    nd_lit = 9:Fixnum

Looking for the block by using the 9 written in the iterator block as a trace, we can understand that NODE_ITER seems to represent the iterator block. And NODE_FCALL which calls iter_method is at the “below” of that NODE_ITER. In other words, the node of iterator block appears earlier than the call of the iterator method. This means, before calling an iterator method, a block is pushed at another node.

And checking by following the flow of code with debugger, I found that the invocation of an iterator is separated into 3 steps: NODE_ITER NODE_CALL and NODE_YIELD. This means,

that’s all.

Push a block

First, let’s start with the first step, that is NODE_ITER, which is the node to push a block.

▼ `rb_eval()` − `NODE_ITER` (simplified)

case NODE_ITER:
  {
    iter_retry:
      PUSH_TAG(PROT_FUNC);
      PUSH_BLOCK(node->nd_var, node->nd_body);

      state = EXEC_TAG();
      if (state == 0) {
          PUSH_ITER(ITER_PRE);
          result = rb_eval(self, node->nd_iter);
          POP_ITER();
      }
      else if (_block.tag->dst == state) {
          state &= TAG_MASK;
          if (state == TAG_RETURN || state == TAG_BREAK) {
              result = prot_tag->retval;
          }
      }
      POP_BLOCK();
      POP_TAG();
      switch (state) {
        case 0:
          break;

        case TAG_RETRY:
          goto iter_retry;

        case TAG_BREAK:
          break;

        case TAG_RETURN:
          return_value(result);
          /* fall through */
        default:
          JUMP_TAG(state);
      }
  }
  break;

Since the original code contains the support of the for statement, it is deleted. After removing the code relating to tags, there are only push/pop of ITER and BLOCK left. Because the rest is ordinarily doing rb_eval() with NODE_FCALL, these ITER and BLOCK are the necessary conditions to turn a method into an iterator.

The necessity of pushing BLOCK is fairly reasonable, but what’s ITER for? Actually, to think about the meaning of ITER, you need to think from the viewpoint of the side that uses BLOCK.

For example, suppose a method is just called. And ruby_block exists. But since BLOCK is pushed regardless of the break of method calls, the existence of a block does not mean the block is pushed for that method. It’s possible that the block is pushed for the previous method. (Figure 1)

figure 1: no one-to-one correspondence between `FRAME` and `BLOCK`
figure 1: no one-to-one correspondence between FRAME and BLOCK

So, in order to determine for which method the block is pushed, ITER is used. BLOCK is not pushed for each FRAME because pushing BLOCK is a little heavy. How much heavy is, let’s check it in practice.

PUSH_BLOCK()

The argument of PUSH_BLOCK() is (the syntax tree of) the block parameter and the block body.

▼ `PUSH_BLOCK() POP_BLOCK()`

 592  #define PUSH_BLOCK(v,b) do { \
 593      struct BLOCK _block;                  \
 594      _block.tag = new_blktag();            \
 595      _block.var = v;                       \
 596      _block.body = b;                      \
 597      _block.self = self;                   \
 598      _block.frame = *ruby_frame;           \
 599      _block.klass = ruby_class;            \
 600      _block.frame.node = ruby_current_node;\
 601      _block.scope = ruby_scope;            \
 602      _block.prev = ruby_block;             \
 603      _block.iter = ruby_iter->iter;        \
 604      _block.vmode = scope_vmode;           \
 605      _block.flags = BLOCK_D_SCOPE;         \
 606      _block.dyna_vars = ruby_dyna_vars;    \
 607      _block.wrapper = ruby_wrapper;        \
 608      ruby_block = &_block

 610  #define POP_BLOCK() \
 611     if (_block.tag->flags & (BLOCK_DYNAMIC))              \
 612         _block.tag->flags |= BLOCK_ORPHAN;                \
 613     else if (!(_block.scope->flags & SCOPE_DONT_RECYCLE)) \
 614         rb_gc_force_recycle((VALUE)_block.tag);           \
 615     ruby_block = _block.prev;                             \
 616  } while (0)

(eval.c)

Let’s make sure that a BLOCK is “the snapshot of the environment of the moment of creation”. As a proof of it, except for CREF and BLOCK, the six stack frames are saved. CREF can be substituted by ruby_frame->cbase, there’s no need to push.

And, I’d like to check the three points about the mechanism of push. BLOCK is fully allocated on the stack. BLOCK contains the full copy of FRAME at the moment. BLOCK is different from the other many stack frame structs in having the pointer to the previous BLOCK (prev).

The flags used in various ways at POP_BLOCK() is not explained now because it can only be understood after seeing the implementation of Proc later.

And the talk is about “BLOCK is heavy”, certainly it seems a little heavy. When looking inside of new_blktag(), we can see it does malloc() and store plenty of members. But let’s defer the final judge until after looking at and comparing with PUSH_ITER().

PUSH_ITER()

▼ `PUSH_ITER() POP_ITER()`

 773  #define PUSH_ITER(i) do {               \
 774      struct iter _iter;                  \
 775      _iter.prev = ruby_iter;             \
 776      _iter.iter = (i);                   \
 777      ruby_iter = &_iter

 779  #define POP_ITER()                      \
 780      ruby_iter = _iter.prev;             \
 781  } while (0)

(eval.c)

On the contrary, this is apparently light. It only uses the stack space and has only two members. Even if this is pushed for each FRAME, it would probably matter little.

Iterator Method Call

After pushing a block, the next thing is to call an iterator method (a method which is an iterator). There also needs a little machinery. Do you remember that there’s a code to modify the value of ruby_iter at the beginning of rb_call0? Here.

▼ `rb_call0()` − moving to `ITER_CUR`

4498      switch (ruby_iter->iter) {
4499        case ITER_PRE:
4500          itr = ITER_CUR;
4501          break;
4502        case ITER_CUR:
4503        default:
4504          itr = ITER_NOT;
4505          break;
4506      }

(eval.c)

Since ITER_PRE is pushed previously at NODE_TER, this code makes ruby_iter ITER_CUR. At this moment, a method finally “becomes” an iterator. Figure 2 shows the state of the stacks.

figure 2: the state of the Ruby stacks on an iterator call.
figure 2: the state of the Ruby stacks on an iterator call.

The possible value of ruby_iter is not the one of two boolean values (for that method or not), but one of three steps because there’s a little gap between the timings when pushing a block and invoking an iterator method. For example, there’s the evaluation of the arguments of an iterator method. Since it’s possible that it contains method calls inside it, there’s the possibility that one of that methods mistakenly thinks that the just pushed block is for itself and uses it during the evaluation. Therefore, the timing when a method becomes an iterator, this means turning into ITER_CUR, has to be the place inside of rb_call() that is just before finishing the invocation.

▼ the processing order

method(arg) { block } # push a block
method(arg) { block } # evaluate the aruguments
method(arg) { block } # a method call

For example, in the last chapter “Method”, there’s a macro named BEGIN_CALLARGS at a handler of NODE_CALL. This is where making use of the third step ITER. Let’s go back a little and try to see it.

BEGIN_CALLARGS END_CALLARGS

▼ `BEGIN_CALLARGS END_CALLARGS`

1812  #define BEGIN_CALLARGS do {\
1813      struct BLOCK *tmp_block = ruby_block;\
1814      if (ruby_iter->iter == ITER_PRE) {\
1815          ruby_block = ruby_block->prev;\
1816      }\
1817      PUSH_ITER(ITER_NOT)

1819  #define END_CALLARGS \
1820      ruby_block = tmp_block;\
1821      POP_ITER();\
1822  } while (0)

(eval.c)

When ruby_iter is ITER_PRE, a ruby_block is set aside. This code is important, for instance, in the below case:

obj.m1 { yield }.m2 { nil }

The evaluation order of this expression is:

Therefore, if there was not BEGIN_CALLARGS, m1 will call the block of m2.

And, if there’s one more iterator connected, the number of BEGIN_CALLARGS increases at the same time in this case, so there’s no problem.

Block Invocation

The third phase of iterator invocation, it means the last phase, is block invocation.

▼ `rb_eval()` − `NODE_YIELD`

2579        case NODE_YIELD:
2580          if (node->nd_stts) {
2581              result = avalue_to_yvalue(rb_eval(self, node->nd_stts));
2582          }
2583          else {
2584              result = Qundef;    /* no arg */
2585          }
2586          SET_CURRENT_SOURCE();
2587          result = rb_yield_0(result, 0, 0, 0);
2588          break;

(eval.c)

nd_stts is the parameter of yield. avalue_to_yvalue() was mentioned a little at the multiple assignments, but you can ignore this.
((errata: actually, it was not mentioned. You can ignore this anyway.))
The heart of the behavior is not this but rb_yield_0(). Since this function is also very long, I show the code after extremely simplifying it. Most of the methods to simplify are previously used.

And this time, I turn on the “optimize for readability option” as follows.

If things are done until this, it becomes very shorter.

▼ `rb_yield_0()` (simplified)

static VALUE
rb_yield_0(val, self, klass, /* pcall=0 */)
    VALUE val, self, klass;
{
    volatile VALUE result = Qnil;
    volatile VALUE old_cref;
    volatile VALUE old_wrapper;
    struct BLOCK * volatile block;
    struct SCOPE * volatile old_scope;
    struct FRAME frame;
    int state;

    PUSH_VARS();
    PUSH_CLASS();
    block = ruby_block;
    frame = block->frame;
    frame.prev = ruby_frame;
    ruby_frame = &(frame);
    old_cref = (VALUE)ruby_cref;
    ruby_cref = (NODE*)ruby_frame->cbase;
    old_wrapper = ruby_wrapper;
    ruby_wrapper = block->wrapper;
    old_scope = ruby_scope;
    ruby_scope = block->scope;
    ruby_block = block->prev;
    ruby_dyna_vars = new_dvar(0, 0, block->dyna_vars);
    ruby_class = block->klass;
    self = block->self;

    /* set the block arguments */
    massign(self, block->var, val, pcall);

    PUSH_ITER(block->iter);
    /* execute the block body */
    result = rb_eval(self, block->body);
    POP_ITER();

    POP_CLASS();
    /* ……collect ruby_dyna_vars…… */
    POP_VARS();
    ruby_block = block;
    ruby_frame = ruby_frame->prev;
    ruby_cref = (NODE*)old_cref;
    ruby_wrapper = old_wrapper;
    ruby_scope = old_scope;

    return result;
}

As you can see, the most stack frames are replaced with what saved at ruby_block. Things to simple save/restore are easy to understand, so let’s see the handling of the other frames we need to be careful about.

FRAME

struct FRAME frame;

frame = block->frame;     /* copy the entire struct */
frame.prev = ruby_frame;  /* by these two lines…… */
ruby_frame = &(frame);    /* ……frame is pushed */

Differing from the other frames, a FRAME is not used in the saved state, but a new FRAME is created by duplicating. This would look like Figure 3.

figure 3: push a copied frame
figure 3: push a copied frame

As we’ve seen the code until here, it seems that FRAME will never be “reused”. When pushing FRAME, a new FRAME will always be created.

BLOCK

block = ruby_block;
         :
ruby_block = block->prev;
         :
ruby_block = block;

What is the most mysterious is this behavior of BLOCK. We can’t easily understand whether it is saving or popping. It’s comprehensible that the first statement and the third statement are as a pair, and the state will be eventually back. However, what is the consequence of the second statement?

To put the consequence of I’ve pondered a lot in one phrase, “going back to the ruby_block of at the moment when pushing the block”. An iterator is, in short, the syntax to go back to the previous frame. Therefore, all we have to do is turning the state of the stack frame into what was at the moment when creating the block. And, the value of ruby_block at the moment when creating the block is, it seems certain that it was block->prev. Therefore, it is contained in prev.

Additionally, for the question “is it no problem to assume what invoked is always the top of ruby_block?”, there’s no choice but saying “as the rb_yield_0 side, you can assume so”. To push the block which should be invoked on the top of the ruby_block is the work of the side to prepare the block, and not the work of rb_yield_0.

An example of it is BEGIN_CALLARGS which was discussed in the previous chapter. When an iterator call cascades, the two blocks are pushed and the top of the stack will be the block which should not be used. Therefore, it is purposefully checked and set aside.

VARS

Come to think of it, I think we have not looked the contents of PUSH_VARS() and POP_VARS() yet. Let’s see them here.

▼ `PUSH_VARS() POP_VARS()`

 619  #define PUSH_VARS() do { \
 620      struct RVarmap * volatile _old; \
 621      _old = ruby_dyna_vars;          \
 622      ruby_dyna_vars = 0

 624  #define POP_VARS() \
 625     if (_old && (ruby_scope->flags & SCOPE_DONT_RECYCLE)) {   \
 626         if (RBASIC(_old)->flags) /* if were not recycled */ \
 627             FL_SET(_old, DVAR_DONT_RECYCLE);                  \
 628      }                                                        \
 629      ruby_dyna_vars = _old;                                   \
 630  } while (0)

(eval.c)

This is also not pushing a new struct, to say “set aside/restore” is closer. In practice, in rb_yield_0, PUSH_VARS() is used only to set aside the value. What actually prepares ruby_dyna_vars is this line.

ruby_dyna_vars = new_dvar(0, 0, block->dyna_vars);

This takes the dyna_vars saved in BLOCK and sets it. An entry is attached at the same time. I’d like you to recall the description of the structure of ruby_dyna_vars in Part 2, it said the RVarmap whose id is 0 such as the one created here is used as the break between block scopes.

However, in fact, between the parser and the evaluator, the form of the link stored in ruby_dyna_vars is slightly different. Let’s look at the dvar_asgn_curr() function, which assigns a block local variable at the current block.

▼ `dvar_asgn_curr()`

 737  static inline void
 738  dvar_asgn_curr(id, value)
 739      ID id;
 740      VALUE value;
 741  {
 742      dvar_asgn_internal(id, value, 1);
 743  }

 699  static void
 700  dvar_asgn_internal(id, value, curr)
 701      ID id;
 702      VALUE value;
 703      int curr;
 704  {
 705      int n = 0;
 706      struct RVarmap *vars = ruby_dyna_vars;
 707
 708      while (vars) {
 709          if (curr && vars->id == 0) {
 710              /* first null is a dvar header */
 711              n++;
 712              if (n == 2) break;
 713          }
 714          if (vars->id == id) {
 715              vars->val = value;
 716              return;
 717          }
 718          vars = vars->next;
 719      }
 720      if (!ruby_dyna_vars) {
 721          ruby_dyna_vars = new_dvar(id, value, 0);
 722      }
 723      else {
 724          vars = new_dvar(id, value, ruby_dyna_vars->next);
 725          ruby_dyna_vars->next = vars;
 726      }
 727  }

(eval.c)

The last if statement is to add a variable. If we focus on there, we can see a link is always pushed in at the “next” to ruby_dyna_vars. This means, it would look like Figure 4.

figure 4: the structure of `ruby_dyna_vars`
figure 4: the structure of ruby_dyna_vars

This differs from the case of the parser in one point: the headers (id=0) to indicate the breaks of scopes are attached before the links. If a header is attached after the links, the first one of the scope cannot be inserted properly. (Figure 5)
((errata: It was described that ruby_dyna_vars of the evaluator always forms a single straight link. But according to the errata, it was wrong. That part and relevant descriptions are removed.))

figure 5: The entry cannot be inserted properly.
figure 5: The entry cannot be inserted properly.

Target Specified Jump

The code relates to jump tags are omitted in the previously shown code, but there’s an effort that we’ve never seen before in the jump of rb_yield_0. Why is the effort necessary? I’ll tell the reason in advance. I’d like you to see the below program:

[0].each do
  break
end
# the place to reach by break

like this way, in the case when doing break from inside of a block, it is necessary to get out of the block and go to the method that pushed the block. What does it actually mean? Let’s think by looking at the (dynamic) call graph when invoking an iterator.

rb_eval(NODE_ITER)                   .... catch(TAG_BREAK)
    rb_eval(NODE_CALL)               .... catch(TAG_BREAK)
        rb_eval(NODE_YIELD)
            rb_yield_0
                rb_eval(NODE_BREAK)  .... throw(TAG_BREAK)

Since what pushed the block is NODE_ITER, it should go back to a NODE_ITER when doing break. However, NODE_CALL is waiting for TAG_BREAK before NODE_ITER, in order to turn a break over methods into an error. This is a problem. We need to somehow find a way to go straight back to a NODE_ITER.

And actually, “going back to a NODE_ITER” will still be a problem. If iterators are nesting, there could be multiple NODE_ITERs, thus the one corresponds to the current block is not always the first NODE_ITER. In other words, we need to restrict only “the NODE_ITER that pushed the currently being invoked block”

Then, let’s see how this is resolved.

▼ `rb_yield_0()` − the parts relates to tags

3826      PUSH_TAG(PROT_NONE);
3827      if ((state = EXEC_TAG()) == 0) {
              /* ……evaluate the body…… */
3838      }
3839      else {
3840          switch (state) {
3841            case TAG_REDO:
3842              state = 0;
3843              CHECK_INTS;
3844              goto redo;
3845            case TAG_NEXT:
3846              state = 0;
3847              result = prot_tag->retval;
3848              break;
3849            case TAG_BREAK:
3850            case TAG_RETURN:
3851              state |= (serial++ << 8);
3852              state |= 0x10;
3853              block->tag->dst = state;
3854              break;
3855            default:
3856              break;
3857          }
3858      }
3859      POP_TAG();

(eval.c)

The parts of TAG_BREAK and TAG_RETURN are crucial.

First, serial is a static variable of rb_yield_0(), its value will be different every time calling rb_yield_0. “serial” is the serial of “serial number”.

The reason why left shifting by 8 bits seems in order to avoid overlapping the values of TAG_xxxx. TAG_xxxx is in the range between 0x1 ~ 0x8, 4 bits are enough. And, the bit-or of 0x10 seems to prevent serial from overflow. In 32-bit machine, serial can use only 24 bits (only 16 million times), recent machine can let it overflow within less than 10 seconds. If this happens, the top 24 bits become all 0 in line. Therefore, if 0x10 did not exist, state would be the same value as TAG_xxxx (See also Figure 6).

figure 6: `block->tag->dst`
figure 6: block->tag->dst

Now, tag->dst became the value which differs from TAG_xxxx and is unique for each call. In this situation, because an ordinary switch as previous ones cannot receive it, the side to stop jumps should need efforts to some extent. The place where making an effort is this place of rb_eval:NODE_ITER:

▼ `rb_eval()` − `NODE_ITER` (to stop jumps)

case NODE_ITER:
  {
      state = EXEC_TAG();
      if (state == 0) {
          /* …… invoke an iterator …… */
      }
      else if (_block.tag->dst == state) {
          state &= TAG_MASK;
          if (state == TAG_RETURN || state == TAG_BREAK) {
              result = prot_tag->retval;
          }
      }
  }

In corresponding NODE_ITER and rb_yield_0, block should point to the same thing, so tag->dst which was set at rb_yield_0 comes in here. Because of this, only the corresponding NODE_ITER can properly stop the jump.

Check of a block

Whether or not a currently being evaluated method is an iterator, in other words, whether there’s a block, can be checked by rb_block_given_p(). After reading the above all, we can tell its implementation.

▼ `rb_block_given_p()`

3726  int
3727  rb_block_given_p()
3728  {
3729      if (ruby_frame->iter && ruby_block)
3730          return Qtrue;
3731      return Qfalse;
3732  }

(eval.c)

I think there’s no problem. What I’d like to talk about this time is actually another function to check, it is rb_f_block_given_p().

▼ `rb_f_block_given_p()`

3740  static VALUE
3741  rb_f_block_given_p()
3742  {
3743      if (ruby_frame->prev && ruby_frame->prev->iter && ruby_block)
3744          return Qtrue;
3745      return Qfalse;
3746  }

(eval.c)

This is the substance of Ruby’s block_given?. In comparison to rb_block_given_p(), this is different in checking the prev of ruby_frame. Why is this?

Thinking about the mechanism to push a block, to check the current ruby_frame like rb_block_given_p() is right. But when calling block_given? from Ruby-level, since block_given? itself is a method, an extra FRAME is pushed. Hence, we need to check the previous one.

Proc

To describe a Proc object from the viewpoint of implementing, it is “a BLOCK which can be bring out to Ruby level”. Being able to bring out to Ruby level means having more latitude, but it also means when and where it will be used becomes completely unpredictable. Focusing on how the influence of this fact is, let’s look at the implementation.

Proc object creation

A Proc object is created with Proc.new. Its substance is proc_new().

▼ `proc_new()`

6418  static VALUE
6419  proc_new(klass)
6420      VALUE klass;
6421  {
6422      volatile VALUE proc;
6423      struct BLOCK *data, *p;
6424      struct RVarmap *vars;
6425
6426      if (!rb_block_given_p() && !rb_f_block_given_p()) {
6427          rb_raise(rb_eArgError,
                "tried to create Proc object without a block");
6428      }
6429
          /* (A)allocate both struct RData and struct BLOCK */
6430      proc = Data_Make_Struct(klass, struct BLOCK,
                                  blk_mark, blk_free, data);
6431      *data = *ruby_block;
6432
6433      data->orig_thread = rb_thread_current();
6434      data->wrapper = ruby_wrapper;
6435      data->iter = data->prev?Qtrue:Qfalse;
          /* (B)the essential initialization is finished by here */
6436      frame_dup(&data->frame);
6437      if (data->iter) {
6438          blk_copy_prev(data);
6439      }
6440      else {
6441          data->prev = 0;
6442      }
6443      data->flags |= BLOCK_DYNAMIC;
6444      data->tag->flags |= BLOCK_DYNAMIC;
6445
6446      for (p = data; p; p = p->prev) {
6447          for (vars = p->dyna_vars; vars; vars = vars->next) {
6448              if (FL_TEST(vars, DVAR_DONT_RECYCLE)) break;
6449              FL_SET(vars, DVAR_DONT_RECYCLE);
6450          }
6451      }
6452      scope_dup(data->scope);
6453      proc_save_safe_level(proc);
6454
6455      return proc;
6456  }

(eval.c)

The creation of a Proc object itself is unexpectedly simple. Between (A) and (B), a space for an Proc object is allocated and its initialization completes. Data_Make_Struct() is a simple macro that does both malloc() and Data_Wrap_Struct() at the same time.

The problems exist after that:

These four have the same purposes. They are:

Here, “all” means the all things including prev. For the all stack frames pushed there, it duplicates each frame by doing malloc() and copying. VARS is usually forced to be collected by rb_gc_force_recycle() at the same moment of POP, but this behavior is stopped by setting the DVAR_DONT_RECYCLE flag. And so on. Really extreme things are done.

Why are these extreme things necessary? This is because, unlike iterator blocks, a Proc can persist longer than the method that created it. And the end of a method means the things allocated on the machine stack such as FRAME, ITER, and local_vars of SCOPE are invalidated. It’s easy to predict what the consequence of using the invalidated memories. (An example answer: it becomes troublesome).

I tried to contrive a way to at least use the same FRAME from multiple Proc, but since there are the places such as old_frame where setting aside the pointers to the local variables, it does not seem going well. If it requires a lot efforts in anyway, another effort, say, allocating all of them with malloc() from the frist place, seems better to give it a try.

Anyway, I sentimentally think that it’s surprising that it runs with that speed even though doing these extreme things. Indeed, it has become a good time.

Floating Frame

Previously, I mentioned it just in one phrase “duplicate all frames”, but since that was unclear, let’s look at more details. The points are the next two:

Then first, let’s start with the summary of how each stack frame is saved.

_. Frame _. location _. has prev pointer?
FRAME stack yes
SCOPE stack no
local_tbl heap  
local_vars stack  
VARS heap no
BLOCK stack yes

CLASS CREF ITER are not necessary this time. Since CLASS is a general Ruby object, rb_gc_force_recycle() is not called with it even by mistake (it’s impossible) and both CREF and ITER becomes unnecessary after storing its values at the moment in FRAME. The four frames in the above table are important because these will be modified or referred to multiple times later. The rest three will not.

Then, this talk moves to how to duplicate all. I said “how”, but it does not about such as “by malloc()”. The problem is how to duplicate “all”. It is because, here I’d like you to see the above table, there are some frames without any prev pointer. In other words, we cannot follow links. In this situation, how can we duplicate all?

A fairly clever technique used to counter this. Let’s take SCOPE as an example. A function named scope_dup() is used previously in order to duplicate SCOPE, so let’s see it first.

▼ `scope_dup()` only the beginning

6187  static void
6188  scope_dup(scope)
6189      struct SCOPE *scope;
6190  {
6191      ID *tbl;
6192      VALUE *vars;
6193
6194      scope->flags |= SCOPE_DONT_RECYCLE;

(eval.c)

As you can see, SCOPE_DONT_RECYCLE is set. Then next, take a look at the definition of POP_SCOPE():

▼ `POP_SCOPE()` only the beginning

 869  #define POP_SCOPE()                                      \
 870      if (ruby_scope->flags & SCOPE_DONT_RECYCLE) {        \
 871         if (_old) scope_dup(_old);                        \
 872      }                                                    \

(eval.c)

When it pops, if SCOPE_DONT_RECYCLE flag was set to the current SCOPE (ruby_scope), it also does scope_dup() of the previous SCOPE (_old). In other words, SCOPE_DONT_RECYCLE is also set to this one. In this way, one by one, the flag is propagated at the time when it pops. (Figure 7)

figure 7: flag propagation
figure 7: flag propagation

Since VARS also does not have any prev pointer, the same technique is used to propagate the DVAR_DONT_RECYCLE flag.

Next, the second point, try to think about “why all of them are duplicated”. We can understand that the local variables of SCOPE can be referred to later if its Proc is created. However, is it necessary to copy all of them including the previous SCOPE in order to accomplish that?

Honestly speaking, I couldn’t find the answer of this question and has been worried about how can I write this section for almost three days, I’ve just got the answer. Take a look at the next program:

def get_proc
  Proc.new { nil }
end

env = get_proc { p 'ok' }
eval("yield", env)

I have not explained this feature, but by passing a Proc object as the second argument of eval, you can evaluate the string in that environment.

It means, as the readers who have read until here can probably tell, it pushes the various environments taken from the Proc (meaning BLOCK) and evaluates. In this case, it naturally also pushes BLOCK and you can turn the BLOCK into a Proc again. Then, using the Proc when doing eval … if things are done like this, you can access almost all information of ruby_block from Ruby level as you like. This is the reason why the entire stacks need to be fully duplicated.
((errata: we cannot access ruby_block as we like from Ruby level. The reason why all SCOPEs are duplicated was not understood. It seems all we can do is to investigate the mailing list archives of the time when this change was applied. (It is still not certain whether we can find out the reason in this way.)
))

Invocation of Proc

Next, we’ll look at the invocation of a created Proc. Since Proc#call can be used from Ruby to invoke, we can follow the substance of it.

The substance of Proc#call is proc_call():

▼ `proc_call()`

6570  static VALUE
6571  proc_call(proc, args)
6572      VALUE proc, args;           /* OK */
6573  {
6574      return proc_invoke(proc, args, Qtrue, Qundef);
6575  }

(eval.c)

Delegate to proc_invoke(). When I look up invoke in a dictionary, it was written such as “call on (God, etc.) for help”, but when it is in the context of programming, it is often used in the almost same meaning as “activate”.

The prototype of the proc_invoke() is,

proc_invoke(VALUE proc, VALUE args, int pcall, VALUE self)

However, according to the previous code, pcall=Qtrue and self=Qundef in this case, so these two can be removed by constant foldings.

▼ `proc_invoke` (simplified)

static VALUE
proc_invoke(proc, args, /* pcall=Qtrue */, /* self=Qundef */)
    VALUE proc, args;
    VALUE self;
{
    struct BLOCK * volatile old_block;
    struct BLOCK _block;
    struct BLOCK *data;
    volatile VALUE result = Qnil;
    int state;
    volatile int orphan;
    volatile int safe = ruby_safe_level;
    volatile VALUE old_wrapper = ruby_wrapper;
    struct RVarmap * volatile old_dvars = ruby_dyna_vars;

    /*(A)take BLOCK from proc and assign it to data */
    Data_Get_Struct(proc, struct BLOCK, data);
    /*(B)blk_orphan */
    orphan = blk_orphan(data);

    ruby_wrapper = data->wrapper;
    ruby_dyna_vars = data->dyna_vars;
    /*(C)push BLOCK from data */
    old_block = ruby_block;
    _block = *data;
    ruby_block = &_block;

    /*(D)transition to ITER_CUR */
    PUSH_ITER(ITER_CUR);
    ruby_frame->iter = ITER_CUR;

    PUSH_TAG(PROT_NONE);
    state = EXEC_TAG();
    if (state == 0) {
        proc_set_safe_level(proc);
        /*(E)invoke the block */
        result = rb_yield_0(args, self, 0, pcall);
    }
    POP_TAG();

    POP_ITER();
    if (ruby_block->tag->dst == state) {
        state &= TAG_MASK;      /* target specified jump */
    }
    ruby_block = old_block;
    ruby_wrapper = old_wrapper;
    ruby_dyna_vars = old_dvars;
    ruby_safe_level = safe;

    switch (state) {
      case 0:
        break;
      case TAG_BREAK:
        result = prot_tag->retval;
        break;
      case TAG_RETURN:
        if (orphan) {   /* orphan procedure */
            localjump_error("return from proc-closure", prot_tag->retval);
        }
        /* fall through */
      default:
        JUMP_TAG(state);
    }
    return result;
}

The crucial points are three: C, D, and E.

==(C)== At NODE_ITER a BLOCK is created from the syntax tree and pushed, but this time, a BLOCK is taken from Proc and pushed.

(D) It was ITER_PRE before becoming ITER_CUR at rb_call0(), but this time it goes directly into ITER_CUR.

(E) If the case was an ordinary iterator, its method call exists before yeild occurs then going to rb_yield_0, but this time rb_yield_() is directly called and invokes the just pushed block.

In other words, in the case of iterator, the procedures are separated into three places, NODE_ITER ~ rb_call0() ~ NODE_YIELD. But this time, they are done all at once.

Finally, I’ll talk about the meaning of blk_orphan(). As the name suggests, it is a function to determine the state of “the method which created the Proc has finished”. For example, the SCOPE used by a BLOCK has already been popped, you can determine it has finished.

Block and Proc

In the previous chapter, various things about arguments and parameters of methods are discussed, but I have not described about block parameters yet. Although it is brief, here I’ll perform the final part of that series.

def m(&block)
end

This is a “block parameter”. The way to enable this is very simple. If m is an iterator, it is certain that a BLOCK was already pushed, turn it into a Proc and assign into (in this case) the local variable block. How to turn a block into a Proc is just calling proc_new(), which was previously described. The reason why just calling is enough can be a little incomprehensible. However whichever Proc.new or m, the situation “a method is called and a BLOCK is pushed” is the same. Therefore, from C level, anytime you can turn a block into a Proc by just calling proc_new().

And if m is not an iterator, all we have to do is simply assigning nil.

Next, it is the side to pass a block.

m(&block)

This is a “block argument”. This is also simple, take a BLOCK from (a Proc object stored in) block and push it. What differs from PUSH_BLOCK() is only whether a BLOCK has already been created in advance or not.

The function to do this procedure is block_pass(). If you are curious about, check and confirm around it. However, it really does just only what was described here, it’s possible you’ll be disappointed…