malloc流程

重要的数据结构

1.malloc_state

首先便是我们经常接触的arena

struct malloc_state
{
  /* Serialize access.  */
  mutex_t mutex;

  /* Flags (formerly in max_fast).  */
  int flags;

  /* fastbin链条数组 */
  mfastbinptr fastbinsY[NFASTBINS];

  /*top chunk 指针 */
  mchunkptr top;

  /* The remainder from the most recent split of a small request 最近分割的一个小请求的剩余部分 */
  mchunkptr last_remainder;

  /* NBINS为宏，代指128，这里包含了除fastbin的所有bin指针 */
  mchunkptr bins[NBINS * 2 - 2];

  /* bins数组的位图 */
  unsigned int binmap[BINMAPSIZE];

  /* 链接下一个malloc_state的指针 */
  struct malloc_state *next;

  /* Linked list for free arenas.  Access to this field is serialized
     by free_list_lock in arena.c.  */
  struct malloc_state *next_free;

  /* Number of threads attached to this arena.  0 if the arena is on
     the free list.  Access to this field is serialized by
     free_list_lock in arena.c.  */
  INTERNAL_SIZE_T attached_threads;

  /* 在本arena当中从系统处获取到的内存大小  */
  INTERNAL_SIZE_T system_mem;
  INTERNAL_SIZE_T max_system_mem;
};

其结构在源码当中表现为mstate

2.malloc_chunk

struct malloc_chunk {

  INTERNAL_SIZE_T      prev_size;  /* Size of previous chunk (if free).  */
  INTERNAL_SIZE_T      size;       /* Size in bytes, including overhead. */

  struct malloc_chunk* fd;         /* double links -- used only if free. */
  struct malloc_chunk* bk;

  /* Only used for large blocks: pointer to next larger size.  */
  struct malloc_chunk* fd_nextsize; /* double links -- used only if free. */
  struct malloc_chunk* bk_nextsize;
};

源码自带的malloc_chunk细节

/*
   malloc_chunk details:

    (The following includes lightly edited explanations by Colin Plumb.)

    Chunks of memory are maintained using a `boundary tag' method as
    described in e.g., Knuth or Standish.  (See the paper by Paul
    Wilson ftp://ftp.cs.utexas.edu/pub/garbage/allocsrv.ps for a
    survey of such techniques.)  Sizes of free chunks are stored both
    in the front of each chunk and at the end.  This makes
    consolidating fragmented chunks into bigger chunks very fast.  The
    size fields also hold bits representing whether chunks are free or
    in use.

    An allocated chunk looks like this:


    chunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        |             Size of previous chunk, if allocated            | |
        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        |             Size of chunk, in bytes                       |M|P|
      mem-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        |             User data starts here...                          .
        .                                                               .
        .             (malloc_usable_size() bytes)                      .
        .                                                               |
nextchunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        |             Size of chunk                                     |
        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


    Where "chunk" is the front of the chunk for the purpose of most of
    the malloc code, but "mem" is the pointer that is returned to the
    user.  "Nextchunk" is the beginning of the next contiguous chunk.

    Chunks always begin on even word boundaries, so the mem portion
    (which is returned to the user) is also on an even word boundary, and
    thus at least double-word aligned.

    Free chunks are stored in circular doubly-linked lists, and look like this:

    chunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        |             Size of previous chunk                            |
        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    `head:' |             Size of chunk, in bytes                         |P|
      mem-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        |             Forward pointer to next chunk in list             |
        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        |             Back pointer to previous chunk in list            |
        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        |             Unused space (may be 0 bytes long)                .
        .                                                               .
        .                                                               |
nextchunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    `foot:' |             Size of chunk, in bytes                           |
        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

    The P (PREV_INUSE) bit, stored in the unused low-order bit of the
    chunk size (which is always a multiple of two words), is an in-use
    bit for the *previous* chunk.  If that bit is *clear*, then the
    word before the current chunk size contains the previous chunk
    size, and can be used to find the front of the previous chunk.
    The very first chunk allocated always has this bit set,
    preventing access to non-existent (or non-owned) memory. If
    prev_inuse is set for any given chunk, then you CANNOT determine
    the size of the previous chunk, and might even get a memory
    addressing fault when trying to do so.

    Note that the `foot' of the current chunk is actually represented
    as the prev_size of the NEXT chunk. This makes it easier to
    deal with alignments etc but can be very confusing when trying
    to extend or adapt this code.

    The two exceptions to all this are

     1. The special chunk `top' doesn't bother using the
    trailing size field since there is no next contiguous chunk
    that would have to index off it. After initialization, `top'
    is forced to always exist.  If it would become less than
    MINSIZE bytes long, it is replenished.

     2. Chunks allocated via mmap, which have the second-lowest-order
    bit M (IS_MMAPPED) set in their size fields.  Because they are
    allocated one-by-one, each must contain its own trailing size field.

*/

0x01 malloc步骤

step1 malloc环境准备

首先我们调用malloc(size)的时候，调用的库函数实际上为_libc_malloc，如下：

void * __libc_malloc (size_t bytes)
{
  mstate ar_ptr; //mstate:arena结构体
  void *victim;  //victim:要获取的chunk指针

  void *(*hook) (size_t, const void *)
    = atomic_forced_read (__malloc_hook);    
  if (__builtin_expect (hook != NULL, 0))
    return (*hook)(bytes, RETURN_ADDRESS (0));   //调用malloc_hook

  arena_get (ar_ptr, bytes);                  //表现为宏，获取arena指针

  victim = _int_malloc (ar_ptr, bytes);  
  //调用_int_malloc，返回分配chunk指针，参数一为arena指针，参数二为我们的需要分配的字节
  /* Retry with another arena only if we were able to find a usable arena
     before.  */
  if (!victim && ar_ptr != NULL) //如果分配失败&&arena指针不为空
    {
      LIBC_PROBE (memory_malloc_retry, 1, bytes);
      ar_ptr = arena_get_retry (ar_ptr, bytes);
      victim = _int_malloc (ar_ptr, bytes);//尝试重新分配
    }

  if (ar_ptr != NULL)
    (void) mutex_unlock (&ar_ptr->mutex);

  assert (!victim || chunk_is_mmapped (mem2chunk (victim)) ||
          ar_ptr == arena_for_chunk (mem2chunk (victim)));
  return victim;//返回chunk指针
}

可以发现我们的_libc_malloc函数主要功能是获取合适的arena，然后传递给_int_malloc分配真正的堆块，然后来观察_int_malloc，而该函数十分长，因此逐步来看,

首先看到定义的一些字段，如下：

static void * _int_malloc (mstate av, size_t bytes)//传入两个参数,第一个是arena指针av,第二个是大小bytes
{
  INTERNAL_SIZE_T nb;               /* normalized request size标准化请求大小 */
  unsigned int idx;                 /* 字节所关联的bin数组下标 */
  mbinptr bin;                      /* bin数组下标所对应的bin指针 */

  mchunkptr victim;                 /* 检查/选择得到的chunk指针 */
  INTERNAL_SIZE_T size;             /* 得到的chunk大小 */
  int victim_index;                 /* 所得chunk对应bin的index */

  mchunkptr remainder;              /* 分块后的剩余部分 */
  unsigned long remainder_size;     /* 剩余部分大小 */

  unsigned int block;               /* bit map traverser */
  unsigned int bit;                 /* bit map traverser */
  unsigned int map;                 /* current word of binmap */

  mchunkptr fwd;                    /* misc temp for linking */
  mchunkptr bck;                    /* misc temp for linking */

  const char *errstr = NULL;

题外话: checked_request2size (bytes, nb)

//通过需求的字节大小来转变至实际需要的堆块大小
checked_request2size (bytes, nb);
//这里通过宏定义实现
#define request2size(req)                                         \
  (((req) + SIZE_SZ + MALLOC_ALIGN_MASK < MINSIZE)  ?             \
   MINSIZE :                                                      \
   ((req) + SIZE_SZ + MALLOC_ALIGN_MASK) & ~MALLOC_ALIGN_MASK)

/*  Same, except also perform argument check */

#define checked_request2size(req, sz)                             \
  if (REQUEST_OUT_OF_RANGE (req)) {                                              \
      __set_errno (ENOMEM);                                                      \
      return 0;                                                                      \
    }                                                                              \
  (sz) = request2size (req);

继续int_malloc

/* 若传入的av为空，那么转回调用sysmalloc，通过mmap来分配出一个chunk */
if (__glibc_unlikely (av == NULL))
  {
    void *p = sysmalloc (nb, av);
    if (p != NULL)
  alloc_perturb (p, bytes);
    return p;
  }

step2 若在fastbin范围

然后接下来判断其是否位于fastbin范围

/*
   如果该size位于fastbins范围， 首先检查合并堆块.
   即使av未初始化，该代码也是可正常安全执行的, 因此我们可以在不检查的情况下执行
 */

if ((unsigned long) (nb) <= (unsigned long) (get_max_fast ()))
  {
    idx = fastbin_index (nb);                     //根据chunk的大小获取fastbin数组的下标
    mfastbinptr *fb = &fastbin (av, idx);         //为一个宏，得到fastbinsY元素指针
    mchunkptr pp = *fb;
    do
      {
        victim = pp;
          /* 若未找到合适的victim，跳出循环 */
        if (victim == NULL)
          break;
      }
    while ((pp = catomic_compare_and_exchange_val_acq (fb, victim->fd, victim))
           != victim);        //这里该函数为一个原子操作，目的是交换fb与victim->fd的值，也就是从链头开始取
           
  /* 下面判断取出的victim是否通过检查 */
    if (victim != 0)
      {
        if (__builtin_expect (fastbin_index (chunksize (victim)) != idx, 0)) //检查堆块的size位
          {
            errstr = "malloc(): memory corruption (fast)";
          errout:
            malloc_printerr (check_action, errstr, chunk2mem (victim), av);
            return NULL;//如果size位不符合fastbin的idx,则报错
          }
        check_remalloced_chunk (av, victim, nb);  //检查重分配堆块
        void *p = chunk2mem (victim);             //堆块指针转为指向返回内存的宏
        alloc_perturb (p, bytes);                 //堆块清0
        return p;
      }
  }

大致含义即为从fastbinsY链表数组找到对应的下标，然后从头取出fastbin，清零后返回给用户。

step3 若在small范围当中

下面继续_int_malloc，执行到这里表示超出fastbin大小范围，或者说使用fastbin分配失败，然后就会判断其进入smallbin的判断当中

/*
    如果请求的size大小属于smallbin范围，我们则检查通用的bins数组.  Since these "smallbins"
     hold one size each, no searching within bins is necessary.
     (如果是largebin范围, 我们必须等到 unsorted chunks 被处理为寻找最佳适配块. But for small ones, fits are exact
     anyway, so we can check now, which is faster.)
   */

  if (in_smallbin_range (nb))
    {
      idx = smallbin_index (nb);
      bin = bin_at (av, idx);                         //获取对应smallbin链表数组下标

      if ((victim = last (bin)) != bin)
      //如果非空bin->bk会指向单链表的第一个chunk
      //last(bin)为宏bin->bk,这里是判断他是否为空，若不为空则说明smallbin里面有堆块，进入下一步
        {
          if (victim == 0) /* 初始化检查，若为0则说明smallbin并未初始化，我们的small bin的各项还是0 */
            malloc_consolidate (av); //进行av初始化，也就是取出fast chunk各项堆块合并一下
          else                                 //这里说明我们已经经过了初始化，所以接下来就是普通的判断过程
            {
              bck = victim->bk;
              //此时victim为smallbin的尾块,此语句的作用是使smallbin->bk指向victim->bk,即倒数第二块,即为取出尾块的操作
                if (__glibc_unlikely (bck->fd != victim))         //检查
                {
                  errstr = "malloc(): smallbin double linked list corrupted";
                  goto errout;
                }
              set_inuse_bit_at_offset (victim, nb);         //设置相邻下一个堆块的inuse位
              bin->bk = bck;
              bck->fd = bin;  //从尾部脱链,将最后一块取出链表

              if (av != &main_arena)
                victim->size |= NON_MAIN_ARENA;
              check_malloced_chunk (av, victim, nb);
              void *p = chunk2mem (victim);
              alloc_perturb (p, bytes);                  //清空bytes字节大小的值
              return p;
            }
        }
    }

我们对于smallbin的分配也十分简单，那就是直接从尾部开始取，但是在取之前我们会判断arena是否进行过初始化，若没有进行初始化，则调用malloc_consolidate进行初始化

题外话：malloc_consolidate

这是单独的malloc_consolidate函数讲解，与上面_int_malloc单独分开，整体代码如下：

/*
  ------------------------- malloc_consolidate -------------------------
  malloc_consolidate是一个用来拆除fastbins中chunk的特殊free()函数.
  free()本身不能用于此目的，因为在其他情况下他可能将堆块放入fastbins当中  因此我们需要使用一个相似的操作来代替free()代码。
  当然，因为该代码在malloc的工程中是第一次被调用 ，他是一个极佳的位置来触发我们的初始化代码
*/
static void malloc_consolidate(mstate av)
{
  mfastbinptr*    fb;                 /* 目前正在被合并的fastbin chunk指针
/
*  mfastbinptr*
    maxfb;              /* last fastbin (for loop control) 
/
*  mchunkptr       p;                  /*
 current chunk being consolidated 
/
*  mchunkptr       nextp;              /*
 next chunk to consolidate 
/
*  mchunkptr       unsorted_bin;       /*
 bin header 
/
*  mchunkptr       first_unsorted;     /*
 chunk to link to */
  /* These have same use as in free() */
  mchunkptr       nextchunk;
  INTERNAL_SIZE_T size;
  INTERNAL_SIZE_T nextsize;
  INTERNAL_SIZE_T prevsize;
  int             nextinuse;
  mchunkptr       bck;
  mchunkptr       fwd;
  /*
    如果max_fast 为0，则说明arena并未初始化，因此执行下面的步骤
  */
  if (get_max_fast () != 0) {
    clear_fastchunks(av);//将fastbin的inuse位都置0
    unsorted_bin = unsorted_chunks(av);                         //获取unsorted bin的bins数组下标
    /*
      移除fastbins当中所有的chunk，然后进行合并，再紧接着放入我们的unsorted bin链条当中.
      Among other reasons for doing this,placing in unsorted bin avoids needing to calculate actual bins
      until malloc is sure that chunks aren't immediately going to be reused anyway.
    */
    maxfb = &fastbin (av, NFASTBINS - 1);                 //获取最大块的fastbin
    fb = &fastbin (av, 0);                                //获取最小块的fastbin
    do {
      p = atomic_exchange_acq (fb, 0);         //纳米交换！小子
      if (p != 0) {                            //如果说换出来的堆块指针非0，那么就说明该链条上面存在fastbin堆块
    do {
      check_inuse_chunk(av, p);         
      //查询下一个堆块的inuse来表示该堆块是否被使用，以及是否是mmap分配，但这里没怎么理解，
      //因为fastbin的inuse不应该都是始终为1的么，这里可能是检查紧邻的下一个堆块可能不属于fastbin范围时的检测
      nextp = p->fd;
      /* Slightly streamlined version of consolidation code in free() */
      size = p->size & ~(PREV_INUSE|NON_MAIN_ARENA);   //获取堆块size
      nextchunk = chunk_at_offset(p, size); //为一个宏，这里即为p+size
      nextsize = chunksize(nextchunk);                 //获取next堆块的size
      if (!prev_inuse(p)) {                         //查看p堆块前面的堆块是否分配，若未分配则进行下面的步骤
        prevsize = p->prev_size;

        size += prevsize;
        p = chunk_at_offset(p, -((long) prevsize));         //即为上面p + prevsize前一个块地址
        unlink(av, p, bck, fwd);                 //大名鼎鼎的unlink，在2.23版本表现为一个宏
      }
      if (nextchunk != av->top) {                          //如果nextchunk不是topchunk，则往下走
        nextinuse = inuse_bit_at_offset(nextchunk, nextsize); //判断nextchunk的下一块inuse位
        if (!nextinuse) {         //nextinuse为0说明该块是空的
          size += nextsize;
          unlink(av, nextchunk, bck, fwd);                 //unlink该nextchunk
        } else         //nextinuse 为1则说明该块正在被使用，因此清该堆块的inuse位为0
          clear_inuse_bit_at_offset(nextchunk, 0);
        first_unsorted = unsorted_bin->fd;
        unsorted_bin->fd = p;
        first_unsorted->bk = p;         //将fastbin取出/合并后的堆块存入unsortedbin，从链头放入
        if (!in_smallbin_range (size)) { //如果为largebin范围，则置空fd/bk_nextsize
          p->fd_nextsize = NULL;
          p->bk_nextsize = NULL;
        }
        set_head(p, size | PREV_INUSE);  //设置堆块细节，也就是头部分和脚部分
        p->bk = unsorted_bin;
        p->fd = first_unsorted;
        set_foot(p, size);
      }
      else {                 //如果nextchunk是topchunk
        size += nextsize;  //合并为top_chunk
        set_head(p, size | PREV_INUSE);
        av->top = p;
      }
    } while ( (p = nextp) != 0); //二层则遍历 fastbin 单链表得到每一个相同尺寸的空闲 chunk。
      }
    } while (fb++ != maxfb); //第一层遍历 fastbinY 数组，得到每一个固定尺寸的 fastbin 单链表
  }
  else {                 //若未初始化，则使用下面代码进行一个简单的初始化
    malloc_init_state(av);
    check_malloc_state(av);
  }
}

对每一个 chunk，首先尝试向后合并。合并操作即更新 p 的 size 以及指向，然后调用 unlink()宏将后方 chunk 从其链接的 bin 中脱链:

if (!prev_inuse(p)) {
  prevsize = p->prev_size;
  size += prevsize;
  p = chunk_at_offset(p, -((long) prevsize));
  unlink(av, p, bck, fwd);
}

然后尝试向前合并。向前合并情况复杂点，它的处理是这样的：

如果向前相邻 top_chunk，则直接合并到 top_chunk 后完事，不再理会 unsorted_bin
如果向前不相邻 top_chunk，则尝试向前合并后插入到 unsorted_bin

代码如下:

if (nextchunk != av->top) {
        nextinuse = inuse_bit_at_offset(nextchunk, nextsize);

        if (!nextinuse) {
          size += nextsize;
          unlink(av, nextchunk, bck, fwd);
        } else
          clear_inuse_bit_at_offset(nextchunk, 0);

        first_unsorted = unsorted_bin->fd;
        unsorted_bin->fd = p;
        first_unsorted->bk = p;

        if (!in_smallbin_range (size)) {
          p->fd_nextsize = NULL;
          p->bk_nextsize = NULL;
        }

        set_head(p, size | PREV_INUSE);
        p->bk = unsorted_bin;
        p->fd = first_unsorted;
        set_foot(p, size);
      }

      else {
        size += nextsize;
        set_head(p, size | PREV_INUSE);
        av->top = p;
      }

总结:

若 get_max_fast() 返回 0，则进行堆的初始化工作，然后进入第 7 步
从 fastbin 中获取一个空闲 chunk
尝试向后合并
若向前相邻 top_chunk，则直接合并到 top_chunk，然后进入第 6 步
否则尝试向前合并后，插入到 unsorted_bin 中
获取下一个空闲 chunk，回到第 2 步，直到所有 fastbin 清空后进入第 7 步
退出函数

题外话： unlink宏

其中unlink宏如下：

/* 从bin链表数组当中取出一个chunk */
#define unlink(AV, P, BK, FD) {                                            \
    FD = P->fd;                                                                      \
    BK = P->bk;                                                                      \
    if (__builtin_expect (FD->bk != P || BK->fd != P, 0))                      \  //检查一：p->fd->bk == p && p->bk->fd == p
      malloc_printerr (check_action, "corrupted double-linked list", P, AV);  \
    else {                                                                      \
        FD->bk = BK;                                                              \

        BK->fd = FD;                                                              \         //脱链操作
        if (!in_smallbin_range (P->size)                                      \          //不在smallbin范围，那么就是在largebin范围
            && __builtin_expect (P->fd_nextsize != NULL, 0)) {                      \
        if (__builtin_expect (P->fd_nextsize->bk_nextsize != P, 0)              \

        || __builtin_expect (P->bk_nextsize->fd_nextsize != P, 0))    \         //检查二：p->fd_nextsize->bk_nextsize == p && p->bk_nextsize->fd_nextsize == p
          malloc_printerr (check_action,                                      \
                   "corrupted double-linked list (not small)",    \
                   P, AV);                                              \
            if (FD->fd_nextsize == NULL) {                                      \ //如果在小链条中
                if (P->fd_nextsize == P)                                      \         //如果largebin链条就一个，则往下走
                  FD->fd_nextsize = FD->bk_nextsize = FD;                      \
                else {                                                              \
                    FD->fd_nextsize = P->fd_nextsize;                              \
                    FD->bk_nextsize = P->bk_nextsize;                              \
                    P->fd_nextsize->bk_nextsize = FD;                              \
                    P->bk_nextsize->fd_nextsize = FD;                              \         //largebin脱链
                  }                                                              \
              } else {                                                              \                 //若刚好p为小链条最后一个
                P->fd_nextsize->bk_nextsize = P->bk_nextsize;                      \
                P->bk_nextsize->fd_nextsize = P->fd_nextsize;                      \
              }                                                                      \
          }                                                                      \
      }                                                                              \
}

step4 清理堆块碎片

上面我们讲到了smallbin判断，但如果说我们请求的bytes既不在fastbin范围，也不在smallbin范围当中呢

/*
     如果我们是largebinsize的请求, 在我们继续之前，我们先合并一下咱们的fastbin，也就是调用mallock_consolidate函数
     虽然这个举动看起来是在还仍保留有大量空闲区块的同时来消除所有的fastbin堆块, 但是他避免了因fastbin堆块导致的碎片问题。
   */
  else
    {
      idx = largebin_index (nb);         //获取largebin 的下标
      if (have_fastchunks (av))

        malloc_consolidate (av);         //合并fastbinchunk，然后摘除他们
    }

在我们调用完malloc_consolidate之后，我们进入接下来的判断，也就是进入我们的外部大循环：

step5 大循环-unsortedbin清理

    /*
    处理最近释放或剩余的块，仅在完全匹配的情况下获取一个，
    或者，如果这是一个小请求，则chunk是最近非完全匹配的剩余块。将其他已遍历的块放在bin中。
    请注意，在任何例程中，这一步骤都是将块放置在bin中的唯一位置。
    这里需要外循环，因为我们可能直到malloc接近尾声时才意识到我们应该合并，
    所以必须这样做并重试。这种情况最多发生一次，而且只有当我们需要扩展内存来为“small”请求提供服务时才会发生
   */

  for (;; )
    {
      int iters = 0;
      while ((victim = unsorted_chunks (av)->bk) != unsorted_chunks (av))         
      //判断条件是unsorted bin不为空，并且此时使得victim指向unsorted 链表尾部
        {
          bck = victim->bk;//bck为倒数第二个块
          if (__builtin_expect (victim->size <= 2 * SIZE_SZ, 0) //检测一：size大小最小最大检查
              || __builtin_expect (victim->size > av->system_mem, 0))
            malloc_printerr (check_action, "malloc(): memory corruption",
                             chunk2mem (victim), av);
          size = chunksize (victim);

          /*
             如果是个小请求，如果最近的remainder剩余块是unsortedbin当中唯一的块的话，
            尝试使用他来分配  This helps promote locality for
             runs of consecutive small requests. 这是最佳适配的唯一例外
            并且适用于当这里没有最佳适配的小堆块
           */

          if (in_smallbin_range (nb) &&
              bck == unsorted_chunks (av) &&
              victim == av->last_remainder &&
              (unsigned long) (size) > (unsigned long) (nb + MINSIZE))         
              //请求字节smallbin范围+unsortedbin只有一个堆块，同样也是last_remainder+这个剩余块size大于请求size
            {
              /* split and reattach remainder 直接切割unsortedbin返回,并将剩余部分留在unsortedbin*/
              remainder_size = size - nb;
              remainder = chunk_at_offset (victim, nb);
              unsorted_chunks (av)->bk = unsorted_chunks (av)->fd = remainder;
              av->last_remainder = remainder;
              remainder->bk = remainder->fd = unsorted_chunks (av);
              if (!in_smallbin_range (remainder_size))         //如果剩余块为largebin范围
                {
                  remainder->fd_nextsize = NULL;
                  remainder->bk_nextsize = NULL;
                }

              set_head (victim, nb | PREV_INUSE |
                        (av != &main_arena ? NON_MAIN_ARENA : 0));
              set_head (remainder, remainder_size | PREV_INUSE);
              set_foot (remainder, remainder_size);

              check_malloced_chunk (av, victim, nb);
              void *p = chunk2mem (victim);
              alloc_perturb (p, bytes);
              return p;                                  //这里是划分出了相应堆块，直接返回
            }

          /* 从unsorted 链表移除我们的victim */
          unsorted_chunks (av)->bk = bck;    //unsortedbin->bk指向倒数第二个chunk
          bck->fd = unsorted_chunks (av);    //倒数第二个chunk->fd指向unsortedbin

          /* Take now instead of binning if exact fit */

          if (size == nb)         //最佳适配
            {
              set_inuse_bit_at_offset (victim, size);
              if (av != &main_arena)
                victim->size |= NON_MAIN_ARENA;
              check_malloced_chunk (av, victim, nb);
              void *p = chunk2mem (victim);
              alloc_perturb (p, bytes);
              return p;
            }

          /* 走到这儿说明并没有最佳匹配，因此在这里就开始归还堆块给相应的bin */

          if (in_smallbin_range (size))         //为smallbin范围
            {
              victim_index = smallbin_index (size);
              bck = bin_at (av, victim_index);
              fwd = bck->fd;
            }
          else                         //为largebin范围
            {
              victim_index = largebin_index (size);
              bck = bin_at (av, victim_index); //bck指向对应的bin
              fwd = bck->fd;                   //fwd为对应的bin的头块

              /* maintain large bins in sorted order */
              if (fwd != bck)                 //说明largebin链条不为空
                {
                  /* Or with inuse bit to speed comparisons */
                  size |= PREV_INUSE;
                  /* if smaller than smallest, bypass loop below */
                  assert ((bck->bk->size & NON_MAIN_ARENA) == 0);
                  if ((unsigned long) (size) < (unsigned long) (bck->bk->size))  //bck->bk是尾块
                  //这里我们知道largebin的链条尾部是最小堆块，所以这里如果小于最小堆块的话那么直接放入链表尾部
                    {
                      fwd = bck;     //使fwd指向bin
                      bck = bck->bk; //使bck指向尾块
                      //bin->头->...->尾块->(在这里插入)->bin
                      //               ↑                 ↑
                      //              bck               fwd
                      //如此一来最后在bck与fwd插入的效果就等价于在尾部插入了

                      victim->fd_nextsize = fwd->fd;
                      //因为fd_nextsize忽略bin的头指针,所以设置victim->fd_nextsize为头块
                      //fd_nextsize指向下一个第一个大小不同的块
                      victim->bk_nextsize = fwd->fd->bk_nextsize;
                      //此处不写victim->bk_nextsize = bck;
                      //推测原因为当有多个和bck等大的chunk挂在尾部时,victim->bk_nextsize应该指向这些chunk的第一个块
                      fwd->fd->bk_nextsize = victim->bk_nextsize->fd_nextsize = victim;
                      //头的bk = victim   ;   victim的bk的fd = victim
                    }
                  else                 
                  //若该堆块大于最小堆块，那么我们就从链头开始寻找，直至找到一个比他小的堆块，然后放到他前面
                    {
                      assert ((fwd->size & NON_MAIN_ARENA) == 0);
                      while ((unsigned long) size < fwd->size)
                        {
                          fwd = fwd->fd_nextsize;
                          assert ((fwd->size & NON_MAIN_ARENA) == 0);
                        }

                      if ((unsigned long) size == (unsigned long) fwd->size)
                        /* 如果是等于，总是插入小链条的第二个位置  */
                        fwd = fwd->fd;
                        //same1->(插入位置)->same2->same3
                        //  ↑                 ↑
                        // bck       ->      fwd
                      else //如果size>fwd->size
                        {
                                                                  //此时chunk的大小关系: fwd->bk > victim > fwd
                          victim->fd_nextsize = fwd;              //所以victim下一块大小不同的块就是fwd
                          victim->bk_nextsize = fwd->bk_nextsize; //victim上一块大小不同的块就是fwd->bk_nextsize
                          fwd->bk_nextsize = victim;              //更新fwd上一块大小不同的块为victim
                          victim->bk_nextsize->fd_nextsize = victim;  
                          //更新victim的上一个大小不同的块的(下一个大小不同的块)为victim本身
                        }
                      bck = fwd->bk;//无论何种情况将bck更新为fwd的上一块,为了在bck和fwd中间插入块准备
                    }
                }
              else         //如果largebin链条为空
                victim->fd_nextsize = victim->bk_nextsize = victim;
            }

          mark_bin (av, victim_index);
          //初始情况:(下划线表示成员变量,箭头表示链表指向)
          //bck_fd -> fwd
          //bck    <- fwd_bk
          victim->bk = bck;//这行执行完:
          //bck_fd -> fwd
          //   bck <- fwd_bk
          //   bck <- victim_bk
          victim->fd = fwd;//这行执行完:
          //victim_fd -> fwd
          //   bck_fd -> fwd
          //      bck <- fwd_bk
          //      bck <- victim_bk
          fwd->bk = victim;//这行执行完:
          //victim_fd -> fwd
          //   bck_fd -> fwd
          //      bck <- victim_bk <- fwd_bk
          bck->fd = victim;//这行执行完:
          //bck_fd -> victim_fd -> fwd
          //   bck <- victim_bk <- fwd_bk
          //最后的效果就是在bck与fwd之间插入victim
          //对于smallbin而言是在头部插入victim
          //对于largebin而言是在bck与fwd之间插入victim,且当大小相同时,插入到大小相同这一串子链表的第二个位置

#define MAX_ITERS       10000
          if (++iters >= MAX_ITERS)
            break;
        }

step6 大循环-largebin堆块寻找

经过unsortedbin循环之后，我们再来判断

/*
        如果是large请求, 寻找最小适配块.  Use the skip list for this.
      */

     if (!in_smallbin_range (nb))
       {
         bin = bin_at (av, idx);//获取bin指针

         /* skip scan if empty or largest chunk is too small */
         if ((victim = first (bin)) != bin &&
             (unsigned long) (victim->size) >= (unsigned long) (nb)) //判断非空且最大块大于请求nb
           {
             victim = victim->bk_nextsize;
             while (((unsigned long) (size = chunksize (victim)) <
                     (unsigned long) (nb)))         //从链表尾部开始遍历，寻找到大于或等于他的块
               victim = victim->bk_nextsize;

             /* Avoid removing the first entry for a size so that the skip
                list does not have to be rerouted.  */
             //如果victim不为最后一个块，且相同大小堆块构成的子链表当中有两个或以上的相同大小的堆块
             if (victim != last (bin) && victim->size == victim->fd->size)  
               victim = victim->fd;         //始终取第二个

             remainder_size = size - nb;         //判断是否是非最佳适配，可能会多出部分
             unlink (av, victim, bck, fwd);         //脱链

             /* Exhaust */
             if (remainder_size < MINSIZE)
               {
                 set_inuse_bit_at_offset (victim, size);
                 if (av != &main_arena)
                   victim->size |= NON_MAIN_ARENA;
               }
             /* Split */
             else         //这里是存在剩余部分，然后我们存放在unsiorted bin 当中
               {
                 remainder = chunk_at_offset (victim, nb);
                 /* We cannot assume the unsorted list is empty and therefore
                    have to perform a complete insert here.  */
                 bck = unsorted_chunks (av);
                 fwd = bck->fd;
     if (__glibc_unlikely (fwd->bk != bck))
                   {
                     errstr = "malloc(): corrupted unsorted chunks";
                     goto errout;
                   }
                 remainder->bk = bck;
                 remainder->fd = fwd;
                 bck->fd = remainder;
                 fwd->bk = remainder;//在unsortedbin头部插入remainder
                 if (!in_smallbin_range (remainder_size))
                   {
                     remainder->fd_nextsize = NULL;
                     remainder->bk_nextsize = NULL;
                   }
                 set_head (victim, nb | PREV_INUSE |
                           (av != &main_arena ? NON_MAIN_ARENA : 0));
                 set_head (remainder, remainder_size | PREV_INUSE);
                 set_foot (remainder, remainder_size);
               }
             check_malloced_chunk (av, victim, nb);
             void *p = chunk2mem (victim);
             alloc_perturb (p, bytes);
             return p;
           }
       }

step7 大循环-位图分配

然后接着往下走：

/*
       从下一个最大的bin开始，通过扫描bin来搜索chunk。
      此搜索严格按照最佳匹配进行；即选择适合的最小的（具有接近最近最少使用的关系）块。
       位图避免了检查大多数块是否为非空。
      在预热阶段跳过所有存储箱的特殊情况下，还没有返回块，这比看起来更快。
     */

    ++idx;
    bin = bin_at (av, idx);
    block = idx2block (idx);                 //宏，右移5个bit位
    map = av->binmap[block];
    bit = idx2bit (idx);

    for (;; )
      {
        /* Skip rest of block if there are no more set bits in this block.  */
        if (bit > map || bit == 0)
          {
            do
              {
                if (++block >= BINMAPSIZE) /* out of bins */
                  goto use_top;
              }
            while ((map = av->binmap[block]) == 0);

            bin = bin_at (av, (block << BINMAPSHIFT));
            bit = 1;
          }

        /* Advance to bin with set bit. There must be one. */
        while ((bit & map) == 0)
          {
            bin = next_bin (bin);
            bit <<= 1;
            assert (bit != 0);
          }

        /* Inspect the bin. It is likely to be non-empty */
        victim = last (bin);

        /*  If a false alarm (empty bin), clear the bit. */
        if (victim == bin)
          {
            av->binmap[block] = map &= ~bit; /* Write through */
            bin = next_bin (bin);
            bit <<= 1;
          }

        else
          {
            size = chunksize (victim);

            /*  We know the first chunk in this bin is big enough to use. */
            assert ((unsigned long) (size) >= (unsigned long) (nb));

            remainder_size = size - nb;

            /* unlink */
            unlink (av, victim, bck, fwd);

            /* Exhaust */
            if (remainder_size < MINSIZE)
              {
                set_inuse_bit_at_offset (victim, size);
                if (av != &main_arena)
                  victim->size |= NON_MAIN_ARENA;
              }

            /* 切块，然后剩余的给unsorted bin */
            else
              {
                remainder = chunk_at_offset (victim, nb);

                /* We cannot assume the unsorted list is empty and therefore
                   have to perform a complete insert here.  */
                bck = unsorted_chunks (av);
                fwd = bck->fd;
    if (__glibc_unlikely (fwd->bk != bck))
                  {
                    errstr = "malloc(): corrupted unsorted chunks 2";
                    goto errout;
                  }
                remainder->bk = bck;
                remainder->fd = fwd;
                bck->fd = remainder;
                fwd->bk = remainder;

                /* advertise as last remainder */
                if (in_smallbin_range (nb))
                  av->last_remainder = remainder;
                if (!in_smallbin_range (remainder_size))
                  {
                    remainder->fd_nextsize = NULL;
                    remainder->bk_nextsize = NULL;
                  }
                set_head (victim, nb | PREV_INUSE |
                          (av != &main_arena ? NON_MAIN_ARENA : 0));
                set_head (remainder, remainder_size | PREV_INUSE);
                set_foot (remainder, remainder_size);
              }
            check_malloced_chunk (av, victim, nb);
            void *p = chunk2mem (victim);
            alloc_perturb (p, bytes);
            return p;
          }
      }

step7 使用top chunk

use_top:
    /*
     If large enough, split off the chunk bordering the end of memory
     (held in av->top). Note that this is in accord with the best-fit
     search rule. In effect, av->top is treated as larger (and thus
     less well fitting) than any other available chunk since it can
     be extended to be as large as necessary (up to system
     limitations).
     We require that av->top always exists (i.e., has size >=
     MINSIZE) after initialization, so if it would otherwise be
     exhausted by current request, it is replenished. (The main
     reason for ensuring it exists is that we may need MINSIZE space
     to put in fenceposts in sysmalloc.)
    */

  victim = av->top;
  size = chunksize (victim);

  if ((unsigned long) (size) >= (unsigned long) (nb + MINSIZE))         //如果topchunk够分配，直接切割
    {
      remainder_size = size - nb;
      remainder = chunk_at_offset (victim, nb);
      av->top = remainder;
      set_head (victim, nb | PREV_INUSE |
                (av != &main_arena ? NON_MAIN_ARENA : 0));
      set_head (remainder, remainder_size | PREV_INUSE);

      check_malloced_chunk (av, victim, nb);
      void *p = chunk2mem (victim);
      alloc_perturb (p, bytes);
      return p;
    }

  /* When we are using atomic ops to free fast chunks we can get
     here for all block sizes.  */
  else if (have_fastchunks (av))         
  //如果topchunk不够，且有fastchunk，那么进行malloc_consolidate进行合并fast，然后在接着分配
    {
      malloc_consolidate (av);
      /* restore original bin index */
      if (in_smallbin_range (nb))
        idx = smallbin_index (nb);
      else
        idx = largebin_index (nb);
    }

  /*
     Otherwise, relay to handle system-dependent cases
   */
  else         //如果上述都不满足，则调用系统分配
    {
      void *p = sysmalloc (nb, av);
      if (p != NULL)
        alloc_perturb (p, bytes);
      return p;
    }
}

总结流程图

参考链接:

Glibc：浅谈 malloc_consolidate() 函数具体实现_malloc consolidate-CSDN博客

Malloc_Free - peiwithhao’s Valhalla

malloc_2.23流程