接前篇Continue the previous post .net垃圾回收和CLR 4.0对垃圾回收所做的改进之一
CLR4.0所带来的变化仍然没有在这篇,请看下篇。
内存释放和压缩
创建对象引用图之后,垃圾回收器将那些没有在这个图中的对象(即不再需要的对象)释放。释放内存之后, 出现了内存碎片, 垃圾回收器扫描托管堆,找到连续的内存块,然后移动未回收的对象到更低的地址, 以得到整块的内存,同时所有的对象引用都将被调整为指向对象新的存储位置。这就象一个夯实的动作。
After building up the reference relationship graph, garbage collector reclaims the objects not in the graph(no longer needed), after releasing the objects not in the graph, there is memory scrap. Garbage collector scans the managed heap to find continous memory block, and shifts the remaining objects to lower address to get consecutive memory space, and then adjusts the references of objects according to the shifted address of objects. This is looking like a tamping on the managed heap.
下面要说到的是代的概念。代概念的引入是为了提高垃圾收集器的整体性能。We come to the concept of generations next. The importing of generation concept is to improve the performance of garbage collector.
代Generations
请想一想如果垃圾收集器每次总是扫描所有托管堆中的对象,对性能会有什么影响。会不会很慢?是的。微软因此引入了代的概念。
Please think about what will happen if garbage collector scans all the objects in the whole heap in every garbage collecting cycle. Will it be very slow? Yes, therefore Microsoft imported the concept of generations.
为什么代的概念可以提高垃圾收集器的性能?因为微软是基于对大量编程实践的科学估计,做了一些假定而这些假定符合绝大多数的编程实践:
Why generation concept can help improve performance of garbage collector? Because Microsoft did scientific valuation on mass of programming practice, and made assumptions and the assumptions conform to most of programming practice:
越新的对象,其生命周期越短。The newer an object is, the shorter its lifetime will be.
越老的对象,其生命周越长。The older an object is, the longer its lifetime will be.
新对象之间通常有强的关系并被同时访问。Newer objects tend to have strong relationships to each other and are frequently accessed around the same time.
压缩一部分堆比压缩整个堆快。Compacting a portion of the heap is faster than compacting the whole heap.
有了代的概念,垃圾回收活动就可以大部分局限于一个较小的区域来进行。这样就对垃圾回收的性能有所提高。After importing the concept of generations, most of garbage collecting will be limited in in smaller range of memory. This enhances the performance of garbage collector.
让我们来看垃圾收集器具体是怎么实现代的: Let’s see how generations are exactly implemented in garbage collector:
第0代:新建对象和从未经过垃圾回收对象的集合 Generation 0: A collection of newly created object and the objects never collected.
第1代:在第0代收集活动中未回收的对象集合 Generation 1: A collection of objects not collected by garbage collector in collecting cycle of generation 0.
第2代:在第1和第2代中未回收的对象集合, 即垃圾收集器最高只支持到第2代, 如果某个对象在第2代的回收活动中留下来,它仍呆在第2代的内存中。 Generation 2: A collection of objects not collected by garbage collector in generation 1 and generation 2. This means the highest generation that garbage collector supports is generation 2. If an object survives in generation 2 collecting cycle, it still remains in memory of generation 2.
当程序刚开始运行,垃圾收集器分配为每一代分配了一定的内存,这些内存的初始大小由.net framework的策略决定。垃圾收集器记录了这三代的内存起始地址和大小。这三代的内存是连接在一起的。第2代的内存在第1代内存之下,第1代内存在第0代内存之下。应用程序分配新的托管对象总是从第0代中分配。如果第0代中内存足够,CLR就很简单快速地移动一下指针,完成内存的分配。这是很快速的。当第0代内存不足以容纳新的对象时,就触发垃圾收集器工作,来回收第0代中不再需要的对象,当回收完毕,垃圾收集器就夯实第0代中没有回收的对象至低的地址,同时移动指针至空闲空间的开始地址(同时按照移动后的地址去更新那些相关引用),此时第0代就空了,因为那些在第0代中没有回收的对象都移到了第1代。
When the program initializes, garbage collector allocates memory for generations. The initial size of memory blocks are determined according to the strategies of the .net framework. Garbage collector records the start address and size of the memory block for generations. The memory blocks of generations are continuous and adjacent. The memory of generation 2 is under the memory of generation 1, and the memory of generation 1 is under the memory of generation 0. CLR always allocates memory for new objects in generation 0. If there is enough memory in generation 0, CLR simply moves the pointer to allocate memory. This is really fast. When there is not enough memory in generation 0 to accommodate new objects, CLR triggers garbage collector starts to collect objects no longer needed from generation 0. When the collecting action in generation 0 finishs, garbage collector tamps(or compacts) the objects not collected in generation 0 to lower address, and moves the pointer to start address of free memory(and updates the related references according to the shifted address of objects). At this time, generation 0 is empty, because the objects survived in generation 0 are moved to generation 1.
当只对第0代进行收集时,所发生的就是部分收集。这与之前所说的全部收集有所区别(因为代的引入)。对第0代收集时,同样是从根开始找那些正引用的对象,但接下来的步骤有所不同。当垃圾收集器找到一个指向第1代或者第2代地址的根,垃圾收集器就忽略此根,继续找其他根,如果找到一个指向第0代对象的根,就将此对象加入图。这样就可以只处理第0代内存中的垃圾。这样做有个先决条件,就是应用程序此前没有去写第1代和第2代的内存,没有让第1代或者第2代中某个对象指向第0代的内存。但是实际中应用程序是有可能写第1代或者第2代的内存的。针对这种情况,CLR有专门的数据结构(Card table)来标志应用程序是否曾经写第1代或者第2代的内存。如果在此次对第0代进行收集之前,应用程序写过第1代或者第2代的内存,那些被Card Table登记的对象(在第1代或者第2代)将也要在此次对第0代收集时作为根。这样,才可以正确地对第0代进行收集。
When collecting generation 0 only, it is partial collection. It is different from full collection mentioend earlier(because of the generations). When collecting generation 0, garbage collector starts from the roots, which is the same as the full collection, but it is different in coming steps. When garbage collector finds a root pointing to an address of generation 1 or 2, garbage collector ignores the root, and goes to next root. If garbage collector finds a root pointing to an object of generation 0, garbage collector addes the object into the graph. That way garbage collector processes the objects of generation 0 only. There is a pre-condition to do that. It is that the application does not write to the memory of generation 1 and 2, does not allow some objects of generation 1 or 2 refer to the memory of generation 0. But in our daily work, the applicaiton is possible to write the memory of generation 1 or 2. In this case, CLR has a dedicated data structure called Card Table to record whether the application writes the memory of generation 1 or 2. If the application writes the memory of generation 1 or 2 before the collecting on generation 0, the objects recorded by the Card Table will become roots during the collecting on generation 0. Garbage collection on generation 0 can be done correctly in this case.
以上说到了第0代收集发生的一个条件,即第0代没有足够内存去容纳新对象。执行GC.Collect()也会触发对第0代的收集。另外,垃圾收集器还为每一代都维护着一个监视阀值。第0代内存达到这个第0代的阀值时也会触发对第0代的收集。对第1代的收集发生在执行GC.Collect(1)或者第1代内存达到第1代的阀值时。第2代也有类似的触发条件。当第1代收集时,第0代也需要收集。当第2代收集时,第1和第0代也需要收集。在第n代收集之后仍然存留下来的对象将被转移到第n+1代的内存中,如果n=2, 那么存留下来的对象还将留在第2代中。
We mentioned a criteria to trigger collecting on generation 0 in above paragraphs: generation 0 does not have enough memory to accommodate new objects. When execute GC.Collect(), it launches collecting on generation 0 also. In addition, garbage collector sets up a threshold for each of generations. When the memory of generation 0 reaches the threshold, collecting on generation 0 happens also. Collecting on generation 1 happens when executing GC.Collect() or the memory of generation 1 reaches the threshold of generation 1. Generation 2 has similar trigger conditions. When collecting on generation 1, collecting on generation 0 happens also. When collecting on generation 2, collecting on generation 1 and 0 happen also. The survived object in collecting generation n will be moved to the memory of generation n+1. If n=2, the remaining objects still stay in generation 2.
对象结束Finalization of objects
对象结束机制是程序员忘记用Close或者Dispose等方法清理申请的资源时的一个保证措施。如下的一个类,当一个此类的实例创建时,在第0代中分配内存,同时此对象的引用要被加入到一个由CLR维护的结束队列中去。
Finalization is an ensuring mechanism when programmers forget to use Close or Dispose method to clean up resources. For exmaple, a class like the following, when an instane of the class is created, it is allocated in memory of generation 0, and a reference of the object is appended to Finalization quere maintained by CLR.
public class BaseObj { public BaseObj() { } protected override void Finalize() { // Perform resource cleanup code here... // Example: Close file/Close network connection Console.WriteLine("In Finalize."); }}当此对象成为垃圾时,垃圾收集器将其引用从结束队列移到待结束队列中,同时此对象会被加入引用关系图。一个独立运行的CLR线程将一个个从待结束队列(Jeffrey Richter称之为Freachable quere)取出对象,执行其Finalize方法以清理资源。因此,此对象不会马上被垃圾收集器回收。只有当此对象的Finalize方法被执行完毕后,其引用才会从待结束队列中移除。等下一轮回收时,垃圾回收器才会将其回收。
When the object becomes garbage, garbage collector moves the reference from Finalization queue to ToBeFinalized queue(Jeffrey Richter called it Freachable queue), and appends the object to the reference graph. A standalone thread of CLR will fetch objects from the ToBeFinalized queue one by one, and execute the Finalize() method of objects to clean up resources. Therefore, the object will not be collected right away by garbage collector. After the Finalize() method is executed, its reference will be removed from the ToBeFinalizaed queue. When next collecting comes, garbage collector reclaims its memory.
GC类有两个公共静态方法GC.ReRegisterForFinalize和GC.SuppressFinalize大家也许想了解一下,ReRegisterForFinalize是将指向对象的引用添加到结束队列中(即表明此对象需要结束),SuppressFinalize是将结束队列中该对象的引用移除,CLR将不再会执行其Finalize方法。
There are two public static methods of GC class you guys may want to know: GC.ReRegisterForFinalize and GC.SuppressFinalize. ReRegisterForFinalize is to append the reference of objects to finalization queue(meaning the objects need to be finalized), SuppressFinalize is to remove the reference of objects from finalization queue, then CLR would not
[1] [2] 下一页