Optimize unnecessary check in Vec::retain
The function `vec::Vec::retain` only have two stages:
1. Nothing was deleted.
2. Some elements were deleted.
Here is an unnecessary check `if g.deleted_cnt > 0` in the loop, and it's difficult for compiler to optimize it. I split the loop into two stages manully and keep the code clean using const generics.
I write a special but common bench case for this optimization. I call retain on vec but keep all elements.
Before and after this optimization:
```
test vec::bench_retain_whole_100000 ... bench: 84,803 ns/iter (+/- 17,314)
```
```
test vec::bench_retain_whole_100000 ... bench: 42,638 ns/iter (+/- 16,910)
```
The result is expected, there are two `if`s before the optimization and one `if` after.