Hi there,

I’m working with the max drawdown indicator, and what I need is a very fast implementation to run in a machine learning experiment. The implementation will be called many times, however, the data set used in each call is relatively small.

I made a solution to benchmarking the implementations form this question’s answers; the only work I done was vectorized version implementation using Math.NET.

```
public static class VectorizedMaxDrawDown
{
public static double Run(double[] ccr)
{
double[] cumCCR = new double[ccr.Length];
double summedCcr = 0;
for (int idx = 0; idx < ccr.Length; idx++)
{
summedCcr += ccr[idx];
cumCCR[idx] = summedCcr;
}
var cumulativeCcr = Vector<double>.Build.DenseOfArray(cumCCR);
cumulativeCcr.PointwiseExp(cumulativeCcr);
var invCumulativeCcr = 1 / cumulativeCcr;
var cumulativeCcrMat = (Vector<double>.OuterProduct(cumulativeCcr, invCumulativeCcr) - 1);
return cumulativeCcrMat.LowerTriangle().Enumerate().Min();
}
}
```

I expected the vectorized version to be way faster than the others, but the simpler for loop version is two orders of magnitude faster!

So the questions are:

- Why is the simple for loop version faster?
- Is my implementation correct?
- How can be improved? (e.g. a better way to estimate the vector cumulative sum )

Any hint will be much appreciated, thanks in advance.

JJ