* port the mimalloc allocator
* hook mimalloc opt into common.h and reduction ops
* repurpose USE_MIMALLOC to only denote subbing in of default allocator with mimalloc and some refactoring
* fix unintended cherry pick diffs
* polish alloctor_mimalloc
* explicitly disable mimalloc where it already had been disabled
* update mimalloc to pull in stl allocator
* switch mimalloc stl allocator to use mimalloc library version
* turn mimalloc on by default (only the stl changes are enabled, the python interacting ones are off already and shall remain so)
* move FastAllocVector into cpu specific code
* separate out defines into arena and stl changes
* the rest of the define renames
* bfc arena allocator
* some typos and rename the bfc arena allocator to fit existing class naming conventions
* adjustments in response to comments
* different template instantiations are friends