* BLD: Try github actions again
* new requirements for p36
* fix code different across numpy versions
* silence the correct warnings for tests to run
* MAINT: Use loc instead of deprecated ix
* comment out windows for now
Co-authored-by: Richard Frank <rich@quantopian.com>
The following methods are supported:
- mean
- median
- stddev
- max
- min
- sum
- notnull_count
Each of these methods produces a term with ndim=1, meaning that it produces a
single scalar value per day.
- Adds a new method, `Filter.if_else()`, that wraps `np.where`. This allows
users to create expressions that conditionally draw from the outputs of one
of two terms. This is implemented using the newly-standardized "universal"
term machinery, similarly to downsample() and alias(), and Slice.
- Adds a new method, `ComputableTerm.fillna()`, implemented in terms of
`if_else()`. `fillna` allows users to fill missing data with either a
constant value, or values from another term. It supports both 2d and 1d
terms.
- As part of the implementation of `fillna()`, adds a new universal mixin for
creating constant terms.
Reworks how we choose a benchmark source for the zipline CLI and
zipline.utils.run_algo.run_algorithm. The logic for choosing a benchmark is now
consolidated in a new `BenchmarkSpec` class.
Benchmarks can be configured from several possible options:
- An in-memory series can be passed as `benchmark_returns`. This is mostly
useful for testing, or for use with `run_algorithm`.
- A path or file-like object can be passed as `benchmark_file`.
- A sid can be passed as `benchmark_sid`
- A symbol can be passed as `benchmark_symbol`.
- `no_benchmark` can be passed to use a dummy benchmark of all zero returns.
`BenchmarkSpec` takes all the parameters listed above, and "resolves" them into
either a sid or a series of benchmark returns, which are forwarded to
TradingAlgorithm, which already takes `benchmark_returns` or
`benchmark_sid`. If none of the above parameters are passed, we log a warning,
emit None for both values, and assume that the algorithm will call
`set_benchmark` in its `initialize` method.
Avoid making an extra copy of non-C-contiguous arrays when factorizing inputs
to LabelArray. This requires taking care to ensure that we use the same memory
order both when ravelling and unravelling the input arrays.
If an FX rate query requests a date that's greater than the last date in the fx
rate file, forward-fill from the last value in the file rather than raising an
error.
We do this for a few reasons:
1. We'd like to gracefully handle the possibility of an FX rates file that's
older than another input file.
2. Relative to other non-erroring behaviors, forward-filling is the simplest
thing to implement. Specifically, it's what the implementation prior to this
change would do naturally if there weren't an explicit check to prevent it.
3. For an FX rates file containing prices on a 24/5 calendar, some amount of
forward-filling is required to handle any market with a non-weekday date.
- When reading before the start of data, return NaN. We do this because it's
hard to reliably apply a lower bound to the queried dates in core-loader
style pipeline loaders.
- When reading an unknown base currency, return NaN. We might get data from
third parties with unknown currencies. Doing so should not be an error.
- When reading after the end of data, emit an error rather than forward-filling
forever. We may want to revisit this in the future.
Rather than trying to use S3s everywhere, which is annoying in Python 3 and
makes it harder to represent missing data, just use object arrays with None as
the missing value. This is the representation we want anyway for loading
currency data in pipelines, and the main downsides are performance (which
doesn't appear to be meaningfully affected) and difficulty with sorting, which
we don't need to do (at least right now).