• Jason Rhinelander's avatar
    series/quantiles: source-based quantile support · 99de1bbf
    Jason Rhinelander authored
    Old behaviour:
    ==============
    
    creativity-series previously removed non-finite values, and sorted the
    rest, producing something like:
    
    t,1st,2nd,3rd,...
    2,10,11,12
    3,3
    
    (where t=0 and t=1 have no finite values, t=3 has only one non-finite,
    etc.).  This discards any association with the files they come from,
    however, which means any sort of source-based confidence exclusion is
    impossible.
    
    creativity-series-quantiles then used these pre-sorted values to produce
    another .csv of quantile values.
    
    createivity-series-graphs accepted either one--if the series file, it
    calculated quantiles on the fly; if the quantiles, it used the
    pre-calculated quantiles.
    
    New behaviour:
    ==============
    
    creativity-series now just generates something like:
    
    t,source1,source2,source3
    0,nan,nan,nan
    1,nan,nan,nan
    1,12,10,11
    2,nan,3,nan
    
    which creativity-series-quantiles and creativity-series-graphs now
    understand: they does the nan trimming and sorting, then calculate
    quantiles.
    
    creativity-series-graphs also gains an entirely new ability, with flag
    --source-confidence (but only when given a creativity-series file): to
    select a confidence region by excluding (1-x)% of source files, then
    plotting the minimum and maximum values remaining after removal of the
    most non-median values.  This is done by calculating (p-0.5)^2 for the
    inverse quantile p for each time period, then summing these up across
    time periods: the source files with the largest scores are excluded.
    (see --help for the gory details).
    99de1bbf