Documentation for Lowess.
Lowess
This package includes a pure Julia lowess
function, which is an implementation of the LOWESS smoother (references given at the end of the documentation), along with a lowess_model
function which can be used to make corresponding models.
Tutorial
In this section we will go through a simple example to see how the package works. Consider the following code snippet.
using Lowess, Plots, Random
Random.seed!(42)
xs = 10 .* rand(100)
xs = sort(xs)
ys = sin.(xs) .+ 0.5 * rand(100)
model = lowess_model(xs, ys, 0.2)
us = range(extrema(xs)...; step = 0.1)
vs = model(us)
scatter(xs, ys)
plot!(us, vs, legend = false)
The above code creates some random data points sampled out of a sine curve, with some noise added to the ordinates. A model is created using this data with f = 0.2
(f
is the parameter which controls the amount of smoothing).
us
is a vector of abscissas which lie in the range of the abscissas which were passed as input to the model. To get the predicted smooth values for us
, we are passing it to the model; the result will be the vector vs
, the vector of predicted values.
Finally, we get the scatter plot of the input points, and the smooth plot using us
and vs
.
Comparison with Loess.jl
Our package is an alternative to Loess.jl, which exports the more general LOESS predictor. In this section, we benchmark the performance of both the packages on a common dataset and do a comparison in terms of the resources used.
For our test, we use the example given in the tutorial. For the benchmarks, we use the BenchmarkTools
package. In the below code, we benchmark the performance of our package code.
julia> using BenchmarkTools, Lowess, Random;
julia> Random.seed!(42);
julia> xs = 10 .* rand(100);
julia> xs = sort(xs);
julia> ys = sin.(xs) .+ 0.5 * rand(100);
julia> @benchmark begin model = lowess_model(xs, ys, 0.2) us = range(extrema(xs)...; step = 0.1) vs = model(us) end
BenchmarkTools.Trial: 10000 samples with 1 evaluation. Range (min … max): 88.501 μs … 3.688 ms ┊ GC (min … max): 0.00% … 97.20% Time (median): 91.401 μs ┊ GC (median): 0.00% Time (mean ± σ): 92.211 μs ± 36.061 μs ┊ GC (mean ± σ): 0.39% ± 0.97% █▁▅ ▁▂▃▅▄▂▂▁▁▁▁▁▁▁▄▆███▇▃▃▂▁▂▁▂▂▂▂▂▃▂▂▂▁▂▂▁▁▁▁▁▁▁▂▂▂▁▂▁▁▁▁▁▁▁▁▁ ▂ 88.5 μs Histogram: frequency by time 98 μs < Memory estimate: 4.66 KiB, allocs estimate: 12.
The exact same benchmarking code, but with Loess.jl, is given below.
julia> using Loess;
julia> @benchmark begin model = loess(xs, ys, span = 0.5) us = range(extrema(xs)...; step = 0.1) vs = predict(model, us) end
BenchmarkTools.Trial: 10000 samples with 1 evaluation. Range (min … max): 222.805 μs … 3.360 ms ┊ GC (min … max): 0.00% … 89.35% Time (median): 231.806 μs ┊ GC (median): 0.00% Time (mean ± σ): 296.725 μs ± 347.596 μs ┊ GC (mean ± σ): 19.13% ± 14.41% █▅▁ ▁ ███▄▁▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▅▅▅▆▅▅▅▆▆▆▆▇▆▆▄▆▇▆ █ 223 μs Histogram: log(frequency) by time 2.37 ms < Memory estimate: 1.24 MiB, allocs estimate: 2559.
For the above test, our package code runs much faster compared to Loess.jl.
References
[1] Cleveland, W. S. (1979). Robust locally weighted regression and smoothing scatterplots. Journal of the American statistical association, 74(368), 829-836. DOI: 10.1080/01621459.1979.10481038
[2] Cleveland, W. S., & Devlin, S. J. (1988). Locally weighted regression: an approach to regression analysis by local fitting. Journal of the American statistical association, 83(403), 596-610. DOI: 10.1080/01621459.1988.10478639
[3] Cleveland, W. S., & Grosse, E. (1991). Computational methods for local regression. Statistics and computing, 1(1), 47-62. DOI: 10.1007/BF01890836
API Reference
Lowess.lowess
— Methodlowess(x, y, f = 2 / 3, nsteps = 3, delta = 0.01 * (maximum(x) - minimum(x)))
Compute the smooth of a scatterplot of y
against x
using robust locally weighted regression. Input vectors x
and y
must contain either integers or floats. Parameters f
and delta
must be of type T
, where T <: AbstractFloat
. Returns a vector ys
; ys[i]
is the fitted value at x[i]
. To get the smooth plot, ys
must be plotted against x
.
Arguments
x::Vector
: Abscissas of the points on the scatterplot.x
must be ordered.y::Vector
: Ordinates of the points in the scatterplot.f::T
: The amount of smoothing.nsteps::Integer
: Number of iterations in the robust fit.delta::T
: A nonnegative parameter which may be used to save computations.
Example
using Lowess, Plots
x = sort(10 .* rand(100))
y = sin.(x) .+ 0.5 * rand(100)
ys = lowess(x, y, 0.2)
scatter(x, y)
plot!(x, ys)
Lowess.lowess_model
— Functionlowess_model(xs, ys, f = 2 / 3, nsteps = 3, delta = 0.01 * (maximum(xs) - minimum(xs)))
Return a lowess model which can be used to predict the ordinate corresponding to a new abscissa. Has the same arguments as lowess
.
Example
using Lowess, Plots
xs = 10 .* rand(100)
xs = sort(xs)
ys = sin.(xs) .+ 0.5 * rand(100)
model = lowess_model(xs, ys, 0.2)
us = range(extrema(xs)...; step = 0.1)
vs = model(us)
scatter(xs, ys)
plot!(us, vs, legend = false)