-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
merge() optimization #363
merge() optimization #363
Conversation
Codecov Report
@@ Coverage Diff @@
## master #363 +/- ##
==========================================
+ Coverage 86.19% 86.81% +0.61%
==========================================
Files 10 10
Lines 478 508 +30
==========================================
+ Hits 412 441 +29
- Misses 66 67 +1
Continue to review full report at Codecov.
|
src/utilities.jl
Outdated
idx_b = Vector{IndexType}(length(b)) | ||
k = 1 | ||
@inbounds while (i <= na) && (j <= nb) | ||
if a[i] < b[j] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The indentation looks weird here. Please use space.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think the original code had some tabs and my editor was set to auto. Should be fixed.
src/combine.jl
Outdated
function merge(ta1::TimeArray{T, N, D}, ta2::TimeArray{T, M, D}, method::Symbol=:inner; | ||
colnames::Vector=[], meta::Any=Void) where {T, N, M, D} | ||
colnames::Vector=[], meta::Any=Void, missingvalue=NaN) where {T, N, M, D} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already have a padding
keyword in other APIs, like lag(..., padding=true)
.
I think something like padvalues
or paddedvalue
might better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed to padvalue
@iblis17 Do these look in a state you're happy to merge now? |
end # sorted_unique_merge | ||
For each column in src, insert elements from src[srcidx[i], column] to dst[dstidx[i], column]. | ||
""" | ||
function insertbyidx!(dst::AbstractArray, src::AbstractArray, dstidx::Vector, srcidx::Vector) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this function can be replaced by
broadcast_setindex!(dst, broadcast_getindex(src, srcidx), dstidx)
which is faster in your use case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using BenchmarkTools
using TimeSeries
dst = zeros(1_000_000)
src = zeros(1_000_000)
dstidx = [1:length(dst)...]
srcidx = [1:length(dst)...]
@btime broadcast_setindex!(dst, broadcast_getindex(src, srcidx), dstidx)
@btime TimeSeries.insertbyidx!(dst, src, dstidx, srcidx)
9.486 ms (8 allocations: 7.63 MiB)
2.208 ms (0 allocations: 0 bytes)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would love for the core language and library features to work optimally, so worth revisiting this in the future.
Thanks for your great contributions! 👍 |
It's fun to help a little bit when so much great work is done by other people before me 😃. Btw, |
Actually, julia> f = (dst, src, srcidx, dstidx) -> @inbounds(dst[dstidx, :] = @view(src[srcidx, :]))
(::#26) (generic function with 1 method)
julia> @btime f($dst, $src, $srcidx, $dstidx)
1.700 ms (5 allocations: 192 bytes)
julia> @btime TimeSeries.insertbyidx!($dst, $src, $dstidx, $srcidx)
1.632 ms (0 allocations: 0 bytes)
julia> @btime broadcast_setindex!($dst, broadcast_getindex($src, $srcidx), $dstidx)
4.636 ms (4 allocations: 7.63 MiB) |
Yep, aware of the globals type instability. In this case I figured it wasn't going to make much difference because it only affects the dispatch, and the functions in the benchmark operate on a fairly large dataset. You're right though, with |
Summary:
Mostly focused on optimizing outer join. Benchmark example:
Previous result:
New result:
At these scales it's mostly load/store bound so any reduction in that (e.g. smaller index types, smaller data types, fewer passes) make the biggest difference. For my, and maybe other people's use cases
Float32
support helps a lot. The above benchmark withFloat32
values is ~224ms.