-
Notifications
You must be signed in to change notification settings - Fork 2k
wgpu: Implement Tqueries #8669
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
wgpu: Implement Tqueries #8669
Conversation
2a92144
to
7cfcd2d
Compare
7cfcd2d
to
658390f
Compare
std::atomic<uint64_t> elapsed{ 0 };// only valid if available is true | ||
}; | ||
|
||
std::shared_ptr<Status> status; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does status need a std::shared_ptr
? From what I can tell, only this class owns and uses this? Thus, if a smart pointer were needed I would think to reach for std::unique_ptr
, but I don't see the need for a pointer at all. Why can't Status
just be a member variable of this class, or even just its members directory (available
and elapsed
)? In any of these cases, should the underlying status be private
, since only this class needs it?
if (!status->available.load()) { | ||
return false; | ||
} | ||
if (outElapsedTime) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First, this should not ever be true right? If the status is available, the time should be. Perhaps an assert_invariant(...)
makes more sense? Second, even if this were true we would return that the results are available, but we never set anything in outElapsedTime which could have undefined behavior, such as junk in that integer.
|
||
void WebGPUDriver::endTimerQuery(Handle<HwTimerQuery> tqh) { | ||
auto* tq = handleCast<WGPUTimerQuery>(tqh); | ||
tq->endTimeElapsedQuery(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is certainly one way to do it, and maybe we should review this with some of the other backend engineers, to see if this gets to the spirit of what is needed by these queries. But, what I originally had in mind was something more like:
...
// somewhere in the driver initiation... maybe the constructor:
...
mQueue.OnSubmittedWorkDone(wgpu::CallbackMode::AllowSpontaneous, [this](auto const& status) {
if (status == wgpu::QueueWorkDoneStatus::Success) {
mCurrentTimerQuery.end();
} else {
... log something ...
}
});
...
}
void WebGPUDriver::commit(Handle<HwSwapChain> sch) {
mCommandEncoder = nullptr;
mCurrentTimerQuery.begin();
mQueue.Submit(1, &mCommandBuffer);
...
beginTimerQuery(...)
and endTimerQuery(...)
maybe could be used to set (and unset) mCurrentTimerQuery
.
You also have to account for finish, where we may need to add additional time if we submit multiple command buffers in a frame (we can discuss offline if you need clarity on that). In that case, you might want some kind of state for intermediate queues to account for in OnSubmittedWorkDone and maybe another argument to the timer query's "end" call.
#include "WebGPUHandles.h" | ||
|
||
#include <chrono> | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you need to include (include-what-you-use) <memory>
for std::weak_ptr
and <cstdint>
for uint64_t
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should at least time the command buffer queue submission time (I believe close to what Metal is doing) instead of the approach as-is at this point. We need to prioritize the approach of timing all the render/compute passes, accumulating those, and reporting that, which is a bit more complicated.
BUGS=[407961733]
This PR Implements Timer Queries. It does it by taking a timestamp before it beings a frame and after it ends processing