On 18/04/2017 22:10, Chris Wilson wrote:
On Tue, Apr 18, 2017 at 05:56:15PM +0100, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin(a)intel.com>
> Building on top of the previous patch which exported the concept
> of engine classes and instances, we can also use this instead of
> the current awkward engine selection uAPI.
> This is primarily interesting for the VCS engine selection which
> is a) currently done via disjoint set of flags, and b) the
> current I915_EXEC_BSD flags has different semantics depending on
> the underlying hardware which is bad.
> Proposed idea here is to reserve 16-bits of flags, to pass in
> the engine class and instance (8 bits each), and a new flag
> named I915_EXEC_CLASS_INSTACE to tell the kernel this new engine
> selection API is in use.
> The new uAPI also removes access to the weak VCS engine
> balancing as currently existing in the driver.
> Example usage to send a command to VCS0:
> eb.flags = i915_execbuffer2_engine(DRM_I915_ENGINE_CLASS_VIDEO_DECODE, 0);
> Or to send a command to VCS1:
> eb.flags = i915_execbuffer2_engine(DRM_I915_ENGINE_CLASS_VIDEO_DECODE, 1);
To save a bit of space, we can use the ring selector as a class selector
if bit18 is set, with 19-27 as instance. That limits us to 64 classes -
hopefully not a problem for near future. At least I might have you sold
you on a flexible execbuf3 by then.
I was considering re-using those bits yes. I was thinking about the pro
of keeping it completely separate but I suppose there is not much value
in that. So I can re-use the ring selector just as well and have a
smaller impact on number of bits left over.
(As a digression, some cryptic notes for an implementation I did over
* - per context
* - per engine
We have this already so I am missing something I guess.
* - PAGE_SIZE ctl [ro head, rw tai] + user pot
* - kthread [i915/$ctx-$engine] (optional?)
No idea what these two are. :)
* - assumes NO_RELOC-esque awareness
Ok ok NO_RELOC. :)
* SYNC flags [wait/signal], handle [semaphore/fence]
Sync fence in out just as today, but probably more?
* BIND handle, offset [user provided]
* ALLOC[32,64] handle, flags, *offset [kernel provided, need RELOC]
* RELOC[32,64] handle, target_handle, offset, delta
* CLEAR flags, handle
* UNBIND handle
Explicit VMA management? Separate ioctl maybe would be better?
* BATCH flags, handle, offset
* [or SVM flags, address]
* PIN flags (MAY_RELOC), count, handle[count]
* FENCE flags, count, handle[count]
* SUBMIT handle [fence/NULL with error]
No idea again. :)
At the moment it is just trying to do execbuf2, but more compactly
with fewer ioctls. But one of the main selling points is that we can
extend the information passed around more freely than execbuf2.)
I have nothing against a better eb since I trust you know much better it
is needed and when. But I don't know how long it will take to get there.
This class/instance idea could be implemented quickly to solve the sore
point of VCS/VCS2 engine selection. But yeah, it is another uABI to keep
in that case.