| // Copyright 2018 The Fuchsia Authors. All rights reserved. |
| // Use of this source code is governed by a BSD-style license that can be |
| // found in the LICENSE file. |
| |
| library fuchsia.media; |
| |
| using fuchsia.media.audio; |
| |
| /// |
| /// AudioRenderers can be in one of two states at any time: _configurable_ or _operational_. A |
| /// renderer is considered operational whenever it has packets queued to be rendered; otherwise it |
| /// is _configurable_. Once an AudioRenderer enters the operational state, calls to "configuring" |
| /// methods are disallowed and will cause the audio service to disconnect the client's connection. |
| /// The following are considered configuring methods: `AddPayloadBuffer`, `SetPcmStreamType`, |
| /// `SetStreamType`, `SetPtsUnits`, `SetPtsContinuityThreshold`. |
| /// |
| /// If an AudioRenderer must be reconfigured, the client must ensure that no packets are still |
| /// enqueued when these "configuring" methods are called. Thus it is best practice to call |
| /// `DiscardAllPackets` on the AudioRenderer (and ideally `Stop` before `DiscardAllPackets`), prior |
| /// to reconfiguring the renderer. |
| /// |
| protocol AudioRenderer { |
| compose StreamBufferSet; |
| compose StreamSink; |
| |
| /// Binds to the gain control for this AudioRenderer. |
| BindGainControl(request<fuchsia.media.audio.GainControl> gain_control_request); |
| |
| /// Sets the units used by the presentation (media) timeline. By default, PTS units are |
| /// nanoseconds (as if this were called with numerator of 1e9 and denominator of 1). |
| SetPtsUnits(uint32 tick_per_second_numerator, |
| uint32 tick_per_second_denominator); |
| |
| /// Sets the maximum threshold (in seconds) between explicit user-provided PTS |
| /// and expected PTS (determined using interpolation). Beyond this threshold, |
| /// a stream is no longer considered 'continuous' by the renderer. |
| /// |
| /// Defaults to an interval of half a PTS 'tick', using the currently-defined PTS units. |
| /// Most users should not need to change this value from its default. |
| /// |
| /// Example: |
| /// A user is playing back 48KHz audio from a container, which also contains |
| /// video and needs to be synchronized with the audio. The timestamps are |
| /// provided explicitly per packet by the container, and expressed in mSec |
| /// units. This means that a single tick of the media timeline (1 mSec) |
| /// represents exactly 48 frames of audio. The application in this scenario |
| /// delivers packets of audio to the AudioRenderer, each with exactly 470 |
| /// frames of audio, and each with an explicit timestamp set to the best |
| /// possible representation of the presentation time (given this media |
| /// clock's resolution). So, starting from zero, the timestamps would be.. |
| /// |
| /// [ 0, 10, 20, 29, 39, 49, 59, 69, 78, 88, ... ] |
| /// |
| /// In this example, attempting to use the presentation time to compute the |
| /// starting frame number of the audio in the packet would be wrong the |
| /// majority of the time. The first timestamp is correct (by definition), but |
| /// it will be 24 packets before the timestamps and frame numbers come back |
| /// into alignment (the 24th packet would start with the 11280th audio frame |
| /// and have a PTS of exactly 235). |
| /// |
| /// One way to fix this situation is to set the PTS continuity threshold |
| /// (henceforth, CT) for the stream to be equal to 1/2 of the time taken by |
| /// the number of frames contained within a single tick of the media clock, |
| /// rounded up. In this scenario, that would be 24.0 frames of audio, or 500 |
| /// uSec. Any packets whose expected PTS was within +/-CT frames of the |
| /// explicitly provided PTS would be considered to be a continuation of the |
| /// previous frame of audio. For this example, calling 'SetPtsContinuityThreshold(0.0005)' |
| /// would work well. |
| /// |
| /// Other possible uses: |
| /// Users who are scheduling audio explicitly, relative to a clock which has |
| /// not been configured as the reference clock, can use this value to control |
| /// the maximum acceptable synchronization error before a discontinuity is |
| /// introduced. E.g., if a user is scheduling audio based on a recovered |
| /// common media clock, and has not published that clock as the reference |
| /// clock, and they set the CT to 20mSec, then up to 20mSec of drift error |
| /// can accumulate before the AudioRenderer deliberately inserts a |
| /// presentation discontinuity to account for the error. |
| /// |
| /// Users whose need to deal with a container where their timestamps may be |
| /// even less correct than +/- 1/2 of a PTS tick may set this value to |
| /// something larger. This should be the maximum level of inaccuracy present |
| /// in the container timestamps, if known. Failing that, it could be set to |
| /// the maximum tolerable level of drift error before absolute timestamps are |
| /// explicitly obeyed. Finally, a user could set this number to a very large |
| /// value (86400.0 seconds, for example) to effectively cause *all* |
| /// timestamps to be ignored after the first, thus treating all audio as |
| /// continuous with previously delivered packets. Conversely, users who wish |
| /// to *always* explicitly schedule their audio packets exactly may specify |
| /// a CT of 0. |
| /// |
| /// Note: explicitly specifying high-frequency PTS units reduces the default |
| /// continuity threshold accordingly. Internally, this threshold is stored as an |
| /// integer of 1/8192 subframes. The default threshold is computed as follows: |
| /// RoundUp((AudioFPS/PTSTicksPerSec) * 4096) / (AudioFPS * 8192) |
| /// For this reason, specifying PTS units with a frequency greater than 8192x |
| /// the frame rate (or NOT calling SetPtsUnits, which accepts the default PTS |
| /// unit of 1 nanosec) will result in a default continuity threshold of zero. |
| /// |
| SetPtsContinuityThreshold(float32 threshold_seconds); |
| |
| /// Retrieves the stream's reference clock. The returned handle will have READ, DUPLICATE |
| /// and TRANSFER rights, and will refer to a zx::clock that is MONOTONIC and CONTINUOUS. |
| /// |
| GetReferenceClock() -> (handle<clock> reference_clock); |
| |
| /// Sets the reference clock that controls this renderer's playback rate. If the input |
| /// parameter is valid, it must have READ, DUPLICATE, TRANSFER rights and refer to a clock |
| /// that is MONOTONIC and CONTINUOUS. If instead ZX_HANDLE_INVALID is passed, then this |
| /// renderer will use the so-called 'optimal' clock generated by AudioCore for this stream. |
| /// |
| /// Although |SetReferenceClock| is not considered a configuring method, calling it when |
| /// packets are queued (even if paused) will generally cause a reference timeline |
| /// discontinuity, with packets either discarded or delayed (based on the discontinuity). |
| /// Cases that PRESERVE continuity would include setting as reference either the 'optimal' |
| /// clock provided by AudioCore, or the clone of a client-controlled reference clock. |
| /// |
| SetReferenceClock(handle<clock>? reference_clock); |
| |
| /// Sets the usage of the render stream. This method may not be called after |
| /// `SetPcmStreamType` is called. The default usage is `MEDIA`. |
| SetUsage(AudioRenderUsage usage); |
| |
| /// Sets the type of the stream to be delivered by the client. Using this method implies |
| /// that the stream encoding is `AUDIO_ENCODING_LPCM`. |
| /// |
| /// This must be called before `Play` or `PlayNoReply`. After a call to `SetPcmStreamType`, |
| /// the client must then send an `AddPayloadBuffer` request, then the various `StreamSink` |
| /// methods such as `SendPacket`/`SendPacketNoReply`. |
| SetPcmStreamType(AudioStreamType type); |
| |
| /// Enable or disable notifications about changes to the minimum clock lead |
| /// time (in nanoseconds) for this AudioRenderer. Calling this method with |
| /// 'enabled' set to true will trigger an immediate `OnMinLeadTimeChanged` |
| /// event with the current minimum lead time for the AudioRenderer. If the |
| /// value changes, an `OnMinLeadTimeChanged` event will be raised with the |
| /// new value. This behavior will continue until the user calls |
| /// `EnableMinLeadTimeEvents(false)`. |
| /// |
| /// The minimum clock lead time is the amount of time ahead of the reference |
| /// clock's understanding of "now" that packets needs to arrive (relative to |
| /// the playback clock transformation) in order for the mixer to be able to |
| /// mix packet. For example... |
| /// |
| /// ++ Let the PTS of packet X be P(X) |
| /// ++ Let the function which transforms PTS -> RefClock be R(p) (this |
| /// function is determined by the call to Play(...) |
| /// ++ Let the minimum lead time be MLT |
| /// |
| /// If R(P(X)) < RefClock.Now() + MLT |
| /// Then the packet is late, and some (or all) of the packet's payload will |
| /// need to be skipped in order to present the packet at the scheduled time. |
| /// |
| // TODO(mpuryear): What should the units be here? Options include... |
| // |
| // 1) Normalized to nanoseconds (this is the current API) |
| // 2) Reference clock units (what happens if the reference clock changes?) |
| // 3) PTS units (what happens when the user changes the PTS units?) |
| // |
| // TODO(mpuryear): Should `EnableMinLeadTimeEvents` have an optional -> () |
| // return value for synchronization purposes? Probably not; users should be |
| // able to send a disable request and clear their event handler if they no |
| // longer want notifications. Their in-process dispatcher framework can |
| // handle draining and dropping any lead time changed events that were |
| // already in flight when the disable message was sent. |
| // |
| EnableMinLeadTimeEvents(bool enabled); |
| -> OnMinLeadTimeChanged(int64 min_lead_time_nsec); |
| |
| // TODO(mpuryear): Eliminate this method when possible. Right now, it is |
| // used by code requiring synchronous FIDL interfaces to talk to |
| // AudioRenderers. |
| /// |
| /// While it is possible to call `GetMinLeadTime` before `SetPcmStreamType`, |
| /// there's little reason to do so. This is because lead time is a function |
| /// of format/rate, so lead time will be recalculated after `SetPcmStreamType`. |
| /// If min lead time events are enabled before `SetPcmStreamType` (with |
| /// `EnableMinLeadTimeEvents(true)`), then an event will be generated in |
| /// response to `SetPcmStreamType`. |
| GetMinLeadTime() -> (int64 min_lead_time_nsec); |
| |
| /// Immediately put the AudioRenderer into a playing state. Start the advance |
| /// of the media timeline, using specific values provided by the caller (or |
| /// default values if not specified). In an optional callback, return the |
| /// timestamp values ultimately used -- these set the ongoing relationship |
| /// between the media and reference timelines (i.e., how to translate between |
| /// the domain of presentation timestamps, and the realm of local system |
| /// time). |
| /// |
| /// Local system time is specified in units of nanoseconds; media_time is |
| /// specified in the units defined by the user in the `SetPtsUnits` function, |
| /// or nanoseconds if `SetPtsUnits` is not called. |
| /// |
| /// The act of placing an AudioRenderer into the playback state establishes a |
| /// relationship between 1) the user-defined media (or presentation) timeline |
| /// for this particular AudioRenderer, and 2) the real-world system reference |
| /// timeline. To communicate how to translate between timelines, the Play() |
| /// callback provides an equivalent timestamp in each time domain. The first |
| /// value ('reference_time') is given in terms of the local system clock; the |
| /// second value ('media_time') is what media instant exactly corresponds to |
| /// that local time. Restated, the frame at 'media_time' in the audio stream |
| /// should be presented at system local time 'reference_time'. |
| /// |
| /// Note: on calling this API, media_time immediately starts advancing. It is |
| /// possible (if uncommon) for a caller to specify a system time that is |
| /// far in the past, or far into the future. This, along with the specified |
| /// media time, is simply used to determine what media time corresponds to |
| /// 'now', and THAT media time is then intersected with presentation |
| /// timestamps of packets already submitted, to determine which media frames |
| /// should be presented next. |
| /// |
| /// With the corresponding reference_time and media_time values, a user can |
| /// translate arbitrary time values from one timeline into the other. After |
| /// calling `SetPtsUnits(pts_per_sec_numerator, pts_per_sec_denominator)` and |
| /// given the 'ref_start' and 'media_start' values from `Play`, then for |
| /// any 'ref_time': |
| /// |
| /// media_time = ( (ref_time - ref_start) / 1e9 |
| /// * (pts_per_sec_numerator / pts_per_sec_denominator) ) |
| /// + media_start |
| /// |
| /// Conversely, for any presentation timestamp 'media_time': |
| /// |
| /// ref_time = ( (media_time - media_start) |
| /// * (pts_per_sec_denominator / pts_per_sec_numerator) |
| /// * 1e9 ) |
| /// + ref_start |
| /// |
| /// Users, depending on their use case, may optionally choose not to specify |
| /// one or both of these timestamps. A timestamp may be omitted by supplying |
| /// the special value '`NO_TIMESTAMP`'. The AudioRenderer automatically deduces |
| /// any omitted timestamp value using the following rules: |
| /// |
| /// Reference Time |
| /// If 'reference_time' is omitted, the AudioRenderer will select a "safe" |
| /// reference time to begin presentation, based on the minimum lead times for |
| /// the output devices that are currently bound to this AudioRenderer. For |
| /// example, if an AudioRenderer is bound to an internal audio output |
| /// requiring at least 3 mSec of lead time, and an HDMI output requiring at |
| /// least 75 mSec of lead time, the AudioRenderer might (if 'reference_time' |
| /// is omitted) select a reference time 80 mSec from now. |
| /// |
| /// Media Time |
| /// If media_time is omitted, the AudioRenderer will select one of two |
| /// values. |
| /// - If the AudioRenderer is resuming from the paused state, and packets |
| /// have not been discarded since being paused, then the AudioRenderer will |
| /// use a media_time corresponding to the instant at which the presentation |
| /// became paused. |
| /// - If the AudioRenderer is being placed into a playing state for the first |
| /// time following startup or a 'discard packets' operation, the initial |
| /// media_time will be set to the PTS of the first payload in the pending |
| /// packet queue. If the pending queue is empty, initial media_time will be |
| /// set to zero. |
| /// |
| /// Return Value |
| /// When requested, the AudioRenderer will return the 'reference_time' and |
| /// 'media_time' which were selected and used (whether they were explicitly |
| /// specified or not) in the return value of the play call. |
| /// |
| /// Examples |
| /// 1. A user has queued some audio using `SendPacket` and simply wishes them |
| /// to start playing as soon as possible. The user may call Play without |
| /// providing explicit timestamps -- `Play(NO_TIMESTAMP, NO_TIMESTAMP)`. |
| /// |
| /// 2. A user has queued some audio using `SendPacket`, and wishes to start |
| /// playback at a specified 'reference_time', in sync with some other media |
| /// stream, either initially or after discarding packets. The user would call |
| /// `Play(reference_time, NO_TIMESTAMP)`. |
| /// |
| /// 3. A user has queued some audio using `SendPacket`. The first of these |
| /// packets has a PTS of zero, and the user wishes playback to begin as soon |
| /// as possible, but wishes to skip all of the audio content between PTS 0 |
| /// and PTS 'media_time'. The user would call |
| /// `Play(NO_TIMESTAMP, media_time)`. |
| /// |
| /// 4. A user has queued some audio using `SendPacket` and want to present |
| /// this media in synch with another player in a different device. The |
| /// coordinator of the group of distributed players sends an explicit |
| /// message to each player telling them to begin presentation of audio at |
| /// PTS 'media_time', at the time (based on the group's shared reference |
| /// clock) 'reference_time'. Here the user would call |
| /// `Play(reference_time, media_time)`. |
| /// |
| // TODO(mpuryear): Define behavior in the case that a user calls `Play()` |
| // while the system is already playing. We should probably do nothing but |
| // return a valid correspondence pair in response -- unless both reference |
| // and media times are provided (and do not equate to the current timeline |
| // relationship), in which case we should introduce a discontinuity. |
| // |
| // TODO(mpuryear): Collapse these if we ever have optional retvals in FIDL |
| Play(int64 reference_time, int64 media_time) |
| -> (int64 reference_time, int64 media_time); |
| PlayNoReply(int64 reference_time, int64 media_time); |
| |
| /// Immediately put the AudioRenderer into the paused state and then report |
| /// the relationship between the media and reference timelines which was |
| /// established (if requested). |
| /// |
| // TODO(mpuryear): Define behavior in the case that a user calls `Pause()` |
| // while the system is already in the paused state. We should probably do |
| // nothing but provide a valid correspondence pair in response. |
| // |
| // TODO(mpuryear): Collapse these if we ever have optional retvals in FIDL |
| Pause() -> (int64 reference_time, int64 media_time); |
| PauseNoReply(); |
| |
| // TODO(mpuryear): Spec methods/events which can be used for unintentional |
| // discontinuity/underflow detection. |
| // |
| // TODO(mpuryear): Spec methods/events which can be used to report routing |
| // changes. (Presuming that they belong at this level at all; they may |
| // belong on some sort of policy object). |
| // |
| // TODO(mpuryear): Spec methods/events which can be used to report policy |
| // induced gain/ducking changes. (Presuming that they belong at this level |
| // at all; they may belong on some sort of policy object). |
| }; |