Closed Bug 1194876 Opened 9 years ago Closed 9 years ago

[Music] Simultaneously tapping forward and previous while song is playing on lockscreen causes skipping issues.

Categories

(Core :: General, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla43
blocking-b2g 2.5+
Tracking Status
firefox43 --- fixed
b2g-v2.2 --- unaffected
b2g-master --- fixed

People

(Reporter: NicholasN, Assigned: kats)

References

()

Details

(Keywords: regression, Whiteboard: [2.5-Daily-Testing][Spark][music])

Attachments

(5 files, 4 obsolete files)

Attached file logcat_music.txt
Description:
The user opens the music app and selects the Songs view. They begin playing a song, and then lock the phone. If they tap forward and previous while on the screen the song with play, but with severe skipping/repeating issues.


Repro Steps:
1) Update an Aries to 20150814042814
2) Ensure there are multiple songs on the device.
3) Open the music app and go to the Songs list.
4) Play any song.
5) Lock the screen.
6) Tap forward and previous at the same time.


Actual:
Song plays with severe skipping/repeating issues.


Expected:
Song plays normally.


Notes:

Environmental Variables:
Device: Aries 2.5
Build ID: 20150814042814
Gaia: 39b121515ab8a8c3ea07f26d3ba1dd792e90217c
Gecko: 4e883591bb5dff021c108d3e30198a99547eed1e
Gonk: 2916e2368074b5383c80bf5a0fba3fc83ba310bd
Version: 43.0a1 (2.5)
Firmware Version: D5803_23.1.A.1.28_NCB.ftf
User Agent: Mozilla/5.0 (Mobile; rv:43.0) Gecko/43.0 Firefox/43.0


Repro frequency: 4/4
See attached: video clip, logcat
Issue reproduces on Flame 2.5 but not on Flame 2.2.

Flame 2.5

Actual Result:
Song plays with severe skipping/repeating issues.

Environmental Variables:
Device: Flame 2.5
BuildID: 20150814073053
Gaia: 4b09b8824e3c68d8f5208a53f9ae3a8917dc9560
Gecko: beb9cc29efb9f727d22f179bcf551ca0bcb842ef
Gonk: c4779d6da0f85894b1f78f0351b43f2949e8decd
Version: 43.0a1 (2.5) 
Firmware Version: v18D
User Agent: Mozilla/5.0 (Mobile; rv:43.0) Gecko/43.0 Firefox/43.0

Flame 2.2

Actual Result:
Song plays normally.

Environmental Variables:
Device: Flame 2.2
BuildID: 20150811123013
Gaia: 102f1299e9eafe3760e1deb44d556b5c4f36b5af
Gecko: 9295034c0ee3
Gonk: bd9cb3af2a0354577a6903917bc826489050b40d
Version: 37.0 (2.2) 
Firmware Version: v18D
User Agent: Mozilla/5.0 (Mobile; rv:37.0) Gecko/37.0 Firefox/37.0
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(pbylenga)
Keywords: regression
Whiteboard: [2.5-Daily-Testing][Spark]
[Blocking Requested - why for this release]:
performance regression

requesting a window.
blocking-b2g: --- → 2.5?
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(pbylenga)
QA Contact: mshuman
This issue appears to be caused by:
Bug 1178847 - Refactor code that computes the CSS viewport

Mozilla-inbound Regression Window

Last Working 
Environmental Variables:
Device: Flame 2.5
BuildID: 20150721075041
Gaia: 4fe0507781f3ed56c8ae5e66dd9489165d1ff68e
Gecko: 49ae0961591e
Version: 42.0a1 (2.5)
Firmware Version: v18D
User Agent: Mozilla/5.0 (Mobile; rv:42.0) Gecko/42.0 Firefox/42.0

First Broken 
Environmental Variables:
Device: Flame 2.5
BuildID: 20150721075241
Gaia: 4fe0507781f3ed56c8ae5e66dd9489165d1ff68e
Gecko: fee45cd9a4d9
Version: 42.0a1 (2.5)
Firmware Version: v18D
User Agent: Mozilla/5.0 (Mobile; rv:42.0) Gecko/42.0 Firefox/42.0

Last Working gaia / First Broken gecko - Issue DOES reproduce
Gaia: 4fe0507781f3ed56c8ae5e66dd9489165d1ff68e
Gecko: fee45cd9a4d9

First Broken gaia / Last Working gecko - Issue does NOT reproduce
Gaia: 4fe0507781f3ed56c8ae5e66dd9489165d1ff68e
Gecko: 49ae0961591e

Gecko Pushlog:
http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=49ae0961591e&tochange=fee45cd9a4d9
Blocks: 1178847
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(jmercado)
Kartikaya, can you please take a look?  The changes for bug 1178847 seem to be the culprit.
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage+]
Flags: needinfo?(jmercado) → needinfo?(bugmail.mozilla)
Yup, thanks, I'll look into it.
Assignee: nobody → bugmail.mozilla
Flags: needinfo?(bugmail.mozilla)
Based on my investigation so far the MVM changes changed the structure of the layer tree slightly, which (maybe) exposed a pre-existing bug that might have happened in other circumstances. The bug results in a GestureEventListener getting a touch-start and nothing else, and results in that GestureEventListener issuing a long-tap when it shouldn't be. This results in the triggering of the "seek" behaviour on the music player controls, which is what results in the skipping issues. At least as far as I can tell so far.
I confirmed my diagnosis above and have a gtest for the APZ code that replicates the problem. Still trying to figure out the right way to fix it, but I probably won't have time to work on it for a couple of days.
blocking-b2g: 2.5? → 2.5+
Component: Gaia::Music → General
Product: Firefox OS → Core
Whiteboard: [2.5-Daily-Testing][Spark] → [2.5-Daily-Testing][Spark][music]
Just clarifying that this only happens when you touch the forward and previous buttons *simultaneously* which is a pretty uncommon usage scenario. Still, technically a regression so we should fix it.
Summary: [Music] Tapping forward and previous while song is playing on lockscreen causes skipping issues. → [Music] Simultaneously tapping forward and previous while song is playing on lockscreen causes skipping issues.
Attached patch Part 1 - Refactor some code (obsolete) — Splinter Review
Attachment #8650709 - Flags: review?(botond)
Not entirely sure if we need this but it seemed odd that the code was just setting the state to NONE without clearing anything else.
Attachment #8650713 - Flags: review?(botond)
Attached patch Part 4 - test (obsolete) — Splinter Review
Attachment #8650716 - Flags: review?(botond)
Comment on attachment 8650709 [details] [diff] [review]
Part 1 - Refactor some code

Review of attachment 8650709 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/layers/apz/src/APZCTreeManager.cpp
@@ +794,5 @@
>            outTransform, touchData.mScreenPoint);
>      }
>    }
>  
> +  mTouchCounter.Update(aInput);

There's an early return here: http://mxr.mozilla.org/mozilla-central/source/gfx/layers/apz/src/APZCTreeManager.cpp?rev=78747fd20d8d#777 , perhaps we should call Update() before it?

::: gfx/layers/moz.build
@@ +105,5 @@
>      'apz/util/APZThreadUtils.h',
>      'apz/util/ChromeProcessController.h',
>      'apz/util/DoubleTapToZoom.h',
>      'apz/util/InputAPZContext.h',
> +    'apz/util/TouchCounter.h',

Why is this file in apz/util when it's only used by things in apz/src?
Comment on attachment 8650710 [details] [diff] [review]
Part 2 - Reset the input state on APZCs that are left with active touch points

Review of attachment 8650710 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/layers/apz/src/InputBlockState.cpp
@@ +550,5 @@
>    while (HasEvents()) {
>      TBS_LOG("%p returning first of %" PRIuSIZE " events\n", this, mEvents.Length());
>      MultiTouchInput event = mEvents[0];
>      mEvents.RemoveElementAt(0);
> +    mTouchCounter->Update(event);

Suggestion: instead of making DispatchImmediate virtual, factor out a function DispatchEvent which just calls |GetTargetAPZC()->HandleInputEvent()|, and make *that* virtual. DispatchImmediate() and this call site would both call DispatchEvent().

This way, we can override DispatchEvent() and add the TouchCounter::Update() call in just one place.

::: gfx/layers/apz/src/InputBlockState.h
@@ +359,5 @@
>    bool mAllowedTouchBehaviorSet;
>    bool mDuringFastFling;
>    bool mSingleTapOccurred;
>    nsTArray<MultiTouchInput> mEvents;
> +  TouchCounter* mTouchCounter;

Might as well make this a reference, since you're setting it in the constructor.

Also, since there are two touch counters (APZCTreeManager's and InputQueue's, mention in a comment that this referes to InputQueue's one.)

::: gfx/layers/apz/src/InputQueue.cpp
@@ +449,5 @@
>      } else if (curBlock->IsDefaultPrevented()) {
>        curBlock->DropEvents();
>        target->ResetInputState();
>      } else {
> +      UpdateTargetApzc(curBlock->GetTargetApzc());

Should we be calling this before calling DispatchImmediate() in MaybeHandleCurrentBlock(), too?

@@ +472,5 @@
>  
>  void
> +InputQueue::UpdateTargetApzc(const nsRefPtr<AsyncPanZoomController>& aNewTarget) {
> +  if (mLastTargetApzc && mLastTargetApzc != aNewTarget
> +      && mTouchCounter.GetActiveTouchCount()) {

|GetActiveTouchCount() > 0| reads clearer to me.

::: gfx/layers/apz/src/InputQueue.h
@@ +146,5 @@
>  
>    void ScheduleMainThreadTimeout(const nsRefPtr<AsyncPanZoomController>& aTarget, uint64_t aInputBlockId);
>    void MainThreadTimeout(const uint64_t& aInputBlockId);
>    void ProcessInputBlocks();
> +  void UpdateTargetApzc(const nsRefPtr<AsyncPanZoomController>& aNewTarget);

I'd prefer calling this function something else, to avoid confusion with InputBlockState::UpdateTargetApzc(), which does something different.

SwitchToNewApzc?
Comment on attachment 8650713 [details] [diff] [review]
Part 3 - Tweak the APZC reset code a bit

Review of attachment 8650713 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/layers/apz/src/AsyncPanZoomController.cpp
@@ +3133,3 @@
>      listener->HandleInputEvent(cancel);
>    }
> +  OnTouchCancel(cancel);

This will, among other things, dispatch an APZStateChange::EndTouch notification. Do we want that? (Other things it will do that it didn't do before, like clear axis velocities, we probably do want done.)
Comment on attachment 8650716 [details] [diff] [review]
Part 4 - test

Review of attachment 8650716 [details] [diff] [review]:
-----------------------------------------------------------------

Note that content telling APZ to target an ancestor of the tentative target is a somewhat unexpected scenario (and kats filed bug 1197350 to investigate why this was happening in production), but it's orthogonal to what the test is testing, so the test is fine.
Attachment #8650716 - Flags: review?(botond) → review+
> >  
> > +  mTouchCounter.Update(aInput);
> 
> There's an early return here:
> http://mxr.mozilla.org/mozilla-central/source/gfx/layers/apz/src/
> APZCTreeManager.cpp?rev=78747fd20d8d#777 , perhaps we should call Update()
> before it?

I thought about this, and I don't think we do want that. If we move it up, it the touch counter will get updated with the touches before we remove the non-retained touch points. That means it will introduce a behaviour difference in the scenario where the user pulls an APZC into overscroll, puts down a second finger and then lifts the first finger. In this scenario all the touches dealing with the second finger are not retained. On current m-c we would clear mApzcForInputBlock as soon as the first finger gets lifted. If I add the Update() call before the code to remove the non-retained touch points, then the active touch count will remain 1 until after the second finger goes up. If I add the Update() call where it is in the patch, then the active touch count will drop to 0 as soon as the first finger goes up, which maintains the behaviour we see on m-c. (You could definitely argue that we want the behaviour to change, but this was my rationale for putting it where I did).

> >      'apz/util/InputAPZContext.h',
> > +    'apz/util/TouchCounter.h',
> 
> Why is this file in apz/util when it's only used by things in apz/src?

Good point, moved it to apz/src. It is included by APZCTreeManager.h though so I still had to export it.

(In reply to Botond Ballo [:botond] from comment #14)
> Suggestion: instead of making DispatchImmediate virtual, factor out a
> function DispatchEvent which just calls
> |GetTargetAPZC()->HandleInputEvent()|, and make *that* virtual.
> DispatchImmediate() and this call site would both call DispatchEvent().
> 
> This way, we can override DispatchEvent() and add the TouchCounter::Update()
> call in just one place.

Good suggestion, thanks! Done.

> ::: gfx/layers/apz/src/InputBlockState.h
> > +  TouchCounter* mTouchCounter;
> 
> Might as well make this a reference, since you're setting it in the
> constructor.
> 
> Also, since there are two touch counters (APZCTreeManager's and
> InputQueue's, mention in a comment that this referes to InputQueue's one.)

Done and done.

> ::: gfx/layers/apz/src/InputQueue.cpp
> >      } else {
> > +      UpdateTargetApzc(curBlock->GetTargetApzc());
> 
> Should we be calling this before calling DispatchImmediate() in
> MaybeHandleCurrentBlock(), too?

Yes, good catch!

> > +  if (mLastTargetApzc && mLastTargetApzc != aNewTarget
> > +      && mTouchCounter.GetActiveTouchCount()) {
> 
> |GetActiveTouchCount() > 0| reads clearer to me.

Updated

> ::: gfx/layers/apz/src/InputQueue.h
> > +  void UpdateTargetApzc(const nsRefPtr<AsyncPanZoomController>& aNewTarget);
> 
> I'd prefer calling this function something else, to avoid confusion with
> InputBlockState::UpdateTargetApzc(), which does something different.
> 
> SwitchToNewApzc?

Fair enough. I renamed it to UpdateActiveApzc just because "SwitchToNewApzc" implies that the APZC is different, when it might actually be the same. I'm open to other name suggestions.

(In reply to Botond Ballo [:botond] from comment #15)
> ::: gfx/layers/apz/src/AsyncPanZoomController.cpp
> @@ +3133,3 @@
> >      listener->HandleInputEvent(cancel);
> >    }
> > +  OnTouchCancel(cancel);
> 
> This will, among other things, dispatch an APZStateChange::EndTouch
> notification. Do we want that? (Other things it will do that it didn't do
> before, like clear axis velocities, we probably do want done.)

Hm, good point. But I think this was already happening on m-c, because the APZCTreeManager code sends a TOUCH_CANCEL event to the old mApzcForInputBlock when it changes. So going from one finger -> two fingers, the mApzcForInputBlock may change and the old one would receive a cancel event, which would also send the EndTouch notification. Also at the only place that uses the EndTouch notification [1] it probably makes sense to be sending it in this scenario. I'm happy to revisit this if we find actual problems from this behaviour.

[1] http://mxr.mozilla.org/mozilla-central/source/gfx/layers/apz/util/APZEventState.cpp?rev=bb91cf98a42b#388
Attachment #8650709 - Attachment is obsolete: true
Attachment #8650709 - Flags: review?(botond)
Attachment #8651866 - Flags: review?(botond)
Attachment #8650713 - Attachment is obsolete: true
Attachment #8650713 - Flags: review?(botond)
Attachment #8651868 - Flags: review?(botond)
Attached patch Part 4 - testSplinter Review
Rebased, carrying r+
Attachment #8650716 - Attachment is obsolete: true
Attachment #8651869 - Flags: review+
Attachment #8651866 - Flags: review?(botond) → review+
Attachment #8651867 - Flags: review?(botond) → review+
Attachment #8651868 - Flags: review?(botond) → review+
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #17)
> > >  
> > > +  mTouchCounter.Update(aInput);
> > 
> > There's an early return here:
> > http://mxr.mozilla.org/mozilla-central/source/gfx/layers/apz/src/
> > APZCTreeManager.cpp?rev=78747fd20d8d#777 , perhaps we should call Update()
> > before it?
> 
> I thought about this, and I don't think we do want that. If we move it up,
> it the touch counter will get updated with the touches before we remove the
> non-retained touch points. That means it will introduce a behaviour
> difference in the scenario where the user pulls an APZC into overscroll,
> puts down a second finger and then lifts the first finger. In this scenario
> all the touches dealing with the second finger are not retained. On current
> m-c we would clear mApzcForInputBlock as soon as the first finger gets
> lifted. If I add the Update() call before the code to remove the
> non-retained touch points, then the active touch count will remain 1 until
> after the second finger goes up. If I add the Update() call where it is in
> the patch, then the active touch count will drop to 0 as soon as the first
> finger goes up, which maintains the behaviour we see on m-c. (You could
> definitely argue that we want the behaviour to change, but this was my
> rationale for putting it where I did).

Good point, thanks for reasoning this out! My concern was that Update() wouldn't be called in the case where we take the early return, but I see now that this is not a problem, because in that case all the touches in the event would be non-retained, and the touch count wouldn't incorporate them to begin with.

> (In reply to Botond Ballo [:botond] from comment #15)
> > ::: gfx/layers/apz/src/AsyncPanZoomController.cpp
> > @@ +3133,3 @@
> > >      listener->HandleInputEvent(cancel);
> > >    }
> > > +  OnTouchCancel(cancel);
> > 
> > This will, among other things, dispatch an APZStateChange::EndTouch
> > notification. Do we want that? (Other things it will do that it didn't do
> > before, like clear axis velocities, we probably do want done.)
> 
> Hm, good point. But I think this was already happening on m-c, because the
> APZCTreeManager code sends a TOUCH_CANCEL event to the old
> mApzcForInputBlock when it changes. So going from one finger -> two fingers,
> the mApzcForInputBlock may change and the old one would receive a cancel
> event, which would also send the EndTouch notification. Also at the only
> place that uses the EndTouch notification [1] it probably makes sense to be
> sending it in this scenario. I'm happy to revisit this if we find actual
> problems from this behaviour.
> 
> [1]
> http://mxr.mozilla.org/mozilla-central/source/gfx/layers/apz/util/
> APZEventState.cpp?rev=bb91cf98a42b#388

Note that ResetInputState() has another call site, in InputQueue::ProcessInputBlocks() when a block was prevent-defaulted. However, I can't think of any issues with dispatching an EndTouch notification in that case, either, so let's roll with this.
Issues is still Reproducing on the latest Flame and Aries Central Build. 

Actual Results: 
Music skips after pressing both back and forward buttons simultaneously at the lock-screen. 

Environmental Variables:
Device: Flame 2.5
Build ID: 20150916030229
Gaia: 994ff1537c2d7ca4d1658806c50f3ceba1053f9b
Gecko: 3e8dde8f8c174cce2c0b65c951808f88e35d1875
Gonk: c4779d6da0f85894b1f78f0351b43f2949e8decd
Version: 43.0a1 (2.5)
Firmware Version: v18D
User Agent: Mozilla/5.0 (Mobile; rv:43.0) Gecko/43.0 Firefox/43.0


Environmental Variables:
Device: Aries 2.5
BuildID: 20150916015546
Gaia: 994ff1537c2d7ca4d1658806c50f3ceba1053f9b
Gecko: 3e8dde8f8c174cce2c0b65c951808f88e35d1875
Gonk: 2916e2368074b5383c80bf5a0fba3fc83ba310bd
Version: 43.0a1 (2.5) 
Firmware Version: D5803_23.1.A.1.28_NCB.ftf
User Agent: Mozilla/5.0 (Mobile; rv:43.0) Gecko/43.0 Firefox/43.0
QA Whiteboard: [QAnalyst-Triage+] → [QAnalyst-Triage?][failed-verification]
Flags: needinfo?(jmercado)
Is it just that the music switches to a different song, and the back/forward buttons stop working? Or are the full symptoms from comment 0 still reproducible, in that the song plays with a lot of skipping and jerkiness?
The full symptoms from comment 0 are still reproducible on the latest Flame and Aries builds.


Environmental Variables:
Device: Flame 2.5
Build ID: 20150916030229
Gaia: 994ff1537c2d7ca4d1658806c50f3ceba1053f9b
Gecko: 3e8dde8f8c174cce2c0b65c951808f88e35d1875
Gonk: c4779d6da0f85894b1f78f0351b43f2949e8decd
Version: 43.0a1 (2.5)
Firmware Version: v18D
User Agent: Mozilla/5.0 (Mobile; rv:43.0) Gecko/43.0 Firefox/43.0


Environmental Variables:
Device: Aries 2.5
BuildID: 20150916015546
Gaia: 994ff1537c2d7ca4d1658806c50f3ceba1053f9b
Gecko: 3e8dde8f8c174cce2c0b65c951808f88e35d1875
Gonk: 2916e2368074b5383c80bf5a0fba3fc83ba310bd
Version: 43.0a1 (2.5) 
Firmware Version: D5803_23.1.A.1.28_NCB.ftf
User Agent: Mozilla/5.0 (Mobile; rv:43.0) Gecko/43.0 Firefox/43.0
QA Whiteboard: [QAnalyst-Triage?][failed-verification] → [QAnalyst-Triage+][failed-verification]
Flags: needinfo?(jmercado)
Flags: needinfo?(bugmail.mozilla)
Thanks. I'll clone this bug and look into it again then. These patches landed a while ago and I don't want to back them out or keep working in this bug.
Flags: needinfo?(bugmail.mozilla)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: