_______________________________________________
dev-fxos mailing list
dev-...@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-fxos
muy cache!
On Fri, Oct 2, 2015 at 11:37 AM, Wilson Page <wp...@mozilla.com> wrote:
Wow! Can email share what changes they made to get such a big improvement?
W I L S O N P A G E
Front-end Developer
Firefox OS (Gaia)
London Office
Twitter: @wilsonpage
IRC: wilsonpage
All the nga apps (music, contacts, sms) show significant regressions. Is
that only a lack of optimizations in these apps, in the bridge they all
use or design flaws in nga itself?
In any case, we have to stop porting new apps to nga until these
questions are answered.
Fabrice
--
Fabrice Desré
b2g team
Mozilla Corporation
On Oct 2, 2015, at 9:02 AM, Eli Perelman <eper...@mozilla.com> wrote:
EliIf you would like to use Raptor to start performance testing your apps to get these numbers down, it's all documented on MDN [1]. These numbers were captured on a Flame-KK with 319MB of memory with the light reference workload, which is the baseline device for v2.2 -> v2.5. Raptor does require Node v0.12 [2], so if you find you need to switch between Node 0.10 and 0.12 for Gaia, I recommend something like "n" [3] to easily switch between.Thanks,
[1] https://developer.mozilla.org/en-US/Firefox_OS/Automated_testing/Raptor
[2] https://developer.mozilla.org/en-US/Firefox_OS/Automated_testing/Raptor#Prerequisites
[3] https://www.npmjs.com/package/n
On Fri, Oct 2, 2015 at 10:54 AM, Gareth Aye <garet...@gmail.com> wrote:
muy cache!
On Fri, Oct 2, 2015 at 11:37 AM, Wilson Page <wp...@mozilla.com> wrote:
Wow! Can email share what changes they made to get such a big improvement?
W I L S O N P A G E
Front-end Developer
Firefox OS (Gaia)
London Office
Twitter: @wilsonpage
IRC: wilsonpage
I would also like to add that this policy of immediately pouncing on devs who attempt to try something new
Correction, my bad. Eli just informed me that his numbers earlier were from the 319mb Flame configuration. I was under the impression that we were no longer supporting that config. In addition, I was looking at “fullyLoaded” and not “visuallyLoaded”.We have a patch on the way that should be optimizing the way we fetch album art and should drastically cut down on memory usage. Once that has landed, I will re-test under 319mb and see where we stand.
-Justin
-Justin
On Oct 2, 2015, at 1:16 PM, Fabrice Desré <fab...@mozilla.com> wrote:
On 10/02/2015 09:49 AM, Justin D'Arcangelo wrote:I would also like to add that this policy of immediately pouncing on devs who attempt to try something new that may cause the perf numbers to momentarily dip is part of why we seem to have a culture problem in FxOS dev where everyone is afraid to take any kind of risks. If we are not allowed to have a 2-3 week window to optimize after a huge landing such as this, then how are we supposed to experiment or take risks?
You have all the time you want if you don't put dogfooders at risk. No
one is saying that you should not take the risk to try something new
(side note, you spent enough time on spark & flyweb to know that). But
when it comes to shipping there is a minimum bar to meet, and with
basically a x2 memory usage we are not meeting it in this app yet,
sorry. Feel free to ship a new app alongside the existing one instead
and ask people to try it, since we can't do A/B testing.
The problem we have is that most people don't care enough about having a
stable nightly, which is why we haven't updated dogfooders for more than
a month now.
Fabrice
--
Fabrice Desré
b2g team
Mozilla Corporation
_______________________________________________
dev-fxos mailing list
dev-...@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-fxos
On 10/05/2015 01:46 AM, Christopher Lord wrote:
> I'll preface what I say with the hopefully obvious statement that we
> should always aim for everything to be better. That said, however, I'd
> take a 2mb memory regression and a half-second startup time regression
> if it meant the app was polished and performed well.
Some apps regressed by way more than 2MB. And also, beware of the
boiling frog.
> Have you guys used an Android phone recently? Their startup time for
> apps is generally atrocious compared to ours (even on high-end devices)
> - we shouldn't drop the ball, but it's not where we compare badly. Given
> we aren't targeting 256mb devices anymore, I'd gladly have all our apps
> use double the memory they did in 2.2 if it meant we had a consistent
> 60Hz update, consistent transitions and snappy response.
That's not what I see on a Nexus 4 running CM 12 and on a z3c running L.
They are both super fast and snappy when launching the default apps.
Still better than us on the same hardware.
Fabrice
--
Fabrice Desré
b2g team
Mozilla Corporation
_______________________________________________
dev-fxos mailing list
dev-...@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-fxos
Fabrice
--
Fabrice Desré
b2g team
Mozilla Corporation
_______________________________________________
dev-fxos mailing list
dev-...@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-fxos
Fabrice
--
Fabrice Desré
b2g team
Mozilla Corporation
_______________________________________________
dev-fxos mailing list
dev-...@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-fxos
_______________________________________________
dev-fxos mailing list
dev-...@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-fxos
If by try runs you mean automated performance testing when opening a PR, then no. Right now the best way to ensure performance is up-to-snuff with your patch is to run Raptor during development. With Raptor installed, use a performance profile on the device by using the same flags as `make raptor`, then testing an app is as easy as:raptor test coldlaunch --app clock --runs 10raptor test coldlaunch --app communications --entry-point dialer --runs 20During development use a small number of runs for a tighter feedback loop. Before committing/landing, use many runs to ensure a better statistical guarantee. See the Raptor docs for details on getting started [1]. If you need any help getting up and running with Raptor, Rob Wood and myself would be happy to lend a hand, just ping us [2].
To be clear, you mean manually but not locally, right? I am talking about using automation prior to landing on the trees consumed by downstream, like is done with gecko+Firefox with inbound trees, etc. Things like try servers.
Nobody should have to run them locally and our learning there is that locally run test results are not to be trusted.
To be clear, you mean manually but not locally, right? I am talking about using automation prior to landing on the trees consumed by downstream, like is done with gecko+Firefox with inbound trees, etc. Things like try servers.
Nobody should have to run them locally and our learning there is that locally run test results are not to be trusted.
I don't see any individual bashing in this thread. It's an organizational change, as you say.
Raptor automatically reports performance regressions. If one is due to
gecko (like when someone broke nuwa recently) it needs to be treated the
same way we would do with gaia. I see absolutely no difference there.
On Oct 5, 2015, at 4:53 PM, Dietrich Ayala <auto...@gmail.com> wrote:I strongly disagree with any acceptance of any performance regression for any reason except emergency security patches. Only a zero tolerance policy for perf regressions will result in performant software in such a large and complex project.+1 to the frog metaphor.History has shown it's *incredibly* hard to claw back from performance regressions. And every moment spent doing so is done *at the cost* of exactly the type of work Chris described - work that actually moves the project *forward*.
If you have a tension between perf and features, then it's time to cut the slow features, or get some more time.
The polish/bugs problems mentioned is fixed by landing fewer bugs (a culture of detailed automated tests and a project-wide love and acceptance of backouts), not by accepting perf regressions.Also, I recommend not using any subjective measure to compare app startup times across different platforms. We used tools to do this in the past.
(My first patch ever, in 2006, regressed Firefox startup time and I spent a few days on the hook... until my feature could land with no startup hit. Can you tell it had an impact on me :D)
On Mon, Oct 5, 2015 at 5:19 PM Fabrice Desré <fab...@mozilla.com> wrote:
On 10/05/2015 01:46 AM, Christopher Lord wrote:
> I'll preface what I say with the hopefully obvious statement that we
> should always aim for everything to be better. That said, however, I'd
> take a 2mb memory regression and a half-second startup time regression
> if it meant the app was polished and performed well.
Some apps regressed by way more than 2MB. And also, beware of the
boiling frog.
> Have you guys used an Android phone recently? Their startup time for
> apps is generally atrocious compared to ours (even on high-end devices)
> - we shouldn't drop the ball, but it's not where we compare badly. Given
> we aren't targeting 256mb devices anymore, I'd gladly have all our apps
> use double the memory they did in 2.2 if it meant we had a consistent
> 60Hz update, consistent transitions and snappy response.
That's not what I see on a Nexus 4 running CM 12 and on a z3c running L.
They are both super fast and snappy when launching the default apps.
Still better than us on the same hardware.
Fabrice
--
Fabrice Desré
b2g team
Mozilla Corporation
_______________________________________________
dev-fxos mailing list
dev-...@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-fxos
On Oct 6, 2015, at 12:39 PM, Fabrice Desré <fab...@mozilla.com> wrote:On 10/06/2015 01:57 AM, Julien Wajsberg wrote:Le 02/10/2015 19:16, Fabrice Desré a écrit :On 10/02/2015 09:49 AM, Justin D'Arcangelo wrote:I would also like to add that this policy of immediately pouncing on devs who attempt to try something new that may cause the perf numbers to momentarily dip is part of why we seem to have a culture problem in FxOS dev where everyone is afraid to take any kind of risks. If we are not allowed to have a 2-3 week window to optimize after a huge landing such as this, then how are we supposed to experiment or take risks?You have all the time you want if you don't put dogfooders at risk. No
one is saying that you should not take the risk to try something new
(side note, you spent enough time on spark & flyweb to know that). But
when it comes to shipping there is a minimum bar to meet, and with
basically a x2 memory usage we are not meeting it in this app yet,
sorry. Feel free to ship a new app alongside the existing one instead
and ask people to try it, since we can't do A/B testing.
Sorry, I disagree here. I don't completely disagree though, so bear with
me :)
I think the best way to find bugs and regressions is exposing the
changes to users. _of course_ we need to make sure we don't badly break
the phone first. But users of the master branch will have regressions.
That's normal and expected. Any big feature will get at least a handful
of regressions. Our goal is to track them and fix them, before we ship
to less technical/less engaged users. IMO that's why we wanted the
dogfood process in the first place.
I guess we only disagree on the magnitude of the regressions we are
happy to ship to dogfooders. Your bar seems higher than mine.
If you want that dogfooders don't get the master regressions, then don't
use the master branch. BTW I personally think we should let the
dogfooders choose between "master-dogfood" and "aurora-dogfood" branches.
There's already "dogfood" (with QA sign off) and "dogfood-latest"
(nightlies, use at your own risk).Now, I guess you're afraid that we're losing dogfooders, even those on
master that are aware they can get issues. Big news, they don't leave
the program because an app takes 2x memory. Users don't even see it. You
can look at the list of foxfood bugs [1], very few bugs have "slow" or
"performance" in their summary.
So please don't mix and confuse topics and concerns. The performance
concern is important, but it's not what puts dogfooders at risk.
Well... we have almost no dogfooders, because we have been unable to
ship updates fixing a bunch of bugs that were submitted at the beginning
of the program. So right now I don't think we can draw any conclusions
from the foxfood feedback unfortunately. And it's not because they won't
notice memory regressions that they are not important. I was merely
pointing out that we have an overall quality issue, and memory/startup
time regressions are part of that.
Fabrice
--
Fabrice Desré
b2g team
Mozilla Corporation
So we're fine with the system that didn't work for 2.5 and we're making no promise for the future.Nice commitment to performance.
On Tue, Oct 6, 2015 at 11:58 AM, Etienne Segonzac <eti...@mozilla.com> wrote:So we're fine with the system that didn't work for 2.5 and we're making no promise for the future.Nice commitment to performance.I'd hardly say that what we've built doesn't have a positive impact towards performance. The fact that this conversation can even exist with real data is a testament to how far we've come. The intent of my email was not to back people into corners and play blame games, but just to shine a light as to what things look like right now so owners and peers have ammunition to make decisions. Let me repeat what I just said, because it is the crux of the problem:The current scope of performance automation is not in the state that it is something automatically sheriffable. We've built up the tools and infrastructure from almost the ground-up and it has accomplished exactly what it was set out to do: to put the knowledge of performance and problems into the hands of owners and peers to make their own decisions.Automating performance is usually not a binary decision like a unit test. It takes analysis and guesswork, and even then it still needs human eyes. Rob and I are working towards making this better, to automate as much as possible, but right now the burden of making the tough calls still lies with those landing patches. We equip you to make those determinations until we have more tooling and automation in place for the sheriffing to actually be an option, because right now it is not.
We're back to my first message on the thread.We don't have the adequate tooling to achieve our performance goal.
Every release we talk about performance like if the issue was a "developer awareness" issue, and we take strong stance on how "we should never regress".
But if we meant it we'd have more that 2 people working on the very challenging tooling work required. And believe me I'm fully aware of how challenging it is.
We can't hand every gaia and gecko developer a link to moztrap (manual test case tracker), remove all automated tests, and then be all high-minded about how we should never regress a feature. But it's exactly what we're doing with launch time performance.
On Tue, Oct 6, 2015 at 7:21 PM, Eli Perelman <eper...@mozilla.com> wrote:On Tue, Oct 6, 2015 at 11:58 AM, Etienne Segonzac <eti...@mozilla.com> wrote:So we're fine with the system that didn't work for 2.5 and we're making no promise for the future.Nice commitment to performance.I'd hardly say that what we've built doesn't have a positive impact towards performance. The fact that this conversation can even exist with real data is a testament to how far we've come. The intent of my email was not to back people into corners and play blame games, but just to shine a light as to what things look like right now so owners and peers have ammunition to make decisions. Let me repeat what I just said, because it is the crux of the problem:The current scope of performance automation is not in the state that it is something automatically sheriffable. We've built up the tools and infrastructure from almost the ground-up and it has accomplished exactly what it was set out to do: to put the knowledge of performance and problems into the hands of owners and peers to make their own decisions.Automating performance is usually not a binary decision like a unit test. It takes analysis and guesswork, and even then it still needs human eyes. Rob and I are working towards making this better, to automate as much as possible, but right now the burden of making the tough calls still lies with those landing patches. We equip you to make those determinations until we have more tooling and automation in place for the sheriffing to actually be an option, because right now it is not.
We're back to my first message on the thread.We don't have the adequate tooling to achieve our performance goal.
Every release we talk about performance like if the issue was a "developer awareness" issue, and we take strong stance on how "we should never regress".But if we meant it we'd have more that 2 people working on the very challenging tooling work required. And believe me I'm fully aware of how challenging it is.
We're back to my first message on the thread.We don't have the adequate tooling to achieve our performance goal.
I think the question lies in answering, how do we resolve this, who resolves this? Perhaps having a quarter goal for some one would help push this to come through? I think it's evident that we need someone to work on it and have it part of their goals even if it's in parts.
I would love to, but my Firefox OS Flame reference phone went dead on me, and wont come back on, even after I charged the battery to 100% power level.
EliIf you would like to use Raptor to start performance testing your apps to get these numbers down, it's all documented on MDN [1]. These numbers were captured on a Flame-KK with 319MB of memory with the light reference workload, which is the baseline device for v2.2 -> v2.5. Raptor does require Node v0.12 [2], so if you find you need to switch between Node 0.10 and 0.12 for Gaia, I recommend something like "n" [3] to easily switch between.Thanks,
[1] https://developer.mozilla.org/en-US/Firefox_OS/Automated_testing/Raptor
[2] https://developer.mozilla.org/en-US/Firefox_OS/Automated_testing/Raptor#Prerequisites
[3] https://www.npmjs.com/package/nOn Fri, Oct 2, 2015 at 10:54 AM, Gareth Aye <garet...@gmail.com> wrote:muy cache!On Fri, Oct 2, 2015 at 11:37 AM, Wilson Page <wp...@mozilla.com> wrote:Wow! Can email share what changes they made to get such a big improvement?W I L S O N P A G E
Front-end Developer
Firefox OS (Gaia)
London Office
Twitter: @wilsonpage
IRC: wilsonpageOn Fri, Oct 2, 2015 at 3:51 PM, Eli Perelman <eper...@mozilla.com> wrote:Eli PerelmanDialer v2.2 USS: 17.48MBDialer current cold launch: 944ms (~90ms regression, still under 1000ms)Dialer v2.2 cold launch: 851msContacts current USS: 20.04MB (~1.75MB regression)Contacts v2.2 USS: 18.26MBContacts current cold launch: 1246ms (~475ms regression)Contacts v2.2 cold launch: 773msClock current USS: 14.95MB (~1MB regression)Clock v2.2 USS: 13.98MBClock current cold launch: 1260ms (acceptable)Clock v2.2 cold launch: 1232msCamera current USS: 16.05MB (~2.2MB regression)Camera v2.2 USS: 13.83MBCamera current cold launch: 2090ms (~600ms regression)Camera v2.2 cold launch: 1492msCalendar current USS: 13.99MB (good)Calendar v2.2 USS: 14.01MBCalendar current cold launch: 1638ms (~180ms regression)Calendar v2.2 cold launch: 1454msEnough of the chatter, here's the data:Hello fxos,With deadlines for v2.5 approaching, I thought I would take a couple minutes and summarize the current state of performance for Gaia. At the outset of v2.5 we captured metrics of v2.2 and have used that as the baseline to determine whether applications have regressed their performance since. Any applications whose performance has significantly regressed since v2.2 will need approval to not block as major increases will block v2.5.Dialer current USS: 13.04MB (good!)Email v2.2 cold launch: 2129msEmail current cold launch: 606ms (good!)Email v2.2 USS: 16.17MBEmail current USS: 15.78MB (good)FM v2.2 cold launch: 604msFM current cold launch: 783ms (~175ms regression)FM v2.2 USS: 10.37MBFM current USS: 10.51MB (acceptable)Gallery v2.2 cold launch: 1113msGallery current cold launch: 1207ms (~90ms regression)Gallery v2.2 USS: 17.71MBGallery current USS: 18.98MB (~1.25MB regression)Music v2.2 cold launch: 1066msMusic current cold launch: 1717ms (~650ms regression)Music v2.2 USS: 13.37MBMusic current USS: 29.49MB (~16.12MB regression)SMS v2.2 cold launch: 1340msSMS current cold launch: 1630ms (~290ms regression)SMS v2.2 USS: 12.86MBSMS current USS: 19.94MB (~7MB regression)Settings v2.2 cold launch: 2474msSettings current cold launch: 2950ms (~475ms regression)Settings v2.2 USS: 17.18MBSettings current USS: 17.54MB (acceptable)Video v2.2 cold launch: 1115msVideo current cold launch: 1309ms (~190ms regression)Video v2.2 USS: 12.13MBVideo current USS: 13MB (acceptable)TLDR; there seem to be quite a few serious regressions across many applications, in both cold launch time and USS memory usage. As a comparison, the Test Startup Limit app when first captured started off in the 880ms range, spent a good chunk of June and July around 620ms and is now around 850ms.If anyone has any questions about the data or needs additional information, please let me know.Thanks,
Also, kudos to the Email team for the massive improvement in both launch time and memory.
_______________________________________________
dev-fxos mailing list
dev-...@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-fxos
_______________________________________________
dev-fxos mailing list
dev-...@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-fxos
Good points. I agree.
I would also like to add that this policy of immediately pouncing on devs who attempt to try something new that may cause the perf numbers to momentarily dip is part of why we seem to have a culture problem in FxOS dev where everyone is afraid to take any kind of risks. If we are not allowed to have a 2-3 week window to optimize after a huge landing such as this, then how are we supposed to experiment or take risks?
In a worse-case scenario, we have the old Music app in the dev_apps folder that we can switch back to at a moment’s notice. But, we should be encouraging devs by giving them time to optimize after taking a huge risk. We are shipping to Mozillians (foxfooders) right now, who presumably understand that we are trying to make FxOS better.
-Justin
> On Oct 2, 2015, at 12:41 PM, Justin D'Arcangelo <jdarc...@mozilla.com> wrote:
>
> I understand, but the whole point of doing the switch now was to ensure that there were more people using the app to make sure there were no showstopper bugs that we weren’t aware of. If we don’t get people using the app for a few weeks, its possible that some major bugs could slip through. Especially in the case of an app like Music where everyone possesses a wide variety of media files. It would be impossible for any of the Music app devs or QA testers to check every possible media file out there. However, with dogfooders using the app, its more likely that more types of media files will get tested.
>
> -Justin
>
>
>> On Oct 2, 2015, at 12:36 PM, Fabrice Desré <fab...@mozilla.com> wrote:
>>
>> Think about people dogfooding. It's barely acceptable to suddenly switch
>> to a much worse version of any app, even I your target is some arbitrary
>> deadline in the future and you're confident to fix issues. You would not
>> do that on a live website right?
>>
>> On 10/02/2015 09:31 AM, Justin D'Arcangelo wrote:
>>> I feel like this is the 3rd or 4th time I’ve had to give this explanation, but at least in the case of Music NGA, we merely landed a completely new, feature-complete app this week. The optimization phase of the code had not yet begun, hence the reason for the increase in the perf numbers. However, prior to landing, I *did* run Raptor every day for the past 2 weeks on Flame. In my Raptor results, Music NGA was coming out ~500ms *faster* than the old app. However, as I noted in the bug, I do not trust the numbers because of the OS-wide perf regression that was causing *both* Music apps to take about 3-4 seconds to launch.
>>>
>>> This week, the focus has been mainly on identifying and quickly addressing any bugs that came up after the initial testing of the app. I feel that we have things somewhat under control as far as broken functionality goes. Yesterday, we started working on optimizations. There are several areas where we are completely unoptimized at the moment:
>>>
>>> - album art caching/loading
>>> - thumbnail sizes
>>> - script loading
>>> - view caching
>>>
>>> All of these items will address the memory usage, startup time or both. So, please do not assume that we spent weeks optimizing the app before landing this week. We merely reached a state of “feature-complete” with a new codebase. We hope to meet or beat the prior app’s numbers before the v2.5 deadline.
>>>
>>> Thanks!
>>>
>>> -Justin
>>>
>>>
>>>> On Oct 2, 2015, at 12:01 PM, Fabrice Desré <fab...@mozilla.com> wrote:
>>>>
>>>> All the nga apps (music, contacts, sms) show significant regressions. Is
>>>> that only a lack of optimizations in these apps, in the bridge they all
>>>> use or design flaws in nga itself?
>>>> In any case, we have to stop porting new apps to nga until these
>>>> questions are answered.
>>>>
>>>> Fabrice
>>>> --
>>>> Fabrice Desré
>>>> b2g team
>>>> Mozilla Corporation
>>>> _______________________________________________
>>>> dev-fxos mailing list
>>>> dev-...@lists.mozilla.org
>>>> https://lists.mozilla.org/listinfo/dev-fxos
>>>
>>
>>
>> --
>> Fabrice Desré
>> b2g team
>> Mozilla Corporation
>
Now, I understand.
On 10/02/2015 09:49 AM, Justin D'Arcangelo wrote:
> I would also like to add that this policy of immediately pouncing on devs who attempt to try something new that may cause the perf numbers to momentarily dip is part of why we seem to have a culture problem in FxOS dev where everyone is afraid to take any kind of risks. If we are not allowed to have a 2-3 week window to optimize after a huge landing such as this, then how are we supposed to experiment or take risks?
You have all the time you want if you don't put dogfooders at risk. No
one is saying that you should not take the risk to try something new
(side note, you spent enough time on spark & flyweb to know that). But
when it comes to shipping there is a minimum bar to meet, and with
basically a x2 memory usage we are not meeting it in this app yet,
sorry. Feel free to ship a new app alongside the existing one instead
and ask people to try it, since we can't do A/B testing.
The problem we have is that most people don't care enough about having a
stable nightly, which is why we haven't updated dogfooders for more than
a month now.
Fabrice
I have two phones, one of whivh has boot2gecko 3.0.0.0-prerelease, and version 1.3 that, with updatesm updated me to only version 2.0,but it didnt go further than that of version updates (not sure why). But how now to continue testing since I can no longer access and test version 3.0.0.0-prerelease since my phone is dead that had that version on it, but yet my other FxOS phone that has 2.0 curreny on it isnt being updated?
It seems that way cus I know one dev tht has to do tht cus uis app tht worked before on one version no lobger works on v2.5 or higher.
On 10/06/2015 09:58 AM, Etienne Segonzac wrote:
> So we're fine with the system that didn't work for 2.5 and we're making
> no promise for the future.
> Nice commitment to performance.
We have tools to detect regressions and report bugs. We have people
triaging and following up on these bugs. What's left? Locking down devs
to fix issues? If you have suggestions I'm all ears, but I'm out of
politically correct ideas.