Round 13 preview data is available for sanity checking

1,983 views
Skip to first unread message

Brian Hauer

unread,
Aug 18, 2016, 7:43:02 PM8/18/16
to framework-benchmarks
Round 13 preview data from Azure is available for sanity checks.

https://www.techempower.com/benchmarks/previews/round13/

Thank you for your patience!  We hope to address some known issues with the results and accept fix pull requests for approximately two weeks prior to finalizing the round.  If you identify and can correct any issues, we'd appreciate pull requests that fix problems.

Notes and known issues:
  • We are in the process of migrating the logs to our log server.  If the logs directory at http://tfb-logs.techempower.com/round-13/ is empty when you check it, please try again in a few hours.
  • Stripped test implementations are now hidden by default.
  • Framework attributes data are somewhat unclean. (E.g., "PHP", "PHP5", and "PHP7" in the Language attributes.) We hope to clean this prior to Round 13 final.
  • The LWAN multiple-queries result appear to be using either a database batch or resultset cursors but review of the source code has not yet confirmed this.
Thanks for your help and contributions!

Zhong Yu

unread,
Aug 18, 2016, 8:19:53 PM8/18/16
to Brian Hauer, framework-benchmarks
Thanks Brian, much appreciated.

It's typical among frameworks that the throughput drops significantly when number of connections goes up from 4k to 16k. Anybody has a theory? Is it from the increasing bookkeeping from TCP stack? Also, can we test with even more connections, like 64K.

Zhong Yu
bayou.io

rik...@ngs.hr

unread,
Aug 19, 2016, 5:23:41 AM8/19/16
to framework-benchmarks
Hi Brian,

thanks for the results.
But I can't seem to understand why Revenj.JVM is missing from multiple queries and updates.
The logs indicate it finished successfully, but on the preview page it is shown as did not complete.

Did you manually excluded results for those two tests?

btw. Lwan tests seems to be using SQLite, thats why it has much higher numbers than others.

Regards,
Rikard

Fredrik Widlund

unread,
Aug 19, 2016, 9:57:28 AM8/19/16
to Brian Hauer, framework-benchmarks
Hi Brian,

Are we still running on Ubuntu 12.04?

Kind regards,
Fredrik Widlund

--
You received this message because you are subscribed to the Google Groups "framework-benchmarks" group.
To unsubscribe from this group and stop receiving emails from it, send an email to framework-benchmarks+unsub...@googlegroups.com.
Visit this group at https://groups.google.com/group/framework-benchmarks.
For more options, visit https://groups.google.com/d/optout.

Brian Hauer

unread,
Aug 19, 2016, 11:03:33 AM8/19/16
to framework-benchmarks
Aha, SQLite of course!  I will hide that row in the data set.  As discussed elsewhere, we do not presently support in-process or local database tests.  Removing these options for the time being may be a side effect of our planned work to clean up the attributes/filters.

We should have Github issues to represent the other hidden data soon.  Until an issue is created, I can summarize: the Revenj.JVM implementation appears to be using a SQL batch rather than independent queries.  Review at this line:

https://github.com/TechEmpower/FrameworkBenchmarks/blob/master/frameworks/Java/revenj/src/main/java/hello/Context.java#L66

The requirements should be clarified for this, however, since the only current citation of batches is to say they are "acceptable" for the updates portion of the updates test; they are not cited specifically in the multiple queries test requirements.  The intent of the test is to require the use of N query statements without a batching behavior—N round trips to the database server.

Brian Hauer

unread,
Aug 19, 2016, 11:04:17 AM8/19/16
to framework-benchmarks, teona...@gmail.com
Hi Fredrik,

Good question!  This is now on Ubuntu 14.

rik...@ngs.hr

unread,
Aug 19, 2016, 11:19:40 AM8/19/16
to framework-benchmarks
That explains why you hid the multiple queries, but not why did you hid the updates test.

And now onto batches, I don't believe that query is doing anything anything against the spirit of the test.
If your test represents a use case for loading data from N different data sources (eg. tables) that is aligned with the Revenj implementation.
I'm 99.9% sure that no other framework supports such feature in a general way. And it's not feasible to implement such query by hand on Postgres. So Revenj is doing something unique and is following the rules of not doing query such as "IN (ids)".

Your rules don't state that implementation must do a N roundtrips (which would be not a really good rule), but that it has to to multiple queries.
If I was using MSSQL I would write such queries in a way:

SELECT * FROM table1 where id = @id1; SELECT * FROM table2 WHERE id = @id2

That are batched queries, but in a single roundrip. It makes no sense to forbid such queries because that is the way you should talk to the database.

Regards,
Rikard

Brian Hauer

unread,
Aug 19, 2016, 1:51:32 PM8/19/16
to framework-benchmarks
The preview results were improperly scaled for 15-second tests rather than the actual 60-second tests we ran for this preview.  I've corrected the preview results to know that this was using a 60-second test duration.

This results in all numbers being 25% what you would have seen if you looked at the results earlier today or yesterday.  However, the rank order of results is identical.

Sorry for this error!

David D

unread,
Aug 20, 2016, 9:12:08 AM8/20/16
to framework-benchmarks
Excellent that these results are now available. Thanks everyone for all your hard work!

Could someone please confirm which branch we should now be raising PRs for and if they will actually make it into this round 13, or the (new?) round 14?

thanks!

Aliaksandr Valialkin

unread,
Aug 20, 2016, 9:38:31 AM8/20/16
to framework-benchmarks
Plaintext and json results seem capped either by network speed or by client (wrk) performance comparing to Round 12.
Btw, it would be great to see cpu load and RAM usage on the server host for each framework - these values might help in identifying bottlenecks.

Aliaksandr Valialkin

unread,
Aug 20, 2016, 9:52:49 AM8/20/16
to framework-benchmarks
I suspect the cap is in the azure network, which limits network packets' rate to cap numbers shown in json results. Plaintext results are capped at higher nubmers because of http pipelining enabled in this test, so multiple requests or responses may trap in a single network packet.

Nikolche Mihajlovski

unread,
Aug 20, 2016, 3:08:33 PM8/20/16
to Aliaksandr Valialkin, framework-benchmarks
I agree, the numbers look definitely wrong (the top 20 frameworks have almost identical performance), I also assume the network is the bottleneck.

Rapidoid magically dropped from 1st place to 12th place in "Plaintext", and and 2nd to 17th in "JSON" - just like that, without any changes in the implementation.
I don't see any other reason for this except a bottleneck in the infrastructure.

I understand there was a strong pressure from the community to produce Round 13 soon, but I just hope the final round will have proper results.

Aliaksandr's idea about reporting the CPU and RAM is great!

Matthieu Garrigues

unread,
Aug 22, 2016, 3:39:46 AM8/22/16
to framework-benchmarks


Le vendredi 19 août 2016 01:43:02 UTC+2, Brian Hauer a écrit :
Round 13 preview data from Azure is available for sanity checks.

https://www.techempower.com/benchmarks/previews/round13/

Thank you for your patience!  We hope to address some known issues with the results and accept fix pull requests for approximately two weeks prior to finalizing the round.  If you identify and can correct any issues, we'd appreciate pull requests that fix problems.


Thanks to all the TFB team for the results. I noticed in the logs that Silicon installation failed because cmake could not find the clang++-3.5 compiler. This compiler should have been
installed by prerequisites.sh (at least it is the case in the vagrant development env):

Is there any reason that could make apt-get fail in the asure environment but not in the dev environment ?

Best,
Matthieu

Brian Hauer

unread,
Aug 22, 2016, 5:24:18 PM8/22/16
to framework-benchmarks
Hi Rikard,

I removed both the Multi-query result and Updates result for Revenj because both appeared to be selecting data using a single round-trip to the database server.  We have previously done the same in similar circumstances for other test implementations until they were corrected.

The Multiple-query and Updates tests are designed to exercise the database connection pool, the database driver, the ORM, and all other aspects of the database pipeline repeatedly (as well as the HTTP request pipeline once per request).  The N iterations are intended to be the equivalent of doing the database work of the Single-query test in its entirety N times, but without the overhead of an additional the HTTP request.

The use-case this is approximating is an application behavior where you need to read item A and then based on A's value and other logic, you need to read item B and then based on B's value and other logic, you now need to read item C, and so on.  Imagine branching code.

The intent has always been to do N round-trips to the database server to fully exercise the connection pool, database driver, ORM (where applicable), and other elements of database connectivity.  I have attempted to clarify the requirements further as a result of your feedback.  I have rewritten requirement #6 of the Multi-query test as such:

"This test is designed to exercise multiple queries, each requiring a round-trip to the database server, and with each resulting row selected individually. It is not acceptable to use batches. It is not acceptable to execute multiple SELECTs within a single statement. It is not acceptable to retrieve all required rows using a SELECT ... WHERE id IN (...) clause."

It sounds as if you are generating multiple ResultSets from a single statement.  That is obviously a perfectly sensible thing to do in some use-cases and may be a suitable use-case to use as the basis for a future test type in our project.  But it is not in-line with the intent of the existing Multi-query test.  If you would like to propose a new test type that executes multiple queries within a single statement, please do so here:

https://github.com/TechEmpower/FrameworkBenchmarks/issues/133

I hope you understand that we do this not to frustrate your efforts but to keep the results accurate and fair.  This is analogous to our (current) stance that we do not yet have a test type suitable for SQLite since an embedded database also avoids the principal work of making round-trips to an external service.

Thank you for your understanding!

Brian Hauer

unread,
Aug 22, 2016, 6:03:58 PM8/22/16
to framework-benchmarks
Hi Aliaksandr,

You are correct, these two test types are being network-limited.  While we didn't measure the network performance directly, we suspect it is approximately 1Gbps or approximately.  We saw exactly this sort of compression of results at the high-end in our original physical hardware environment, which was i7 workstations on gigabit Ethernet.

We hope to eventually be able to run Round 13 on our new physical server environment that has 10-gigabit Ethernet.  However, we're not yet comfortable that we're seeing consistent network performance in that environment, and therefore do not yet have preview data that is suitable to share. 

Brian Hauer

unread,
Aug 22, 2016, 6:05:19 PM8/22/16
to framework-benchmarks
Hi David,

For fixes based on the preview results, PRs should go to round-13.  For additions and other changes, PRs should go to round-14.

Brian Hauer

unread,
Aug 22, 2016, 6:13:34 PM8/22/16
to framework-benchmarks
Hi Matthieu,

Unfortunately, it's not obvious why apt-get would fail in our test and not in your development environment.  Some things to note:
  • We're now on Ubuntu 14 for our tests.  So it's possible the package in question isn't available for Ubuntu 14.
  • Perhaps the repos were not available at the time of our test.

Daniel Nicoletti

unread,
Aug 23, 2016, 12:36:37 AM8/23/16
to framework-benchmarks
Hi,

what sort of hardware is this Azure instance?
Do I need to mark when keep-alive is not supported?

The results for Cutelyst are much better on my Laptop with
an i5 dual core, also I didn`t have any errors where can I see
what the errors where?

Thanks.

rik...@ngs.hr

unread,
Aug 23, 2016, 4:40:01 AM8/23/16
to framework-benchmarks
Hi Brian,


On Monday, August 22, 2016 at 11:24:18 PM UTC+2, Brian Hauer wrote:
Hi Rikard,

I removed both the Multi-query result and Updates result for Revenj because both appeared to be selecting data using a single round-trip to the database server.  We have previously done the same in similar circumstances for other test implementations until they were corrected.

Can you point me to an instance where you instructed implementations which did not break rules as stated at that time to change the implementation?
 

The Multiple-query and Updates tests are designed to exercise the database connection pool, the database driver, the ORM, and all other aspects of the database pipeline repeatedly (as well as the HTTP request pipeline once per request).  The N iterations are intended to be the equivalent of doing the database work of the Single-query test in its entirety N times, but without the overhead of an additional the HTTP request.

What you are saying is that you want implementations to work hard and will not allow implementations to work smart.
 
The use-case this is approximating is an application behavior where you need to read item A and then based on A's value and other logic, you need to read item B and then based on B's value and other logic, you now need to read item C, and so on.  Imagine branching code.

Frankly I think you now just made up that use case in trying to prove that Revenj implementation does not abide by it.
But let's say that this is your originally wanted use case. In that case your current requirements are again lacking, eg. you need to state that queries must be executed in serial fashion. Otherwise knowledge obtained by previous result can't be used for next query.
By quickly browsing few implementation it's obvious that they don't abide by this rule. Instead they are starting N parallel request to the database to minimize duration of total DB interaction. For example:
https://github.com/TechEmpower/FrameworkBenchmarks/blob/master/frameworks/Scala/akka-http/src/main/scala/com/typesafe/akka/http/benchmark/handlers/QueriesHandler.scala#L50
https://github.com/TechEmpower/FrameworkBenchmarks/blob/master/frameworks/Java/undertow/src/main/java/hello/DbSqlHandler.java#L53
https://github.com/TechEmpower/FrameworkBenchmarks/blob/master/frameworks/JavaScript/nodejs/handlers/mongodb-raw.js#L57

to name just a few.
And on top of that, Revenj actually can execute multiple queries in single roundtrip and use result of previous query as information for the next query. Are you saying that if I change the implementation to support such an use case you will allow it (or will you invoke multiple roundtrips to the database rule again)?
 
The intent has always been to do N round-trips to the database server to fully exercise the connection pool, database driver, ORM (where applicable), and other elements of database connectivity.  I have attempted to clarify the requirements further as a result of your feedback.  I have rewritten requirement #6 of the Multi-query test as such:

"This test is designed to exercise multiple queries, each requiring a round-trip to the database server, and with each resulting row selected individually. It is not acceptable to use batches. It is not acceptable to execute multiple SELECTs within a single statement. It is not acceptable to retrieve all required rows using a SELECT ... WHERE id IN (...) clause."

Don't take this the wrong way, but you should try and learn from constructive criticism.
Rules which specify how implementation should behave are broken rules.
Rules should specify only the resulted output and behavior in specific scenarios (eg. flushing to disk before returning response).
But your benchmark due to use of random and various simplifications has issues verifying if frameworks are behaving correctly.
Since it's unlikely that anything will change at this point due to sheer size of the implementations, maybe you should reconsider your stance of letting community help you with the benchmark.

It sounds as if you are generating multiple ResultSets from a single statement.  That is obviously a perfectly sensible thing to do in some use-cases and may be a suitable use-case to use as the basis for a future test type in our project.  But it is not in-line with the intent of the existing Multi-query test.  If you would like to propose a new test type that executes multiple queries within a single statement, please do so here:

https://github.com/TechEmpower/FrameworkBenchmarks/issues/133

That's just wishful thinking. I don't expect new tests from this benchmark anytime soon.
 

I hope you understand that we do this not to frustrate your efforts but to keep the results accurate and fair.  This is analogous to our (current) stance that we do not yet have a test type suitable for SQLite since an embedded database also avoids the principal work of making round-trips to an external service.

What frustrates me is that you have few "broken" rules in the benchmark. But that is irrelevant, this is your benchmark, not a community one, and you are free to make up rules as you see fit. But you should not be surprised when people complain about issues with your benchmark.

Allowing bulk update on a single table, but no allowing bulk reading implementation which supports multiple table makes no sense.
In the end, which result would benefit more people looking at them?
Ones which suggest that you should do a parallel implementation of multiple DB queries (which is not even the fastest in this bench) or the ones which clearly shows that batching multiple different queries is much faster.
 

Thank you for your understanding!

Sorry for not having too much understanding for your explanations.
I hope that you will read and try to understand my response, although I don't have much faith that you will understand my point of view.

Regards,
Rikard

Brian Hauer

unread,
Aug 23, 2016, 5:30:32 PM8/23/16
to framework-benchmarks
Hi Daniel,

You may be able to see the errors in the log files that are linked from the banner on the preview site.

The Azure tests were run on D3 v2 instances.  Keep-alive is used in all test types.  HTTP pipelining is only used in the plaintext test type.

Brian Hauer

unread,
Aug 23, 2016, 6:41:52 PM8/23/16
to framework-benchmarks
Hi Rikard,


On Tuesday, August 23, 2016 at 1:40:01 AM UTC-7, rikard wrote:
Can you point me to an instance where you instructed implementations which did not break rules as stated at that time to change the implementation?

We have removed and will continue to remove results from SQLite implementations for a similar reason—namely, such tests are avoiding a principal portion of the expected work.  We've also removed Redis implementations since we would prefer to have Redis implementations show up in a future caching-enabled test type.  NB: This has been and continues to be a manual process; we're working on automating more of this as time permits.
 
What you are saying is that you want implementations to work hard and will not allow implementations to work smart.

Yes.  This is a benchmark exercise.  Working hard is precisely what we want.

Working "smart" is encouraged but—as I am sure you can understand—working too smart is potentially dangerous for a benchmarking project.  Being too clever may mean intentionally avoiding the expected work load (e.g., an implementation that doesn't even talk to a database and just returns results that fool our validation tests) or unintentionally avoiding expected work (using an optimization that seems reasonable but we deem invalid for the test type).  Previously we have had to enforce very subtle things like requiring that the JSON implementations instantiate objects, rather than return a serialization of a single static object.  Enforcing some of these things may not even measurably affect the results, but we try to do so as we can to keep the results fair.

That said, I will admit that we're only so good at noticing every clever thing that may or may not be violating the spirit of the tests.  In large part we count on the generosity of the community to help us keep an eye on test implementations that may be playing a bit fast and loose with the rules.
 
Here is the bottom-line: the multi-query test type has always been intended to require N round-trips to the database server as an external system.  Implementations that use SQLite are avoiding those round-trips to an external system.  An implementation that runs the N queries as a batch avoids making several round-trips.  An implementation that runs the N queries in a single Statement with multiple ResultSets avoids making several round-trips.  None of these are acceptable.

The legwork of communicating with an external database is in large part what we are concerned with in this test type.  If you can remove that legwork in a real-world application, that's great.  But we're measuring a scenario where you are required to make N round-trips to an external system.  That's just what the test is.


The use-case this is approximating is an application behavior where you need to read item A and then based on A's value and other logic, you need to read item B and then based on B's value and other logic, you now need to read item C, and so on.  Imagine branching code.

Frankly I think you now just made up that use case in trying to prove that Revenj implementation does not abide by it.

Yes, I did just make up that analogy to try to paint a picture of a case where you might need to run N queries but are not able to batch them together.  It's one example of why N round trips might be necessary.  But you are right, I've never used that particular analogy before.
 
But let's say that this is your originally wanted use case. In that case your current requirements are again lacking, eg. you need to state that queries must be executed in serial fashion. Otherwise knowledge obtained by previous result can't be used for next query.
By quickly browsing few implementation it's obvious that they don't abide by this rule. Instead they are starting N parallel request to the database to minimize duration of total DB interaction. For example:
https://github.com/TechEmpower/FrameworkBenchmarks/blob/master/frameworks/Scala/akka-http/src/main/scala/com/typesafe/akka/http/benchmark/handlers/QueriesHandler.scala#L50
https://github.com/TechEmpower/FrameworkBenchmarks/blob/master/frameworks/Java/undertow/src/main/java/hello/DbSqlHandler.java#L53
https://github.com/TechEmpower/FrameworkBenchmarks/blob/master/frameworks/JavaScript/nodejs/handlers/mongodb-raw.js#L57

To be clear, it is not a requirement that the queries run serially.  The scenario I used above is one reason you might need N queries.  Another example reason is that you need to interact with N separate databases on different servers.  Whatever the reason, we are insisting on N round-trips.  We want to exercise the framework and platform's database driver code (and ORM, where applicable) repeatedly during the scope of each request.

If for the sake of argument, we did allow one implementation to avoid the work of N round-trips, all implementations would suddenly be out of date and everyone would feel the pressure to update their implementations to use batches, multiple ResultSets, or whatever other tactics they have in mind for doing all the work in one iteration.  But again, that's not what we are doing.
 
And on top of that, Revenj actually can execute multiple queries in single roundtrip and use result of previous query as information for the next query. Are you saying that if I change the implementation to support such an use case you will allow it (or will you invoke multiple roundtrips to the database rule again)?

You are required to make N round trips.
 
 
The intent has always been to do N round-trips to the database server to fully exercise the connection pool, database driver, ORM (where applicable), and other elements of database connectivity.  I have attempted to clarify the requirements further as a result of your feedback.  I have rewritten requirement #6 of the Multi-query test as such:

"This test is designed to exercise multiple queries, each requiring a round-trip to the database server, and with each resulting row selected individually. It is not acceptable to use batches. It is not acceptable to execute multiple SELECTs within a single statement. It is not acceptable to retrieve all required rows using a SELECT ... WHERE id IN (...) clause."

Don't take this the wrong way, but you should try and learn from constructive criticism.
Rules which specify how implementation should behave are broken rules.

You may feel that way, but I feel otherwise.  These tests are required to use a database server—they are not permitted to generate responses that appear convincing to our validation tests without hitting a database.  The Fortunes test is always returning the same payload but it is required to repeatedly query the database for those fortune cookie messages that never change.  The JSON serialization test must incur the cost of instantiating an object or allocating memory.  And so on.  These are implementation details that we consider fundamental to the benchmarking exercise at hand.

Yes, a framework may have clever features that in some use-cases would avoid similar workload in real-world applications.  We simply do not have the necessary test type diversity (yet?) to demonstrate all of those cases.
 
Rules should specify only the resulted output and behavior in specific scenarios (eg. flushing to disk before returning response).
But your benchmark due to use of random and various simplifications has issues verifying if frameworks are behaving correctly.
Since it's unlikely that anything will change at this point due to sheer size of the implementations, maybe you should reconsider your stance of letting community help you with the benchmark.

I'm not sure what you mean here.  We greatly appreciate all of the help we have received from the community on this project!  We hope to continue receiving such help.

But you are right, there is a large volume of code in the existing implementations that the community has contributed.  We prefer to not change test type requirements unless there is a necessary clarification.  Changing the Multi-query test to allow batching would change the intention of the test and render many/most implementations out of date.  We are averse to changes that will render many implementations obsolete.  Doing so would not just be a burden on us, but more importantly, a burden on the community contributors.
 
What frustrates me is that you have few "broken" rules in the benchmark. But that is irrelevant, this is your benchmark, not a community one, and you are free to make up rules as you see fit. But you should not be surprised when people complain about issues with your benchmark.

Complaining about benchmarks is commonplace and this one is no exception.  And you are right, we continue to have final authority on this particular benchmark project.  But I feel we have been open and engaging with the community.  The community contributions to this project are extensive.  My feeling is that saying this is not a community project is not fair to the community.

It is impossible to have community consensus on everything all the time.  Software development is an opinionated universe.

Sometimes consensus is more or less clear.  We recently changed the implementation approach classification of the Rapidoid implementation to Stripped based on feedback from the community.  Other times, it's not as clear and we have to make an executive decision.  We try to communicate the rationale for those decisions and understand that not everyone will agree.
 
Allowing bulk update on a single table, but no allowing bulk reading implementation which supports multiple table makes no sense.

Your desire to bulk read required debating whether to allow bulk reading in the Updates test.  I was on the fence about this but we ultimately landed where we did in deference to leaving implementations as-is.  It is also perhaps worthwhile to know that the Updates test was originally derived from the Multi-query test.  While we decided to allow batch updates, we anticipated that implementations would leverage the existing implementation of the Multi-query test for the reads portion and then add writes.
 
In the end, which result would benefit more people looking at them?

I feel the most benefit would be achieved by a greater diversity of test types.  It remains a goal of ours to diversify test types.
 
Sorry for not having too much understanding for your explanations.

No worries.  I don't know if we will end up agreeing here, but I do hope that you at least recognize our perspective.  I also invite anyone else in the community to join the conversation.

Daniel Nicoletti

unread,
Aug 23, 2016, 7:59:59 PM8/23/16
to Brian Hauer, framework-benchmarks
Hi Brian,

I looked at all logs and server side I only saw one line regarding
uwsgi listen queue being full
which ATM is 100 but since SOMAXCONN is 128 I'm not sure this was the
cause for 14k errors.
Are there logs from the client side?

I know keep-alive is used, but uwsgi < 2.1 does not support it, and
uwsgi is not release yet,
still from my wheigttp tests it runs fine omitting the -k option.
Nginx->uwsgi should support
keep-alive but performance looks like when I run with it disabled (but
I'll try to investigate this
further)

Best,
> --
> You received this message because you are subscribed to the Google Groups
> "framework-benchmarks" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to framework-benchm...@googlegroups.com.
--
Daniel Nicoletti

KDE Developer - http://dantti.wordpress.com

rik...@ngs.hr

unread,
Aug 24, 2016, 9:30:37 AM8/24/16
to framework-benchmarks
Hi Brian,


On Wednesday, August 24, 2016 at 12:41:52 AM UTC+2, Brian Hauer wrote:
We have removed and will continue to remove results from SQLite implementations for a similar reason—namely, such tests are avoiding a principal portion of the expected work.  We've also removed Redis implementations since we would prefer to have Redis implementations show up in a future caching-enabled test type.  NB: This has been and continues to be a manual process; we're working on automating more of this as time permits.

I don't see those as similar issues, because to me that falls under which databases are supported and which are not. But ok... let's move on.
 
 
What you are saying is that you want implementations to work hard and will not allow implementations to work smart.

Yes.  This is a benchmark exercise.  Working hard is precisely what we want.


I think that's not a good way to think about benchmarking. You want best results. Which is often combination of both.
 
Working "smart" is encouraged but—as I am sure you can understand—working too smart is potentially dangerous for a benchmarking project.  Being too clever may mean intentionally avoiding the expected work load (e.g., an implementation that doesn't even talk to a database and just returns results that fool our validation tests) or unintentionally avoiding expected work (using an optimization that seems reasonable but we deem invalid for the test type).  Previously we have had to enforce very subtle things like requiring that the JSON implementations instantiate objects, rather than return a serialization of a single static object.  Enforcing some of these things may not even measurably affect the results, but we try to do so as we can to keep the results fair.

That said, I will admit that we're only so good at noticing every clever thing that may or may not be violating the spirit of the tests.  In large part we count on the generosity of the community to help us keep an eye on test implementations that may be playing a bit fast and loose with the rules.
 

There are certainly some of irrelevant optimizations done in some products to look better on specific benchmarks. But I don't think you have a healthy attitude on how benchmarking should be done. I will expand on that later.
 
Here is the bottom-line: the multi-query test type has always been intended to require N round-trips to the database server as an external system.  Implementations that use SQLite are avoiding those round-trips to an external system.  An implementation that runs the N queries as a batch avoids making several round-trips.  An implementation that runs the N queries in a single Statement with multiple ResultSets avoids making several round-trips.  None of these are acceptable.

The legwork of communicating with an external database is in large part what we are concerned with in this test type.  If you can remove that legwork in a real-world application, that's great.  But we're measuring a scenario where you are required to make N round-trips to an external system.  That's just what the test is.


What irks me is that while in your mind it was always about doing N roundtrips to the database, from my point of view, that is not a valuable result and was neither specified in the rules.
You might want to pull in analogy that this is just a generic use case for calling N external services, but the name of the test is not call to N external services.
If you create a test with such a name there would not be any problems with it.

The only valid point in banning Revenj from being displayed is that it might be "unfair" to the implementations which choose not to do a multiple results set implementation but could.
But the issue there is that Postgres doesn't support multiple results set (so most implementations would not be able to do so).
There are workarounds and I think it would be more valuable to the people looking at the results to be able to differentiate frameworks/libraries based on weather they do support such a feature or not.
Otherwise, you are not really benching frameworks, but rather network/databases.

When I asked you to show me examples of similar situations I was hoping that you will show me an example where someone submitted a PR with bulk reading/multiple result set (which did not break the rules as specified).
But I'm not aware of any. Therefore I don't think being unfair to others is really a valid point.
 

Don't take this the wrong way, but you should try and learn from constructive criticism.
Rules which specify how implementation should behave are broken rules.

You may feel that way, but I feel otherwise.  These tests are required to use a database server—they are not permitted to generate responses that appear convincing to our validation tests without hitting a database.  The Fortunes test is always returning the same payload but it is required to repeatedly query the database for those fortune cookie messages that never change.  The JSON serialization test must incur the cost of instantiating an object or allocating memory.  And so on.  These are implementation details that we consider fundamental to the benchmarking exercise at hand.

Yes, a framework may have clever features that in some use-cases would avoid similar workload in real-world applications.  We simply do not have the necessary test type diversity (yet?) to demonstrate all of those cases.
 

I have authored several benchmarks and have submitted my solutions to various ones. This is the first time someone accused me of cheating and changed the rules after my submission.
You are repeating some of your "wrong" views here on how benchmarking should be done, so let me expand on that.
Some people have also tried to explain to you some of those stuff before, but at best, you came out of it with the conclusion that you are ok doing exception for their case.

JSON rules which states that you have to create an instance of object before passing it down to the JSON library are fundamentally broken.
First of all, creating instance make sense only in some languages.
Then some languages are more advanced and will optimize dead code anyway.
Therefore the best you can do if you want to "enforce" such rules is pass in external variable.
With such a rule, all those rules which state how implementation should behave are unnecessary.

This will lead to automated framework verification.
What if I decide to submit a solution in my language which only I understand?
How will you evaluate it?
Will you check machine code if solution abide by your implementation rules?

And it's the same with other tests.
Test which states that you have to do N roundtrips to the database is a fundamentally broken one.
Especially when someone submits a solution which is more aligned with how you would solve that problem (as stated in rules) in the real world.
 
But you are right, there is a large volume of code in the existing implementations that the community has contributed.  We prefer to not change test type requirements unless there is a necessary clarification.  Changing the Multi-query test to allow batching would change the intention of the test and render many/most implementations out of date.  We are averse to changes that will render many implementations obsolete.  Doing so would not just be a burden on us, but more importantly, a burden on the community contributors.

You are now changing the rules to disallow batching. So you are changing it anyway (you may not feel that way because you implied it in your mind).
What I dislike most from this situation is that you hid it without asking the community about it's opinion.
If you showed Revenj among the results and asked the community what it thinks about that, most of my complaints would be null and void.
 
 
What frustrates me is that you have few "broken" rules in the benchmark. But that is irrelevant, this is your benchmark, not a community one, and you are free to make up rules as you see fit. But you should not be surprised when people complain about issues with your benchmark.

Complaining about benchmarks is commonplace and this one is no exception.  And you are right, we continue to have final authority on this particular benchmark project.  But I feel we have been open and engaging with the community.  The community contributions to this project are extensive.  My feeling is that saying this is not a community project is not fair to the community.


Complaints are often valid. Because it's very easy to create problems in benchmarks.
The round 12 was a mess. You did not share anything with the community.
This round is +6 months overdue.
You ask community for patience while you alone go through setup.
People from community offer their time for whatever task is needed, but you don't even reply to those offers.
That doesn't feel like a "community" project.
 
It is impossible to have community consensus on everything all the time.  Software development is an opinionated universe.


But benchmark results are not opinionated. They are just numbers. Which say something.
Sometimes what they say is not what you were expecting.
 
Sometimes consensus is more or less clear.  We recently changed the implementation approach classification of the Rapidoid implementation to Stripped based on feedback from the community.  Other times, it's not as clear and we have to make an executive decision.  We try to communicate the rationale for those decisions and understand that not everyone will agree.

That's fine, but in this case there was not any discussion.
Even right now, LWAN results which use "unapproved" DB are shown, while Revenj results are hidden.
And you hid Revenj results only because you have issues on how much faster it is than others.
You did not hid Revenj.NET results which uses the same algorithms, but runs on platform with crappy GC so it's nowhere near the top of the results.
I'm sure you will fix that mistake now.
But then again, you did not hid Dropwizards results which uses Hibernate and implicitly caching, again, because it doesn't stand out.

While I don't have anything against Hibernate, since I did a benchmark of database drivers I have also built a validation tests which fail when Hibernate is used
(eg, change data in the database through some alternative method and try to load if after that with hibernate).
 
 
Allowing bulk update on a single table, but no allowing bulk reading implementation which supports multiple table makes no sense.

Your desire to bulk read required debating whether to allow bulk reading in the Updates test.  I was on the fence about this but we ultimately landed where we did in deference to leaving implementations as-is.  It is also perhaps worthwhile to know that the Updates test was originally derived from the Multi-query test.  While we decided to allow batch updates, we anticipated that implementations would leverage the existing implementation of the Multi-query test for the reads portion and then add writes.

Yes, debate is what I was expecting. And I would be ok if result of that debate is to disallow Revenj submission.
But there was no debate, so I'm kind of forcing one onto you now.

If anything I think multiple queries should not allow IN (ids), but updates should.
When you decide not to allow bulk update which works only on a single table, only in that case IN (ids) should not be allowed in updates test.
But that's mine opinion and it's irrelevant now for this discussion.
 
 
In the end, which result would benefit more people looking at them?

I feel the most benefit would be achieved by a greater diversity of test types.  It remains a goal of ours to diversify test types.

Frankly I doubt this benchmarks will get new tests anytime soon. But it has plenty of test which could be even more valuable.
When you see plaintext and json text limited by bandwidth you feel that there is something wrong with the setup.
But when you see updates and multiple queries test with same kind of results, you think that is ok because of the network roundtrip overhead.
That's kind of strange.
If you clarify that those two tests are allowed to use batched queries which works on multiple tables (keep the IN (ids) part of the rule) then you will see greater diversification among the results.
And this is a good thing.
 
 
Sorry for not having too much understanding for your explanations.

No worries.  I don't know if we will end up agreeing here, but I do hope that you at least recognize our perspective.  I also invite anyone else in the community to join the conversation.

I do understand your perspective, but I'm trying to explain to you flaws in your reasoning (at least from my POV).
As said, I'm fine if the "community thinks" my implementation is not an acceptable one (Revenj will still be on top, just not so much better than the competition).

Regards,
Rikard

Nick Kasvosve

unread,
Aug 24, 2016, 11:45:14 AM8/24/16
to framework-benchmarks
Seems there is also some confusion about the version of Java. On the dev platform, frameworks using Java 8 run just fine, but failed on the preview run.

Can we address this please so that these frameworks can run successfully next time?

Do we need to create a pull request to downgrade to Java 7?


On Friday, 19 August 2016 00:43:02 UTC+1, Brian Hauer wrote:

al...@lunabyte.io

unread,
Aug 28, 2016, 8:33:38 PM8/28/16
to framework-benchmarks
Why doesn't Phoenix/Elixir seem to be represented in the results anywhere?

al...@lunabyte.io

unread,
Aug 28, 2016, 8:37:01 PM8/28/16
to framework-benchmarks
Okay, it appears that you are running elixir 1.1 instead of 1.3, which caused compilation problems. Should one of us make a pull request to fix that or is that something you can do on your end?

bma...@techempower.com

unread,
Aug 29, 2016, 12:02:12 PM8/29/16
to framework-benchmarks
Version upgrades for languages and frameworks can be done by the community (and are more than welcome). In this particular case, Elixir has already been updated and should appear in the next set of results. It was merged in after the preview run results that you've referenced. See https://github.com/TechEmpower/FrameworkBenchmarks/pull/2214.

Thanks,
Brittany 

kne...@techempower.com

unread,
Sep 6, 2016, 2:53:34 PM9/6/16
to framework-benchmarks
Hey everyone, just a quick updated on the status of the Round 13 Preview 2 test run. We did a run of the suite up to commit 2e8402fec98296e28e141a8135abff70c7da62bf on August 29th, but were unable to reproduce results that align with what we were seeing in Preview 1. It is possible that there was a change made between Preview 1 (August 11th) and Preview 2 that is affecting the test suite or the machines that we are testing on. We're currently investigating the source of the problem.

Sorry for the delay, but know that this is our top priority and we're hoping to fix the problem soon so we can get you Preview 2 results!

Fredrik Widlund

unread,
Sep 7, 2016, 6:22:09 PM9/7/16
to kne...@techempower.com, framework-benchmarks
IMHO the only way to produce a meaningful and realistic benchmark, and I've said this before, is to run on dedicated hardware on a, at least, 10GbE network (ideally multiple aggregated 10GbE). If you want to actually test the ability to scale, you will want at least 16 cores. Testing with 4 cores on a machine with hyper-threading will basically give you the same result as testing it on 2 cores, which is basically not scaling out at all. Saturating 1GbE when being nowhere near maximum throughput is even worse. Running on d3v2 instances will result in noisy neighbour issues and unpredictable results, not just between benchmark runs, but also between candidates in each run.

Having said that please let's be agile, move forward, deliver iteratively, and improve incrementally.

Kind regards,
Fredrik Widlund

--
You received this message because you are subscribed to the Google Groups "framework-benchmarks" group.
To unsubscribe from this group and stop receiving emails from it, send an email to framework-benchmarks+unsub...@googlegroups.com.

kne...@techempower.com

unread,
Sep 7, 2016, 6:37:18 PM9/7/16
to framework-benchmarks, kne...@techempower.com
Fredrik,

We actually have 2 test environments we are using for round 13: the d3v2 Azure instances we posted for the 1st preview and a high-performance, physical hardware environment (10GbE, 40 cores). We did not post the physical hardware results for the preview because there are still some kinks we're working out with it to ensure that we're getting the results we expect from the machines, but you can expect high-performance results when we release Round 13 Final.

I appreciate your attentiveness and concern with the environments and look forward to getting you those updated results soon!

Thanks,
Keith

Cody Lerum

unread,
Sep 7, 2016, 6:44:22 PM9/7/16
to kne...@techempower.com, framework-benchmarks
There are a couple of round 13 PR's that don't appear to have been merged.

https://github.com/TechEmpower/FrameworkBenchmarks/pulls?q=is%3Apr+is%3Aopen+label%3ARound-13

-C
> --
> You received this message because you are subscribed to the Google Groups
> "framework-benchmarks" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to framework-benchm...@googlegroups.com.

Fredrik Widlund

unread,
Sep 7, 2016, 6:44:29 PM9/7/16
to kne...@techempower.com, framework-benchmarks
Keith,

That is truly great and much appreciated news! I look forward to the results with new found inspiration!

Kind regards,
Fredrik
To unsubscribe from this group and stop receiving emails from it, send an email to framework-benchmarks+unsubscrib...@googlegroups.com.

Hiram Abiff

unread,
Sep 27, 2016, 5:02:31 AM9/27/16
to framework-benchmarks
Why is there no data for Single-Query, Multiple-Queries, Fortunes & Data-Updates for the aspnetcore-linux framework ? 
Is there no code (yet) for these tests ? 
I was looking forward to see how .NET CORE compares with other frameworks on Linux. 
From what I see in JSON-serialization & plaintext-file, it's somewhere in between middle-field to bottom quarter. 
I find it strange that it can perform better on JSON-serialization than when returning a plaintext file...

nat...@microsoft.com

unread,
Oct 3, 2016, 2:09:37 PM10/3/16
to framework-benchmarks
We recently finished adding database tests to our benchmark repo (https://github.com/aspnet/benchmarks) but haven't yet ported them over to the TechEmpower repo.  This will happen for round 14.

As for preliminary results, we have merged a few configuration tweaks that we expect to greatly improve the initial results.

Andrei Neculai

unread,
Oct 18, 2016, 10:33:15 AM10/18/16
to framework-benchmarks
Hi!

Are there any updates on the date of a preview or round 13?

Brian Hauer

unread,
Nov 3, 2016, 7:13:51 PM11/3/16
to framework-benchmarks
A second preview for Round 13 is now available for review and sanity checks.

https://www.techempower.com/benchmarks/previews/round13/

The preview run was started on October 31 within our new physical hardware environment at ServerCentral.  This new environment varies from the previous physical hardware, so it will not be directly comparable to Round 12.  The application server is a Dell R910 (4x 10-Core E7-4850 CPUs) and the database server is a Dell R420 (2x 4-Core E5-2406 CPUs).

Logs for this second preview can be found at the following location:

http://tfb-logs.techempower.com/round-13/preview-2/

We are attempting to conclude Round 13 within approximately 1 to 2 weeks (posting the results around mid-November), so if you have any last minute fix PRs, please submit those as soon as possible.

Thanks everyone for your patience!

Daniel Nicoletti

unread,
Nov 3, 2016, 8:19:03 PM11/3/16
to Brian Hauer, framework-benchmarks
Thanks for the preview,

sadly I don't have a hw with so many cores, and in such case I'd probably
need a smart load balancer as I believe only the first workers are handling
most connections. I believe in this case Nginx as the front end to my
framework would give better results but for some reason all tests failed,
which passed local and travis tests, and also were ok on preview 1.

Any way of debugging:
Server cutelyst-nginx: 2016/10/27 21:22:15 [crit] 24524#0: *7
connect() to unix:/tmp/uwsgi.sock failed (2: No such file or
directory) while connecting to upstream, client: 204.93.249.210,
server: localhost, request: "GET /json HTTP/1.1", upstream:
"uwsgi://unix:/tmp/uwsgi.sock:", host: "localhost"

as the uwsgi log shows it did create the file.

Best,
> --
> You received this message because you are subscribed to the Google Groups
> "framework-benchmarks" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to framework-benchm...@googlegroups.com.

Nikolche Mihajlovski

unread,
Nov 4, 2016, 5:48:45 AM11/4/16
to Daniel Nicoletti, Brian Hauer, framework-benchmarks
After the long waiting, I am very disappointed to see that someone corrupted Rapidoid's benchmark configuration, so the tests were not executed:

https://github.com/TechEmpower/FrameworkBenchmarks/commit/88e374bccee0dbc84881005449463b12359ed868

Fredrik Widlund

unread,
Nov 4, 2016, 6:05:38 AM11/4/16
to Brian Hauer, framework-benchmarks
How is it possible that the JSON benchmark tops out on 500k, when in round 11-12 it topped out on 2+M? Or that the plaintext is less than half of round 11-12? Is this sanity checked?

Kind regards,
Fredrik

Brian Hauer

unread,
Nov 4, 2016, 11:31:14 AM11/4/16
to framework-benchmarks, dant...@gmail.com, teona...@gmail.com
Hi Nikolche,

We'll investigate that.  A fairly significant amount of effort has gone into cleaning up the metadata about the framework tests prior to this Preview run, so there may have been a mistake in this description file.  We'll work to get it corrected before the final run.

Brian Hauer

unread,
Nov 4, 2016, 11:35:02 AM11/4/16
to framework-benchmarks, teona...@gmail.com
Hi Fredrik,

At the end of Round 12, we were notified by the hosting company who had been providing us with physical hardware that the hardware was being decommissioned.  The final run of Round 12 was a bit rushed as a result.

Since then, we have received hardware from a new partner, ServerCentral.  The hardware is not the same as Round 12.  While it's still on 10-gigabit Ethernet, the CPUs are older.

I have posted about this change previously: https://groups.google.com/d/msg/framework-benchmarks/IiDBC6l1QuQ/gDyw6OR4BwAJ

Nick Kasvosve

unread,
Nov 4, 2016, 12:20:43 PM11/4/16
to Brian Hauer, framework-benchmarks
May I ask, when is the final run scheduled for?

Regards,

Nick

--
You received this message because you are subscribed to a topic in the Google Groups "framework-benchmarks" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/framework-benchmarks/nePDNY9jp-4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to framework-benchmarks+unsub...@googlegroups.com.

Brian Hauer

unread,
Nov 4, 2016, 12:42:37 PM11/4/16
to framework-benchmarks, teona...@gmail.com
Hi Nick,

We are attempting to conclude Round 13 within approximately 1-2 weeks.  I'll create a separate thread to make this more prominent.

Daniel Nicoletti

unread,
Nov 4, 2016, 1:04:55 PM11/4/16
to Brian Hauer, framework-benchmarks
So both Cutelyst and Bottle framework have the same errors
with nginx connecting to uwsgi socket, looks like the new
server using namespaced temporary directories

This is odd as running the vagrant mode verify works
but benchmark doesn't, I guess there is some change
in these modes, I'll be pushing patches to cutelyst to use
/var/tmp as /run wasn't writtable by testuser.

Nick Kasvosve

unread,
Nov 4, 2016, 1:08:02 PM11/4/16
to Daniel Nicoletti, Brian Hauer, framework-benchmarks
Thanks Brian.

Would appreciate another preview run before the final run, assuming it does not take a great deal of your resources.

Thanks again,

Nick


>> Visit this group at https://groups.google.com/group/framework-benchmarks.
>> For more options, visit https://groups.google.com/d/optout.
>
>
>
> --
> Daniel Nicoletti
>
> KDE Developer - http://dantti.wordpress.com



--
Daniel Nicoletti

KDE Developer - http://dantti.wordpress.com

--

You received this message because you are subscribed to a topic in the Google Groups "framework-benchmarks" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/framework-benchmarks/nePDNY9jp-4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to framework-benchmarks+unsub...@googlegroups.com.

Brian Hauer

unread,
Nov 4, 2016, 1:17:10 PM11/4/16
to framework-benchmarks, dant...@gmail.com, teona...@gmail.com
Hi Nick,

Understood.  We'll keep running tests and we'll post updates whenever a run completes.

Brian Hauer

unread,
Nov 11, 2016, 12:21:52 PM11/11/16
to framework-benchmarks
We have posted another preview of Round 13.  This one is from our new Azure environment.  Note that we are aware that a large number of the MongoDB tests failed in this run and are investigating.

https://www.techempower.com/benchmarks/previews/round13/azure.html

Logs for this preview:

http://tfb-logs.techempower.com/round-13/preview-3/

We may have one more preview, but this also may be the final preview for this round.

Nick Kasvosve

unread,
Nov 12, 2016, 11:41:34 AM11/12/16
to Brian Hauer, framework-benchmarks
I see this in my logs:

com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Unknown column 'world0_.random_number' in 'field list'

Which is odd because the framework builds and works just fine locally and on Travis.

Would somebody kindly post the database schema for the most recent Preview run please.

Thanks

Nick

--

Fredrik Widlund

unread,
Nov 14, 2016, 4:59:24 AM11/14/16
to Brian Hauer, framework-benchmarks
Hi Brian,

Could you give us any update on whether this was the final preview or not? This would be very valuable to know.

Kind regards,
Fredrik

--
You received this message because you are subscribed to the Google Groups "framework-benchmarks" group.
To unsubscribe from this group and stop receiving emails from it, send an email to framework-benchmarks+unsub...@googlegroups.com.

zloster

unread,
Nov 14, 2016, 10:43:06 AM11/14/16
to framework-benchmarks
About the /round-13/preview-3 data.
It seems there is a problem with the toolset scripts:

190 occations of "sudo: Argument list too long"
./tfb-logs.techempower.com/round-13/preview-3/updates/logs/ulib/out.txt:/bin/sh:
1: sudo: Argument list too long
./tfb-logs.techempower.com/round-13/preview-3/updates/logs/asyncio/out.txt:/bin/sh:
1: sudo: Argument list too long
./tfb-logs.techempower.com/round-13/preview-3/updates/logs/asyncio-json/out.txt:/bin/sh:
1: sudo: Argument list too long
./tfb-logs.techempower.com/round-13/preview-3/updates/logs/cutelyst-thread/out.txt:/bin/sh:
1: sudo: Argument list too long
./tfb-logs.techempower.com/round-13/preview-3/updates/logs/cutelyst-mysql-raw/out.txt:/bin/sh:
1: sudo: Argument list too long
./tfb-logs.techempower.com/round-13/preview-3/updates/logs/fintrospect/out.txt:/bin/sh:
1: sudo: Argument list too long
./tfb-logs.techempower.com/round-13/preview-3/updates/logs/cutelyst/out.txt:/bin/sh:
1: sudo: Argument list too long
./tfb-logs.techempower.com/round-13/preview-3/updates/logs/cutelyst-nginx/out.txt:/bin/sh:
1: sudo: Argument list too long
./tfb-logs.techempower.com/round-13/preview-3/updates/logs/ulib-mysql/out.txt:/bin/sh:
1: sudo: Argument list too long


190 occations of "Could not empty /tmp":
./tfb-logs.techempower.com/round-13/preview-3/updates/logs/cutelyst/out.txt:
Error: Could not empty /tmp
./tfb-logs.techempower.com/round-13/preview-3/updates/logs/cutelyst-nginx/out.txt:
Error: Could not empty /tmp
./tfb-logs.techempower.com/round-13/preview-3/updates/logs/ulib-mysql/out.txt:
Error: Could not empty /tmp
./tfb-logs.techempower.com/round-13/preview-3/updates/logs/cutelyst-nginx-postgres-raw/out.txt:
Error: Could not empty /tmp
./tfb-logs.techempower.com/round-13/preview-3/updates/logs/asyncio-plaintext/out.txt:
Error: Could not empty /tmp

Brian Hauer

unread,
Nov 15, 2016, 4:43:04 PM11/15/16
to framework-benchmarks, teona...@gmail.com
Hi Fredrik,

There will be no more previews for Round 13.  In fact, Round 13's ETA is now tomorrow (Nov 16).

That said, we have our new hardware environment set up to run and capture results more quickly going forward, so we hope that this will really cut down on the time between rounds.


On Monday, November 14, 2016 at 1:59:24 AM UTC-8, Fredrik Widlund wrote:
Hi Brian,

Could you give us any update on whether this was the final preview or not? This would be very valuable to know.

Kind regards,
Fredrik

Ludovic Gasc

unread,
Nov 15, 2016, 4:51:10 PM11/15/16
to Brian Hauer, framework-benchmarks
I confirm I've some strange logs for Python-AsyncIO test suite, it's why I've bad results on some tests.

Ok to have more or less quickly a Round 14, I'm working on a rewrite of the Python-AsyncIO test suite.

--
Ludovic Gasc (GMLudo)

Shawn Bandy

unread,
Nov 15, 2016, 5:35:05 PM11/15/16
to framework-benchmarks, mo...@edno.moe
Hi Zloster,

This is a known issue and will be fixed shortly in round 14.  The good news is that it does not impact the gathering of benchmarking data at all.

-Shawn

Shawn Bandy

unread,
Nov 15, 2016, 5:58:56 PM11/15/16
to framework-benchmarks, teona...@gmail.com
Hi Nick,

The database schema is initialized with https://github.com/TechEmpower/FrameworkBenchmarks/blob/master/config/create.sql

Which framework are you working on?

-Shawn

Nick Kasvosve

unread,
Nov 16, 2016, 10:32:56 AM11/16/16
to Shawn Bandy, framework-benchmarks, Brian Hauer
Thanks Shwan. I found the issue. Its was my mistake.

On Tue, Nov 15, 2016 at 10:58 PM, Shawn Bandy <sba...@techempower.com> wrote:
Hi Nick,

The database schema is initialized with https://github.com/TechEmpower/FrameworkBenchmarks/blob/master/config/create.sql

Which framework are you working on?

-Shawn


On Saturday, November 12, 2016 at 8:41:34 AM UTC-8, Nick Kasvosve wrote:
I see this in my logs:

com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Unknown column 'world0_.random_number' in 'field list'

Which is odd because the framework builds and works just fine locally and on Travis.

Would somebody kindly post the database schema for the most recent Preview run please.

Thanks

Nick
On Fri, Nov 11, 2016 at 5:21 PM, Brian Hauer <teona...@gmail.com> wrote:
We have posted another preview of Round 13.  This one is from our new Azure environment.  Note that we are aware that a large number of the MongoDB tests failed in this run and are investigating.

https://www.techempower.com/benchmarks/previews/round13/azure.html

Logs for this preview:

http://tfb-logs.techempower.com/round-13/preview-3/

We may have one more preview, but this also may be the final preview for this round.

--
You received this message because you are subscribed to a topic in the Google Groups "framework-benchmarks" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/framework-benchmarks/nePDNY9jp-4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to framework-benchmarks+unsubscrib...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages