The working code is here: https://github.com/dlinnemeyer/elixir-msas-app
The main problem I run into is that as we scale up concurrency, we very quickly start running into timeouts when calling Msas.Meta.get(). This sends a call to the Msas.Meta gen server. From what I can tell, this seems to be happening because there's only a single Msas.Meta worker process running (see apps/msas/lib/supervisor.ex), and it processes each query one at a time. Is this likely where the problem is? How would I go about solving this? My gut feeling would be that there'd be two approaches:
defp get_meta(msa) doquery = """SELECT code, nameFROM def_us_area_2014_4WHERE level = 'MSA' AND code = ?"""%{rows: [result | _]} = Ecto.Adapters.SQL.query(Msas.MySQL, query, [msa]){areaid, name} = result%{areaid: areaid, name: name}end
Also, this sort of bottleneck seems like it would be a pretty occurrence. In your experience, is it typical that scaling an Elixir/Erlang app is mainly a matter of discovering worker servers that are getting bogged down and switching them to a lighter-weight, non-blocking approach? Or is it typical/advisable to architect that way from the beginning?
- Have I architected this application best for my purposes? Currently I have a gen server for msa meta info (mysql) and msa gis info (postgres). The controller for the web request simply farms out the work to those servers and sends back the responses. So as far as I can understand it, then main scalability issue would be to make sure each data provider doesn't get blocked up. Or is there something I'm missing? Are there other architectures that may work better for this kind of use case?
- How would I get the web controller to call for the meta and gis info asynchronously? In this example the queries are quick so it's less of a direct concern in this case, but in the api gateway we're specing out, some data pulls can be pretty sluggish, so async would be necessary.
- I ended up in a situation where I have to call Msas.Meta.get(Msas.Meta, code) in the web controller (app/msas_server/lib/router.ex). As I understand it, the first Msas.Meta is the actual module, the second is the name I gave for the supervised process in apps/msas/lib/supervisor.ex. Why the redundancy? Is that necessary, or did I overcomplicate things?
Thanks for any time you have to dig through. Overall I've really enjoyed the architecture that Elixir seems to be pushing toward, and it's proven to be a lot more accessible than Erlang itself (I tried this same exercise there). The main concern I have at this point is getting a little lost in how heavy-weight OTP feels, at least at first.
--
You received this message because you are subscribed to the Google Groups "elixir-lang-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-ta...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-talk/42cddc1b-4dbe-4c50-a579-53d630ef4e76%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
I'm in the process of evaluating different concurrency-based languages (mainly node, Go, and Erlang/Elixir) for an API Gateway service, which we'll be translating a single request into multiple requests to more backend microservices. For evaluation, my team is building a tiny app in a few languages and testing it out in terms of code readability, scalability, tooling, etc. I've got a version working in Elixir, but I immediately ran into a scaling problem because of what I think is a misuse of OTP, so I'd love to get some design pattern pointers to see how difficult it is to get it resolved.The only requirement for the app currently is to take in an MSA code and respond with JSON like this: { "code": "10100", "name": "Aberdeen, SD", "centroid": {"long": -98.696, "lat": 45.521} }. To try and replicate our actual environment, the name for the code comes from a mysql database, while the centroid comes from a postgres database. I set up the app as an umbrella app, first getting the core data-gathering functionality working - msas - then I added a web server using plug - msas_server. The working code is here: https://github.com/dlinnemeyer/elixir-msas-appThe main problem I run into is that as we scale up concurrency, we very quickly start running into timeouts when calling Msas.Meta.get(). This sends a call to the Msas.Meta gen server. From what I can tell, this seems to be happening because there's only a single Msas.Meta worker process running (see apps/msas/lib/supervisor.ex), and it processes each query one at a time. Is this likely where the problem is? How would I go about solving this? My gut feeling would be that there'd be two approaches:
defp get_meta(msa) do query = """ SELECT code, name FROM def_us_area_2014_4 WHERE level = 'MSA' AND code = ? """ %{rows: [result | _]} = Ecto.Adapters.SQL.query(Msas.MySQL, query, [msa]) {areaid, name} = result %{areaid: areaid, name: name}
defmodule Msas.Meta do ## Client API def get( msa) do
query = """SELECT code, name FROM def_us_area_2014_4 WHERE level = 'MSA' AND code = ? """ %{rows: [result | _]} = Ecto.Adapters.SQL.query(Msas.MySQL, query, [msa]) {areaid, name} = result %{areaid: areaid, name: name}end
This is exactly the problem.My answer would be: simply remove the GenServer.Ecto already provides a pool of workers. Accessing it from a GenServer means you are serializing the access to the whole pool. Your get function should directly access Ecto:defp get_meta(msa) doquery = """SELECT code, nameFROM def_us_area_2014_4WHERE level = 'MSA' AND code = ?"""%{rows: [result | _]} = Ecto.Adapters.SQL.query(Msas.MySQL, query, [msa]){areaid, name} = result%{areaid: areaid, name: name}endIn other words, both Postgres and MySQL adapters already provide their GenServers that are properly pooled by Ecto. You don't need to do (or even redo) this work by putting it in a GenServer.
I answer this with more detail down below but there is no such thing as "non-blocking approach" in Elixir as you see in most other languages. Because everything happens in tiny processes, if one tiny process is waiting on IO, the VM is simply going to choose another tiny process to run. There is no way one process can directly block another process due to CPU usage or because it is waiting on IO. So non-blocking is the default.In any case, your observation is right that part of scaling is a matter of discovering worker servers that are getting bogged down. Luckily the VM and tooling make it extremely easy to find. The solutions will vary though. If the bottleneck is because of data access, then ETS may be a solution. If it is because of serial work that could be parallelism, then it is another solution and so on.
You don't need to call anything asynchronously. Let's suppose you have a long request that is waiting on the database. While that particular Elixir process waits on the database I/O, the VM will be able to schedule other Elixir processes to do the work. You absolutely don't need to worry about this. Elixir will be able to efficiently distribute requests, regardless if they are IO or CPU bound. Every request is already running on its own Elixir process.
Although you don't need a GenServer in this particular case, that is indeed necessary in a GenServer. The module is about code. The name is one of the ways to identify the process that is running the code so you can send commands to it. They are orthogonal abstractions.I have actually mentioned a couple days ago that most languages allow you to think how you organize and structure your code (modules, packages, classes). Very few languages give you an abstraction about how you interact with the code and the data in the runtime system and processes offer exactly it.
Welcome! Nice to hear you are enjoying it.One final note: we have uncovered some bottlenecks with the mariaex driver for MySQL in some occasions. Fixes are coming during this weekend. Just a heads up in case the benchmark is below your expectations.
--
You received this message because you are subscribed to a topic in the Google Groups "elixir-lang-talk" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elixir-lang-talk/Tr8ayRHMOh4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to elixir-lang-ta...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-talk/cf7ef574-8518-4a22-8d38-63353b6e4a6a%40googlegroups.com.
Okay, that makes sense. But are there general pointers on when to use a gen server and when to use build a module with functions to call? Is the main distinction whether you have any server state to keep track of?
So if that's the case, what's the typical pattern? Sometime like what Alex recommends, with Tasks.async? Or are there other patterns to be aware of?
Maybe the comparison would be singletons here then? So the question would be how I'd call Msas.Meta.get without passing in the reference to its instance. But you already answered that, moving away from the gen server structure.
Okay, that makes sense. But are there general pointers on when to use a gen server and when to use build a module with functions to call? Is the main distinction whether you have any server state to keep track of?
In any case, your observation is right that part of scaling is a matter of discovering worker servers that are getting bogged down. Luckily the VM and tooling make it extremely easy to find. The solutions will vary though. If the bottleneck is because of data access, then ETS may be a solution. If it is because of serial work that could be parallelism, then it is another solution and so on.What sort of tooling should I check into for this? One thing we're trying to do is get a handle on what debugging looks like in Elixir/Erlang vs. Go, since that seems to be one of the most important comparisons. We won't be producing tons of concurrent apps, so most of our time will probably be in maintenance and optimization.
You don't need to call anything asynchronously. Let's suppose you have a long request that is waiting on the database. While that particular Elixir process waits on the database I/O, the VM will be able to schedule other Elixir processes to do the work. You absolutely don't need to worry about this. Elixir will be able to efficiently distribute requests, regardless if they are IO or CPU bound. Every request is already running on its own Elixir process.Yeah, I'm not so much worried about that aspect of it, since I assumed plug/cowboy was handling each request on a separate process. I want to dig into async to get a speed up for the client. In our actual target API, we'll be firing off potentially 5-6 calls, several of which could be slow, and there's no reason we couldn't do them in parallel, and it's save the client a good bit of time.This is actually the main use case I'm comparing concurrent languages. It's not so much for scalability - though that's a nice side benefit - it's mainly for finding solid code architecture for handling concurrency within an individual request.So if that's the case, what's the typical pattern? Sometime like what Alex recommends, with Tasks.async? Or are there other patterns to be aware of?
Also, on this level, what do you feel is Elixir/Erlang's main benefit over, say, Go? There I could pretty easily spawn up a couple processes, wait on channels, and respond. On the code architecture level, if the benefit is mainly in the area of something like OTP, I'm trying to understand when and where that impacts my code. Or do you see advantages on the more primitive level of basic async operations, too?
Although you don't need a GenServer in this particular case, that is indeed necessary in a GenServer. The module is about code. The name is one of the ways to identify the process that is running the code so you can send commands to it. They are orthogonal abstractions.I have actually mentioned a couple days ago that most languages allow you to think how you organize and structure your code (modules, packages, classes). Very few languages give you an abstraction about how you interact with the code and the data in the runtime system and processes offer exactly it.This part is hard to adjust to. I think I like it, but it creates some strange parallels. In OO code, you care about a class, and you instantiated it into an object that you can carry around. Gen servers feel like this; you initialize them and get a processid back that you can pass in as a first parameter to further method calls. So even though the underlying code is very different, I'm running into strange similarities. Can't tell if that's just me over-translating into an OO background.
Needing state is a good starting point. For example, let's suppose some information from the database never changes. You could load it and store it in a GenServer so you don't need to do a database roundtrip every time.
For your questions regarding running it in production and what Elixir brings to the table, I recommend two things:* A quick introduction on the pieces and how they fit together: http://stackoverflow.com/questions/30422184/where-does-elixir-erlang-fit-into-the-microservices-approach/30423183* Read Erlang in Anger, a fantastic free book about running Erlang in Production: http://www.erlang-in-anger.com/
If they are named processes, a comparison with a singleton is correct from the code organization perspective. But the difference, really, is the runtime abstraction of a process with all the amazing monitoring, introspection and resilience built into it. That's the main difference over the mainstream OO languages and Go imo. :)
So in that case, is there a standard method to avoid bottlenecking on that gen server? A pool of genservers? A non-blocking (in itself) genserver (with :noreply and handle_info)? Or does it just depend too much on the details? In your experience, is this a common problem to run into? We're trying to play with the language to try to identify anticipated pain points, but I don't want to fixate on something that really just isn't a huge deal.
For example, in go, you can catch panics from your whole web server, meaning a failure in a controller won't crash the server. Of course it's pretty far back, so I'd imagine it's caught later, but still not catastrophic? Then you have other portions of your app,
maybe a process that updates some meta data in an ets table every few minutes, not part of the web server. In erlang, a crash there is easily caught and the process restarted, which is nice, but it obviously doesn't prevent a bug that corrupts data in the ets and crashes maybe after the corruption. Those, I'd imagine, are roughly equivalent?
--
You received this message because you are subscribed to a topic in the Google Groups "elixir-lang-talk" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elixir-lang-talk/Tr8ayRHMOh4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to elixir-lang-ta...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-talk/D195B290-E0C5-4ED3-BC16-54F2ABCAF2A9%40gmail.com.
A couple other misc. questions. Sasa, you mentioned working with some sample go code and killing a web server with a division by zero error. It's my understanding that you can wrap your web server and recover from any panics caused by goroutines called within any controllers. I think that's how it panned out in our testing, but we only handled I think one or two examples. Am I right about that, or are you talking about non-recoverable errors?
Also, anyone have any insights into training/hiring for elixir/erlang vs. Go? Elixir seems very small in itself, but I'd guess you could potentially hire erlang devs for an elixir project? Or train people with a background in Ruby, Python, PHP, etc.? My guess is that concurrency in general is in high demand, so I'm expecting that it'd be relatively difficult regardless of language. I'd also expect training to be comparatively difficult regardless of language, given how complicated concurrent systems seem to be. But I'd definitely like to take it into account if Go has a leg up here.
--
You received this message because you are subscribed to a topic in the Google Groups "elixir-lang-talk" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/elixir-lang-talk/Tr8ayRHMOh4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to elixir-lang-ta...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-talk/551D673D-4CD7-474C-8AA7-DFF8EB56A3AE%40chrismcg.com.