How can we improve Lift's stateless/cloud story?

555 views
Skip to first unread message

Joe Barnes

unread,
Feb 4, 2015, 10:40:19 PM2/4/15
to lif...@googlegroups.com
Over this past holiday season, several of you know I worked on migrating my blog from wordpress to a Lift application.  While going through this I put a lot of thought into problems we have experienced at work with our Lift application deployment on AWS Elastic Beanstalk.  One of the issues that inevitably arises is the use of server-side session state.  As a result I've put a lot of thought into how Lift can be used to build a stateless server despite how most of us typically lean on server-side session state to build our applications.  I hadn't been quite ready to strike up a thread on this until today when I saw Alexandre's thread.  He is asking the sorts of questions I've been thinking about, so it seems like a good time to start thinking this out loud with everyone.

I don't want to fixate too much on state, though.  While server-side state can create challenges, state in of itself is not a core issue. I'd like to ask the broader question of how we can make Lift ideal for today's expectations of web applications on the cloud?  For starters, some key expectations that come to my mind are:
  • Zero downtime 
  • Continuous deployment of updates
  • Multiple application instances
I'm by no means suggesting a Lift application cannot meet the above expectations.  My blog does all of the above.  Our application at work... not so much.  It took me some effort and care to make sure my blog application could meet those expectations, even though it is just static content with some ajax for lazy-loading. I think the reason for the extra effort is Lift's happy path is stateful.  Had I not been thinking about the above concerns, I would have tossed in some SHtml.ajaxCall with a JsCmds.SetHtml and called it a day.  Instead, I set up a RestHelper and so forth so that even if the server was swapped out after page load, those ajax requests would still work.

I feel that the first challenge we face is this happy path.  Again, it's not that Lift cannot be stateless but much of the tooling and documentation in place guide developers down a stateful path.  One idea I've had here is to work towards documentation, blog posts, whatever that show the best ways to use Lift without requiring server-side state.  I had in mind to help the Lift community with the blog series I've started where I'm sharing how Lift applications (among other things) can be deployed using techniques like immutable infrastructure and baking virtual machines.  By the time I was deploying my blog, I felt we didn't need only tips on deploying the applications, but also on what parts of Lift you should/shouldn't use.

Our challenge does not seem to only be in education. I think we still face some technical challenges.  Antonio's work on comet rehydration, comes to mind.  Perhaps there are other areas that need some code love.  It seems likely there are other facilities we could add to the framework to make a stateless path easier/more natural.

To echo what Alexandre said, I feel that Lift's greatest (technical) strengths is the view-first paradigm, the snippets, and the templating.  I'd add comet pushes to that (sans the need for rehydration).  My hope is that we could also feel that cloud-readiness is one of our best strengths.

What do you think we can do to make Lift better in this regard?

Joe

ti com

unread,
Feb 5, 2015, 12:04:16 AM2/5/15
to liftweb

Isn't stateful lifts whole claim to fame? I mean view first and snippet is nice philosophy but probably not to hard to do on top of play. So it seems stateful is what makes lift lift?

--
--
Lift, the simply functional web framework: http://liftweb.net
Code: http://github.com/lift
Discussion: http://groups.google.com/group/liftweb
Stuck? Help us help you: https://www.assembla.com/wiki/show/liftweb/Posting_example_code

---
You received this message because you are subscribed to the Google Groups "Lift" group.
To unsubscribe from this group and stop receiving emails from it, send an email to liftweb+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Antonio Salazar Cardozo

unread,
Feb 5, 2015, 8:24:06 AM2/5/15
to lif...@googlegroups.com
Just for the record, Lift's happy path is stateful because the secure path is stateful. I agree
that there's work to be had to so zero-downtime deploys can be done without serious issues,
but the key is to make sure we're not also sacrificing security in the process.

The easiest parts of Lift are also the toughest to handle on redeploy—idMemoize will be
tougher to rebind without a page reload than a simple rendering, comets that just send
JSON will be harder to rehydrate than ones that are tied to part of a page. To that end,
the work that's going into Lift 3 for what you might call “headless” comets, ones that aren't
attached to any elements at all on the client, will give us a basis for comets that will be
easy to reconnect after reload. Fixing more complex things like idMemoize and comets
that bind their contents will be a lot harder—though I believe it should be doable.

In short, I don't think state and the goals you listed are at odds with one another, but I
do think there's work to be done to make the stateful corners of Lift support those goals.
Adding more documentation to the stateless parts of Lift isn't necessarily a bad idea—but
I think there are better ways to hit the goals you're describing.

As for claims to fame, Lift shouldn't care about a claim to fame; it has goals. Being
secure by default, making complex things like comet and AJAX content updating
straightforward, view first, hiding some of the weird corners of HTTP when you don't need
to know about them… These are goals. Server-side state is how we choose to achieve
them because it's proven to be a tremendous tool when building towards them.
Thanks,
Antonio

Joe Barnes

unread,
Feb 8, 2015, 3:52:38 PM2/8/15
to lif...@googlegroups.com
Thanks Antonio for chiming in on my ambitious thoughts.  I knew that statefulness is pivotal in the way we implement many of our security features, but I should certainly have mentioned that in my original post (I should have also cited "security" as one of my favorite technical merits of Lift).

I am glad to see that you feel the goals I mentioned are achievable even with the stateful features of Lift.  I purposefully didn't want to focus too much on statefulness in my post because that's probably not the root issue... it's just what stands in the way of how I initially think to meet those goals.

Before I leave the topic of state, I have one other kinda crazy idea... My belief (read: could be wrong) is that there is no fundamental reason why the server-side state should reside in memory.  Perhaps if there was a hook that allowed it to be placed elsewhere, you could swap out the running server and the next one would just continue with the previous run's state.  Furthermore, this state could be shared by multiple instances simultaneously to form a cluster.  I'm not suggesting this is anything practical, but I'm just wrestling with the fundamentals of server-side state and security in my mind. :)

As I've thought this over some more, I'm not really keen on the idea of solving this with documenting stateless implementation alternatives as you mentioned.  I'd like to figure out how to make Lift hit these goals well without chunking tried-and-true features.

At the risk of having multiple threads within a thread, perhaps first we should articulate what would be the goal for Lift with respect to modern cloud deployment.  Here is my stab at it:

Lift applications are:
  Cloud-ready - Application updates can be deployed continuously without downtime facilitated across multiple server instances.

And maybe one day we'll meet that goal and add it to the homepage. :)

Joe


--
--
Lift, the simply functional web framework: http://liftweb.net
Code: http://github.com/lift
Discussion: http://groups.google.com/group/liftweb
Stuck? Help us help you: https://www.assembla.com/wiki/show/liftweb/Posting_example_code

---
You received this message because you are subscribed to a topic in the Google Groups "Lift" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/liftweb/KHjbjev8A0E/unsubscribe.
To unsubscribe from this group and all its topics, send an email to liftweb+u...@googlegroups.com.

Diego Medina

unread,
Feb 8, 2015, 4:30:58 PM2/8/15
to Lift
Before I leave the topic of state, I have one other kinda crazy idea... My belief (read: could be wrong) is that there is no fundamental reason why the server-side state should reside in memory.  Perhaps if there was a hook that allowed it to be placed elsewhere, you could swap out the running server and the next one would just continue with the previous run's state.  Furthermore, this state could be shared by multiple instances simultaneously to form a cluster.  I'm not suggesting this is anything practical, but I'm just wrestling with the fundamentals of server-side state and security in my mind. :)


I've been thinking about trying this out too, but one place where this would not work, or at least would need more work to work around, is for cases where we are invoking a server side closure/function that changed on the next deployment, in other words, if we store the key -> function on something like redis, but the actual code for the function changed, we would be calling something that is not correct/valid any more.

This is where we now store function name -> functions:

I don't want to say it is impossible, just that this is one place where we may need to think more.

 
As I've thought this over some more, I'm not really keen on the idea of solving this with documenting stateless implementation alternatives as you mentioned.  I'd like to figure out how to make Lift hit these goals well without chunking tried-and-true features.

I would disagree, documenting how to use Lift for your particular use case is a good thing, and if it means to know when not to use certain very popular features of Lift, it is much better than then going around saying that Lift failed in xyz situation, when the problem was lack of knowledge.

 

At the risk of having multiple threads within a thread, perhaps first we should articulate what would be the goal for Lift with respect to modern cloud deployment.  Here is my stab at it:

Lift applications are:
  Cloud-ready - Application updates can be deployed continuously without downtime facilitated across multiple server instances.

Here is where we need to really define what we are looking for, cloud ready can mean anything, and I don't think Lift isn't cloud ready, I run a personal Lift app on the cloud, I have another client who I have been working for over a year and also has their app on the cloud. This would feel just how all of the sudden people want to be "reactive". Now, continuous deployment is an actual technique to explore and without changing anything in Lift now, you could do that. Not too long ago Tim mentioned the idea of green blue deployment, which is something I had heard before but never took the time to read about, it turns out it is a very simple concept, you have two sets of servers, only one live where all you users go to, if you have an update, you deploy it to the other server group, and once it is ready, change your load balancer to start redirecting traffic to that new set of servers, if you still have users left on the "old" codebase, you can push a message/notification asking users to either reload the page, or log out/login (up to you), so they go to the new server group with the new code. and that's it, nobody lost any data, and the users were able to decide *when* to be disrupted and move to the new server group. You could even have several groups of servers if you wanted to deploy several times a day.
Of course there are things to consider, but we could start a separate thread to discuss them if anyone feels like it.

Thanks

Diego

 
You received this message because you are subscribed to the Google Groups "Lift" group.
To unsubscribe from this group and stop receiving emails from it, send an email to liftweb+u...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Diego Medina
Lift/Scala consultant
di...@fmpwizard.com
http://fmpwizard.telegr.am

Antonio Salazar Cardozo

unread,
Feb 8, 2015, 9:17:50 PM2/8/15
to lif...@googlegroups.com
On Sunday, February 8, 2015 at 4:30:58 PM UTC-5, fmpwizard wrote:
Before I leave the topic of state, I have one other kinda crazy idea... My belief (read: could be wrong) is that there is no fundamental reason why the server-side state should reside in memory.  Perhaps if there was a hook that allowed it to be placed elsewhere, you could swap out the running server and the next one would just continue with the previous run's state.  Furthermore, this state could be shared by multiple instances simultaneously to form a cluster.  I'm not suggesting this is anything practical, but I'm just wrestling with the fundamentals of server-side state and security in my mind. :)


I've been thinking about trying this out too, but one place where this would not work, or at least would need more work to work around, is for cases where we are invoking a server side closure/function that changed on the next deployment, in other words, if we store the key -> function on something like redis, but the actual code for the function changed, we would be calling something that is not correct/valid any more.

This is the core problem. A lot of Lift's server-side state is in functions, and Scala anonymous functions
are difficult if not impossible to serialize (some may be serializable, most wouldn't be). There may be
ways we could try to get around that (one that comes to mind would be an alternate form of
function invocation that would involve message-sending or somesuch), but I suspect they'd become
a bit complicated.

In the past when I've had to deal with this problem I've gone a slightly different path, having the client
stash all client content (e.g., textareas) into local browser storage and then giving the page a kick and
restoring the content based on field id or whatever. Not too terrible.

And the other approach is the one that comet rehydration did in a hacky way, which was to render
the full page and then try to figure out the places that map one-to-one between the old page and new
one and preserve information between the two. That one's pretty painful, though.

As I've thought this over some more, I'm not really keen on the idea of solving this with documenting stateless implementation alternatives as you mentioned.  I'd like to figure out how to make Lift hit these goals well without chunking tried-and-true features.

I would disagree, documenting how to use Lift for your particular use case is a good thing, and if it means to know when not to use certain very popular features of Lift, it is much better than then going around saying that Lift failed in xyz situation, when the problem was lack of knowledge.

+1
 
At the risk of having multiple threads within a thread, perhaps first we should articulate what would be the goal for Lift with respect to modern cloud deployment.  Here is my stab at it:

Lift applications are:
  Cloud-ready - Application updates can be deployed continuously without downtime facilitated across multiple server instances.

Here is where we need to really define what we are looking for, cloud ready can mean anything, and I don't think Lift isn't cloud ready, I run a personal Lift app on the cloud, I have another client who I have been working for over a year and also has their app on the cloud. This would feel just how all of the sudden people want to be "reactive". Now, continuous deployment is an actual technique to explore and without changing anything in Lift now, you could do that. Not too long ago Tim mentioned the idea of green blue deployment, which is something I had heard before but never took the time to read about, it turns out it is a very simple concept, you have two sets of servers, only one live where all you users go to, if you have an update, you deploy it to the other server group, and once it is ready, change your load balancer to start redirecting traffic to that new set of servers, if you still have users left on the "old" codebase, you can push a message/notification asking users to either reload the page, or log out/login (up to you), so they go to the new server group with the new code. and that's it, nobody lost any data, and the users were able to decide *when* to be disrupted and move to the new server group. You could even have several groups of servers if you wanted to deploy several times a day.

I agree with most of this. In a lot of ways I think the question of continuous deployment
might best be addressed via a how-to guide rather than intense addition of features, and
perhaps through that how-to guide's writing we can annotate pain points (I know you've
already to some extent done that, Joe) and then think about if and how those pain points
might could be alleviated.
Thanks,
Antonio

Joe Barnes

unread,
Feb 8, 2015, 10:45:27 PM2/8/15
to lif...@googlegroups.com
Thanks for the great responses guys!  This is helping me a lot... not sure about everyone else who are witnessing my naked ignorance. :D

To clarify what I meant about NOT documenting that stateless stuff => I mean it's not the solution to what I'm digging at.  Indeed, it's a Good Thing to doc.  It just wouldn't quite solve things I think we could be better at.

A funny thing about blue/green deployment.  I've heard the term, and applied a janky version of it at work, but didn't connect the dots lol.  I forgot where we found the technique we're using... but it's basically blue/green as Diego described.  It's janky because the switch happens in the Amazon Route 53 service, so it's a sudden hard switch over.  It's ok for where we're at in our development, but it bugs me.  Deeply.

I guess to fill in more details of where I'm coming from and what my goal here is... My team uses Elastic Beanstalk in particular (not the case with my blog tho).  Between that and our blue/green deployment, I feel there is much left to be desired.  Firstly, when we deploy new changes, everyone's page reloads to the index regardless of where they are on the site.  The same thing happens to users when the autoscaler drops an instance or if one were to ever fail and be replaced.  Since none of this is a big deal right now, we've not put a lot of effort into solving it.  Nonetheless, I've been digging around in this stuff over the weekends.  

After taking in the feedback on the thread today, I feel maybe a combo of documentation and code elbow grease will do the job.  Document blue/green as the way to go with some pointers, and see if we can handle some of the zero-downtime state-restoring stuff without worrying about version changes.  I certainly understand in principle the challenge with closures.  Maybe we get Heather Miller to help us here (check out her Spores).  

At the end of the day, I would like knowing someone new to Lift like me can take the plunge with the resources needed to handle these concerns well.

*Sits back with more beer to see how this crap flies*

Joe



 

--

ti com

unread,
Feb 9, 2015, 3:44:22 AM2/9/15
to liftweb

Right I meant goals. I guess one benefit of state is also the function names and every thing built on it like shtml. But it's interesting thought, that it's really flowing from another goal of security

I guess you can always build stateful on stateless but not opposite (proof, http is stateless and lift and servlets is built on it)

--
--
Lift, the simply functional web framework: http://liftweb.net
Code: http://github.com/lift
Discussion: http://groups.google.com/group/liftweb
Stuck? Help us help you: https://www.assembla.com/wiki/show/liftweb/Posting_example_code

---
You received this message because you are subscribed to the Google Groups "Lift" group.
To unsubscribe from this group and stop receiving emails from it, send an email to liftweb+u...@googlegroups.com.

Alexandre Richonnier

unread,
Feb 9, 2015, 4:35:15 AM2/9/15
to lif...@googlegroups.com
Hi Joe, Actually, I'm working on same problems.

My goal is to have something like Node.js + passport + api for external application (mobile or desktop) and web html, all stateless.

To solve these problems, I'm creating a authentication library which support Jose token with "nimbus-jose-jwt" library.
Then I use Rest api in statelessDispatch.
I don't use any sessionState on server. To optimize some request I have a custom cache inherited from TemplateCache.

Why and How I do this?

_ As mobile app developper, cookie is hell and a user can make a new request each time a session die.
_ State is save on browser or app, or in url (hashbang) , no need to save it server.
_ no need page refresh after server reboot.
_ cloud ready and really scalable, no more session load balancing or complex stuff.
_ For html page, no need session, I will use only angularjs against my api, or simple snippet, by doing that, I'm totally opposite to Fobo library strategy...
_ If I need javascript, I write it directly, I not use lift DSL, I had too much trouble with that.
_ sbt and scala compiler are very slow, since 3 months, I use gulp (nodejs streaming build system) to manage all my web assets.



Why I keep Lift ?
_ Because it's a powerful framework
_ Because I like view first.
_ Because we can use it in stateless or statefull mode.


But,

_ I had a project completly lift dependent. Now I change my mind, my stuff is more platform independant. And I use less and less lift functionality.
_ I think the inter-machine communication will become the majority of web traffic. And will be stateless.


How to improve Lift's stateless/cloud story ?

_ The Lift core looks good for me, but the developpement / production packaging is tricky. .  for example, I write a sbt task like that:



val developmentJettyTask = TaskKey[Unit]("development-jetty")

developmentJettyTask <<= (sourceDirectory in Compile) map  {
  dir => IO.delete(dir / "webapp" / "WEB-INF" / "jetty-web.xml")
}


val productionPackageTask = TaskKey[Unit]("production-package")

productionPackageTask <<= (sourceDirectory in Compile) map  {
  dir => {
    val template =  """|<?xml version="1.0"  encoding="UTF-8"?>
                      |<!DOCTYPE Configure PUBLIC "-//Jetty//Configure//EN" "http://www.eclipse.org/jetty/configure.dtd">
                      |
                      |<Configure class="org.eclipse.jetty.webapp.WebAppContext">
                      |    <Call class="java.lang.System" name="setProperty">
                      |        <Arg>run.mode</Arg>
                      |        <Arg>%s</Arg>
                      |    </Call>
                      |    <Call name="setInitParameter">
                      |          <Arg>org.eclipse.jetty.servlet.SessionDomain</Arg>
                      |          <Arg>mysite.com</Arg>
                      |    </Call>
                      |</Configure>"""
    val charset = Charset.forName("utf-8")
    // val propertiesPath = (sourceDirectory in Compile)(_ / "resources")
    val file = dir / "webapp" / "WEB-INF" / "jetty-web.xml"
    IO.write(file, template.stripMargin.format("production"), charset, false)
  }
}


(packageWar in Compile) <<= packageWar in Compile dependsOn productionPackageTask


_ Improve Props tools to read any properties, like environnement properties and have a GetOrDie accessor.
Why? In cloud / heroku deployement, you set your parameter in environnement...
 Something like this:
 
 

object PropsUtils extends Logger {

  private val sysProps = new SystemProperties()

  /**
   * Property access utility which favors SystemProperties over the Lift Props.  The rationale is that system
   * properties are often set via the command-line or the cloud container, and hence are per-deployment specific
   * in those cases.  Therefore, they are regarded as having higher precedence than the Lift properties.
   *
   * Created by barnesjd on 1/25/14.
   */
  def get(key:String):Box[String] = sysProps.get(key) match {
    case Some(v) => Full(v)
    case _ => Props.get(key)
  }


  /**
   * Property access utility to get a property from java system properties or from Lift Props
   *
   * @param str : Propertie name
   * @return : value or throw an exception if not found
   */
  def getOrDie(str: String) = {
    get(str).openOrThrowException("The following required property is not defined: " + str)
  }

  def getOrDie(props: Properties, str: String) = {
    Box.legacyNullTest[String](props.getProperty(str)).openOrThrowException("The following required property is not defined: " + str)
  }

  def get(props: Properties, str: String) = {
    Box.legacyNullTest[String](props.getProperty(str))
  }

  /**
   * Get Properties from a property file
   * @param path file path e.g.: "/ddd.props" search in /resources
   */
  def getProperties(path: String): Box[Properties] = {

    def getInput(): Box[InputStream] = {
      val res = tryo {
        getClass.getResourceAsStream(path)
      }.filter(_ ne null)
      trace("Trying to open resource %s. Result=%s".format(path, res))
      res
    }

    getInput() match {
      case Full(x) => {
        val ret = new Properties
        val ba = Helpers.readWholeStream(x)
        try {
          ret.loadFromXML(new ByteArrayInputStream(ba))
          debug("Loaded XML properties from resource %s".format(path))
        } catch {
          case _: InvalidPropertiesFormatException =>
            ret.load(new ByteArrayInputStream(ba))
            debug("Loaded key/value properties from resource %s".format(path))
        }
        Full(ret)
      }
      case _ => error("cannot load property file for resource: " + path)
      Empty
    }
  }
}

_ produce an official stateless auth library and documentation.
=> If my tests are successful, I will publish something on that...

_ add a LiftRules.dispatch / LiftRules.statelessDispatch api dictionary with a helper to produce an api documentation with something like swagger or jsonDoc.
=> for me, cloud, include api, and api means function listing with documentation and auto testing. Actually I don't find a way to lookup on DispatchPF paths.


Best regards,

Alexandre

ti com

unread,
Feb 9, 2015, 8:51:38 AM2/9/15
to liftweb

For serializing functions, I think you should keep an eye on spores.

You received this message because you are subscribed to the Google Groups "Lift" group.
To unsubscribe from this group and stop receiving emails from it, send an email to liftweb+u...@googlegroups.com.

Joe Barnes

unread,
Feb 14, 2015, 10:20:50 PM2/14/15
to lif...@googlegroups.com
Just a quick follow up, I found some good stuff on Blue/Green Deployment with AWS in particular: http://www.thoughtworks.com/insights/blog/implementing-blue-green-deployments-aws

Since that is what I'm targeting at work and where I'm starting with my blogging, I'll get a good post or two written up on the topic in the coming weeks.  Again, I feel between equipping the Lift community with good zero-downtime deployment techniques and improving our failure recovery, I'll be a happy camper. :)

Joe

Diego Medina

unread,
Feb 14, 2015, 10:23:52 PM2/14/15
to Lift
great! make sure to post links to your blogs here

Joe Barnes

unread,
Feb 14, 2015, 10:35:55 PM2/14/15
to lif...@googlegroups.com
Absolutely.  Should we consider cross-posting them to the liftweb.net blog?

Joe

Diego Medina

unread,
Feb 14, 2015, 11:10:52 PM2/14/15
to Lift
sure

Joe Barnes

unread,
Mar 10, 2015, 8:11:42 AM3/10/15
to lif...@googlegroups.com
So I would like to bounce another idea off the community => Clustering

Spongebob -  clustering


Before I go there... I've been mulling over this thread some more the past couple of days.  To restate what I was convinced of earlier, I think our only technical deficiency is failure recovery.  If a redundant node goes down for any reason (including scaling down), all of the user session on that node get reset when the first ajax or comet request arrives at the new server instance.  I don't think any of the ideas I've thrown out there like spores are sufficient to equip Lift to handle this scenario.

Without any details of how we would implement clustering (as an optional module, that sort of thing), can we see clustering as a solution to the problem?  I suspect it's a good candidate solution.  It could possibly even allow us to not require session affinity.  A request arriving at any instance could theoretically identify resources such as in-memory session variables and actors via the cluster mechanism.

(Full disclosure => I think that Lift providing hooks for clustering and clustering implementations via modules (ex. lift-cluster-akka, lift-cluster-rabbitmq, etc) is the right answer, but that doesn't matter if a cluster cannot solve the problem.)

Am I on to something or am I just dreaming?

Joe

Diego Medina

unread,
Mar 10, 2015, 8:35:27 AM3/10/15
to Lift
You may want to look at wildfly http://wildfly.org/ it's the next generation jboss container, it supports session replication/clustering across servers.
If you go in that direction, instead of using SessionVars you can use ContainerVars, which will use the container mechanism to store users' data, and that would replicate them across servers.

Here is an old but still valid video showing you containerVars across servers

https://www.youtube.com/watch?v=xsXzrzQ8NXk

Hope that gives you some ideas.

Diego




--
--
Lift, the simply functional web framework: http://liftweb.net
Code: http://github.com/lift
Discussion: http://groups.google.com/group/liftweb
Stuck? Help us help you: https://www.assembla.com/wiki/show/liftweb/Posting_example_code

---
You received this message because you are subscribed to the Google Groups "Lift" group.
To unsubscribe from this group and stop receiving emails from it, send an email to liftweb+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

David Whittaker

unread,
Mar 10, 2015, 12:33:16 PM3/10/15
to liftweb
I'm not sure I'm following what you have in mind with modules.  In a cluster you would either need to keep state out of the nodes, route follow up requests to the same node or replicate state between the nodes.  

Removing all state from the server is already a possibility with Lift, but like with every framework it complicates things from a security standpoint (more state on the server means less the client needs to see) and performance standpoint (nothing is going to be faster than retrieving the state from RAM).

While it's not built into Lift, there are also mechanisms for routing follow up requests to the same node using nginx, apache, Amazon ELB... and I'd imagine every other web front end commonly used.  So if it's affinity you are targeting, it's hard for me to see how it would be a good addition when there are already solid, well performing, battle tested solutions.

That leaves us with state replication which is more complicated.  You can use container session replication and Lift's ContainerVar, but it is a *huge* pain to set up.  Having done replication with both Tomcat & Jetty in the past though I can tell you the experience has not been a good one.  Both of those containers have their own proprietary replication config, neither of which is well documented, and the amount of work that goes into adapting it to your infrastructure makes you feel like you might as well be writing your own.  A slick way to configure session replication in Lift, and to mark which SessionVars are replicate-able could vastly improve that experience.  However...  as Antonio pointed out, Scala's anonymous functions are not serializable and a *lot* of Lift's security features involve placing anonymous functions in the Lift Session.  Even though we could improve Lift's ability to replicate Sessions between nodes, we'd still be unable to replicate a lot of the contents of a Session, and I don't know if that's a solvable problem given the state of Scala today.

My personal opinion on this is that if someone wanted to add a replication mechanism for SessionVars that uses pluggable storage (Redis, Memcache) and pluggable discovery (Zookeeper, Consul, Eureka) that would absolutely be a great addition to Lift.  It would make configuring replicating that data vastly easier than it currently is, and it would make Lift apps that rely on replication a lot more portable between containers (including Netty!).  I think we need to be realistic about what can be done though.   Finding a replication solution that supports typical usage of SHtml for instance, without requiring a page refresh after a restart, would be awesome but I just don't think it's feasible right now.

Joe Barnes

unread,
Mar 10, 2015, 1:21:22 PM3/10/15
to lif...@googlegroups.com
Don't worry too much about the modules.  I think that just threw you off the real discussion a bit.  What I have in mind is say... Lift needs a clustered messaging system to do cluster... Then it would have the hooks where you could plug in different implementations of a messaging system.  That way Lift core isn't married to any implementations nor do we need to develop our own.

The real discussion lies outside of that detail.  Session affinity is NOT something I'm targeting, but is possibly a side-effect of solving the problem I am after.  And that problem is that when a server dies, everyone's session is restarted. Today, AFAIK, it is not possible to recover a Lift client session when a server goes away.  This happens commonly because of scaling and releasing new code, both of which we do frequently in our shop.  I have not yet found a blue/green deployment strategy that handles this either.  They often have features such as AWS's connection draining, but that only ensures that the request is completed before the server is canned.   It does NOT ensure that the session is completed, which is what Lift really needs to avoid interrupting in-progress sessions.

As I think through these issues again, I'm actually growing less convinced that clustering can solve this problem.  I don't know what the answer is, but I'm highly optimistic that there is one.  Otherwise, to be frank, I cannot use Lift for some of our upcoming projects.  They involve long-running user sessions and resetting that will be unacceptably frustrating to the user experience, regardless of restoring their last good state from the DB.  This circles around back to my original statement regarding "Cloud readiness".  I don't feel we can regard ourselves as "cloud ready" if we can't deploy code changes and remove unneeded servers without disrupting the user experience.  

So maybe Lift "is what it is" and you trade off these things for security.  If so, that's fine but I hope it's clear to folks from the start that it is your trade off.  Not to suggest I'm the brightest crayon in the box, but I've been playing with Lift for over two years and I still don't understand how to overcome these challenges.

To be clear, this is by no means a slam on Lift but an effort to make Lift work well in these scenarios.

Joe



You received this message because you are subscribed to a topic in the Google Groups "Lift" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/liftweb/KHjbjev8A0E/unsubscribe.
To unsubscribe from this group and all its topics, send an email to liftweb+u...@googlegroups.com.

David Whittaker

unread,
Mar 10, 2015, 5:21:11 PM3/10/15
to liftweb

Joe,

I don’t think that I’m thrown off by the idea of modules. I think what I’m having trouble with is understanding what you mean by clustering. To me, clustering refers to using multiple servers (a cluster) to provide a service. You could be doing that for many reasons, all of which potentially have different requirements when it comes to managing state. It seems like what you’re most interested in is reliability: that when a node is lost, either through failure or due to an intentional action, your clients are unaffected. I’m also getting the impression that when you say clustering, what you mean is replication of state within a cluster.

One thing I was trying to explain is that there are multiple types of state in a Lift app (if you choose to hold state on the server at all) some of the them can be replicated while others can’t.

If you want to: Replicate user login information so the user won’t be logged out when their node goes away, then yes this is possible with Lift now. You can use a ContainerVar to store the user information and set up HttpSession replication. Log in, kill the node, be amazed!

If you want to: Replicate any arbitrary values between nodes, also a yes. Same process as above.

If you want to: Make it so that each and every one of Lifts features can be used without users being affected when a node disappears, then unfortunately the answer is no. Here is the main reason. Registering functions in the session and assigning them a secure id makes SHtml very secure and powerful. Unfortunately, those functions cannot be reliably serialized, which means that they can not be replicated between nodes.

Of course, there is no reason that you need to use SHtml, and while I still find plenty of situations where SHtml is useful, the more I move to apps using libraries like ReactJS, the less reliant I’ve become on it. You can absolutely develop an awesome Lift app that doesn’t use SHtml at all, stores session related data in a ContainerVar, and ties into the container’s (i.e Jetty’s) session replication. Configure that all correctly and it should fail over seamlessly.

That said, this process could absolutely be improved. Like I said, configuring container based session replication sucks. If replication of SessionVars was built into Lift then the whole dance with ContainerVar / container session replication could be avoided and that would also have the benefits of

(a) making what can be replicated in a Lift app, and how to accomplish replication, more obvious
(b) making replicated Lift apps more portable
(c) making session replication work outside of a servlet container (netty) and
(d) making me happy to never have to curse about container replication again.

So my question to you is, what exactly are you trying to accomplish? If it’s replicating data, then you have a mechanism at hand, and if you want to improve on it, that would be awesome, but it should work either way. If your need goes beyond that, then I think you should really consider how adopting another framework would help. To my knowledge, there is nothing that exists that offers the type of functionality you get from SHtml in Lift in a way that can seamlessly be replicated between nodes.

-Dave

Joe Barnes

unread,
Mar 10, 2015, 10:07:28 PM3/10/15
to lif...@googlegroups.com
Thanks for the reply, Dave.  Let's forget about clustering.  I had some initial thoughts that would solve the problem I have, but I had gotten fixated on some aspects which it would help but aren't relevant.

So yes, my problem is reliability/fault tolerance.  I understand that I can refrain from certain features of Lift and not experience the problems I have.  Both my blog and npmaven.org are running great with zero-downtime green/blue deployment and all that good stuff. 

However at work, we have built an application using lift-ng which makes heavy use of both AFuncHolder and comets.  I know how we could replace all of the usages of AFuncHolder in our application (basically quit using lift-ng and go straight to a RestHelper).  I'm not so sure about getting around comet, tho.  I believe it's the only built-in way Lift has for pushing updates to a client.

If I'm not mistaken, both of these features inhibit the fault tolerance I am looking for.  The AFuncHolder states cannot be reliably serialized, and I suspect the same holds for CometActors.  That is what I meant earlier when I said that regardless of what I do with the user's session data, my application must refresh the current page if it ever changes hands on the server-side.  I completely understand what you are saying how Lift can do some replication and so forth, but it comes with the caveats of not using AFuncHolder or CometActor and hence no pushes to the client... right?

This also circles back around to some of the earlier parts of this discussion where I wrote about how the happy path of Lift has you using these features.  Now that I know more about how to deploy it and the implications, I know not to use them in certain projects.  But what I'm finding unfortunate is that one comes into Lift to try things out, follows the guides down the happy path, and end up with a project that is difficult to put into production with zero-downtime and fault tolerance.

I'm bringing all of this up in hopes we can solve this problem, and I'm all in to help get it done.  Unfortunately, I'm getting the feeling that I may be asking for a bit much... maybe even trying to have my cake and eat it too. :-/

Joe

David Miguel Antunes

unread,
Mar 12, 2015, 7:31:11 AM3/12/15
to Joe Barnes, lif...@googlegroups.com
Hi,

My 2 cents:
Regarding: "The AFuncHolder states cannot be reliably serialized"

I haven't played much with function serialization in Scala. But, if I understand correctly, they are serializable, except unless they capture something which is not serializable (which makes all sense).
Heather Miller has a presentation where she talks about this: https://speakerdeck.com/heathermiller/spores-distributable-functions-in-scala?slide=31

Here is an example:
def serialize(obj: AnyRef): Array[Byte] = {
val baos = new java.io.ByteArrayOutputStream()
val oos = new java.io.ObjectOutputStream(baos)
oos.writeObject(obj)
oos.close()
baos.toByteArray
}

class C1 {
var variable = "hello world"
val func = () => println(variable)
}
class C2 extends Serializable {
var variable = "hello world"
val func = () => println(variable)
}

serialize(new C1().func) // <-- CRASHES
serialize(new C2().func) // <-- WORKS FINE
In the case of C1 we're capturing the "this" of C1, which is not serializable, hence the function is not serializable. In C2 it works fine.
So it seems this isn't some strange "unreliable" rule, but simply: "if you capture an external variable the class must be serializable".

That said, Java(/Scala) serialization even captures the "object graph" (which is very impressive/useful).
Therefore I see no reason why you shouldn't be able to serialize the map(?) where the functions are stored in Lift to database/filesystem, and then restore it back again when lift restarts (plus probably the sessions info etc).

Best,
David

Andreas Joseph Krogh

unread,
Mar 12, 2015, 8:59:02 AM3/12/15
to lif...@googlegroups.com
Serializing doesn't help here, no matter what technology is used, as we need the object-references, which are never preserved when serializing. When I register a func in the map I rely on the func to reference the actual instances captured, by reference.
 
--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
 

Antonio Salazar Cardozo

unread,
Mar 12, 2015, 10:16:23 AM3/12/15
to lif...@googlegroups.com
I don't think you are asking too much, but there will be effort involved. The approach I've referenced
before for restoring comets post-restart is sound, IMO, though it won't work 100% of the time (e.g., if
your templates change). The draft implementation I did was nasty, but there are a variety of things
that can be done to improve it.

The key to restoring state without reloading the page is identifying the places where that state was
bound (e.g., input names, comets, etc) and then ensuring that state is recreated. Recreating the state
is a matter of reinstantiating the associated snippets and comets, and changing the client-side page
to the now-restored session and RenderVersion (which will preserve the new set of RequestVars),
then fixing the function id references to point to the functions bound in the new session.

I think there's good work that can be done in this department, though there are probably a lot of
things that need to happen to make it work. It won't work perfectly 100% of the time, but I think we
can provide tools that will make the 80% use case work with minimal effort.
Thanks,
Antonio
To unsubscribe from this group and stop receiving emails from it, send an email to liftweb+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
--
Diego Medina
Lift/Scala consultant
di...@fmpwizard.com
http://fmpwizard.telegr.am

--
--
Lift, the simply functional web framework: http://liftweb.net
Code: http://github.com/lift
Discussion: http://groups.google.com/group/liftweb
Stuck? Help us help you: https://www.assembla.com/wiki/show/liftweb/Posting_example_code

---
You received this message because you are subscribed to the Google Groups "Lift" group.
To unsubscribe from this group and stop receiving emails from it, send an email to liftweb+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
--
Lift, the simply functional web framework: http://liftweb.net
Code: http://github.com/lift
Discussion: http://groups.google.com/group/liftweb
Stuck? Help us help you: https://www.assembla.com/wiki/show/liftweb/Posting_example_code

---
You received this message because you are subscribed to a topic in the Google Groups "Lift" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/liftweb/KHjbjev8A0E/unsubscribe.
To unsubscribe from this group and all its topics, send an email to liftweb+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
--
Lift, the simply functional web framework: http://liftweb.net
Code: http://github.com/lift
Discussion: http://groups.google.com/group/liftweb
Stuck? Help us help you: https://www.assembla.com/wiki/show/liftweb/Posting_example_code

---
You received this message because you are subscribed to the Google Groups "Lift" group.
To unsubscribe from this group and stop receiving emails from it, send an email to liftweb+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
--
Lift, the simply functional web framework: http://liftweb.net
Code: http://github.com/lift
Discussion: http://groups.google.com/group/liftweb
Stuck? Help us help you: https://www.assembla.com/wiki/show/liftweb/Posting_example_code

---
You received this message because you are subscribed to a topic in the Google Groups "Lift" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/liftweb/KHjbjev8A0E/unsubscribe.
To unsubscribe from this group and all its topics, send an email to liftweb+unsubscribe@googlegroups.com.

David Miguel Antunes

unread,
Mar 12, 2015, 10:50:32 AM3/12/15
to lif...@googlegroups.com
Andreas,

What do you mean by "object-references"?
The serialization captures all the entire "graph" of object references which are referenced by the objects you serialize.
So, if you serialize the whole Lift function map, you should get a map exactly as the one you serialized. I.e. if the user web page makes an ajax call with some function ID, your code should run exactly in the same state you left it before restoring the state.

Here is an example with a map of functions (it runs on the scala repl):
import java.io._

def serialize(obj: AnyRef): Array[Byte] = {
  val baos = new ByteArrayOutputStream()
val oos = new ObjectOutputStream(baos)
oos.writeObject(obj); oos.close(); baos.toByteArray
}
def deserialize[T](arr: Array[Byte]): T = {
val ois = new ObjectInputStream(new ByteArrayInputStream(arr))
try {ois.readObject().asInstanceOf[T]} finally {ois.close()}

}

class C2 extends Serializable {
var variable = "hello world"
val func = () => println(variable)
}

object FMap {
var map = Map[String, () => Unit]()
}

val uid = java.util.UUID.randomUUID().toString
FMap.map = FMap.map + (uid -> new C2().func)
FMap.map(uid)()
FMap.map = deserialize[Map[String, () => Unit]](serialize(FMap.map))
FMap.map(uid)()
Best,
David

 

--

David Whittaker

unread,
Mar 12, 2015, 11:22:03 AM3/12/15
to liftweb
Joe,

I feel like narrowing the scope of what you need down to failover of comet and lift-ng will help a lot with developing a plan.  It seems like Antonio already has an approach in mind for comet, and if anyone outside of DPP would know how to approach replicating comet, it would be him.  

As for lift-ng, well, you kind of have an advantage there yourself :)  I've never looked at the lift-ng internal but unlike SHtml, I think lift-ng's usage of AFuncHolder for secure invocation could maybe be adapted to work in a failover situation.  The big differences in my eyes are that SHtml methods are generally intended to close over their local scope, while that's not necessarily true for lift-ng.  I think that serializing arbitrary AFuncHolder instances is a lost cause, but that doesn't mean we couldn't devise a method for marking a specific AFuncHolder as serializable/replicate-able.  That ability could be expressed in turn through the lift-ng API.  A developer could choose to mark their module / methods as serializable, in which case it would be up to them to make sure they follow certain rules.  There is a lot to figure out there: how do modules / methods get marked as serializable, how are the serializable AFuncHolders in the function map replicated, what happens when someone screws up and registers something that can't in fact be serialized, etc, but it doesn't seem impossible.



To unsubscribe from this group and stop receiving emails from it, send an email to liftweb+u...@googlegroups.com.

Joe Barnes

unread,
Mar 12, 2015, 12:01:13 PM3/12/15
to lif...@googlegroups.com
Awesome David W!  I agree as I look back over this thread I've had an ugly wad of big picture stuff and a sprinkling of specifics, producing a rather unsavory concoction.  (Ok, I'll put the thesaurus away...)  I appreciate the time you've taken to write up emails to steer me onto a path where we can actually have actionable items. :)

And of course, thanks Antonio for highlighting what can be done.  

I'm definitely interested in this side-discussion between David A and Andreas.  I get the feeling that more of this works that we realized. :)

This is all precisely the sorts of things I hoped to dig up and see if we could improve.

As I mentioned in another thread, I'm gonna be tied up the next week or so but hopefully soon I can start tugging around in the code and examining Antonio's previous work on comet rehydration.  Maybe then we can start forming a real plan.  I know this effort will be an awesome improvement for what we have at work, and I hope that others will likewise benefit from the enhancements we have an opportunity to make.

Joe


To unsubscribe from this group and all its topics, send an email to liftweb+u...@googlegroups.com.

Joe Barnes

unread,
Mar 13, 2015, 6:39:50 AM3/13/15
to lif...@googlegroups.com
How about the following high-level idea as the approach?

First, some assumptions:
1) Lift is running with potentially multiple instances with session affinity/stickiness enabled
2) Serialization Just Works™.
2`) Our Lift app developer was diligent to not reference bad things from a function.
3) We have Antonio's comet rehydration.
3`) We have a way to handle code changes (by purposefully failing or whatever)

When in this cluster mode or whatever we should call it, Lift serializes and writes the functions, etc that it would otherwise need if it were to need to recover in parallel to processing the actual request.  This backend persistence could be pluggable so you could use either something performant/in-memory or just a dumb ol' SQL DB.  

Requests which come in go through the usual process unless there is a miss where Lift cannot find the comet or function requested.  This is where today's default behavior is to redirect to the index page.  In this case, Lift first checks the backend persistence to see if it is an asset that one of the instances previously served.  If found, grab it and assume it into its own working state.  If a miss, then resume current behavior lest we expose ourselves to the possibility of fake requests, playback, etc.

This writing would simply be the fail-over copy of the state and done in parallel to the normal processing.  Thus under normal circumstances the performance impact should be negligible.  When misses occur, we'll have a little overhead of finding this stuff in the backend store.  We also are taking care not to, for instance, rehydrate a phony comet.  We will only repair state for an asset which was served previously by the application.

A little "handy wavy" I know, but does the idea in principle a sound one?

Joe


David Miguel Antunes

unread,
Mar 13, 2015, 7:06:22 AM3/13/15
to lif...@googlegroups.com

I'd just like to note that you may need to serialize functions together:
Consider this:
def render(): NodeSeq = {
var amount = 0

<button onclick={SHtml.ajaxInvoke(() => {amount += 1})}>Increment</button> ++
<button onclick={SHtml.ajaxInvoke(() => {amount -= 1})}>Decrement</button>
}
You have 2 functions referencing the amount.
If you serialize/deserialize them separately you'll have 2 separate "amounts" in memory after they are deserialized.

It seems to me the finest granularity would be the session: serializing all the session's functions together, and then check the database for the session id and, if it exists, deserialize and load all that session's functions.
That would solve the problem in the example (because serialization takes care of the object graph).
(If there was state shared between sessions, that would be a problem again for the same reason, of course.)

This would also mean that, when writing to DB, all the session's functions would need to written, so it would probably be better to do it in a shutdown hook before the app goes down or something. In this case the normal processing overhead would be nonexisting, it would only impact when restoring a session on it's first request (if it was an old session) and it could take a few seconds(?) to complete shutdown (depending on the persistence backend).

Cheers,
David

Joe Barnes

unread,
Mar 13, 2015, 7:32:25 AM3/13/15
to lif...@googlegroups.com
Excellent point regarding serialization, David.  However, I'm not certain that session persistence will grab that `amount` variable hanging over there.  It would need to be a `SessionVar` or something to work like that, no?

Regarding the writing of state, I would like to not wait for a shutdown hook.  The idea works well in most cases where an intentional shutdown was triggered, but it doesn't handle the case of an unintentional failure.

The granularity of session sounds good though.  Perhaps this idea I am pondering should simply augment the existing capabilities we already possess with container clustering.  In this case we have the session state replicated already and we just need to also save the anonymous functions and their keys.

Joe

David Miguel Antunes

unread,
Mar 13, 2015, 8:03:49 AM3/13/15
to lif...@googlegroups.com
What do you mean by grabbing the amount variable?
The serialization also serializes the variable - in fact, it serializes the enclosing class, with contains the variable (which is why it must be serializable).
Here is an example (runs on the repl):
def test(): Unit = {

import java.io._

def serialize(obj: AnyRef): Array[Byte] = {
val baos = new ByteArrayOutputStream()
val oos = new ObjectOutputStream(baos)
oos.writeObject(obj);oos.close();baos.toByteArray
}
def deserialize[T](arr: Array[Byte]): T = {
val ois = new ObjectInputStream(new ByteArrayInputStream(arr))
try {ois.readObject().asInstanceOf[T]} finally {ois.close()}
}

  val fmap = collection.mutable.Map[String, () => Unit]()

// Enclosing class must be serializable (otherwise will get: "java.io.NotSerializableException: $anon$1"):
new AnyRef with Serializable {
var amount = 0
def render() = {
fmap += ("incF_9389120" -> (() => {amount += 1; println(s"Inc to $amount")}))
fmap += ("decF_1837599" -> (() => {amount -= 1; println(s"Dec to $amount")}))
}
}.render()

// Serializing/deserializing the single functions:
val singleIncFuncDeserialized = deserialize[() => Unit](serialize(fmap("incF_9389120")))
val singleDecFuncDeserialized = deserialize[() => Unit](serialize(fmap("decF_1837599")))

println("Wrong: changing different variables:")
singleIncFuncDeserialized()
singleIncFuncDeserialized()
singleDecFuncDeserialized()
singleDecFuncDeserialized()

// Serializing/deserializing the whole map:
val wholeMapDeserialized = deserialize[collection.mutable.Map[String, () => Unit]](serialize(fmap))

println("Correct: changing the same variable:")
wholeMapDeserialized("incF_9389120")()
wholeMapDeserialized("incF_9389120")()
wholeMapDeserialized("decF_1837599")()
wholeMapDeserialized("decF_1837599")()
}
test()
This is the output:
scala> test()
Wrong: changing different variables:
Inc to 1
Inc to 2
Dec to -1
Dec to -2
Correct: changing the same variable:
Inc to 1
Inc to 2
Dec to 1
Dec to 0
(In the first case we're changing separate variables)

This may also lead to size issues: suppose you have a page with a table and the table data has 10Mb (so instead of the amount variable, you have a 10Mb "data" variable). If you have 100 functions in the page (clicking table buttons, etc) which capture the "data" variable, the size of serializing the functions individually is: 100*10Mb=1Gb, while serializing then together only takes 10Mb (because they share the same "data" object).

That said, the SessionVariable could work (I'm not very familiar with how that part of Lift is implemented...), but even so if the developer uses local variables like the "amount" he'll have this problem.

Cheers,
David

Joe Barnes

unread,
Mar 13, 2015, 8:47:39 AM3/13/15
to lif...@googlegroups.com
I understand that serialization will grab the variable.  What I should have said is that the two functions need to be serialized together so that the variable will be deserialized as the same instance between the two.  Now that I think about it, perhaps this would work.  If those two functions are in the session object, then serializing the session would grab them at the same time.

I'm likewise concerned about data size as you mentioned.  I'm also mulling over an alternative approach.  Rather than serialize all of this stuff, what if we instead try to run the functions again?  In a sense, re-render the page server-side and map all of the client-visible values so that they hook up to the already-served page.  I'm sure there are challenges on mapping them over and so forth, but maybe this would give us better behavior and performance.

Joe

David Whittaker

unread,
Mar 13, 2015, 2:39:26 PM3/13/15
to liftweb
I think this is a pretty good broad outline.  Here are a few questions I think will need answers though:

1. Do we plug into container session replication?  The upsides I see are that it would involve less work and that it is inline with existing features (ContainerVar).  The downsides I see are that, from my experience, container session replication is difficult to set up and that it's specific to the container (jetty, tomcat) being used, which makes the process non-portable and more difficult to document.

2. I strongly believe that trying to serialize the entire function map is a lost cause, and that going down that route will lead to failure.  For serialization of functions attached to IDs to work properly, I think there will need to be a way to specify that specific functions in the map can be replicated.  David A., I understand your arguments to the contrary, but I hope you understand that my opinion isn't that serialization of all anonymous functions *can't be done*, it's that it can't be done *reliably* and *with reasonable performance*.  For it to be done reliably, we'd need to know that everything in the object graph is serializable.  For it to be done with reasonable performance, we'd have to know that *only* the information we're interested in is getting serialized.   Since, like you've mentioned, the entire outer class may be serialized along with an anonymous function I don't see how that last bit can be guaranteed.

Diego Medina

unread,
Mar 13, 2015, 2:47:22 PM3/13/15
to Lift
And just to add more complications, if the idea is to also help when you deploy new versions of you app, note that some of the function IDs will pint to "stale" functions, so bringing those back up isn't what you want.

Joe Barnes

unread,
Mar 14, 2015, 1:01:12 PM3/14/15
to lif...@googlegroups.com
I think what David W just stated regarding serialization makes a lot of sense. I'll put more thought into the alternate approach to recreate the functions and re-map the IDs when a page load is rehydrated.  

Perhaps to state the problem even more clearly now, the goal is to replace the Lift server from behind a served page.  Session replication is NOT what I think is important here.  I don't think that leveraging container clustering will be helpful in that case.

Diego, I completely understand and agree.  This is 100% NOT for the case of deploying new versions of code.  That would be doing it wrong. :)

Joe

David Whittaker

unread,
Mar 14, 2015, 1:34:11 PM3/14/15
to liftweb
Joe,

I think we're zeroing in on an approach, but lets not throw away container session replication as a part of the solution just yet.

On the one hand, I'd prefer if I never had to use container session replication again.  It can be difficult to configure, the configuration is container dependent, and the options for a container like Jetty are limited (putting serialized session in JDBC (ugh) and MongoDB are all they mention in their documentation http://www.eclipse.org/jetty/documentation/current/session-clustering-jdbc.html).

On the other hand... container session serialization already exists and if something new needs to get built, I think there are a lot of hard questions to answer.

- Does replication use an external data store, or do we replicate in memory between nodes?
- If the answer is external store which do we support, how do we make sure that stale data is removed from the store?
- If the answer is replication between nodes, how are nodes discovered, how is data transmitted, how are sessions partitioned?
- Either way, do we identify the Lift Session that is referenced for a function invocation? (right now, a LiftSession is tied to a container session and I'm not sure they even have their own unique ID)

While I think that container session replication sucks, the path to a working solution using it is definitely less steep.  All of the questions above can basically be ignored and replicating the function map can be done by either:

a) Putting the LiftSession into the container session (whereas right now it just tracks the lifecycle of the container session) and make it java.io.Externalizable.  Implement readExternal / writeExternal in a way so that only "safe" data is serialized / deserialized.

b) Hooking into the session map registration / de-registration process and making sure "safe" functions are also stored to the container session.  In a failover, when a new LiftSession is created it can grab whatever is in the container session to populate it's initial state.

Even if that's not the best long term solution, I think it might be a good idea to do a proof of concept using one of those methods.  It will at least prove out how well the "safe" functions serialize.

Joe Barnes

unread,
Mar 14, 2015, 2:14:50 PM3/14/15
to lif...@googlegroups.com
I agree 100% that building a solution would be very difficult and we should reuse something that already exists.  As for dismissing container replication, I felt that way because I don't see a need to replicate entire sessions and I would like this to also be a good solution for Netty deployments of Lift.  I just need somewhere to store the function IDs and comet IDs (perhaps other data to help tie that together) such that a Lift server can fail and another pick up that state.  The goal is not salvaging sessions but salvaging a page that has already been rendered.  Maybe we will find that we need the entire session for that, but I hope not.  

What I have in mind is that Lift would have the hooks and perhaps define an interface that you can implement in order to be such a store.  We could then implement one as a Lift module that could then depend on some other library or deployment to carry out that work.  This way different implementations can be selected and Lift itself won't need more dependencies.

Joe

Joe Barnes

unread,
Mar 22, 2015, 8:36:47 PM3/22/15
to lif...@googlegroups.com
After Scaladays SF last week, I had the pleasure of meeting DPP in person and we discussed this problem I'm trying to solve.  He had a great idea that completed the picture I have in my head.... are you ready for it??

Macros are what I am missing.  Here is the plan I'm going to try fleshing out soon (I hope):

The idea is to store what is minimally needed.  Hence, I'm throwing out concerns of serializing functions and the mess that could produce.  I want to only serialize the GUIDs put into the pages served and what they were attached to.  In the case of comets, I don't suppose this is difficult because they always have a type and optionally a name.  For ajax stuff, they are always associated with an anonymous function which with vanilla Scala isn't readily referenced by something like a name.  With macros, we could generate a static GUID for each function.  This way when we have a GUID miss and go to the store, we can find it and invoke the anonymous function called.  

As I mentioned earlier, I have no interest in this working for page loads served from different versions of the app.  This approach breaks in that case, which is the right behavior (assuming that particular function was recompiled).

The first thing I need to identify is a good replicated in memory key/value store, preferably a JVM one which will just be an added dependency.  This would keep the deployment/ops of this feature simple.  (I still see this as a module in case you wanted to back this with redis or whatever).  Furthermore, this would always clean up nicely when you deploy a new version, ensuring you don't accidentally retain dead state.

Somebody hand me my helmet.

Joe



David Whittaker

unread,
Mar 23, 2015, 11:51:04 AM3/23/15
to liftweb
Hi Joe,

Sounds interesting.  So you're going to try to capture something like a cursor that points to the anonymous function location?  If so, have you thought much on how the GUID is re-attached within an alternate server during failover?  If the cursor includes information about the class of the (snippet, lift-ng controller, whatever...) it was nested within, the idea is that an instance of that thing could then be freshly instantiated?  Seems like you'd lose any state that wasn't captured by the macro... but we've already come to terms with the fact that the state was going to be super problematic anyway.  Yeah... I agree that this could be a good approach.

Joe Barnes

unread,
Mar 23, 2015, 1:31:08 PM3/23/15
to lif...@googlegroups.com
Yes David, I think you understand my idea.  This cursor would capture the surrounding classes and such so that it can be rerun as if the page is being rendered a second time.  I'm hopeful I'll find a way to hook in so I can run the snippets exactly this way, and use the static GUID cursor things to hook the functions back up to the page that is already served.  All of the state and such is lost.  If you use this feature, you need to be mindful of this behavior so it does the right thing.  

Joe

You received this message because you are subscribed to a topic in the Google Groups "Lift" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/liftweb/KHjbjev8A0E/unsubscribe.
To unsubscribe from this group and all its topics, send an email to liftweb+u...@googlegroups.com.

David Whittaker

unread,
Mar 23, 2015, 7:31:00 PM3/23/15
to liftweb
Cool.  Really excited to see what you put together.

One quick thought though... before you settle on an external key/value store, consider container sessions.  I know I've ranted against them earlier in this thread, but, at least for a POC, they seem like exactly what you need.  Configure Jetty for distributed sessions, store your GUID -> Cursor mapping in the HttpSession, profit.

Joe Barnes

unread,
Mar 26, 2015, 12:14:49 PM3/26/15
to lif...@googlegroups.com
I think I'm ready to circle back around and like the idea of ContainerVar.  In addition to the point you make, this also means that if you in fact do have stuff you want persistent across the session, you can do it.  Otherwise, one must essentially configure two different clusters.  Perhaps we could still have the hooks that let you handle this yourself if you so desire, but I bet no one would find a need for it...

Thanks,
Joe

Joe Barnes

unread,
Aug 17, 2015, 5:51:31 PM8/17/15
to Lift
Now that I've knocked away a lot of junk that had kept me busy since starting this thread, I'm itching to get something to happen here.  

Is anyone around here running a Lift app in a cluster?  I know David W has mentioned in this thread that it can be quite tricky.  I'd like to dive into one that someone else is already familiar with in case I need tips along the way.

Joe
To unsubscribe from this group and stop receiving emails from it, send an email to liftweb+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
--
Lift, the simply functional web framework: http://liftweb.net
Code: http://github.com/lift
Discussion: http://groups.google.com/group/liftweb
Stuck? Help us help you: https://www.assembla.com/wiki/show/liftweb/Posting_example_code

---
You received this message because you are subscribed to a topic in the Google Groups "Lift" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/liftweb/KHjbjev8A0E/unsubscribe.
To unsubscribe from this group and all its topics, send an email to liftweb+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
--
Lift, the simply functional web framework: http://liftweb.net
Code: http://github.com/lift
Discussion: http://groups.google.com/group/liftweb
Stuck? Help us help you: https://www.assembla.com/wiki/show/liftweb/Posting_example_code

---
You received this message because you are subscribed to the Google Groups "Lift" group.
To unsubscribe from this group and stop receiving emails from it, send an email to liftweb+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
--
Lift, the simply functional web framework: http://liftweb.net
Code: http://github.com/lift
Discussion: http://groups.google.com/group/liftweb
Stuck? Help us help you: https://www.assembla.com/wiki/show/liftweb/Posting_example_code

---
You received this message because you are subscribed to a topic in the Google Groups "Lift" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/liftweb/KHjbjev8A0E/unsubscribe.
To unsubscribe from this group and all its topics, send an email to liftweb+unsubscribe@googlegroups.com.

Joe Barnes

unread,
Aug 18, 2015, 5:53:36 PM8/18/15
to Lift
It turns out that setting up Jetty session clustering with a SQL database is pretty easy.  I have this sample Lift project on github.  Next I'll write packer/terraform stuff for deploying a pair of these Lift servers in AWS.  Then I'll do a test similar to Tim's old screencast to convince myself that my setup is correct.  If that all looks good, then I will have a good test bed for trying out the ideas to store GUIDs and such in the replicated session.

Joe

Joe Barnes

unread,
Oct 2, 2015, 10:26:08 PM10/2/15
to Lift
I finally have some good news to report! Thanks to David W's suggestion to try storing the GUIDs and functions in the HttpSession, I was able to get a proof-of-concept working with very little code changes in my fail-over branch.

I created a LiftRules Boolean var named putAjaxFnsInContainerSession.  When enabled any time S.functionMap is updated in the LiftSession, I put a copy of the map entries into an HttpSession attribute here.  During LiftSession construction, I read from the HttpSession attribute to initialize the nmessageCallback map here.  Originally I wanted to keep the entire nmessageCallback as an HttpSession attribute, but it contains more than just the ajax functions.  In particular, it has some stateful stuff about the page with loads of unserializable stuff in the object graph.

If you want to try it for yourself, publish my branch locally and use my lift-jetty-cluster app's failover-testing branch.  Be sure to read the notes about running locally so you can configure your environment for clustering.

Hopefully someone here isn't working a real job and can take some time to review this work.  :)

Next up is to see if I can get Antonio's comet rehydration to work.  (BTW, if you read this, can you tell me what branch that is?)

Joe

Matt Farmer

unread,
Oct 2, 2015, 10:44:34 PM10/2/15
to Lift
I haven’t taken too deep a look into this, but it sounds awesome Joe!


Matt Farmer Blog | Twitter

To unsubscribe from this group and stop receiving emails from it, send an email to liftweb+u...@googlegroups.com.

Antonio Salazar Cardozo

unread,
Oct 4, 2015, 9:30:07 PM10/4/15
to Lift
I believe rehydration used the one commit @ https://github.com/lift/framework/commit/cdc99eb25ad938d657dab1dee9bc61128610c887
and the gist @ https://gist.github.com/Shadowfiend/3007331 . It was pretty primitive.
Thanks,
Antonio

Joe Barnes

unread,
Nov 1, 2015, 1:29:02 PM11/1/15
to Lift
It only took me a month, but I'm finally reviewing your gist for rehydrating comets.  At the moment, it won't work with the changes we've made to for Lift 3.x's js code.  In particular, you use the old lift_cometEntry function (now simply named "cometEntry"), but that is now a private function.

I already mentioned in this other thread that I'd like some of these functions which have been made private to be configurable in some way. I figure the best approach will be following the prior precedence of functions like noCometSessionCmd and add them to LiftRules as a configurable function.  I'll probably have a look at this next time I can set down with our code.

Joe

To unsubscribe from this group and stop receiving emails from it, send an email to liftweb+u...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
--
Lift, the simply functional web framework: http://liftweb.net
Code: http://github.com/lift
Discussion: http://groups.google.com/group/liftweb
Stuck? Help us help you: https://www.assembla.com/wiki/show/liftweb/Posting_example_code

---
You received this message because you are subscribed to a topic in the Google Groups "Lift" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/liftweb/KHjbjev8A0E/unsubscribe.
To unsubscribe from this group and all its topics, send an email to liftweb+u...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
--
Lift, the simply functional web framework: http://liftweb.net
Code: http://github.com/lift
Discussion: http://groups.google.com/group/liftweb
Stuck? Help us help you: https://www.assembla.com/wiki/show/liftweb/Posting_example_code

---
You received this message because you are subscribed to the Google Groups "Lift" group.
To unsubscribe from this group and stop receiving emails from it, send an email to liftweb+u...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
--
Lift, the simply functional web framework: http://liftweb.net
Code: http://github.com/lift
Discussion: http://groups.google.com/group/liftweb
Stuck? Help us help you: https://www.assembla.com/wiki/show/liftweb/Posting_example_code

---
You received this message because you are subscribed to a topic in the Google Groups "Lift" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/liftweb/KHjbjev8A0E/unsubscribe.
To unsubscribe from this group and all its topics, send an email to liftweb+u...@googlegroups.com.

Joe Barnes

unread,
Feb 6, 2016, 3:51:48 PM2/6/16
to Lift
I've opened a PR for the first bit of this work which will make Lift support failover for ajax functions.

Joe

Joe Barnes

unread,
Feb 7, 2016, 3:27:50 PM2/7/16
to Lift
Cool!  I was able to adapt Antonio's proof-of-concept comet rehydration: https://github.com/lift/framework/commit/fa84b7cd95d7a8ebd3567b06e69b68a6d7aa802e

Of course we would never ship Lift requiring jquery for this feature, but it's a great start!

Joe

Joe Barnes

unread,
Feb 7, 2016, 5:43:26 PM2/7/16
to Lift
Naturally, I'm really digging into the way comet works in Lift.  I'm trying to find a cleaner way to call the server for a fresh page load and build up the "toWatch" object.  That object has as keys the comet IDs which are in the DOM, and the values are a time stamp.  It would be really clean if those timestamps were also in the DOM.  I'm considering making the comet div's have data-lift-comet="$timestamp".  Would that pose a security risk in any way?  It seems doubtful since that data MUST get to the client in some form (today it does via the page js calling registerComets).  I just thought I would see what everyone thought before I went down that path.

Joe

Matt Farmer

unread,
Feb 7, 2016, 7:47:37 PM2/7/16
to lif...@googlegroups.com
Can you attach it in such a way that wouldn't show up in the web inspector? Like jQuery.fn.data or something?

Just a thought!


Matt Farmer Blog | Twitter
GPG: CD57 2E26 F60C 0A61 E6D8  FC72 4493 8917 D667 4D07

Joe Barnes

unread,
Feb 8, 2016, 9:17:17 AM2/8/16
to Lift
Matt, that would attach it as a data attribute, right?  That's what I plan to do but it will still certainly show up under inspection...

It turns out that I learned something interesting either way...  It appears that once upon a time we did this exact thing!  I'm hopeful that Antonio is still tuned into this thread because github suggests that he put the code there.

In this commit, Antonio added code (or perhaps moved existing code) to put data-lift-comet attributes on the page with the values set to the comet version.  We later register the comets on page load after collecting these attributes.  I'm not sure what settings causes this code to be put to use, though.  When I curl my page, I see the other two attributes but not the comet(s):

<body data-lift-session-id="F384714483040YW2B5P" data-lift-gc="F384714483041O5VFVI">

With the highly sophisticated debugging capabilities of the JVM, I did a println of the S.requestCometVersions where we build the GUID/Version pairs in the LiftMerge code.  It turns out that it is empty.  I think it's because this portion of the page rendering happens before the comets are added to the page's request context.

I suspect there must be a settings configuration that I'm not familiar with that makes this code relevant, or we refactored some stuff rendering this code moot.  

Anyway, for now I'm going to slap those comet versions in the DOM like a boss.

Joe

Joe Barnes

unread,
Feb 8, 2016, 11:03:34 PM2/8/16
to Lift
I just updated my branch with a rehydrateComets() which does not use jquery nor does it allow the page's JS to run!  For the use case I've focused on, I think this is shaping up nicely.

Right now this work assume that comets are always in the form of a div with an outer div wrapper.  I've seen while walking through the code that there are a lot of other ways comets can be configured which I guess could affect this DOM structure.  Does anyone have any ideas what else a comet could look like that I should expect?

Joe

Antonio Salazar Cardozo

unread,
Feb 9, 2016, 1:15:29 PM2/9/16
to Lift
Any default comet rendered via `data-lift="Comet..."` should have a div container.
Comets that aren't attached to the HTML won't have any visible markers. Lastly,
and perhaps most importantly, even if someone changes the container (which you
can do by overriding some stuff in CometActor), you should still have data-lift-comet-version
on the element.

I'd make getComets configurable by the provider (liftVanilla vs liftJQuery) and use
a selector lookup, `[data-lift-comet-version]`, where possible. If impossible (i.e., in
liftVanilla on a browser that doesn't have document.querySelectorAll), you can
maybe fall back to rummaging through the whole document.

On the other hand, I see you're trying to clean script tags with a regex… Amongst
other things, that'll miss bits like `onload` attributes that might try to trigger
something. Depending on when loading behavior triggers (I think most browsers
don't do it until the element gets attached to the DOM, but older ones may not
conform to that behavior).

I do think cleaning isn't overkill, mind you—otherwise it's entirely possible for you
to accidentally load this page's own JS again in weird ways. This is why I originally
went with an iframe: it provides complete isolation from this page, so you have total
control over what you pull in and what you don't. I'm still not sure that isn't the best
solution.

Lastly, it's worth noting that a core reason I never considered my solution “final” is
there was no mechanism to decide when it's kosher to just rehydrate the comets, vs
when the page has changed in such a way that simple rehydration is no longer a
viable path and you need to reload the page wholesale. Some examples of times
when this could happen are if the backing JS on the page has changed, or when
the page structure has changed in such a way that it doesn't make sense to rehydrate,
etc.

There also wasn't a good way to straightforwardly test whether the basic rehydration
would make your page work wonderfully with new code, or make it explode in a shower
of fiery sparks. That's maybe a higher-level concern—but if we don't solve it then comet
rehydration needs to be off by default IMO.
Thanks,
Antonio

Joe Barnes

unread,
Feb 9, 2016, 9:19:42 PM2/9/16
to lif...@googlegroups.com
Great tips and feedback as always, Antonio.  I can certainly use the configurable JS route as you mentioned.  I imagine that a jquery version will be a bit quicker than wading through the divs of the whole world.

I love the link on the regex. Classic!  I found the regex while trying to google if I should care in the first place.  Even though my Chrome browser was not loading that code, I thought I'd take an attempt to find scripts to see how it went.  It turns out that it didn't even catch the destroy script Lift drops in there.  I decided to not pursue it hard without further discussion.

Regarding the iframe, you make an excellent point.  I went away from it because I didn't want the scripts to run, and I learned that there isn't a good way around it.  However, I'll see if I can do the same trick I did with the current document, but with one from an iframe.  Maybe we can get the isolation without having to worry about the scripts running.

100% agree that this will be off by default.  If nothing else, it would possibly be a pain while developing, confusing the developer if something was broken or not.  Furthermore, I would assume it would not work if the backend code has changed.  This is the same for any of this fail-over stuff I'm attempting.  A Lift page should never be serviced by another Lift server that isn't running the exact same code.  All of this functionality is for identical replacement fail-over only, NOT zero-downtime deployment of new code.  If you don't have the infrastructure set up properly, you're gonna have a bad time.

FWIW, since the noCometSessionCmd is a FactoryMaker, the developer does have an opportunity to decide if the page should be reloaded with server logic, or call a JS function to determine that.  I imagine it wouldn't be the easiest code to write, but the hooks are there.

Thanks again for the time,

Joe


Antonio Salazar Cardozo

unread,
Feb 10, 2016, 1:42:31 PM2/10/16
to Lift
Yep yep. Lot to consider here for sure; thanks for putting in the effort to
start fleshing it out! It's often easier to talk about concrete code than it
is to talk about broad outlines :)
Thanks,
Antonio
To unsubscribe from this group and all its topics, send an email to liftweb+unsubscribe@googlegroups.com.

Joe Barnes

unread,
Feb 13, 2016, 6:04:39 PM2/13/16
to Lift
I've been thinking some more about the first PR of our Lift failover support after understanding how to rehydrate the comets.  

The current approach for ajax (and potentially other stateful features like SHtml.Text which could be implemented similarly) of depending on container clustering and serialization of the functions into the underlying container session has some drawbacks.  Firstly, the function MUST be fully java.lang.Serializable meaning everything it closes over must also be serializable.  I know that Jetty in particular will throw an Exception immediately if anything in the object graph isn't, but this could potentially vary from container to container.  Furthermore, this opens up a possibility that things work great in development mode when not clustering, but go belly up once stood up in the cluster.  I don't know of any cheap serialization checks, but perhaps they exist (I'm assuming that a test serialization of the thing is not very ideal in terms of performance).  Then if it does fail serialization, how should we handle it?  Blow up immediately breaking the page now?  Leave it out of the container session, potentially breaking the page during a failover?  Allow configurable behavior?  There is a bit of a can of worms here.  I'm curious if anyone has some insightful ideas.

Back to the rehydration, It seems we could consider going a similar route for repairing the ajax endpoints.  That would be nice because it would not be a feature which requires servlet clustering.  Being similar to the comet restoration, it would be one less complex concern for Lift developers to grok.  However, I feel that ajax functions are more likely to contain some state that would be different when re-serving the page from the new Lift instance.  

There is also a third option of providing implementations of either approach and let them choose,  I already anticipate all of this failover stuff will need some thorough documentation on our wiki and having options to accomplish failover might be a bit overwhelming.

Any opinions?

Joe

Joe Barnes

unread,
Feb 13, 2016, 10:03:26 PM2/13/16
to Lift
I've taken care of nearly everything Antonio mentioned here and opened the PR.  I thought I had it all but after re-reading I'm reminded to try document.querySelectorAll in liftVanilla.  I'll get that up there momentarily.

Joe

--
--
Lift, the simply functional web framework: <a href="http://liftweb.net" rel="nofollow" target="_blank" onmousedown="this.href='http://www.google.com/url?q\75http%3A%2F%2Fliftweb.net\46sa\75D\46sntz\0751\46usg\75AFQjCNEmJwOCJGtjWwlxKbLBvr-O

Antonio Salazar Cardozo

unread,
Feb 20, 2016, 12:21:00 PM2/20/16
to Lift
I think the main story is that no matter what you do, it's going to be really
hard to provide cluster migration of session/page state unless you're very
careful.

The question is, what do we want “being careful” to mean?

I think part of the issue here is we haven't defined particularly well the
scope of the issue that's being solved. What kinds of things are people
typically doing in callback functions that can't be restored to some extent
by reloading the page? Can we say “we don't support X, Y, and Z, but in
cases A, B, and C you can restore state”? How do we define those
boundaries?

Whatever we end up doing, we need to be very careful to define the
limits of it—and ideally make as many of those limits checkable (via
types, or perhaps via annotations, or whatever) as possible.
Thanks,
Antonio
...

Joe Barnes

unread,
Feb 20, 2016, 1:52:10 PM2/20/16
to lif...@googlegroups.com
I agree with you 100% Antonio.  That has mostly been my thought process behind this set of features.  I expect we won't be able to support this for an arbitrary Lift application, but my intent is to support as many of our features as we reasonably can.  I may just continue with my approach thus far which is to pick a feature, and explore ways it can be supported.  Perhaps if I continue this path, we'll have a clearer picture of what our options are.

How does everyone like that as the approach I take?

Joe


You received this message because you are subscribed to a topic in the Google Groups "Lift" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/liftweb/KHjbjev8A0E/unsubscribe.
To unsubscribe from this group and all its topics, send an email to liftweb+u...@googlegroups.com.

Antonio Salazar Cardozo

unread,
Feb 21, 2016, 2:20:05 PM2/21/16
to Lift
I dig it!

j...@joescii.com

unread,
Jul 6, 2017, 11:14:01 AM7/6/17
to Lift
After a little over a year hiatus, I am FINALLY getting back to this task! To get rolling, I'll start by recapping what I'm trying to accomplish and then I'll report what works as of today.

Very simply stated, my goal is to allow pages served by once instance of a Lift app server to be serviced later by another identical instance. By design a page served by Lift can only be serviced with ajax calls, comet pushes, etc from the exact in-memory instance that served the page. This is one of our core security features. This capability simply expands the authority of a Lift server to be able to service other pages served by siblings as needed. If I can pull this off, then Lift will be a little more suited for modern operations where we commonly replace infrastructure and servers on the fly with zero-downtime. 

The first open PR addresses ajax functions, and the second PR addresses re-attaching comets to a page. The failover-testing branch of this project is set up to demo these features. The readme explains how to demo and what behavior to expect. 

My current approach is the same which is to set up a demo of each basic feature of Lift and see what shakes out. I'm currently looking under the hood at ScreenVars. 

Joe

j...@joescii.com

unread,
Jul 11, 2017, 4:27:42 PM7/11/17
to Lift
I've started trying these features on an existing Lift application to start exploring what other Lift features are dependent upon in-memory state which I have not offloaded to the container's session. Originally I had thought I would leave SessionVar as is because we have ContainerVar which works for this use case. Hence for data which makes sense to survive a server failure should be in a ContainerVar, but objects which aren't serializable or don't make sense to migrate (actors, connections perhaps, etc) can go in SessionVars. However, I quickly discovered that this particular app is using the squeryl Lift module which uses a SessionVar which should be a ContainerVar in this app. After discovering that, it quickly occurred to me that it is impractical to expect folks to build Lift apps without the ability to have SessionVars write through to the container like ContainerVars do.

This poses an interesting challenge. I don't think there is a good way to generically know (i.e. from the Lift code) the difference between SessionVars which should be written through to the container vs those which are transient. Even if the object successfully serializes, it could be the case that it's no good when deserialized (an akka ActorRef is a good example, where the actor is bound to the single JVM running the Lift app). 

My best idea at the moment is to make every SessionVar behave like a ContainerVar based on a LiftRules setting. Then I'll add TransientSessionVar to explicitly force the current SessionVar behavior. At least this way, developers have a way to tell Lift to NOT attempt serializing the values down into the container. Here I'm being optimistic that data saved in a SessionVar of a module which an app might consume is more likely to be something that is appropriate to serialize. Also... it's precisely the behavior I need the SessionVar in the squeryl module. LOL

Any thoughts?

Joe

Antonio Salazar Cardozo

unread,
Jul 12, 2017, 1:30:28 PM7/12/17
to Lift
Another idea might be to make it a toggle on `SessionVar`. We could have an
overloaded constructor that takes a `storeInContainer`, or what have you, and
the current constructor sets that to `true`.

One question is, if we default these to storing at the container level, and
someone has not configured clustering, will the app blow up? Or will it just work
normally until clustering is enabled?
Thanks,
Antonio

j...@joescii.com

unread,
Jul 20, 2017, 12:22:01 PM7/20/17
to Lift
Thanks for the feedback Antonio. Been meaning to reply.

Firstly and more importantly, anyone using Lift who hasn't configured themselves for clustering shouldn't notice a change at all. That's the goal. 

I had thought about your idea to make SessionVar have a parameter. But if I'm not mistaken, it would be the only Var type with a parameter other than the default value. At the moment I'm leaning towards having SessionVar behave like a ContainerVar when appropriate, and giving the developers VolatileSessionVar for the non-cluster case. Furthermore, I plan to build in some dev-helper functionality that will test values for serialization when in dev mode (or whatever condition they like for that matter). Hence if someone uses a SessionVar naively when they really need a VolatileSessionVar, it'll be tested and a helpful message will be produced even though it wasn't necessary to serialize in non-cluster mode. 

How do you like the approach?

Joe

Antonio Salazar Cardozo

unread,
Jul 20, 2017, 2:55:04 PM7/20/17
to Lift
Well I certainly don't hate it :) Worried “Volatile” is a bit of a scary word to include
on something that really means “the way SessionVar has worked until now”, particularly
given volatile has a somewhat specific meaning in JVM/Scala-land. Perhaps
something more like “UnreplicatedSessionVar” or “UnserializedSessionVar”. But
we can discuss those details on a PR.

The other thing is I wouldn't want folks who are using Lift successfully in its current
configuration to suddenly sprout a bunch of runtime warnings. We could make the check
a flag, but it feels like it would be easier to make the serializing version be a new
type of SessionVar. Then if you wanted to opt in, you could pretty easily change all
your SessionVars to SerializableSessionVars or ClusteredSessionVars or whatever,
and then you'd get warnings if your data isn't serializable.

I realize this makes the shift slightly less transparent, but to me it feels like a slightly
better compromise in terms of being trivial to opt into clustering behavior and related
warnings both piecewise and explicitly, while keeping current behavior basically
untouched (realize in a sense this is still true if you don't enable clustering with your
proposal, though).
Thanks,
Antonio

j...@joescii.com

unread,
Jul 20, 2017, 4:51:34 PM7/20/17
to Lift
Yeah, I actually have it named "LegacySessionVar" at the moment. I wondered if you would feel that "Volatile" had some undesirable implications/connotations that wouldn't work.

I agree about the warnings. I've not figured out exactly what the LiftRules settings needs to look like for this pile of work, but I would want those warnings to opt to "OFF" for that exact reason.

We actually already have ContainerVar which will do what you're referring to, Antonio. The problem I encountered there (I think I mentioned it in this thread at some point) is that some Lift modules use a SessionVar when really it would need to be a ContainerVar. My gut here is it would be pretty rough to go through and update our Lift modules.

Joe

Matt Farmer

unread,
Jul 21, 2017, 9:27:06 AM7/21/17
to Lift
Following a long loosely here, but I do have some opinions to share. =)

I'm not wild about changing the meaning of "SessionVar". I would prefer there to be a new concept and for us to update liftmodules as appropriate. In general, I've found it to be a good principle that serialization of any kind is something that a developer should opt-in to, not opt-out of. We can't be sure that doing anything else won't cause edge cases to sprout up in the various lift modules that already exist. I actually think that testing for that sounds more difficult than updating places in those modules we know are safe to serialize somewhere.

Cheers,
Matt

--
--
Lift, the simply functional web framework: http://liftweb.net
Code: http://github.com/lift
Discussion: http://groups.google.com/group/liftweb
Stuck? Help us help you: https://www.assembla.com/wiki/show/liftweb/Posting_example_code

---
You received this message because you are subscribed to the Google Groups "Lift" group.
To unsubscribe from this group and stop receiving emails from it, send an email to liftweb+u...@googlegroups.com.

j...@joescii.com

unread,
Jul 21, 2017, 10:24:27 AM7/21/17
to Lift
I certainly don't disagree with your and Antonio's hesitance on using a switch to change the behavior of a SessionVar, and you bring up some going points regarding updating modules.

One other wrench to throw in here that I've discovered, it's also necessary to do this same behavior for RequestVars. If one cluster node serves the page, and another services an ajax request, then the RequestVars need to be properly serialized/deserialized to continue handling the page. 

When I get the hood up on the Vars again, I'll try to evaluate the effort to outfit the Lift source with "Container" vars as is appropriate for Session/Requests, rather than my current approach to change the behavior of existing Vars.

Thanks for the feedback as always folks!

Joe

Antonio Salazar Cardozo

unread,
Jul 21, 2017, 5:11:49 PM7/21/17
to Lift
Worth noting: typical usage of *Vars looks like:

  object myVarThing extends *Var[Type](default)

If we could make the containerized/clustering behavior a mixed-in trait, we could
imagine:

  object myVarThing extends *Var[Type](default) with ContainrizedClusteringAwesomeSauce

Or a better name if preferred :D I think that might be cool/valuable as well.
Thanks,
Antonio

Matt Farmer

unread,
Jul 21, 2017, 5:42:47 PM7/21/17
to Lift
What's wrong with ContainrizedClusteringAwesomeSauce? :P

j...@joescii.com

unread,
Oct 3, 2017, 2:27:17 PM10/3/17
to Lift
Ok, I have a long overdue update on this effort. I'd almost call it an announcement. :)

A pair of Lift PRs (comet rehydration and session serialization support) and the new lift-cluster module seem to work! We're still in the testing phase with our application, but at this point I primarily anticipate issues to be in the application code. 

The main thing I have left to do is to complete the readme for lift-cluster. There isn't enough there yet for someone to successfully deploy a Lift as a cluster. There are also gotchas and such which need to be documented.

But overall the design is fairly simple as far as we are concerned (there is NOTHING simple about using Kryo to serialize LiftSession, lol). I wrap up the LiftSession instance with a class that implements java serialization on the surface, but uses Kryo to do the serialization. The container then can serialize the session state and put it wherever it likes (Jetty in particular stores it in a SQL DB). If a node on the cluster fails and gets removed, then the other nodes will be able to deserialize the LiftSession and continue servicing the previously served pages. If the page has comets, then it will fire the code added in the comet rehydration PR. That code refetches the page and injects the new comet IDs. 

Joe

Matt Farmer

unread,
Oct 6, 2017, 10:13:04 AM10/6/17
to Lift
This is awesome Joe!
Reply all
Reply to author
Forward
0 new messages